项目的完整代码在 C2j-Compiler <https://github.com/dejavudwh/C2j-Compiler>

前言


在上一篇完成对JVM指令的生成,下面就可以真正进入代码生成部分了。通常现代编译器都是先把生成IR,再经过代码优化等等,最后才编译成目标平台代码。但是时间水平有限,我们没有IR也没有代码优化,就直接利用AST生成Java字节码

入口

进行代码生成的入口在CodeGen,和之前解释器一样:先获取main函数的头节点,从这个节点开始,先进入函数定义,再进入代码块

函数定义节点

在进入函数定义节点的时候,就要生成一个函数定义对应的Java字节码,即一个静态方法(因为我们对整个C语言文件生成为一个类,main方法为public
static main,其它的则是对应的静态方法,结构体则是另外的类)

* 对于函数定义先从节点中拿到对应的函数命和参数
* emitArgs是用来处理参数的,根据参数生成相应的Java字节码
* 如果这个函数是main的话已经是交由前面处理了,逻辑差不多(具体在start.Start中) case
SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl: root.reverseChildren();
AstNode n = root.getChildren().get(0); String name = (String)
n.getAttribute(NodeKey.TEXT); symbol = (Symbol)
root.getAttribute(NodeKey.SYMBOL); generator.setCurrentFuncName(name); if (name
!= null && !name.equals("main")) { String declaration = name +
emitArgs(symbol); generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC,
declaration); generator.setNameAndDeclaration(name, declaration); }
copyChild(root, root.getChildren().get(0)); break; case
SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl: n =
root.getChildren().get(0); name = (String) n.getAttribute(NodeKey.TEXT); symbol
= (Symbol) root.getAttribute(NodeKey.SYMBOL);
generator.setCurrentFuncName(name); if (name != null && !name.equals("main")) {
String declaration = name + emitArgs(symbol);
generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration);
generator.setNameAndDeclaration(name, declaration); } Symbol args =
symbol.getArgList(); if (args == null || argsList == null ||
argsList.isEmpty()) { System.err.println("generate function with arg list but
arg list is null"); System.exit(1); } break;
创建结构体和数组

数组


创建结构体和数组的节点在DefGenerate里,可以看到在这里只处理了数组和普通变量,有关结构体的处理是在对结构体第一次使用的时候。顺便提一下代码生成对于赋初值操作是没有进行处理的。

* 如果是个数组,酒直接调用ProgramGenerator直接生成创建数组的指令
*
如果是个普通变量,就直接找到它并且赋值为0(这里变量在队列里的位置是根据符号表来计算的,具体可以看上一篇的getLocalVariableIndex方法)
public class DefGenerate extends BaseGenerate { @Override public Object
generate(AstNode root) { int production = (int)
root.getAttribute(NodeKey.PRODUCTION); ProgramGenerator generator =
ProgramGenerator.getInstance(); Symbol symbol = (Symbol)
root.getAttribute(NodeKey.SYMBOL); switch (production) { case
SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def: Declarator declarator =
symbol.getDeclarator(Declarator.ARRAY); if (declarator != null) { if
(symbol.getSpecifierByType(Specifier.STRUCTURE) == null) {
generator.createArray(symbol); } } else { int i =
generator.getLocalVariableIndex(symbol); generator.emit(Instruction.SIPUSH, ""
+ 0); generator.emit(Instruction.ISTORE, "" + i); } break; default: break; }
return root; } }
结构体

处理结构体定义的代码在UnaryNodeGenerate,也就是只有在使用到结构体定义时才会进行定义

* 先拿到当前UNARY的符号,如果instanceof
ArrayValueSetter就说明是一个结构体数组,就进入getStructSymbolFromStructArray方法创建一个结构体数组,并返回当前下标的结构体对象
* 设置当前结构体的作用域范围
* 对结构体作为类进行定义
* 然后对读取结构体的域
* 其实可以忽略指针部分,因为代码生成并没有对指针进行模拟 case
SyntaxProductionInit.Unary_StructOP_Name_TO_Unary: child =
root.getChildren().get(0); String fieldName = (String)
root.getAttribute(NodeKey.TEXT); Object object =
child.getAttribute(NodeKey.SYMBOL); boolean isStructArray = false; if (object
instanceof ArrayValueSetter) { symbol = getStructSymbolFromStructArray(object);
symbol.addValueSetter(object); isStructArray = true; } else { symbol = (Symbol)
child.getAttribute(NodeKey.SYMBOL); } if (isStructArray) { ArrayValueSetter vs
= (ArrayValueSetter) object; Symbol structArray = vs.getSymbol();
structArray.addScope(ProgramGenerator.getInstance().getCurrentFuncName()); }
else { symbol.addScope(ProgramGenerator.getInstance().getCurrentFuncName()); }
ProgramGenerator.getInstance().putStructToClassDeclaration(symbol); if
(isSymbolStructPointer(symbol)) { copyBetweenStructAndMem(symbol, false); }
Symbol args = symbol.getArgList(); while (args != null) { if
(args.getName().equals(fieldName)) { args.setStructParent(symbol); break; }
args = args.getNextSymbol(); } if (args == null) { System.err.println("access a
filed not in struct object!"); System.exit(1); } if (args.getValue() != null) {
ProgramGenerator.getInstance().readValueFromStructMember(symbol, args); }
root.setAttribute(NodeKey.SYMBOL, args); root.setAttribute(NodeKey.VALUE,
args.getValue()); if (isSymbolStructPointer(symbol)) {
checkValidPointer(symbol); structObjSymbol = symbol; monitorSymbol = args;
GenerateBrocasterImpl.getInstance().registerReceiverForAfterExe(this); } else {
structObjSymbol = null; } break;
一元操作节点

这个节点和在解释器的有很多相同,除了有对结构体的操作,其它的也是有非常重要的作用

* 像数字、字符串或者是变量和之前的操作都是把信息传递到父节点,交由父节点处理 case
SyntaxProductionInit.Number_TO_Unary: text = (String)
root.getAttribute(NodeKey.TEXT); boolean isFloat = text.indexOf('.') != -1; if
(isFloat) { value = Float.valueOf(text); root.setAttribute(NodeKey.VALUE,
value); } else { value = Integer.valueOf(text);
root.setAttribute(NodeKey.VALUE, value); } break; case
SyntaxProductionInit.Name_TO_Unary: symbol = (Symbol)
root.getAttribute(NodeKey.SYMBOL); if (symbol != null) {
root.setAttribute(NodeKey.VALUE, symbol.getValue());
root.setAttribute(NodeKey.TEXT, symbol.getName()); } break; case
SyntaxProductionInit.String_TO_Unary: text = (String)
root.getAttribute(NodeKey.TEXT); root.setAttribute(NodeKey.VALUE, text); break;
case SyntaxProductionInit.Unary_LB_Expr_RB_TO_Unary: child =
root.getChildren().get(0); symbol = (Symbol)
child.getAttribute(NodeKey.SYMBOL); child = root.getChildren().get(1); int
index = 0; if (child.getAttribute(NodeKey.VALUE) != null) { index = (Integer)
child.getAttribute(NodeKey.VALUE); } Object idxObj =
child.getAttribute(NodeKey.SYMBOL); try { Declarator declarator =
symbol.getDeclarator(Declarator.ARRAY); if (declarator != null) { Object val =
declarator.getElement((int) index); root.setAttribute(NodeKey.VALUE, val);
ArrayValueSetter setter; if (idxObj == null) { setter = new
ArrayValueSetter(symbol, index); } else { setter = new ArrayValueSetter(symbol,
idxObj); } root.setAttribute(NodeKey.SYMBOL, setter);
root.setAttribute(NodeKey.TEXT, symbol.getName()); } Declarator pointer =
symbol.getDeclarator(Declarator.POINTER); if (pointer != null) {
setPointerValue(root, symbol, index); PointerValueSetter pv = new
PointerValueSetter(symbol, index); root.setAttribute(NodeKey.SYMBOL, pv);
root.setAttribute(NodeKey.TEXT, symbol.getName()); } } catch (Exception e) {
e.printStackTrace(); System.exit(1); } break;
赋值操作

* 如果当前是一个数组,先拿到它的符号和下标
* 如果不是结构体数组,那么拿到下标直接用readArrayElement生成读取数组元素的指令
* 如果是一个符号则用getLocalVariableIndex读取这个符号的值
* 如果是一个常数,则直接生成IPUSH指令
* 最后进行赋值操作,如果不是对结构体的域进行赋值就直接用getLocalVariableIndex拿到队列位置然后生成ISTORE
*
如果是对结构体数组的元素的域的赋值,就调用assignValueToStructMemberFromArray生成代码,如果只是结构体就直接调用assignValueToStructMember生成代码
ProgramGenerator generator = ProgramGenerator.getInstance(); if
(BaseGenerate.resultOnStack) { this.value = obj; BaseGenerate.resultOnStack =
false; } else if (obj instanceof ArrayValueSetter) { ArrayValueSetter setter =
(ArrayValueSetter) obj; Symbol symbol = setter.getSymbol(); Object index =
setter.getIndex(); if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null)
{ if (index instanceof Symbol) {
ProgramGenerator.getInstance().readArrayElement(symbol, index); if (((Symbol)
index).getValue() != null) { int i = (int) ((Symbol) index).getValue(); try {
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i); } catch
(Exception e) { e.printStackTrace(); } } } else { int i = (int) index; try {
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i); } catch
(Exception e) { e.printStackTrace(); }
ProgramGenerator.getInstance().readArrayElement(symbol, index); } } } else if
(obj instanceof Symbol) { Symbol symbol = (Symbol) obj; this.value =
symbol.value; int i = generator.getLocalVariableIndex(symbol);
generator.emit(Instruction.ILOAD, "" + i); } else if (obj instanceof Integer) {
Integer val = (Integer) obj; generator.emit(Instruction.SIPUSH, "" + val);
this.value = obj; } if (!this.isStructMember()) { int idx =
generator.getLocalVariableIndex(this); if (!generator.isPassingArguments()) {
generator.emit(Instruction.ISTORE, "" + idx); } } else { if
(this.getStructSymbol().getValueSetter() != null) {
generator.assignValueToStructMemberFromArray(this.getStructSymbol().getValueSetter(),
this, this.value); } else {
generator.assignValueToStructMember(this.getStructSymbol(), this, this.value);
} }
最后

完成这部分后,对下面的代码
void quicksort(int A[10], int p, int r) { int x; int i; i = p - 1; int j; int
t; int v; v = r - 1; if (p < r) { x = A[r]; for (j = p; j <= v; j++) { if (A[j]
<= x) { i++; t = A[i]; A[i] = A[j]; A[j] = t; } } v = i + 1; t = A[v]; A[v] =
A[r]; A[r] = t; t = v - 1; quicksort(A, p, t); t = v + 1; quicksort(A, t, r); }
} void main () { int a[10]; int i; int t; printf("before quick sort:"); for(i =
0; i < 10; i++) { t = (10 - i); a[i] = t; printf("value of a[%d] is %d", i,
a[i]); } quicksort(a, 0, 9); printf("after quick sort:"); for (i = 0; i < 10;
i++) { printf("value of a[%d] is %d", i, a[i]); } }
则会生成下面的Java字节码
.class public C2Bytecode .super java/lang/Object .method public static
main([Ljava/lang/String;)V sipush 10 newarray int astore 0 sipush 0 istore 1
sipush 0 istore 2 getstatic java/lang/System/out Ljava/io/PrintStream; ldc
"before quick sort:" invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V getstatic java/lang/System/out
Ljava/io/PrintStream; ldc " " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V sipush 0 istore 1 loop0: iload 1
sipush 10 if_icmpge branch0 sipush 10 iload 1 isub istore 2 aload 0 iload 1
iload 2 iastore aload 0 iload 1 iaload istore 3 iload 1 istore 4 getstatic
java/lang/System/out Ljava/io/PrintStream; ldc "value of a[" invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V getstatic java/lang/System/out
Ljava/io/PrintStream; iload 4 invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream; ldc "] is " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V getstatic java/lang/System/out
Ljava/io/PrintStream; iload 3 invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream; ldc " " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V iload 1 sipush 1 iadd istore 1
goto loop0 branch0: aload 0 sipush 0 sipush 9 invokestatic
C2Bytecode/quicksort([III)V getstatic java/lang/System/out
Ljava/io/PrintStream; ldc "after quick sort:" invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V getstatic java/lang/System/out
Ljava/io/PrintStream; ldc " " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V sipush 0 istore 1 loop2: iload 1
sipush 10 if_icmpge branch4 aload 0 iload 1 iaload istore 3 iload 1 istore 4
getstatic java/lang/System/out Ljava/io/PrintStream; ldc "value of a["
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V getstatic
java/lang/System/out Ljava/io/PrintStream; iload 4 invokevirtual
java/io/PrintStream/print(I)V getstatic java/lang/System/out
Ljava/io/PrintStream; ldc "] is " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V getstatic java/lang/System/out
Ljava/io/PrintStream; iload 3 invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream; ldc " " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V iload 1 sipush 1 iadd istore 1
goto loop2 branch4: return .end method .method public static quicksort([III)V
sipush 2 newarray int astore 6 sipush 0 istore 5 sipush 1 istore 5 aload 6
iload 5 sipush 1 iastore aload 6 sipush 1 iaload istore 10 getstatic
java/lang/System/out Ljava/io/PrintStream; ldc "before quick sort: "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V getstatic
java/lang/System/out Ljava/io/PrintStream; iload 10 invokevirtual
java/io/PrintStream/print(I)V getstatic java/lang/System/out
Ljava/io/PrintStream; ldc " " invokevirtual
java/io/PrintStream/print(Ljava/lang/String;)V sipush 0 istore 9 sipush 0
istore 3 iload 1 sipush 1 isub istore 3 sipush 0 istore 4 sipush 0 istore 7
sipush 0 istore 8 iload 2 sipush 1 isub istore 8 iload 1 iload 2 if_icmpge
branch1 aload 0 iload 2 iaload istore 9 iload 1 istore 4 loop1: iload 4 iload 8
if_icmpgt ibranch1 aload 0 iload 4 iaload iload 9 if_icmpgt ibranch2 iload 3
sipush 1 iadd istore 3 aload 0 iload 3 iaload istore 7 aload 0 iload 3 aload 0
iload 4 iaload iastore aload 0 iload 4 iload 7 iastore ibranch2: iload 4 sipush
1 iadd istore 4 goto loop1 ibranch1: iload 3 sipush 1 iadd istore 8 aload 0
iload 8 iaload istore 7 aload 0 iload 8 aload 0 iload 2 iaload iastore aload 0
iload 2 iload 7 iastore iload 8 sipush 1 isub istore 7 aload 0 iload 1 iload 7
invokestatic C2Bytecode/quicksort([III)V iload 8 sipush 1 iadd istore 7 aload 0
iload 7 iload 2 invokestatic C2Bytecode/quicksort([III)V branch1: return .end
method .end class
小结

这篇的代码生成和之前解释器的思路很相似,都是根据AST和对应的产生式来执行或者生成代码。

其实主要的思路是很清晰的,只是其中有太多细节容易让人太过纠结。这个系列算作是我自己的学习笔记,到这也有十三篇了,下一篇可能写写总结就正式结束了。

欢迎Star!