In this assignment you will write the final pass of our Decaf compiler - it traverses the syntax tree and emits Jasmin code.
For this assignment, you will be downloading a compressed "tar" (tape-archive) file which contains all the files you will need. Keep all the files in a single directory.
Once you've downloaded the compressed tar file, execute the following commands:
% gunzip pa5.tar.gz % tar xf pa5.tarThis will create a directory called
pa5
and will put in it all the files you need. You can then remove the pa5.tar
file.
This is the final stage of our decaf compiler. In the last assignment you added a check()
method to each kind of Decaf node. In this one you will add a code()
method. That method should print out Jasmin code that perform whatever is appropriate for that kind of node. For example, an arithmetic expression pushes its left and right sub-expressions onto the runtime stack, and then performs the appropriate operation via an opcode like iadd
or imul
. An assignment statement pushes its right-hand-side onto the stack, then stores it into some local variable via the opcode istore
.
The code()
methods will often be pretty short, recursively calling the code()
methods of their own members, and then adding a bit more code of their own. This is very similar to how things were done in the checking phase. What matters is that you understand exactly what sort of Jasmin instructions you need to use to express the semantics you have in mind for a given Decaf instruction. The next sections will give you some tips on what to do.
Download the tar file and un-tar it as described above. Now execute the following instructions:
% javac *.java % java Main Empty.nocaf
You should get an output like this:
.source Empty.nocaf .class Empty .super java/lang/Object .method static nothing()I .limit stack 16 ; (Hack!) .limit locals 0 iconst_0 ireturn .end method
That's the Jasmin source for the Empty.nocaf
program. Now we will convert it into a real Java classfile. Execute the following instructions:
% java Main Empty.nocaf > Empty.jasmin % setenv CLASSPATH ".:/fs/cs-cls/cs160/lib/jasmin" % java jasmin.Main Empty.jasmin
Jasmin assembler converted the JVM assembly code in Empty.jasmin to JVM bytecode and generated a class file. Now you have a Java classfile named Empty.class
. How to use it? Since Decaf doesn't support the main()
method, we can't see the results of executing compiled Decaf code unless we call it from regular Java code. Open the file Caller.java
and uncomment the one line of code there. Now execute the following instructions:
% javac Caller.java % java Caller
This will call the nothing()
method of the Empty
class and print the result. This is how you will call methods of Decaf classes you write.
How does this all happen? The code in Main.java
has just two new lines:
Scanner S = new Scanner(args[0]); Lexer L = new Lexer(S); Parser P = new Parser(L); ClassDecl CD = P.parseClassDecl(); CD.check(); Coder.filename = args[0]; // This is new. CD.code(); // This is new.
So, as with the checking phase, you have to follow the action down through the code()
method of ClassDecl
and see what it does. Aside from some administrative stuff, it calls the code()
methods of various other things. Continuing this way you will eventually look at each node class and fill in its code()
methods.
The JVM + Jasmin virtual machine is a very straightforward low-level environment. It shouldn't be too hard to figure out what code to emit for most sorts of statements and declarations. The tricky cases will be equality expressions, if statements and while statements. Some basic things to keep in mind when writing your coding methods:
code()
method defined, but many of them need finishing. In every method body are comments telling you what you need to do there.FormalParams
, MethodDecls
, etc.) have already implemented the code()
method to just pass on the call to each of the elements in their list. You don't need to add anything.Coder
(described below) keeps track of these and provides some accessors for you.javac
, and then decompile it to Jasmin and take a look.Expression
have a new field called decl
in them. This field is set by the checking code to refer back to the node in the syntax tree where the type of the expression was declared. You'll need this info when emitting code. See the section Interfaces below for more about this.
This is a bit reminiscent of how we needed to retrieve the declaration of a method when we were checking a call expression. We did that by using the symbol table. But after the checking phase the table is empty. So while we go through the checking, we attach to expressions this link back to where they were declared.
Statement
defines a method called
numLocals()
, which returns the value zero. This method
is supposed to return the number of local variables declared in a
statement. This has meaning for a local variable declaration itself,
and for a block. You will need to override the method for those
classes to return the correct number..limit stack
directive is hardwired in
MethodDecl.code()
to use the value 16. This is definitely a hack
for the sample code. Another limit you need to set is the local variable space.
Your final code generator should analyze the code for each method,
predict how much stack and local variable space is needed, and set up proper limits.
There is just one helper class provided here; following the pattern in the previous assignment it's called Coder
. This class keeps a few global values that are inconvenient to store anywhere else. In particular it keeps track of the local variable number and the label number.
The JVM can support up to 65,535 local variables in a given method. It refers to them by number. As far as your perspective is concerned, this is like having 65,535 different registers. How do we connect the variable with its number? Each variable is declared exactly once, so we attach the variable number to its declaration. This is done in the code()
method for the declaration. In that method, it asks the Coder
what the next unused variable number is, and uses that. A lengthier discussion of how this local variable number is stored with the declaration is given below in the section Interfaces.
In Jasmin you can use any string to denote a jump label. The class Coder
provides a method that makes it easier to create jump labels that are unique, by returning the next unused jump number. You can create a label by just concatenating that number to the string "Label"
(or whatever other string you want).
These are the JVM opcodes you will need to use to write your Decaf compiler. Longer descriptions of these can be found at http://www.mrl.nyu.edu/meyer/jvmref/.
iadd
- Pop top two elements, push sum.isub
- Difference.imul
- Product.idiv
- Integer quotient.irem
- Remainder (mod).iand
- Bitwise and.ior
- Bitwise or.But ... no ieq
or ine
! You'll have to use if_cmpne
and if_cmpeq
to get the same effect.
ldc #
- Push single-word constant.iload #
- Push local variable #.istore #
- Pop top element, store in local #.The previous three instructions take two bytes. If the operands of the instructions are small numbers, there are these special-case opcodes that take only one byte.
iconst_#
- Push 0, 1, 2, 3, 4, or 5.iload_#
- Push local variable 0,1,2,3.istore_#
- Pop top element, store in local 0,1,2,3.goto <label>
- Jump to <label>.ifeq <label>
- Pop top element, jump to <label> if it == 0.ifne <label>
- Pop top element, jump to <label> if it != 0.if_icmpne <label>
- Pop top two elements, jump to <label> if they are !=.if_icmpeq <label>
- Pop top two elements, jump to <label> if they are ==.invokestatic <method>
- Call static method.ireturn
- Return, top element is integer.An interesting design problem arises, due to the similiarity of formal parameter declarations and local variable declarations. In fact, they're practically identical - each of them declares the type and name of a variable that is local to a method's scope. But they have different roles in the formal syntax of the language, and the inheritance hierarchy of their Node
types reflects this:
Node Listable FormalParam ... etc. ... Statement LocalVarDecl ... etc. ...
It's clear that FormalParam
is a kind of Listable
, and that LocalVarDecl
is a kind of Statement
, so we don't want to give up that. But...
Every local variable in a method has its own unique number. The
logical place to store this number is in the declaration, since there's
exactly one of those, and possibly many uses of the variable
in expressions. OK, so we add a field called localNum
to
both the classes LocalVarDecl
and
FormalParam
. Up to now we've had to use these two classes
in different contexts, so the fact that they had identical structures
that had to be written in parallel instead of shared was just an
inconvenience. But as we'll see below, things become considerably more
annoying now. Interfaces offer us a way out.
Next we add to the expression node Id
a reference that points back to its declaration, and set that reference during the checking pass. Now, how do we declare that? At this stage, the best we could do is this:
class Id extends Expression { public String spelling; public Listable decl; // A LocalVarDecl or FormalParam. ... }
So what's the problem? Well, when we get to the code()
method, we need to do something like this:
public void code() { System.out.print(" iload "); if (decl instanceof LocalVarDecl) System.out.println(((LocalVarDecl)decl).localNum); else if (decl instanceof FormalParam) System.out.println(((FormalParam)decl).localNum); else throw some sort of unexpected type exception; }
Well, this sort of thing is what OO is supposed to help us avoid, isn't it? For various good reasons, Java lets a class inherit (extend) from only one parent class. Inheritance means you get interfaces and implementations from your parent. Not just the family name, but the money too. But classes can get just the interface only, from an interface class, and can do that for any number of interfaces. What we do here is this:
interface LocalDecl { int localNum(); } class FormalParam extends Listable implements LocalDecl { private int localNum; public int localNum() { return localNum; } ... } class LocalVarDecl extends Statement implements LocalDecl { // similar }
So we've created a sort of new superclass, just for FormalParam
and LocalVarDecl
, which specifies that they have to provide a certain interface, namely a method called localNum()
. What they actually do with this method is their business (which is to say, your business when you write the code). In fact they do the obvious thing.
Now we can be more expressive in declaring to the Java compiler what sort of field this is that we're adding to Id
:
class Id extends Expression { public String spelling; public LocalDecl decl; // Has a localNum() method. ... }
And this in turn lets us be more succinct in the Id.code()
method, while still letting the Java compiler be sure it's type-safe:
System.out.println(" iload " + decl.localNum());