Programming Assignment 5 (Due on March 13)

In this assignment you will write the final pass of our Decaf compiler - it traverses the syntax tree and emits Jasmin code.

Files

For this assignment, you will be downloading a compressed "tar" (tape-archive) file which contains all the files you will need. Keep all the files in a single directory.

Once you've downloaded the compressed tar file, execute the following commands:

This will create a directory called pa5 and will put in it all the files you need. You can then remove the pa5.tar file.


The Problem

This is the final stage of our decaf compiler. In the last assignment you added a check() method to each kind of Decaf node. In this one you will add a code() method. That method should print out Jasmin code that perform whatever is appropriate for that kind of node. For example, an arithmetic expression pushes its left and right sub-expressions onto the runtime stack, and then performs the appropriate operation via an opcode like iadd or imul. An assignment statement pushes its right-hand-side onto the stack, then stores it into some local variable via the opcode istore.

The code() methods will often be pretty short, recursively calling the code() methods of their own members, and then adding a bit more code of their own. This is very similar to how things were done in the checking phase. What matters is that you understand exactly what sort of Jasmin instructions you need to use to express the semantics you have in mind for a given Decaf instruction. The next sections will give you some tips on what to do.


Getting Started

Download the tar file and un-tar it as described above. Now execute the following instructions:

You should get an output like this:

That's the Jasmin source for the Empty.nocaf program. Now we will convert it into a real Java classfile. Execute the following instructions:

Jasmin assembler converted the JVM assembly code in Empty.jasmin to JVM bytecode and generated a class file. Now you have a Java classfile named Empty.class. How to use it? Since Decaf doesn't support the main() method, we can't see the results of executing compiled Decaf code unless we call it from regular Java code. Open the file Caller.java and uncomment the one line of code there. Now execute the following instructions:

This will call the nothing() method of the Empty class and print the result. This is how you will call methods of Decaf classes you write.


What Next?

How does this all happen? The code in Main.java has just two new lines:

So, as with the checking phase, you have to follow the action down through the code() method of ClassDecl and see what it does. Aside from some administrative stuff, it calls the code() methods of various other things. Continuing this way you will eventually look at each node class and fill in its code() methods.

The JVM + Jasmin virtual machine is a very straightforward low-level environment. It shouldn't be too hard to figure out what code to emit for most sorts of statements and declarations. The tricky cases will be equality expressions, if statements and while statements. Some basic things to keep in mind when writing your coding methods:


Boolean type

JVM does not have a boolean type. So we will encode the boolean values true and false as integers. We will use 1 to encode true and 0 to encode false. If the return type of a method is boolean, it is OK if the code you generate returns 0 when the result is false and 1 when the result is true.

Helper classes

There is just one helper class provided here; following the pattern in the previous assignment it's called Coder. This class keeps a few global values that are inconvenient to store anywhere else. In particular it keeps track of the local variable number and the label number.

The JVM can support up to 65,535 local variables in a given method. It refers to them by number. As far as your perspective is concerned, this is like having 65,535 different registers. How do we connect the variable with its number? Each variable is declared exactly once, so we attach the variable number to its declaration. This is done in the code() method for the declaration. In that method, it asks the Coder what the next unused variable number is, and uses that. A lengthier discussion of how this local variable number is stored with the declaration is given below in the section Interfaces.

In Jasmin you can use any string to denote a jump label. The class Coder provides a method that makes it easier to create jump labels that are unique, by returning the next unused jump number. You can create a label by just concatenating that number to the string "Label" (or whatever other string you want).


JVM Opcodes

These are the JVM opcodes you will need to use to write your Decaf compiler. Longer descriptions of these can be found at http://www.mrl.nyu.edu/meyer/jvmref/.

Operations

But ... no ieq or ine! You'll have to use if_cmpne and if_cmpeq to get the same effect.

Stack manipulation

The previous three instructions take two bytes. If the operands of the instructions are small numbers, there are these special-case opcodes that take only one byte.

Control flow


Interfaces

An interesting design problem arises, due to the similiarity of formal parameter declarations and local variable declarations. In fact, they're practically identical - each of them declares the type and name of a variable that is local to a method's scope. But they have different roles in the formal syntax of the language, and the inheritance hierarchy of their Node types reflects this:

It's clear that FormalParam is a kind of Listable, and that LocalVarDecl is a kind of Statement, so we don't want to give up that. But...

Every local variable in a method has its own unique number. The logical place to store this number is in the declaration, since there's exactly one of those, and possibly many uses of the variable in expressions. OK, so we add a field called localNum to both the classes LocalVarDecl and FormalParam. Up to now we've had to use these two classes in different contexts, so the fact that they had identical structures that had to be written in parallel instead of shared was just an inconvenience. But as we'll see below, things become considerably more annoying now. Interfaces offer us a way out.

Next we add to the expression node Id a reference that points back to its declaration, and set that reference during the checking pass. Now, how do we declare that? At this stage, the best we could do is this:

So what's the problem? Well, when we get to the code() method, we need to do something like this:

Well, this sort of thing is what OO is supposed to help us avoid, isn't it? For various good reasons, Java lets a class inherit (extend) from only one parent class. Inheritance means you get interfaces and implementations from your parent. Not just the family name, but the money too. But classes can get just the interface only, from an interface class, and can do that for any number of interfaces. What we do here is this:

So we've created a sort of new superclass, just for FormalParam and LocalVarDecl, which specifies that they have to provide a certain interface, namely a method called localNum(). What they actually do with this method is their business (which is to say, your business when you write the code). In fact they do the obvious thing.

Now we can be more expressive in declaring to the Java compiler what sort of field this is that we're adding to Id:

And this in turn lets us be more succinct in the Id.code() method, while still letting the Java compiler be sure it's type-safe: