CMPSC 160 Course Description
Department and Course Number: CMPSC 160
Course Title: Translation of Programming Languages
Total Credits: 4
Course Coordinator: Tevfik Bultan
Current Catalog Description
Study of the structure of compilers. Topics include: lexical
analysis; syntax analysis including LL and LR parsers; type checking; runtime
environments; intermediate code generation; and compilerconstruction tools.
Prerequisites
CMPSC 64 or ECE 154 and CMPSC 130A and CMPSC 138
Course Goals
(1) To learn structure of compilers.
(2) To learn basic techniques used in compiler construction
such as lexical analysis,
topdown and bottomup parsing,
contextsensitive analysis, and intermediate code generation.
(3) To learn basic data structures used in compiler construction
such as abstract syntax trees, symbol tables, threeaddress code,
and stack machines.
(4) To learn software tools used in compiler construction such
as lexical analyzer generators (lex, flex), and
parser generators (yacc, bison).
(5) To construct a compiler for a small language using
the above techniques and tools.
Prerequisites by Topic
Automata theory and formal languages
Programming in C++
Data structures, algorithms, and complexity
Topics Covered in the Course
 Introduction: [1 lecture] Overview of compilers, phases of a compiler.
 Lexical analysis, scanning: [3 lectures, 2 discussions]
 Role of a scanner
 Tokens, lexemes,
specifications of tokens, regular expressions,
regular definitions,
regular expression extensions.

Recognizing tokens,
DFAs, NFAs, DFA simulation,
NFA simulation, recognizing
the longest matching prefix,
regular expressiontoNFA conversion,
NFAtoDFA conversion,
DFA minimization,
timespace tradeoffs in using DFAs and NFAs for scanning.

Lexical analyzer generators,
lex, flex.
 Syntax analysis, parsing [6 lectures, 3 discussions]

Role of a parser.

Context free grammars, derivations, sentential forms,
leftmost, rightmost derivations,
parse trees. Ambiguity,
ambiguous grammars, precedence, associativity, dangling else problem, eliminating
ambiguity.
Regular languages vs. context free languages,
non contextfree languages.

Topdown vs. bottomup parsing. Topdown parsing, recursive descent parsing,
predictive parsing,
leftrecursion elimination, left factoring.
 Stackbased predictive parsing, parse tables,
tabledriven predictive parsing algorithm (LL parsing algorithm),
FIRST sets, FOLLOW sets.
LL(1), LL(k) grammars, constructing LL(1), LL(k) parse tables,
conflicts in LL(0) parse tables,
parsing ambiguous grammars with LL parsers,
building a parse tree while parsing.
 Bottomup parsing. Handles, shiftreduce parsers, stackbased shiftreduce
parsing, viable prefixes,
tabledriven shiftreduce parsing algorithm (LR parsing algorithm), LR(0) items,
closure, goto operations for LR(0) items.
Setsofitems construction for LR(0) items,
the NFA and the DFA that recognize viable prefixes,
valid LR(0) items for a viable prefix,
constructing LR(0) parse tables, constructing SLR(1) parse tables,
conflicts in LR parse tables,
LR(k) items, LR(1) items, valid LR(1) items for a viable prefix,
closure, goto operations for LR(1) items.
Setsofitems construction for LR(1) items,
constructing LR(1) parse tables,
LALR parse table construction,
parsing conflicts in LR parsers,
parsing ambiguous grammars with LR parsers.
 Error recovery strategies in parsing,
error recovery in LL and LR parsing.

Parser generators,
yacc, bison.
 ContextSensitive Analysis [3 lectures, 2 discussions]

Syntaxdirected definitions, attribute grammars,
synthesized and inherited attributes, dependency graphs, evaluation order,
topological sort, constructing syntax trees, constructing syntax trees for expressions using
syntaxdirected definitions.
 Sattributed definitions,
bottomup evaluation of Sattributed definitions,
Lattributed definitions, depthfirst evaluation order, translation schemes,
topdown translation, eliminating leftrecursion from a translation scheme, designing
predictive translators, bottomup evaluation of
inherited attributes.
 Type checking, type systems, type expressions,
static vs. dynamic type checking,
type expressions, basic types, type constructors, type graphs,
equivalence of type expressions, structural vs. name equivalence,
type conversions.
 Runtime environments [2 lectures, 1 discussion]
 Symbol tables. Procedure abstraction,
activation trees, control stack, scope of a declaration,
runtime storage organization, activation records, static data, control stack,
heap,
storageallocation strategies, procedure calls, callsequence, returnsequence,
access to nonlocal names, lexical (static) scope with (or without) nested
procedures, access links, displays, procedure parameters, dynamic scope,
parameter passing, callbyvalue, callbyreference, copyrestore, callbyname.
 Intermediate code generation [4 lectures, 2 discussions]
 Intermediate representations,
intermediate code generation for assignment statements,
intermediate code generation for boolean expressions,
numerical and flowofcontrol representations,
shortcircuiting,
intermediate code generation for case statements, backpatching.
 Allocating storage for variables and procedures, generating code for
addressing array elements.
 Generating code for x86.
 Review [1 lecture]
Laboratory projects
A five part programming project in C++. The goal is to incrementally
build a compiler which translates programs written in a simple
programming language x86 assembly.
The input language does not have any objectoriented features
and only allows integer and boolean variable types.
The project involves using lex/flex lexical analyzer generators
and yacc/bison parser generators.
Parts of the project are:
 [1 week] Warmup: Building a recursivedescent parser for
a simple expression language.
 [2 weeks] Scanning and Parsing: Building a scanner and parser
for a simple programming language.
Students will write a lex/flex specification
to automatically generate a scanner
and a yacc/bison specification to automatically generate
a bottomup parser.
 [2 weeks] Intermediate Representation Generation:
Students will write semantic actions to create an abstract syntax tree for
the given program.
 [2 weeks] Contextsensitive analysis: Students will write methods which
traverse the abstract syntax tree to determine if the input program is legal by
checking conditions such as identifiers are declared before they are used
and that there are no type errors.
 [2 weeks] Code generation: Students will write methods which will
traverse the abstract syntax tree and emit x86 assembly code.
Estimate CSAB Category Content
CSAB Category

CORE

ADVANCED

CSAB Category

CORE

ADVANCED

Data Structures 
_ 
0.5 
Computer Organization and Architecture 
_ 
_ 
Algorithms 
_ 
0.5 
Concepts of Programming Languages 
_ 
2 
Software Design 
_ 
1 
Oral and Written Communications
Social and Ethical Issues
Students learn the impact of design decisions in programming
languages to future generation of engineers.
Theoretical Content
Students review the following theoretical concepts in this course:
regular expressions, DFAs, NFAs, context free grammars,
regular, contextfree and non contextsensitive languages.
Students learn theoretical concepts such as
LL, LR parsers, and attribute grammars.
Problem Analysis
Students learn how theoretical concepts such as finite automata,
regular expressions and contextfree grammars can be used in solving
practical problems.
Solution Design
Students learn how using a modular design one can build
complex software systems like compilers.
Students learn how to build complex data structures such as
abstract syntax trees using objectoriented design concepts.