Translate Expressions To Intermediate Representation
Tiger in Stratego -- Compilation by Program Transformation
Implement translation of
TigerAbstractSyntax expressions to
IntermediateRepresentation code in module
TAS2IR in the
TigerTrans package.
The
TigerTrans package contains the definition of
IntermediateRepresentation and (mostly empty) templates for the components to be implemented.
The
TigerXmpl package provides example Tiger programs and a makefile for testing the new components.
The
ASM package provides the signature of assembly code that is used by instruction selection in
TigerTrans.
Translation to Expression or Statement
Tiger knows only expressions. During translation a distinction should be made into Tiger expressions that return a value (proper expressions) and expressions that do not (statements). The same syntactic construct can represent expressions and statements. This hold in particular for if-then-else (
If
) and sequential composition (
Seq
). The expression/statement type of an expression can be determined by its context. For example, the body of a function is an expression if the function returns a value, otherwise it is a statement. This knowledge can be expressed in
the traversal strategy employed by the translator.
Make two kinds of rules,
TrExp
translates expressions and
TrStm
translates statements. Let the
strategy determine which should be called.
Records
For the translation of record creation and record access use the type information that is available
in the
TigerAbstractSyntax tree after typechecking
Function Calls
Translate a
Call
to a
CALL
. Don't worry about passing arguments in registers or on the stack. This will be dealt with during
InstructionSelection.
Passing Information Around
Information about memory locations of local variables, function arguments, and
static links can be passed around using
DynamicRules. For example, to translate a variable, generate a rule when translating its variable declaration (
VarDec
) that translates a
Var(x)
to a
TEMP
or to a memory access depending on where the variable is placed at declaration time.
Functions such as
getchar
, which are defined in the run-time system of Tiger, do not expect a static link. Thus, the call translation rules generated for the built-in functions should not
pass a static link.
Static Links
To compute
static links keep a list
stack frames. A stack frame can be represented by its name and the offset from the frame pointer to the location where its
static link is stored. When the static link is passed as first argument of a function the offset will be zero.
Keep the list in a dynamic rule, such that it can be updated while traversing into the tree.
The current frame is the head of the list.
The distance between the current frame and the frame in which a function or variable was declared
determines the number of steps that needs to be taken to obtain the pointer to its enclosing
stack frame.
Frame Pointer Offset
To allocate stack bound arguments and local variables on the stack it is necessary to maintain an offset to the
frame pointer. Increment the offset after each variable access.
After completing the translation of the function, this offset needs to be stored in the fragment for
the function. See the definition of the
intermediate representation for a description of the parameters passed to a
PROC
fragment.
Nested Functions
While translating, leave the nesting structure of functions by translating a
Let
with function declarations to a
LET
with as first argument a list of
PROC
fragments (see
intermediate representation.)
As the first action of
IRCanonicalize remove this nesting structure by lifting
PROC
fragments to top-level.
Function Bodies
After translation the statements of a function body will be shuffled around during canonicalization. In order to start at the right statement, the translation should remember where the body starts and where it ends. This should be communicated to
instruction selection by putting a start label at the beginning of the body and jump to an end label at the end of the body. These labels should be declared as parameters of the
PROC
fragment of the function.
String Constants
String constants should translated to a piece of code that declares static program data. For
example, with a
MIPS back-end the string
"hello world!"
should be translated to
.align 2
.rdata
a_0:
.word 12
.asciiz "hello world!"
This specific translation can be defferred until
instruction selection, however. During translation string constants should be translated to a combination of a
STRING
fragment that
declares the label at which the string data can be found and the string itself, and an expression that yields the address of the label using the
NAME
constructor. This can be done using the
LET
construct, i.e., a string
"hello world!"
translates to
LET([STRING("a_0", "hello world!")], NAME("a_0"))
In order to avoid translating multiple occurrences of the same string to multiple data declarations,
store the label associated with the string constant in a dynamic rule when first encountering the string and reproducing the label when next encountering the string.