Translate Expressions To Intermediate Representation

Tiger in Stratego -- Compilation by Program Transformation
Implement translation of TigerAbstractSyntax expressions to IntermediateRepresentation code in module TAS2IR in the TigerTrans package.

The TigerTrans package contains the definition of IntermediateRepresentation and (mostly empty) templates for the components to be implemented. The TigerXmpl package provides example Tiger programs and a makefile for testing the new components. The ASM package provides the signature of assembly code that is used by instruction selection in TigerTrans.

Translation to Expression or Statement

Tiger knows only expressions. During translation a distinction should be made into Tiger expressions that return a value (proper expressions) and expressions that do not (statements). The same syntactic construct can represent expressions and statements. This hold in particular for if-then-else (If) and sequential composition (Seq). The expression/statement type of an expression can be determined by its context. For example, the body of a function is an expression if the function returns a value, otherwise it is a statement. This knowledge can be expressed in the traversal strategy employed by the translator.

Make two kinds of rules, TrExp translates expressions and TrStm translates statements. Let the strategy determine which should be called.

Records

For the translation of record creation and record access use the type information that is available in the TigerAbstractSyntax tree after typechecking

Function Calls

Translate a Call to a CALL. Don't worry about passing arguments in registers or on the stack. This will be dealt with during InstructionSelection.

Passing Information Around

Information about memory locations of local variables, function arguments, and static links can be passed around using DynamicRules. For example, to translate a variable, generate a rule when translating its variable declaration (VarDec) that translates a Var(x) to a TEMP or to a memory access depending on where the variable is placed at declaration time.

Functions such as getchar, which are defined in the run-time system of Tiger, do not expect a static link. Thus, the call translation rules generated for the built-in functions should not pass a static link.

Static Links

To compute static links keep a list stack frames. A stack frame can be represented by its name and the offset from the frame pointer to the location where its static link is stored. When the static link is passed as first argument of a function the offset will be zero. Keep the list in a dynamic rule, such that it can be updated while traversing into the tree. The current frame is the head of the list. The distance between the current frame and the frame in which a function or variable was declared determines the number of steps that needs to be taken to obtain the pointer to its enclosing stack frame.

Frame Pointer Offset

To allocate stack bound arguments and local variables on the stack it is necessary to maintain an offset to the frame pointer. Increment the offset after each variable access. After completing the translation of the function, this offset needs to be stored in the fragment for the function. See the definition of the intermediate representation for a description of the parameters passed to a PROC fragment.

Nested Functions

While translating, leave the nesting structure of functions by translating a Let with function declarations to a LET with as first argument a list of PROC fragments (see intermediate representation.) As the first action of IRCanonicalize remove this nesting structure by lifting PROC fragments to top-level.

Function Bodies

After translation the statements of a function body will be shuffled around during canonicalization. In order to start at the right statement, the translation should remember where the body starts and where it ends. This should be communicated to instruction selection by putting a start label at the beginning of the body and jump to an end label at the end of the body. These labels should be declared as parameters of the PROC fragment of the function.

String Constants

String constants should translated to a piece of code that declares static program data. For example, with a MIPS back-end the string "hello world!" should be translated to

    .align 2
    .rdata
a_0:
    .word 12
    .asciiz "hello world!"     
This specific translation can be defferred until instruction selection, however. During translation string constants should be translated to a combination of a STRING fragment that declares the label at which the string data can be found and the string itself, and an expression that yields the address of the label using the NAME constructor. This can be done using the LET construct, i.e., a string "hello world!" translates to
  LET([STRING("a_0", "hello world!")], NAME("a_0"))
In order to avoid translating multiple occurrences of the same string to multiple data declarations, store the label associated with the string constant in a dynamic rule when first encountering the string and reproducing the label when next encountering the string.