Optimizing compiler using templates corresponding to portions of an intermediate language graph to determine an order of evaluation and to allocate lifetimes to temporary names for variables5613117Abstract A compiler framework uses a generic "shell" and a generic back end (where the code generator is target-specific). The generic back end provides the functions of optimization, register and memory allocation, and code generation. The code generation function of the back end may be targeted for any of a number of computer architectures. A front end is tailored for each different source language, such as Cobol, Fortran, Pascal, C, C++, etc. The front end scans and parses the source code modules, and generates from them an intermediate language representation of the source code programs expressed in the source code. The intermediate language represents any of the source code languages in a universal manner, so the interface between the front end and back end is of a standard format, and need not be rewritten for each language-specific front end. A feature is a method for doing code generation using code templates in a multipass manner. The selection and application of code templates occurs at four different times during the compilation process: (1) A pattern select phase does a pattern match in a context pass to select the best code templates; (2) Tasks of the context pass use context actions of the selected templates to analyze the evaluation order to expressions and to allocate temporary names; (3) A bind pass uses the binding actions of the selected templates to allocate template names; (4) Finally, a code pass uses code generation actions of the selected templates to guide the generation of object code. Claims What is claimed is: Description RELATED CASES
______________________________________
GEM.sub.-- TUPLE.sub.-- NODE =
GEM.sub.-- SE.sub.-- FIND.sub.-- EFFECT(
EIL.sub.-- TUPLE
: in GEM.sub.-- TUPLE.sub.-- NODE,
MIN.sub.-- EXPR.sub.-- COUNT: value)
______________________________________
Returns the most recently pushed tuple whose GEM.sub.-- TPL.sub.-- EXPR.sub.-- COUNT field is greater than MIN.sub.-- EXPR.sub.-- COUNT (see below), and whose execution may change the results produced by EIL.sub.-- TUPLE. Returns null (zero) if no tuple on the stack affects EIL.sub.-- TUPLE. May also return the same tuple specified in the parameter.
______________________________________
GEM.sub.-- TUPLE.sub.-- NODE =
GEM.sub.-- SE.sub.-- FIND.sub.-- EFFECTS(
VAR.sub.-- SYM
: in GEM.sub.-- SYMBOL.sub.-- NODE,
MIN.sub.-- EXPR.sub.-- COUNT: value)
______________________________________
Returns the most recently pushed tuple whose GEM.sub.-- TPL.sub.-- EXPR.sub.-- COUNT field is greater than MIN.sub.-- EXPR.sub.-- COUNT (see below), and whose execution may modify the value of variable VVAR.sub.-- SYM. Returns null (zero) if no tuple on the stack affects EIL.sub.-- TUPLE. GEM.sub.-- SE.sub.-- PUSH.sub.-- EFFECT and GEM.sub.-- SE.sub.-- POP.sub.-- EFFECT will be called only with tuples which can have effects. GEM.sub.-- SE.sub.-- FIND.sub.-- EFFECT will be called only with tuples which can have dependencies or effects. There is an order of invocation. Every EIL tuple has a field called GEM.sub.-- TPL.sub.-- EXPR.sub.-- COUNT. This field contains the index of the tuple in a walk of the EILG in which basic blocks are visited in dominator tree depth-first preorder. If the back end 12 calls GEM.sub.-- SE.sub.-- PUSH.sub.-- EFFECT with a tuple A, and subsequently calls GEM.sub.-- SE.sub.-- PUSH.sub.-- EFFECT or GEM.sub.-- SE.sub.-- FIND.sub.-- EFFECT with a tuple B, without having called GEM.sub.-- SE.sub.-- POP.sub.-- EFFECT with tuple A in the interim, then it is guaranteed that either tuple A precedes tuple B in the same basic block, or the basic block containing tuple A properly dominates the basic block containing tuple B. Therefore, the EXPR.sub.-- COUNT values of tuples on the effects stack decreases with increasing stack depth (i.e., more recently pushed tuples have higher EXPR.sub.-- COUNTs than less recently pushed tuples). This means that the FIND.sub.-- EFFECT routine can cut short its search of the effects stack as soon as it encounters a tuple T whose EXPR.sub.-- COUNT is less than or equal to the MIN.sub.-- EXPR.sub.-- COUNT argument. This is because all tuples stacked deeper than T are guaranteed to have EXPR.sub.-- COUNTs that are less than MIN.sub.-- EXPR.sub.-- COUNT. The mechanism actually used for the implementation of the effects stack is entirely up to the front end 20, as is the rule that it uses to determine when the execution of one tuple might affect the value computed by another tuple. A naive stack implementation is certainly possible, though it would probably be inefficient. A more sophisticated implementation might be built around a hash table, so that multiple small stacks (possibly each concerned with only one or a few variables) would be used instead of a single large stack. The effects-class interface will now be described. Recall that an effects set is a bit vector representing a set of effects classes, and that an effects class represents some arbitrary set of memory locations. Typically, an effects class will represent one of the following: 1. A single named variable. For effective optimization, each simple (i.e., non-aggregate) local variable which is used frequently in a routine should have an effects class dedicated to it. 2. A set of named variables with some common property; for example, in FORTRAN, all the variables in a particular named common block. 3. A set of memory locations which may not be determined until runtime, but which have some common property; for example, all the memory locations which are visible outside this routine (and which might therefore be modified by a routine call); or, in Pascal, all the memory locations which will be dynamically allocated with NEW calls and which have a particular type. The literal GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS is exported by the GEM.sub.-- SE package. It is the maximum number of distinct effects classes that the front end 20 may define. It will be 128 in the initial implementation. The GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET type is exported by the GEM.sub.-- SE package. It is a macro which expands to BITVECTOR[GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS]. Thus, given the declaration X: GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET, the following constructs are all natural (where 0.ltoreq.N.ltoreq.GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS-1):
______________________________________
X[N] = true;
! Add effects class N to set X.
X[N] = false;
! Remove effects class N from set X.
if .X[N] then . . .
! If effects class N is in set X . . .
______________________________________
The interface routines for the effects-class interface will now be described. The front end 20 must provide an implementation of the following routines:
______________________________________
GEM.sub.-- SE.sub.-- EFFECTS(
EIL.sub.-- TUPLE
: in GEM.sub.-- TUPLE.sub.-- NODE,
EFFECTS.sub.-- BV
: inout GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET)
______________________________________
The union of the effects of tuple EIL.sub.-- TUPLE and EFFECTS.sub.-- BV is written into EFFECTS.sub.-- BV.
______________________________________
GEM.sub.-- SE.sub.-- DEPENDENCIES(
EIL.sub.-- TUPLE
: in GEM.sub.-- TUPLE.sub.-- NODE,
EFFECTS.sub.-- BV
: inout GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET)
______________________________________
Writes the set of effects classes that EIL.sub.-- TUPLE depends on into EFFECTS.sub.-- BV.
______________________________________
GEM.sub.-- SE.sub.-- VARIABLE.sub.-- DEPENDENCIES(
SYMBOL : in GEM.sub.-- SYMBOL.sub.-- NODE,
EFFECTS.sub.-- BV
: out GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET)
______________________________________
Writes into EFFECTS.sub.-- BV the set of effects classes that might include the memory associated with variable SYMBOL. GEM.sub.-- SE.sub.-- EFFECTS will be called only with tuples which can have effects. GEM.sub.-- SE.sub.-- DEPENDENCIES will be called only with tuples which can have dependencies. The compiler may provide implementations for the interface routines mentioned above, but these routines are not intended for use in a production compiler. They are inefficient, and their rules for when one tuple invalidates another probably will not coincide exactly with the semantics of any particular language. However, they allow useful default optimizations to occur while other components of a front end 20 being implemented. The EFFECTS field of each symbol node is treated as an effects class number, between 32 and GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS. When the address expression of a fetch or store tuple has a base symbol, the EFFECTS field of the symbol is checked. If it zero, then it is set to a new value between 32 and GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS. For computing effects sets, using the effects class implementation as described above, the front end must call GEM.sub.-- SE.sub.-- INIT.sub.-- EFFECTS.sub.-- CLASSES before invoking the GEM.sub.-- IL.sub.-- BUILD phase. This implementation provides information about effects by defining a simple model for effects: 1. No variables are overlaid: 2. Data access operations not in canonical form (as defined in CT.006) have (for stores) or depend on (for fetches) effect 0. 3. Calls have effects 32 through GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS. ARGADR parameters are treated as if the call writes into their address operands. Effects classes 0 and 32 through GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS are reserved. Effect 0 represents references to memory such that the variables referenced can't be identified (pointer dereferences, parameters, etc.) When a variable is first referenced using a data access operator in canonical form it is assigned an effects class number n in the range 32 to GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS. The number is recorded in the EFFECTS field of the symbol node. The reference and all subsequent references to that variable will have effect or dependency n. The implementation includes some hooks for experimentation, testing, etc: 1. Tuples that may have effects or dependencies have one or more "effects fields" (EFFECTS, DEPENDENCIES, EFFECTS.sub.-- 2, etc.) reserved to the front end to record the effects and dependencies of the tuple. The compiler-supplied effects class callbacks interprets an effects field as a bitvector of length 32 representing the first word of a GEM.sub.-- SE.sub.-- EFFECTS.sub.-- SET. That is, if bit n of the field is true, the routines add effects class n to the computed effects of the tuple. 2. The front end can choose the effects class for a variable by writing the effects class number between 1 and GEM.sub.-- SE.sub.-- K.sub.-- MAX.sub.-- EFFECTS into the effects field of the variable's symbol node. The effects class routines do not assign an effects class if the EFFECTS field is not zero. 3. Effects classes 1 through 32 are reserved for use by the front end. It may assign any interpretation to those effects classes. To use the straight-line dependency implementation discussed above, the front end must call GEM.sub.-- SE.sub.-- INIT.sub.-- EFFECTS.sub.-- STACK before invoking the GEM.sub.-- DF.sub.-- DATAFLOW phase. This implementation uses the information provided by the GEM.sub.-- SE.sub.-- EFFECTS and GEM.sub.-- SE.sub.-- DEPENDENCIES callbacks to determine invalidations. That is, GEM.sub.-- SE.sub.-- FIND.sub.-- EFFECT(X) returns the most recently pushed tuple Y such that the intersection of GEM.sub.-- SE.sub.-- EFFECTS(Y) and GEM.sub.-- SE.sub.-- DEPENDENCIES(X) is non-null. INDUCTION VARIABLES According to one feature of the invention, an improved method of treating induction variables in a compiler is provided. First, the definition and detection of induction variables and inductive expressions will be discussed. An integer variable V is said to be an induction variable of loop L if each store to V that occurs in L: 1. increments (or decrements) V by the same amount each time it is executed. 2. is executed at most once in every "complete trip" through the loop. A trip is "complete" if it flows back to the loop top. For example, the following code illustrates an induction variable V:
______________________________________
Label L
V = 1
IF V > 10
GOTO LABEL M
ELSE
PRINT X
V = V + 1
END IF
______________________________________
In the compile function, in addition to finding induction variables, we are also interested in inductive expressions. Inductive expressions are expressions that can computed as linear functions of induction variables. Consider the following program:
______________________________________
DO I = 1,100
X = I * 8
T = I - 4
A[I] = T * 4
END DO
______________________________________
The expressions "I*8," "I-4," "T" and "T*4" are all inductive expressions in that they can be recomputed as linear functions of I. As a brief illustration of some of the optimizations based on induction variables, consider the following program example:
______________________________________
I = 1;
L: X = X + (4 * I)
I = I + 1
if I <= 100 GOTO L
______________________________________
This is a straightforward DO loop, I being the loop control variable. Notice that the inductive expression I*4 increases by 4 on each trip through the loop. By introducing a new variable, I2, we can replace the multiplication with an addition, which is a less expensive operation. This is optimization known as strength reduction, used in optimizing compilers for a long time:
______________________________________
I = 1;
I2 = 4;
L: X = X + I2
I = I + 1
I2 = I2 + 4
if I <= 100 GOTO L
______________________________________
Note that we now have two variables (I and I2) where we used to have one. We can eliminate the original loop control variable completely by recasting the uses of I to be in terms of I2:
______________________________________
I2 = 4;
L: X = X + I2
I2 = I2 + 4
if I2 <= 400 GOTO L
______________________________________
This optimization is known as induction variable elimination. These optimizations (strength reduction and induction variable elimination) operate directly on induction variables. In addition to these optimizations, induction variable detection provides information to other optimizations such as auto-inc/dec, vectorization, loop unrolling, etc. In the model used in the compiler of FIG. 1, induction variables may be incremented more than once during the loop. Furthermore, the number of changes can even differ with each iteration. In fact, the number of changes can be zero for a particular iteration. The loop invariant increment value may differ between individual stores, but each individual store must increment the variable by the same amount whenever it is executed. There are several different categories of inductive variables, with different properties, including basic induction variables, inductive expressions, and pseudo induction variables. Basic induction variables are the simplest form of induction variable. They have known properties that apply throughout the loop. All other induction variables and expressions are always built up as linear functions of a basic induction variables. Basic induction variables are generally modified in the form I=I+q or I=I-q where "q" is loop invariant. More generally, however, the requirement is that the assignment be of the form I-f(I) where f(I) is a linear function of I with a coefficient of 1. In the algorithms given in the Appendix, the basic induction variables of a particular loop are represented by a set in the loop top. In addition to this set, we also maintain the set of basic induction variables in the loop that have conditional stores that may not be executed on every trip through the loop. This inhibits vectorization and can make strength reduction more "desirable." An inductive expression is either a reference to an induction variable or a linear function of another inductive expression. Inductive expressions must be in one of the following forms:
______________________________________
-f(I)
f(I) + g(I)
f(I) - g(I)
f(I) + E E + f(I)
f(I) - E E - f(I)
f(I) * E E * f(I)
______________________________________
where f(I) and g(I) are inductive expressions derived from basic induction variable I with respect to loop L and E is invariant in loop L. If there are no stores to I between f(I) and the arithmetic operator of which it is an operand, then the arithmetic operator is an inductive expression derived from basic induction variable I with respect to loop L. The other category is pseudo induction variables. Under certain conditions, a variable may behave like an induction variable on all but the first trip through the loop. These can be turned into induction variables (and thus vectorized) by peeling the first iteration of the loop. Such variables are referred to as "pseudo induction variables." This occurs when a fetch within the loop is reached only by two stores, one within the loop that defines a derived induction variable, and another store whose value flows in through the loop top. Additionally, it must be guaranteed that all stores within the loop are executed once per trip. For example:
______________________________________
D = 50
DO I = 1, n
A[I] = D + . . .
D = I + 4
______________________________________
On the first trip through the loop, D has the value 50 at the assignment to I. On subsequent trips, D has the value 5,6,7, etc. By unrolling the loop once, the subsequent trips can be vectorized. Note that the algorithms given herein do not find induction variables that are pseudo induction variables. In order to identify a basic induction variable the compiler must be able to recognize all stores to it. The absence of the "has aliased stores" attribute guarantees this and thus we only recognize basic induction variables that do not have "has aliased stores." Detection of basic induction variables requires the use of "sets" of potential induction variables. Doing this dynamically for each loop is an expensive and complicated operation. Instead, we will use the side effect sets used to construct IDEF sets. A variable "X" is said to be "in" IDEF set S if the all the effects that fetch's of X depend on are in S. That is, X is in IDEF set S only if GET.sub.-- VARIABLE.sub.-- DEPENDENCIES(x) is a subset of S. Note that the presence of X in a basic induction set implies only that: (a) X is a basic induction variable or (b) X is loop invariant and shares IDEF bits with at least one variable that is a basic induction variable. The algorithm descriptions in the Appendix take the following liberties (perhaps more) in the interest of keeping the algorithm description simple: (1) The collection of the constant pans of the linear function cannot cause an overflow. (2) All stores completely redefine the variable. The algorithm as set forth in the Appendix starts out by assuming that all variables modified in the loop are basic induction variables. Each loop top has a basic induction variable set. As we find stores that don't satisfy the requirements for basic induction variables, we eliminate variables from the basic IV set of the loop top. Since inductive expressions and derived induction variables are always functions of basic IVs, we might say that fetches of basic IVs are the atomic forms of inductive expressions. That is, for an expression to have the inductive property it either has inductive operands, or it is a fetch of a basic induction variable. Using the rules given earlier, we build up inductive expressions from simpler inductive expressions based on assumptions about basic IVs. The basic IV of an inductive expression is always retained with the expression. Thus, after the algorithm has run, we can tell whether the expression is truly inductive by checking to see that the basic IV from which it is derived is still in the basic IV set of the loop. The FIND.sub.-- IV algorithm given in the Appendix will become part of the DATAFLOW phase which does a depth first dominator tree walk. The following is a summary overview of the tuple processing that is done:
______________________________________
select TUPLE[OPCODE}
[FETCH]
If base symbol is still a basis IV candidate
then
mark this tuple as being inductive.
[STORE]
Let V be the base symbol of the store.
If the value being stored is not inductive or.sub.-- else
the basic IV of the inductive value being stored
is not V or.sub.-- else
the coefficient of the stored value is not 1
remove V from the basic IV set of the loop top
then
remove V from the basic IV set of the loop top
then
mark the store as being inductive
[ADD, SUB, MUL, etc.]
If one operand is inductive and other operand
is loop invariant
then
mark this tuple as being inductive
______________________________________
The fields added to the tuple data structure, and fields added to the flow nodes, to accommodate induction variable detection, are set forth in Table 6a. AUTOMATIC CREATION OF KFOLD ROUTINE As previously discussed, the programming language compiler of FIG. 1 translates programs written in a source language into the machine language of a target machine 25. The compiler includes a front end 20, which incorporates knowledge of the source language in module 21 being compiled, and a back end 12, which incorporates knowledge of the machine language of the target machine 25. The front end translates programs from the source language into the intermediate language of the ILG 55, and the back end translates programs from the intermediate language into the target machine language. The intermediate language generally specifies a collection of operators (for example, add, shift, compare, fetch, store, or tangent), a collection of data types (for example, "signed 32-bit integer," "IEEE S-format floating point," or "character string"), and a representation for values of those data types. One of the optimizations included in the optimizer 26 is a constant expression evaluation routine. An example of a source code listing that may be related to a constant expression is shown in FIG. 6, where A and B are found to be constants, so A+B is a constant, then I and J are both equal to the same constant. The compiler can do the calculation (A+B), and save the fetch of A and B separately at run time, as well as saving the ADD operation. The I=A+B and J-A+B expressions of the code of FIG. 6 are thus both represented as merely STORE #9,I or STORE #9,J. This is known as "constant folding" because the constants are detected, calculated at compile time, and "folded" into the object code image. The mechanism for doing this is pan of the optimizer 26, referred to as a Kfold routine. The compiler of FIG. 1 incorporates a Kfold routine for evaluating expressions of the intermediate language to find these constant expressions. In general, given an operator of the intermediate language and the values of its operands, this routine will yield the same value which is computed by that operator when applied to those values. Such a constant expression evaluation routine has many applications in a compiler. For example, (a) The execution speed of the machine code which is generated for a program may be improved if some expressions of the program can be evaluated by the compiler itself rather than when the program is executed. (b) Some source languages may allow the use of expressions with constant operands to represent constant values. Compilation of a program in such a language requires the evaluation of such expressions by the compiler. (c) If the repertoire of operations provided in the intermediate language is richer than the set of operations provided by the programming language or environment in which the compiler is implemented, the most convenient way to perform some computation in the compiler may be to represent it as an expression in the intermediate language and submit it to the constant expression evaluation routine. The implementation of a constant expression evaluation routine may be a matter of considerable difficulty. The IL may have dozens of operations (e.g., ADD, SUBT, COSINE, etc.), and when distinct data types are considered (e.g., INT32, NINT64, FLOAT& etc.), an intermediate language may have hundreds or thousands of distinct operators. The evaluator must be able to apply each of the operations to each of the data types correctly, lest the compiler fail to perform its function fully or correctly. Particularly when floating-point types are involved, it is likely that not all of the operations which can be represented in the intermediate language will be directly available in the programming language in which the compiler is implemented. Consequently, a constant expression evaluation routine is liable to be extremely long, containing hundreds of distinct cases, and be highly error-prone. According to an important feature of one embodiment of the invention, the crucial point is that the one language in which the precise meaning of an operator of the intermediate language can always be specified both tersely and precisely is the intermediate language itself. That is, the compiler back end itself must be capable of generating code which correctly implements any operator of the intermediate language. Another way to say this is that compiler back end already embodies the knowledge of the sequences of machine code instructions necessary to realize the effect of each intermediate language operator, and it would be redundant to have to encode this same knowledge again in a different form in the constant expression evaluation routine. Based upon this concept, according to the invention, the mechanical generation of a constant expression evaluation routine becomes straightforward: The first step is to create a new compiler of FIG. 1, which uses the same back end 12 as the regular compiler, but replaces its front end 20 with the special front end described below. (Equivalently, provide a special mode for the compiler in which it operates as described below.) Second, the special front end 20 or special mode of operation does not read and translate a source program 21. Instead, it generates the intermediate language for the constant expression evaluation routine, as follows: (a) The routine does a conditional branch to select a case based on the intermediate language operator specified in the argument list. (b) Each case contains the code for a single operator. It fetches the operand values from the routine's argument list, applies the operator to them, and returns the result. (c) Since the routine is being generated directly in the intermediate language, the code for each case simply consists of intermediate language operators to fetch the operands from the argument list, then the intermediate language operator for the particular case, and then the intermediate language operators to return the result. Third, when this intermediate language graph is submitted to the compiler's back end, it will generate machine code for the constant expression evaluation routine. In the special front end just described, the front end can contain a list of all the operators for which cases must be generated, and can mechanically generate the intermediate language for each case. However, the process can be further simplified if, as may often occur, the compile back end already contains a table of operator information. (For example, such a table may be used to check the correctness of the intermediate language graph generated by the front end.) It is then possible for the special front end to use this table, already provided by the back end, to determine which cases to be generated. TYPE DEFINITION The compiler of FIG. 1 uses a type definition module referred to as the GEM.sub.-- TD module. GEM.sub.-- TD provides the mechanisms used by a front end 20 and back end 12 in constructing program type information to be incorporated in an object module for use by a linker or debugger. It is intended that this type specification service will allow a front end 20 to describe program symbols and their associated type information to the object module builder 29 in a manner independent of target object file requirements. This type specification service acts as a procedural "grammar of types" so that the compiler may associate abstract type specifications and program symbols. The type specification interfaces are defined below, and a number of examples of the use of the GEM.sub.-- TD services are referenced. The creation of type information takes place in the context of symbol table 30 creation and allows a front end 20 to specify an abstract representation of program type information. The object module builder 29 will later use this information in constructing Debug symbol table information. The GEM.sub.-- TD module provides service routines that allows a front end 20 to describe basic types and derived types. These routines typically construct internal data structures describing the specified type information. A new compiler node type, GEM.sub.-- TDI, will be defined to manage this type information. The definition of the type node data structure is private to the compiler 12 and may not be altered or examined by the front end 20. When defining a type, the front end 20 is returned a "handle" to the type node by the GEM.sub.-- TD routine defining the type. The handle allows a front end to associate a type with a program symbol but prohibits it from altering or examining the fields of the data structure. Type nodes will be created and managed by scope, that is, when transmitting type information, a front end 20 will specify the block node that a type is to be declared within, and the shell will be responsible for the management of the type nodes within that scope. The shell will manage type nodes in a list rooted in the block node in which the type is defined. The block node data structure will be expanded to define the fields TYPE.sub.-- LIST.sub.-- HEAD and TYPE.sub.-- LIST.sub.-- TAIL. A front end 20 may choose to make on-the-fly calls to the type specification service routines or may choose to make a pass over the entire symbol table to generate the type information. After defining a type the front end must associate this type information with the symbols of that type. Symbol nodes will have a new field DST.sub.-- TYPE.sub.-- INFO used to associate a symbol with its type. A symbol's DST.sub.-- TYPE.sub.-- INFO field will contain the address of the type node handle returned by a GEM.sub.-- TD service. A symbol node with a DST.sub.-- TYPE.sub.-- INFO value of null will have the target specified behavior for symbols not having type information. Referring to FIG. 7, the data fields and relationships are illustrated for the function:
______________________________________
int toy.sub.-- procl)
float b,c;
.
.
.
}
______________________________________
A block node 60 for toy-proc contains fields 61 and 62 (decl list pointers) pointing to the entries 63, 64 and 65 in the symbol table 30. Also, it contains fields 66 and 67 functioning as type list pointers, pointing to the entries 68 and 69 in the type list for int and float. The entries 63, 64 and 65 also have pointers 70, 71 and 72 pointing to the entries 68 and 69, for int and float, as the case may be. The GEM.sub.-- TD type specification service consists of routines to allow a front end 20 to define standard and derived types and to associate those types with program symbols. The compiler back end 12 will use the resulting type definitions and their symbol node associations to generate target specified Debug Symbol tables. Note that boolean is not considered a basic type. Compilers for languages such as Pascal should define boolean as an enumeration containing the elements true and false. ACTION LANGUAGE FOR MULTIPASS CODE GENERATOR A method for doing code generation in the back end 12 by code generator 29 using one set of code templates will now be described, referring to FIG. 8. The selection and application of this one set of code templates occurs at four different times during the compilation process. 1. The PATSELECT phase does a pattern match, block 80, in the CONTEXT pass to select the best code templates. (During this pattern match the UCOMP and DELAY optimization tasks are done in parallel as part of the pattern matching process.) 2. The TNASSIGN and TNLIFE tasks of the CONTEXT pass use context actions of the selected templates to analyze the evaluation order to expressions and to allocate TNs with lifetimes nonlocal to the code templates, block 81. 3. The TNBIND pass uses the binding actions of the selected templates to allocate TNs with lifetimes local to the code templates, block 82. 4. Finally, the CODE pass uses code generation actions of the selected templates to guide the generation of object code, block 83. A template is used at different times during a compilation. It consists of three major components: 1. ILG Pattern--which guides the template selection process that matches templates to applicable ILG structures. 2. Undelayed Actions--which determine the processing of matched ILG structures during the CONTEXT, TNBIND and CODE passes. The undelayed actions are performed when the template is first processed in each pass. As a result, the template actions for each ILG node are processed three different times--once for each pass. Some of the actions will have meaning for only one pass and will be ignored in the other passes. Other actions will have meaning in more than one pass but the required processing will be different in each pass. 3. Delayed Actions--which also determine the processing of matched ILG structures during the CONTEXT, TNBIND and CODE passes. The delayed actions are performed each pass when the result computed by the template is first processed as the leaf of another template. Delayed actions are useful on target machines like a VAX that have address modes. Simple register machines like a RISC would probably not make heavy use of delayed actions. An ILG pattern of a code generation template consists of four pieces of information: 1. A result value mode (see the examples given in the Appendix) which encodes the representation of a value computed by the template's generated code. 2. A pattern tree which describes the arrangement of ILG nodes that can be coded by this template. The interior nodes of the pattern tree are IL operators; the leaves of the pattern tree are either value mode sets or IL operators with no operands. 3. A sequence of Boolean tests. All of these must evaluate to true in order for the pattern to be applicable. 4. An integer that represents the "cost" of the code generated by this template. The pattern matching or PATSELECT phase matches an ILG subtree with the pattern of a template. If more than one template pattern can be applied at an ILG node then the pattern matcher delays choosing between the alternative templates until it knows which one leads to the lowest estimated code cost. There are three different action interpreters--the CONTEXT interpreter, the TNBIND interpreter and the CODE interpreter. The actions of each template are performed in three different passes of the compiler by the appropriate interpreter. Although the identical template is used in all three passes, the semantics of the actions are phase dependent so that different things are done each pass. Many actions have meanings in only one of the three passes and they do nothing in the other two passes. Other actions have meanings in more than one pass but the semantics of an action in one pass are often very different from the semantics of the same action in a different pass. However, having only one action sequence in a template makes it very easy to understand and to maintain the dependencies between the various passes. The action sequence for each template consists of two parts--the undelayed actions and the delayed actions. When a pattern of selected ILG nodes is first processed the undelayed actions are interpreted. When the ILG pattern is later used as the leaf of another ILG pattern then the delayed actions are interpreted. At the start of interpreting the undelayed actions a table of operand variables is created. An operand variable can contain a temporary name (TN), a literal or a target specific address mode. Temporary names are each partitioned into one of three classes: (1) permanent TNs, (2) delayed TNs and (3) local TNs. The class of a TN is determined by its lifetime and usage. Each TN must have an allocation lifetime. The allocation lifetime is begun by the appropriate template action and extends along all flow paths leading to the last use of the TN. The TNs in the permanent class can have a lifetime that ends some arbitrarily large amount of code into the future after creation of the TN. The life of a delayed class TN must begin in a delayed action of a template and terminate shortly afterwards when the TN is used as a leaf. The life of a local TN never extends beyond the interpretation of a single pattern. The class of a TN determines how it is processed. Permanent class TNs are created once in the CONTEXT pass and the same TN data structure is kept through all three passes and is used to store the complicated lifetime description of the TN. Delayed class and local class TNs have lifetimes of very restricted duration so they do not need a permanent data structure to track this information. As a result, the TN data structure for delayed class and local class TNs are built each pass when interpreting the actions and deleted immediately after their last use in each pass. Interpreting the same action sequence in each pass guarantees identical TN data structures are built in each pass for TNs of these classes. There will be a large list of different template actions. Some of the actions will be target machine dependent. The Appendix contains a list of proposed or example template actions, so that a user can by these code template examples determine for a particular embodiment what will be needed. THE INTERMEDIATE LANGUAGE REPRESENTATION The internal representation used in the compiler framework 10 of FIG. 1 comprises the symbol table 30 and intermediate language graph 55, which are the data structures created by the front end 20 to represent the structure, data, and code of a source module 21. The following describes the nodes which are the primitive components of these data structures, including a specification of the symbol table 30 and intermediate language used in the IL graph 55. In a compiler as described with reference to FIG. 1, the front end 20 generates a symbol table 30 to describe the blocks, routines, variables, literal values, etc. of a program contained in source module 21, and one or more intermediate language graphs 55, to describe the executable code. The following describes these internal data structures. The design of the compiler of FIG. 1 in general, and of the intermediate language and symbol table in particular, is intended to address a variety of architectures ranging from "Complex Instruction Set Computers" (CISC) such as VAX to "Reduced Instruction Set Computers" (RISC) such as PRISM, MIPS (a 32-bit RISC machine), or an advanced 64-bit RISC architecture. This design does assume that the architecture of target machine 25 has certain basic features. First byte organization and addressability are assumed and Twos-complement binary arithmetic, with "Little-endian" bit ordering. "Reasonable" address representation is also assumed, i.e., that an address fits in a register. In general, the front end 20 can be oblivious to the details of the target architecture 25 when creating the intermediate representation of a program. Most constructs of the intermediate representation have a well-defined meaning which is independent of the target architecture 25. There are some issues that must be resolved in implementing the front end 20, however. First, not all data types will be available on all architectures, as explained below. Second, arithmetic overflow behavior and the representation of "small integer" arithmetic may vary on different architectures, again, as discussed below. Third, the behaviors of some operators (such as the arithmetic shift operators) are defined only for subranges of the operand values for which the underlying machine instructions are defined on particular architectures. For operand values outside this specified range, such operators may be well behaved for any particular machine, but may have different behaviors on different machines. Lastly, calling conventions will be different on different target systems 25, requiring the front end 20 to generate different intermediate representations for the same source language constructs in some cases. The phrase "Intermediate Language" refers to an abstract language for specifying executable code. An "Intermediate Language Graph" (ILG) 55 is a particular program expressed in this language. The intermediate language in graph 55 is really a language of data structures in memory, with pointers providing the syntactic structure. However, there is also an approximate textual representation for ILGs, used for IL dumps written by the compiler as a debugging aid. The primitive concept of the IL is the tuple as described above with reference to FIG. 4--an ILG 55 is made up of tuples 35 representing the operations to be executed. These tuples 35 are tied together by pointers (e.g., operand pointers 38) which represent various relations. The most important relations are the operator-operand relation (a pointer 38 from an operator to each of its operands) and the linear ordering on all the tuples in each basic block of the ILG, which provides a nominal execution order. This linear order is represented by the tuple number 40 within a block, and by the pointers linking all the blocks of a routine or module. The computation defined by an ILG 55 is as follows: (1) Start at the BEGIN tuple of the ILG. (2) Evaluate each tuple in linear order: fetch the saved results of its operands, compute and save its result, and perform any secondary action that may be defined for it. (There are exceptions to this simple evaluation rule for "flow boolean" and "conditional selection" operators.) (3) After evaluating a branch tuple, continue evaluation at the label tuple selected by that branch tuple. It should be understood that these rules define the "meaning" of an IL graph 55. The code generator 29 is allowed to rearrange the actions indicated by the ILG, so long as it preserves their dependencies, as specified by the following rules: (1) If the ILG 55 contains an expression, and a statement whose execution might affect the value computed by evaluating the expression, then the generated code for the expression and the generated code for the statement must be executed in the same order that the statement and the expression occurred in the ILG. (2) If the ILG 55 contains two statements whose execution might affect the value computed by evaluating some common expression, then the generated code for the two statements must be executed in the same order that the statements occurred in the ILG. The question of when the execution of a statement might affect the value computed by the evaluation of an expression is resolved by reference to the side effects mechanism described below. The ILG 55 constructed by the front end 20 is not the same as the ILG processed by the back end 12. The front end 20 generates a Compact IL Graph, while the back end 12 processes an Expanded IL Graph. When the back end 12 generates code for a routine, the first thing it does is to expand that routine's CILG into an EILG. The differences between the two forms are several. First, the CIL provides "shorthand" tuples, which are expanded into sequences of lower-level tuples in the EIL. Second, the nodes which represent EIL tuples have many more fields than the nodes which represent CIL tuples. The additional fields contain information which is used by the back end 12, but which can be computed by the IL expander (or by other back end phases) from the fields in the CIL nodes. Third, there are different structural restrictions on the CILG and the EILG. This description is directed to the compact IL, although this information generally pertains to both the CIL and the EIL. The structure of a symbol table 30 represents the structure of the module 21 being compiled. At the heart of the table 30 is a tree of block nodes representing the blocks, routines, and lexical scopes of the module 21; the tree structure represents their nesting relationship. Associated with each block node is a list of the symbol nodes which are declared in that block. Associated with each routine block is an ILG 55 representing the code for that routine. A symbol node represents a symbolic entity in the module, such as a variable, label, or entry point. Constant values in the module 21 being compiled are represented by literal nodes. Literal nodes may be referred both from the symbol table 30 and from ILGs 55. The term literal table is also used to refer to the collection of all literal nodes that have been created in a compilation. Frame nodes represent areas of storage in which code and data can be allocated. Generally, these are either the stack frames of routines or PSECTs. Parameter nodes are used to build parameter lists, which are associated with entry point symbols. Each parameter node relates a parameter symbol in a routine with a location in the argument list of an entry point. Data Types: The intermediate representation used in graph 55 describes a program for an abstract machine 25, which has only a small set of types, the data types which are described in the following list. These data types are distinct from the data types of the source language of module 21, which are relevant only to the front end 20. It is the responsibility of the front end 20 to determine, for each target machine 25, the data types to be used to represent each source language data type. Data Types Null Representational Scalar Address Signed Integer Unsigned Integer Floating Point Complex Boolean The null data type is a special data type, which is the type of tuples that do not compute a value. A representational data type is a type whose values have a specific representation in the target machine architecture. The representational data types are divided into scalar data types and aggregate data types. A scalar data type is one whose values can be represented in a small fixed number of memory locations or registers. The scalar data types are subdivided into the address data type and the arithmetic data types. Note that the arithmetic types may be used to represent any other kind of data than can fit in the appropriate number of bits. In particular, source language character and logical data types must be represented with integer data types. There is a single address data type, ADDR. A value of type ADDR is represented as a binary integer with 32 or 64 bits. There are signed integer data types INT8, INT16, INT32, and INT64, where a value of type INT.sup.x-1 is represented as a signed binary integer with .sup.x-1 bits, and is therefore in the range--(2.sup.x-1) . . . (2.sup.x-1). The type INT8 may also be referred to as IBYTE. The type INT16 may also be referred to as IWORD. The type INT32 may also be referred to as ILONG. The type INT64 may also be referred to as IQUAD. The integer type with the same number of bits as an address may also be referred to as IADDR. The largest signed integer type supported for the target architecture (INT32 or INT64) may also be referred to as IMAX. Any binary scaling (as in PL/I) must be provided by the front end--there are no IL provisions for a scaled binary data type. There are unsigned integer data types UINT8, UINT16, UINT32, and UINT64, where a value of type UINT.sup.x-1 is represented as a signed binary integer with .sup.x-1 bits, and is therefore in the range 0 . . . (2.sup.x -1). The type UINT8 may also be referred to as UBYTE or as CHAR8. The type UINT16 may also be referred to as UWORD or as CHAR16. The type UINT32 may also be referred to as ULONG. The type UINT64 may also be referred to as UQUAD. The unsigned integer type with the same number of bits as an address may also be referred to as UADDR. The largest unsigned integer type supported for the target architecture (UINT32 or UINT64) may also be referred to as UMAX. The floating point data types are the VAX floating point types, REALF, REALD, REALG, and REALH, and the IEEE floating point types, REALS, REALT, REALQ, and REALE. Not all of these will necessarily be supported on any particular target architecture. The complex data types are CMPLXF, CMPLXD, CMPLXG, CMPLXS, and CMPLXT. A complex value is represented as a pair of values of the corresponding real type, which represent the real and imaginary pans of the complex value. Only complex types which correspond to supported floating point types will be supported on a particular target architecture. A value of an aggregate data type consists of a sequence of contiguous elements. An aggregate value is characterized by its body, the actual sequence of elements, and length, the number of elements in the sequence. The aggregate types are: (a) Character strings, type STR8, which have elements of type CHAR8. (b) Extended character strings, type STR16, which have elements of type CHAR16. (c) Bit strings, type BITS, whose elements are single bits, packed as tightly as possible. (d) PL/I and COBOL decimal strings, type DECIMAL, whose elements are decimal digits (represented as four-bit BCD digits, packed two per byte, with a leading sign digit). (The DECIMAL value is characterized by its precision, the number of digits it contains (not counting the leading sign digit), and its scale, the number of those digits which are regarded as coming after the decimal point. The elements of an aggregate value are numbered starting at zero. (Note that this will require many front ends to subtract one when translating a source program string index to an IL string index.) There is no limit on the number of elements which may be processed in a string operation. A flag might be introduced in the future to allow the front end to indicate character string expressions whose lengths were guaranteed not to exceed 65535 characters, and which could therefore be computed efficiently with the VAX character string instructions.) The length word of a varying-length string in memory will still be only 16 bits. Decimal strings are limited to 31-digits (plus the sign digit) on all target architectures. An example of the details of the representational type system for the various target architectures is indicated in Table 6. There is a single Boolean data type, BOOL. This is the type of logical values computed during the execution of a program. It does not have a specified physical representation. For example, a Boolean value might be represented by the value of a binary integer, the value of a processor condition code, or the value of the processor program counter. In particular, type BOOL does not correspond to any logical or Boolean data types that may be present in a source language. These must be represented as INT or UINT values, and convened to and from type BOOL as necessary. The general features that are common to all tuples in the intermediate language, and the structural characteristics of ILGs 55 (routines in the intermediate language) will now be described. An ILG 55 is made up of IL tuple nodes (usually just called tuples). All tuples contain the fields listed in Table 7. Other fields, known as attributes, occur only in particular kinds of tuples. Unlike symbol table nodes, which may be allocated with an arbitrary amount of space reserved for use by the front end 20, CIL tuple nodes will contain only the fields specified here. EIL tuple nodes will contain additional fields, located at a negative offset from the tuple node address, which are private to the back end 12. Structure of the ILG One tuple in an ILG can refer to another tuple in two different ways: as an operand or as an attribute. When only the operator-operand relation is considered, a CILG is directed acyclic graph (DAG), while an EILG is a forest (i.e., a collection of trees). Attribute pointers 39 create additional structure on the ILG, and also allow references from the ILG to the symbol table 30. The most important structural relation is the linear order of the ILG, defined by the next tuple and prey tuple attribute pointers. All of the tuples in a CILG occur in a single list defined by the linear order. The tuples of an EILG occur in a collection of circular lists, one for each basic block. The following rules apply to the structure of an ILG. If a front end 20 creates a CILG which violates these rules, the results are unpredictable, although the back end will attempt, where convenient, to detect violations and terminate compilation: (a) A tuple whose result type is NULL is referred to as a statement tuple, and a tuple whose result type is not NULL is referred to as an expression tuple. (b) In the CIL: (i) A scalar or Boolean expression tuple may be an operand of one or more other tuples. An aggregate expression tuple must be used as an operand of exactly one other tuple, which must be in the same basic block (see below). (ii) An operand may be an expression tuple, a symbol node, or a literal node. (iii) A symbol node used as an operand always has type ADDR. A literal node used as an operand has the data type of the literal. (iv) A symbol representing a variable which is allocated to a register does not have an address, in the normal sense. However, such a symbol may be used as the address operand of a tuple which reads from or writes to memory (a FETCH or STORE), in which case the tuple will access the indicated register. (v) If a symbol represents a variable in a stack frame, then that stack frame must be associated with the current routine or one of its ancestors in the symbol table block tree; otherwise, there would be no way of finding the stack frame at execution time. (c) In the EIL operands must be expression tuples, and every expression tuple must be an operand of exactly one other tuple. (d) No statement tuple may be an operand of any other tuple. (e) A tuple which is an operand of another tuple must precede that tuple in the linear ordering of the ILG. (In an EILG, this means that the operand and the operator must occur in the same basic block.) (f) An expression tuple must dominate every tuple which it is an operand of. That is, it must be impossible to get from an entry point of a routine to a tuple without encountering every operand of that tuple on the way. Subsequent paragraphs in this section describe the sorts of operations that are available in the intermediate language and the operators that are used to represent them. The individual operators are all collected in a data structure called <REFERENCE>(part.sub.-- tuple.sub.-- dictionary), the tuple dictionary. Each operator in the dictionary is documented using a structured format. Table 8 discusses the main categories in this format, the information presented under each, and the format used to present the information. The format section of a tuple specifies the number of operands and the allowed operator, operand, and result types in a single line of the form: op.type(type-1, . . . ,type-n): result where op is the name of the tuple operator, and type specifies the allowable operator types. If ".type" is omitted, then the operator type must be NULL. Otherwise, type must be on eof the following: (a) A specific type name (ADDR, BOOL, BITS, IADDR, etc.) indicates that only the specified type is allowed. (b) INT, UINT, REAL, CMPLX, or STR indicates that any type belonging to the specified family is legal. For example, CMPLX means that CMPLXF, CMPLXD, CMPLXG, CMPLXS, and CMPLXT are all allowed; STR means that STR8 and STR16 are allowed. (c) ALL indicates that any type other than NULL is legal. (d) A string of the letters I, U, R, C, A, S, and B indicates that any type belonging to a family represented by one of the letters is allowed, as follows:
______________________________________
I INT A ADDR
U UINT S STR
R REAL B BITS
C CMPLX
______________________________________
The expressions Type-1, . . . ,Type-n" specify the allowable types of the tuple's operands. If the parenthesized list is omitted, then the operator takes no operands. Otherwise, the tuple must have one operand for each type in the list. Each type-i must be one of the following: (a) T means that the operand type must be the same as the operator type. (b) A specific type name (ADDR, BOOL, BITS, IADDR, etc.) means that the operand must have the specified type. (c) A string of the type code letters I, U, R, C, A, S, and B has the same meaning that it does for the type specifier. Note that operands with the type specifier IU, which means "any integer," are generally convened to type IMAX in the generated code. Program behavior is therefore undefined if the actual value of such an operand cannot be convened to type IMAX. (d) If the operator and operand type specifiers are REAL and CMPLX or STR and CHAR, then the actual operator and operand types must be consistent. For example, the type specification "CADD.CMPLX(T, REAL): T" indicates that the second operand must have type REALF if the operator type is CMPLXF,REALS if the operator type is CMPLXT, etc. If the operator type is SB, i.e., character string or bit string, and an operand type specifier is CHAR, then the operand type must be CHAR8 if the operator type is STR8, CHAR16 if the operator type is STR16, and IMAX if the operator type is BITS. That is, IMAX is treated as the character type corresponding to the string type BITS. The actual operands of the tuple must be tuple nodes whose result types are consistent with the types specified by the operand type list. In the CIL, they may also be symbol nodes, which are always treated as having type ADDR, or literal nodes, which are treated as having the types specified by their data type fields. The expression "Result" specifies the allowable result types. If it is omitted, then the operator is a statement operator and the tuple's result type must be NULL. Otherwise, it is interpreted exactly the same way as the operand type specifiers. Addresses and Memory References An address expression is one of the references in the intermediate language. The simplest form of address expression is a symbol. That is, an operand field of a tuple node may contain the address of a symbol node, to represent the memory address (or the register) associated with that symbol. An address value can also be obtained by fetching it from memory (a "pointer variable"), by casting an arithmetic value, or by evaluating a preincrement tuple, a postincrement tuple, or one of the tuples of the following list:
______________________________________
Address Computation Operators
Operator Meaning
______________________________________
AMINUS Subtracts an integer from an address to yield a
new address.
APLUS Adds an integer to an address to yield a new
address.
BASEDREF Evaluates the address to yield a new address.
LITADDR Yields the address of a read-only memory location
containing a specified literal value.
UPLINK Yields the address of the stack frame for the
current routine or a routine that
contains the current routine.
______________________________________
A data access tuple is a tuple which causes a value to be loaded from or stored into memory. (The word "memory" here includes registers in a register set of the target CPU 25. The only difference between a register and a normal memory location of the CPU 25 is that the "address" of a register can only be used in a data access tuple.) The data access operators are listed in Table 9. In every data access tuple, the first operand is an address expression. Every data access tuple also has an offset attribute which contains a longword integer. The address of the memory location to be accessed is the sum of the run-time address operand and the compile-time constant offset attribute. All data access tuples will have some or all of the attributes listed in Table 10. The uses of the effects, effects2, and base symbol attributes are discussed in more detail below in the section Interface for Representing Effects. Another type of reference is the Array Reference. The APLUS and AMINUS tuples are sufficient for all address computations. However, they do not provide any information about the meaning of an address computation. In particular, they don't provide any information about array references and subscript expressions that might have been present in the source code. This information is needed for vetorization. Therefore, the IL has tuples which specifically describe array references. For example, given a BLISS vector declared as local X: vector[20,long], a reference to .X[.I] could be represented as
______________________________________
$1: FETCH.INT32(I);
$2: SUBSCR.IADDR($1, [4], [0]; POSITION = 1);
$3: FETCH.INT32(X, $2);
______________________________________
Given a Pascal array declared as var Y:packed array [1 . . . 10, 1 . . . 10] of 0 . . . 255, an assignment Y[I, J]:=Z could be represented as
______________________________________
$1: FETCH.INT32(J);
$2: SUBSCR.IADDR($1, [1], [0]; POSITION = 1);
$3: FETCH.INT32(I);
$4 SUBSCR.IADDR($3, [10], $2; POSITION = 2);
$5 FETCH.UINT8(Z);
$6 STORE.UINT8($4-11, $5);
______________________________________
The basic array reference operators are AREF and SUBSCR. AREF yields the address of a specified element in an array. SUBSCR computes the offset of an array element. The first operand or an AREF tuple is an address expression representing the base address of the array, and its second operand is a SUBSCR tuple which computes the byte offset from the base address to an element of the array. The AREF tuple adds the value of the SUBSCR tuple to the base address to compute the address of the indexed element. In fact, the code for AREF(origin, subscript) is identical to the code for APLUS(origin, subscript). A SUBSCR tuple computes the offset of an element along one dimension in an array. Its operands are: (a) The element index. Individual indices in a subscript expression are not normalized for a zero origin. Instead, an origin offset to account for non-zero lower bounds in the array declaration should be added into the address operand of the AREF tuple or the offset field of the tuple that uses the e | ||||||
