Method of replacing lvalues by variables in programs containing nested aggregates in an optimizing compiler5710927Abstract A method for analyzing and optimizing programs that define and use aggregate data structures. A program to be analyzed and optimized is inspected to find definitions and uses of lvalues, which are regions of memory. The lvalues may be denoted by program variables, pointer expressions, or components of aggregate lvalues. A data-flow solver determines where definitions of lvalues reach uses. A set of "least general unifiers" are computed for the definitions and uses. A replacement variable is created for each least general unifier that is determined to be replaceable. Each reference to an lvalue that corresponds to a replaceable least general unifier is replaced by a reference to the corresponding replacement variable or a component thereof. The method is applicable even in the presence of potential aliasing. Claims What I claim is: Description REFERENCES
______________________________________
LUB(LGU1,LGU2)
if LGU1 is IRREPLACEABLE or LGU2 is IRREPLACEABLE then
return IRREPLACEABLE
else if nodes pointed to by LGU1 and LGU2 have a common ancestor
return pointer to nearest common ancestor of the nodes
pointed to by LGU1 a
else return IRREPLACEABLE
______________________________________
The invention employs a recursive function MAKE.sub.-- SUBREGION that returns an expression denoting an lvalue. The function MAKE.sub.-- REGION takes as arguments nodes NODE1 and NODE2 such that NODE2 is identical to a descendant of NODE1 in REGION.sub.-- CANDIDATE.sub.-- FOREST, and given lvalue LVALUE1, returns an lvalue that solves the analogy "NODE1 is to NODE2 as LVALUE1 is to .sub.-- ?".
______________________________________
MAKE.sub.-- SUBREGION(NODE1,NODE2,LVALUE1)
if NODE1==NODE2 then
return LVALUE1
else
return component denoted by NODE2 of lvalue
MAKE.sub.-- SUBREGION(NODE1,PARENT(NODE2),
LVALUE1)
______________________________________
The invention employs a binary operation on groups called "UNIFY.sub.-- GROUP".
______________________________________
UNIFY (GROUP1,GROUP2)
Merge GROUP1 and GROUP2 into new group G.
Set LGU(G) := LUB( LGU(GROUP1), LGU(GROUP2) )
if IS.sub.-- LIVE(GROUP1)=TRUE OR IS.sub.-- LIVE(GROUP2)
Set IS.sub.-- LIVE(G) := TRUE;
else
Set IS.sub.-- LIVE(G) := FALSE;
______________________________________
Those skilled in the art will recognize that the merging of groups can be accomplished efficiently by the classical "union-find" algorithms. The preferred embodiment employs a flow-graph representation of a program in which edges of the graph represent transfer of control, evaluation of expressions, and/or creation/destruction of local variables. Vertices correspond to labels and unlabeled points of execution between statements. This kind of flow-graph departs from common practice primarily in that side-effects are represented as edges instead of vertices. This kind of flow-graph is not necessary for the invention, but simplifies minor details of its preferred embodiment. FIG. 2 shows an example C program and the corresponding flow-graph. Notice that the test "if(x<y)" becomes three edges in the flow-graph: one for the evaluation the expression "x<y", and two more corresponding to taking the "true" and "false" arms of the if-then-else. Also notice that not only are there edges for assignment expressions, but also edges for the creation and destruction of variable z. FIG. 3 shows an overview of the steps applied by the invention to a program and its associated flow-graph. The application of all these steps is considered a single "pass" of the invention. These steps are as follows. 1. Find all candidate definitions. These are assignments or initializations of lvalues. The candidate definition may contain side-effects other than the assignment or initialization itself. 2. Build the region-forest CANDIDATE.sub.-- REGION.sub.-- FOREST for the candidates. Begin by setting CANDIDATE.sub.-- REGION.sub.-- FOREST to empty. Each candidate is of the form L:=R, where the left-hand side L is an lvalued-expression and the right-hand side R. is an rvalued-expression. Add a node corresponding to L to the CANDIDATE.sub.-- REGION.sub.-- FOREST. Add any extra nodes that are necessary to connect the new node to the root of a region-tree. This may require adding a new root node to CANDIDATE.sub.-- REGION.sub.-- FOREST. 3. Refine the region-forest CANDIDATE.sub.-- REGION.sub.-- FOREST to the granularity of the smallest components of definitions that are potentially used. This step proceeds by scanning the program for loads and/or stores of components whose outermost containing lvalue is already represented in CANDIDATE.sub.-- REGION.sub.-- FOREST. If the tree does not yet have a node corresponding to the loaded/stored lvalue, a node is added to the tree. Adding the node requires adding intermediate nodes to connect it with the rest of the tree when the node is not a child of an existing node. The components of each lvalue L that correspond to the leaves in the final tree are called the "leaf-components" of L. 4. Construct and solve IS.sub.-- SAME problems for left-hand-sides of candidates. An IS.sub.-- SAME problem is the following question. Given an rvalued expression E, and two points x and y in the flow-graph, if expression E is evaluated at point y, does it yield the same rvalue as evaluation of the expression at the most recent encounter of point x would have yielded? If x and y are identical, then "most recent encounter of point x" is taken to mean the previous encounter. An answer to the IS.sub.-- SAME problem is denoted IS.sub.-- SAME(E,x)(y), where IS.sub.-- SAME(E,x)(y) returns a boolean value TRUE or FALSE. If E is the address of an lvalue expression L, then if IS.sub.-- SAME(E,x)(y)=FALSE, the support of L is said to have "collapsed". To solve the IS.sub.-- SAME problems, apply any kind of data-flow solver. The method of solving the IS.sub.-- SAME problems is not part of the present invention; only the application of the solution of the IS.sub.-- SAME problems is part of the invention. 5. Construct and solve the REACHES problems. Construct the REACHES problems as follows. For each definition D (of the form L:=R), there are one or more REACHES problems. There is one problem for each leaf in the tree REGION.sub.-- CANDIDATE.sub.-- SET that corresponds to a component or sub-component of L. Let L.c denote a component or sub-component of L, where L is the left-hand-side of a definition. A REACHES problem is the question: Let vertex h be the head of the edge corresponding to definition D. When vertex y is encountered, does the lvalue assigned to L.c by the most recent encounter of vertex h ever reach vertex y (even if the support of L changes)? The answer to a REACHES problem is denoted REACHES(D,c)(y), where REACHES(D,c)(y) returns a boolean value TRUE or FALSE. If REACHES(D,c)(y)=TRUE, the definition of D.c is said to "reach" vertex y. The answer to a REACHES problem is independent of the IS.sub.-- SAME problem for the left-hand-side. That is, if a value is assigned to L.c and the support of L collapses before vertex y is encountered, but the value assigned to the original L.c is still there when vertex y is encountered, then REACHES(D,c)(y)=TRUE nonetheless. To solve the REACHES problems, apply any kind of data-flow solver. The method of solving the REACHES problems is not part of the present invention; only the application of the solution of the REACHES problems is part of the invention. 6. Find least-general unifiers for the definitions. (a) Set initial LGUs as follows. For each component c of each definition D, set GROUP(D.c):=D.c. That is, the initial group for a component of the left-hand-side is the component itself. Set LGU(GROUP(D.c)):=NODE(D.c). Set IS.sub.-- LIVE(GROUP(D.c)):=false. Set REPLACEMENT(GROUP(D.c)):=NONE. (b) Consider each load or call in the program. For each call C, do step (c). For each load of an lvalue L, do steps (d)-(e). Steps (d)-(e) are shown in FIGS. 4 and 5. c) Update the LGUs to reflect the possible effects of the call as shown in FIG. 6. Select all components D.c of definitions that reach the point of call and might be modified by the call. For each such component D.c, set LGU(GROUP(D.c)):=IRREPLACEABLE and set IS.sub.-- LIVE(GROUP(D.c)):=TRUE. The method of determining which components might be modified by a call is not part of the invention. Only the aforementioned analysis of such components is part of the invention. (d) Consider the set of candidate definitions that reach the lead of lvalue L. If any definition reaches via an alias, or if the lead U possibly uses a value that is not from a candidate definition, then set its LGU to IRREPLACEABLE. Set LIVE(LGU(D.c)):=TRUE for all reaching components. (e) As shown in FIG. 8, inspect each definition D for the following two conditions: (i) The definition modifies a memory-region that may be loaded by a lead of lvalue L. (ii) The support of the definition may collapse before a lead of lvalue L is executed. If both conditions hold, then apply process MARK-IRREPLACEABLE-IF-LIVE to mark all components D.c that reach the lead of L as irreplaceable. Process MARK-IRREPLACEABLE-IF-LIVE is shown in FIG. 7. 7. Account for lvalues that are stored into before ("live-in") or loaded after ("live-out") the program is executed. Such lvalues cannot be replaced. For each definition component D.c that is live-in or liveout, set LGU(GROUP(D.c)):=IRREPLACEABLE. If the definition component D.c is live-in, set IS.sub.-- LIVE(GROUP(D.c)):=TRUE. 8. Create replacement variables. Note that up to this step, REPLACEMENT(G)=NONE for any group G. For each component D.c of each definition D, apply the following steps as shown in FIG. 9. Let G=GROUP(D.c). If IS.sub.-- LIVE(G)=TRUE and LGU(G) is not IRREPLACEABLE and REPLACEMENT(G)=NONE, then create a new variable V of type TYPE(LGU(G)) and set REPLACEMENT(G)=V. 9. Traverse the program, looking for references (loads and/or stores) to lvalues that are represented in REGION.sub.-- CANDIDATE.sub.-- TREE. Each reference may be a lead, store, or both (load-and-store). An example of a reference that is both a lead and a store is the reference to "i" in the C-language expression "++i". When replacing the lead within a load-and-store reference, the invention is careful to retain the original store, and vice versa. For example, when replacing the lead in "++i" by "j", the resulting expression is "i=j+1", not simply "++j". For each such reference that is a load, store, or load-and-store of an lvalue L, perform the following steps, as shown in FIG. 10: (a) If lvalue L is separable and there exist two components L.c1 and L.c2 of lvalue L such that GROUP(L.c1) and GROUP(L.c2) differ, then rewrite the reference as references to the separate components and perform steps (b)-(k) for each separate reference. Otherwise perform steps (b)-(k) for the whole load reference to lvalue L. (b) If the reference is a load from L (or a load-and-store), go to step (c). Otherwise continue to step (g). (c) If there is a definition component D.c that reaches the load of lvalue L and is referenced by the load of L, then go to step d; otherwise go to step g. (d) Set G:=GROUP(D.c) (e) If REPLACEMENT(G)=NONE, then consider the reference no further. Otherwise go to step f. (f) Replace the reference to L by a reference to the replacement variable REPLACEMENT(G) or appropriate subregion thereof. Both cases can be handled by replacing the reference to lvalue L by MAKE.sub.-- SUBREGION(LGU(G),NODE(L),REPLACEMENT(G)). Continue to step g. (g) If the reference is a store (or load-and-store) into lvalue L, and has not yet been processed by steps (h)-(k), go to step (h). Otherwise consider the reference no further. (h) If the store into L is a store that corresponds to a store into a component D.c executed by a definition D, then go to step (i); otherwise consider the reference no further. (i) Set G:=GROUP(D.c). (j) If IS.sub.-- LIVE(G) is TRUE, go to step (g). Otherwise, continue to step (k). (k) The store is not used and should be deleted from the program. Delete the store and consider the reference no further. It is emphasized that the particular methods for solving the data-flow problems in steps 4 and 5 are not part of the invention, and those skilled in the art will recognize that a variety of methods can be applied to the solution of the aforementioned data-flow problems. The invention concerns only the proposition of the problems and the application of their solution. Applying multiple passes of the invention to analyze and transform again may yield results superior to a single pass in cases where replacement of lvalues by variables improves the accuracy of alias-analysis and data-flow analysis. Best results are obtained by iteratively applying passes until no non-trivial replacements occur. A replacement is trivial if it is the replacement of all definitions and uses of an lvalue by a single variable. The following modification allows more replacements to be made. Omit step 7 from the process described above. Instead, after step 9, perform the following step to compensate for the omission of step 7: 10. For each replaced lvalue that was live-in, insert extra compensating assignments at the beginning of the program being optimized that assign the live-in value to the replacement variable. For each such replaced lvalue that was live-out, insert extra compensating assignments at the end of the program being optimized that assign the replacement variable's value to the value. The following additional modification to the present invention allows more replacements to be made in the presence of loops. The modification is to omit step 7 and include step 10, and analyze each loop (or nest of loops) separately. Doing so sometimes allows further replacements of lvalues inside a loop,when replacement of the lvalue would otherwise be prohibited by conditions outside of the loop. I explain below the deficiencies of prior art and why the present invention is superior. The prior art of Cytron et al for scalar renaming merely replaces scalar variables with other scalar variables. Its restriction to variables essentially means that it considers only lvalues whose support is empty, and thus the answers to the IS.sub.-- SAME problems would be trivially TRUE. Its restriction to scalars essentially means that it considers only lvalues whose corresponding region-tree would consist of a single trivial root node. In contrast, the present invention replaces arbitrary lvalues, which may be components of aggregates, and/or referenced by arbitrary expressions, and (possibly) may have aliases. The prior art of Budge et al does not use data-flow analysis. It cannot do any replacement if a structure has its address taken. In contrast, the present invention can replace structures whose address is taken, when the taking of the address is irrelevant to an opportune replacement. For example, consider the following C++ program fragment.
______________________________________
struct Complex {
float re, im;
Complex a;
capture.sub.-- address(&a);
// Capture address of "a"
for( int i=0; i<10; i++ ) {
a.re = a.re+3.0;
// Def. 1.
a.im = a.im+4.0;
// Def. 2.
a.re = 2.0*a.re;
// Def. 3.
a.im = 2.0*a.im;
// Def. 4.
}
use.sub.-- address();
// Use captured address
______________________________________
The prior art of Budge et al can do no replacement, because the address of structure "a" has been captured by function "capture", thus allowing structure "a" to be accessed by other functions. However, the present invention can apply replacement to definitions (1) and (2), because it performs data-flow analysis to prove that the address of "a" cannot be used when these definitions are live. Applying the present invention to the example above yields the code below, in which replacement variables "temp1" and "temp2" have been introduced.
______________________________________
struct Complex {
float re, im;
Complex a;
float temp1, temp2;
capture.sub.-- address(&a);
for( int i=0; i<10; i++) {
temp1 = a.re + 3.0;
temp2 = a.im + 4.0;
a.re 2.0*temp1;
a.im = 2.0*temp2;
}
use.sub.-- address();
______________________________________
Furthermore, the prior art of Budge et al considers only lvalues that are local variables. The present invention considers lvalues that are not necessarily local variables, and performs scalar-replacement where applicable. For example, consider the following fragment.
______________________________________
struct Complex {
float re, im;
Complex z;
Complex * p = some.sub.-- address();
// Set p to address of a Complex object.
(*p).re = 5; // Set "re" component to 5.
z = ((*p).re); // Use "re" component
(*p).re = 10; // Reset "re" component to 10.
______________________________________
Notice that the address returned by function some.sub.-- address could have an alias; that is, some other pointer-variable could contain the same address. Such an alias would not prevent the present invention from doing the replacement shown above. The prior art of Budge et al can do nothing with this example. In contrast, the present invention replaces the first set of references to the lvalue (*p).re by a scalar variable (named "temp" in the example). The present invention succeeds because it performs data-flow analysis and recognizes that the definition "(*p).re=5" is used only by the assignment to z.
______________________________________
struct Complex {
float re, im;
Complex * p = some.sub.-- address();
Complex temp = 5;
z = temp;
(*p).re = 10;
______________________________________
Furthermore, the prior art of Budge et al. and Cytron et al. do not work with unions. The present invention replaces references to fields of unions by scalar variables where possible. In this respect the present invention is limited only by how well the REACHES problem is solved for unions. For example, consider the following fragment.
______________________________________
union Cell {
// Storage for re, i, and car overlap.
float re;
int i;
struct {
Cell * car;
Cell * cdr;
} cons;
};
Cell a, b;
b.car = &a;
// Definition #1
use(b.car)
b.re = 3.14;
// Definition #2
______________________________________
If the solver of the REACHES problem is capable of indicating that definition #1 does not reach past definition #2, then the present invention determines that the LGU for b.car in the first definition is replaceable and transforms the example into the code below.
______________________________________
union Cell {
float re;
int i;
struct {
Cell * car;
Cell * cdr;
} cons;
};
Cell a, b;
Cell * temp = &a;
// Definition #1
use(temp)
b.re = 3.14;
// Definition #2
______________________________________
Furthermore, the prior art of Budge et al. unnecessarily breaks structure assignments up into scalars in cases where the present invention does not. For instance, consider the following example in which some components are never used.
______________________________________
struct Pair {
struct Hairy {
float a, b, c, d, e, f, g, h;
} x, y;
} r, s, t;
r.x = some.sub.-- value();
// Set r.x
s.x = r.x
t.x = s.x;
use( t.x ); // Use the value of t.x
______________________________________
The prior art of Budge et al., before attempting replacement, breaks the structure assignments "s.x=r.x" and "t.x=s.x" into multiple scalar assignments before replacement, and thus would yield code similar to that shown below, with separate temporaries u0 . . . u7 for each scalar component of "s.x".
______________________________________
r.x = some.sub.-- value();
// Set r.x
u0 = r.x.a;
u1 = r.x.b;
u2 = r.x.c;
u3 = r.x.d;
u4 = r.x.e;
u5 = r.x.f;
u6 = r.x.g;
u7 = r.x.h;
t.x.a = u0;
t.x.b = u1;
t.x.c = u2;
t.x.d = u3;
t.x.e = u4;
t.x.f = u5;
t.x.g = u6;
t.x.h = u7;
use( t.x ); // Use the value of t.x
______________________________________
In contrast, the present invention, by analyzing least-general-unifiers, determines that s.x need not be decomposed, and transforms the example into the simpler code shown below.
______________________________________
r.x = some.sub.-- value();
// Set r.x
u = r.x
t.x = u;
use( t.x ); // Use the value of t.x
______________________________________
There are two benefits of avoiding unnecessary decomposition of structures (and, more generally, other kinds of aggregates). First, it makes analysis faster because there are fewer components to analysis individually. Second, the resulting machine code can become bloated when all structure assignments are converted into series of scalar assignments. Finally, another benefit of the present invention is that it does not spend machine resources analyzing components of structures that are implicitly present but not used. For instance, consider the following example. float A›1000!; A›42!=x; y=A›42!; Only the component A›42! of the array A is explicitly referenced. When the invention analyzes the example, the region-forest CANDIDATE.sub.-- REGION.sub.-- FOREST will contain a node corresponding to A›42!, but will not contain nodes for the other 999 elements of A. Consequently, the invention can replace the references to A›42! by a scalar variable, without wasting time and space by individually analyzing the 999 implicit components of A.
|
Same subclass Same class Consider this |
||||||||||
