Method and system for target register allocation6321379Abstract A computer-based method and system for allocating target registers to branch operations and for determining the location of target definitions for the branch operations within a computer program. The target register allocation system of the present invention allocates a target register to be specified by each branch operation. The target register is to contain the address of the target that is loaded by the target definition. The target register allocation system determines a location in the computer program for a target definition such that whenever the branch operation is executed, the allocated target register contains the address of the target of the branch. The target allocation system may determine the location to be in a dominator block of the branch operation. The target allocation system may also determine the location a target definition so that the address of the target that is loaded by the target definition can be used by multiple branch operations. The target allocation system may also determine the location of the target definition based on execution frequency of locations. The target allocation system may, when a branch operation is in a loop, determine the location of the target definition to be outside the loop. The target allocation system may, when the program is a function, give preference to a non-callee save register in allocating a target register. The target allocation system may give preference to a callee save register of a function whose invocation is located in between the determined location and the location of the branch operation on a path of execution when allocating a target register. Claims What is claimed is: Description TECHNICAL FIELD
TABLE 1
LIVE RANGES
Field Description
target definition ("d") target definition for this live range
first_op first operation in the range of
operations that include all
operations in "branches"
last_op last operation in the range of
operations that include all
operations in "branches"
target target block for this live range; a
target can also be a called function,
a function return point, or a
computed location of a "switch"
statement
branches set of all branches in this live
range that use the target register
family pointer to the family that includes
this live range
FIG. 7 is a flow diagram of an example routine that implements the TRA system. This routine first numbers each block in the function that is to be allocated target registers. The routine then loops selecting each block in reverse order and establishing the location and target register for the target definition for each branch. Finally, the routine updates the target definitions and adds operations to save and restore any callee save register used by the function. In step 701, the routine invokes the number_operations routine, which numbers the operations in each block. In steps 702-711, the routine loops selecting and processing each block starting with the last block. In step 702, the routine selects the next previous block starting with the last block. In step 703, if all the blocks have already been selected, then the routine continues at step 712, else the routine continues at step 704. In step 704, if the selected block is the last block in a loop, then the routine continues at step 705, else the routine continues at step 706. In step 705, the routine invokes the process_last_block_in_loop routine, which initializes a value for the loop that indicates the callee save register of the functions that are called in the loop. In step 706, if the selected block is a loop preheader, then the routine continues at step 707, else the routine continues at step 708. In step 707, the routine invokes the process_loop_preheader routine, which extends live ranges, as appropriate, to encompass the loop so that the target definition can be placed in the preheader block. In step 708, if the selected block has a branch operation, then the routine continues at step 709, else the routine continues at step 710. In step 709, the routine invokes the start_live_range routine passing the selected block and the branch. The start_live_range routine creates a live range and adds it to a family of live ranges that have the same target and can possibly share the same target definition. In step 710, the routine invokes the coalesce_live_ranges routine, which coalesces the live ranges in the same family that are dominated by the selected block. In step 711, the routine invokes the process_calls_in_block routine and loops to step 702 to select the next previous block. The process_calls_in_block routine creates a live range for the return register and the call register for each call to a function in the selected block. Steps 712-713 are performed after all the blocks have been processed. In step 712, the routine invokes the adjust_target_register_for_families routine, which sets the target registers of the target definitions and branches. In step 713, the routine invokes the process_callee_save_registers routine to add save and restore operations for the callee save registers of the function, as appropriate, and then completes. FIG. 8 is a flow diagram of an example implementation of a number_operations routine. This routine assigns a sequential number to each operation within each block and records the sequential number assigned to the last operation in each block. This routine loops selecting each block and each operation within each block. In step 811, the routine sets the index i, which represents the sequential number, is equal to zero. In step 812, the routine selects the next block starting with the first block. In step 813, if all the blocks have already been selected, then the routine returns, else the routine continues at step 814. In step 814, the routine selects the next operation in the selected block starting with the first. In step 815, if all the operations have already been selected, then the routine continues at step 817, else the routine continues at step 816. In step 816, the routine sets the sequential number ("loc") of the selected operation to the index i, increments the index i, and loops to step 814 to select the next operation. In step 817, the routine sets the last operation ("block_last_op") within the selected block equal to the index i minus 1 to record the number of the last operation in the block and loops to step 814. FIG. 9 is a flow diagram of an example implementation of a process_last_block_in_loop routine. This routine is invoked whenever the last block in a loop is encountered. This routine initializes the call limit (i.e., minimum number of callee save register used by a function that is called in the loop) for the loop depth of the passed block and records the number of the last operation in the passed block so that a live range can be extended to that last block in the loop. In step 901, the routine increments the loop depth ("loop_depth"). In step 902, the routine sets the call limit for this loop depth ("call_limit(loop_depth)") equal to the available number of target registers ("MAX_LIVE") since no calls to functions within the loop have been encountered yet. In step 903, the routine records the number of the last operation in this loop ("loop_last_op(loop_depth)") as the last operation in the passed block ("block_last_op (block)") and then returns. FIG. 10 is a flow diagram of an example implementation of a process_loop_preheader routine. This routine extends any live ranges that can encompass the loop. This routine is passed the block that is the preheader of a loop. In step 1001, the routine invokes the extend_loop_live_ranges routine to extend any live ranges. In step 1002, the routine decrements the loop depth ("loop_depth") because a loop is being exited. In step 1003, if the preheader is within an outer loop ("loop_depth>0"), then the routine continues at step 1004, else the routine returns. In step 1004, the routine adjusts the call limit for the outer loop ("call_limit(loop_depth)") to the minimum of the call limit of the outer loop depth and of the call limit of the inner loop ("call_limit(loop_depth+1")) whose preheader was just processed. This adjustment is made so that the minimum number of callee save registers for the inner loop is reflected in the outer loop. The routine then returns. FIG. 11 is a flow diagram of an example implementation of the coalesce_live_ranges routine. This routine coalesces the live ranges for each active family that is dominated by the passed block. The coalescing represents those live ranges by a single live range with its target definition in the passed block. In step 1101, the routine selects the next set of live ranges ("s") such that all of the live ranges ("l") in the set are in the dominator list ("dom_list") for the passed block and all the live ranges in the set are in the same active family. In step 1102, if all such sets have already been selected, then the routine returns, else the routine continues at step 1103. In step 1103, the routine invokes the coalesce routine passing the passed block and the selected set to coalesce the live ranges in the selected set into a single live range. The routine then loops to step 1101 to select the next set. FIG. 12 is a flow diagram of an example implementation of a process_calls_in_block routine. This routine creates a live range for the return register and the call register for each call to a function in the passed block. This routine attempts to assign active families to callee save registers of the called function so that their contents survive the call. In step 1201, the routine selects the previous operation in the passed block starting with the last operation. In step 1202, if all the operations in the passed block have already been selected, then the routine returns, else the routine continues at step 1203. In step 1203, if the selected operation is a call, then the routine continues at step 1204, else the routine loops to step 1201 to select the next previous operation. In step 1204, the routine invokes the process_call routine passing the selected operation to create a live range for the return register. In step 1205, the routine invokes the start_live_range routine passing the block and the selected operation to create a live range for the call register. The routine then loops to step 1201 to select the next previous operation. FIG. 13 is a flow diagram of an example implementation of a adjust_target_registers_for_families routine. After all the blocks have been processed, this routine sets the target register for each target definition and for each branch in all the families. In step 1301, the routine selects the next family ("f"). In step 1302, if all the families have already been selected, then the routine returns, else the routine continues at step 1303. In step 1303, the routine selects the next live range ("l") of the selected family ("f.range"). In step 1304, if all the live ranges have already been selected, then the routine loops to step 1301 to select the next family, else the routine continues at step 1305. In step 1305, the routine sets the target register for the target definition of the selected live range ("l.d") and sets the branch instructions for the selected live range ("l.branches") to use the target register of the selected family ("f.tr"). The routine then loops to step 1303 to select the next live range in a family. In one embodiment, the callee save registers may not be the lowest number target register. If not, this routine maps the calculated target register to the actual target register. FIG. 14 is a flow diagram of an example implementation of a process_callee_save_registers routine. This routine inserts operations to save and restore the callee save registers that a family uses so that the function that is having its target registers allocated will adhere to the callee save register convention. In step 1401, the routine selects the next callee save register. In step 1402, if all the callee save registers have already been selected, then the routine returns, else the routine continues at step 1403. In step 1403, if a family is assigned to the selected callee save register, then the routine continues at step 1404, else the routine loops to step 1402 to select the next callee save register. In step 1404, the routine adds operations to save and restore the selected callee save register and loops to step 1402 to select the next callee save register. FIG. 15 is a flow diagram of example implementation of a start_live_range routine. This routine is passed an indication of a block and of a branch. This routine creates a live range for the passed block and branch. The routine sets the initial location of a target definition for the branch within the passed block, initializes the bounds of the live range to the location of the branch, and adds the live range either to an active family with the same target or, if none exists, to a newly created family. In step 1501, the routine creates a live range ("l") for the passed branch and creates a target definition ("l.d"). In step 1502, if the branch is a computed branch, the routine determines a computed branch location ("avail") before which the target definition cannot be placed. In step 1503, the routine sets the first ("l.first") and the last ("l.last") operation in the live range equal to the location of the passed branch operation ("loc(branch)"). The target definition is located in the block that contains the first operation. In step 1504, if an active family has a target that is the same as the passed branch, then the newly created live range can be added to that active family and the routine continues at step 1506, else a new active family is needed and the routine continues at step 1505. In step 1505, the routine invokes the create_new_active_family routine. In step 1506, the routine invokes the update_active_family routine to add the newly created live range to the active family. In step 1507, if the passed block has a dominator ("dom(block)!=NULL") and the dominator is after the computed branch location ("block_last_op(dom(block)>avail"), if any, then the routine continues at step 1508, else the routine returns. In step 1508, the routine adds the live range ("l") to the dominator list of the dominator block ("dom_list(dom(block))"), so that when that dominator block is encountered, the live range can be extended to include the dominator block, and the routine then returns. FIG. 16 is a flow diagram of an example implementation of a create_new_active_family routine. This routine creates a new active family and assigns a target register to it. The routine sets a family to inactive if all target registers are in use, so that one target register is available to be assigned to the newly created active family. In step 1601, the routine creates a new family ("f"). In step 1602, the routine indicates that this family is not bound to a certain target register ("f.physical"). In step 1603, the routine sets the limit for this new family ("f.limit") equal to the largest available target register ("MAX_LIVE"), which means that any target register can be assigned to this family. The limit is adjusted based on the callee save register of functions called within the live range. In step 1604, if all the target registers are in use by active families, then the routine continues at step 1605, else the routine continues at step 1606. In step 1605, the routine invokes the spill_cheapest routine to ensure that one target register is not in use by an active family. In step 1606, the routine selects the largest numbered target register ("tr") that is not assigned to an active family. The selection of the largest numbered target register results in non-callee save registers being assigned before callee save registers. In step 1607, the routine sets the target register for the new family ("f.tr") to the selected target register ("tr") and sets the pointer to the next active family ("f.next") to point to the family to which the selected target register was last assigned ("last_def(tr)"). In step 1608, the routine sets the last family defined for the selected target register ("last_def(tr)") to the newly created family. In step 1609, the routine sets the new family to active and returns. FIG. 17 is a flow diagram of an example implementation of the update_active_family routine. This routine is passed a family ("f"), a live range ("l"), and a block in which the live range ends. This routine adds the passed live range to the active family. The routine coalesces any live ranges in the active family that are dominated by the passed block because they can all share the same target definition that is placed in the passed block. In step 1701, the routine increases the cost of the family ("f.cost)" by the frequency of the passed block ("freq (block)"). In step 1702, the routine adds the created live range ("l") to the family ("f.ranges"). In steps 1703-1709, the routine coalesces all live ranges in the passed family that are dominated by the passed block. In step 1703, the routine selects a next live range in the family that is also in the dominator list for the passed block ("dom_list(block)"). In step 1704, if all such live ranges have already been selected, then the routine returns, else the routine continues at step 1705. In step 1705, the routine adds the branches of the selected live range ("x.branches") to the set of branches of the passed live range ("l.branches"). In step 1706, the routine sets the last operation of the passed live range ("l.last") to the greater of the last operation of the passed live range and the last operation of the selected live range. In step 1707, the routine removes the selected live range from the family. In step 1708, the routine decrements the cost of the family ("f.cost") based on the execution frequency of the block that contained the target definition of selected live range ("freq(x.d)"). In step 1709, the routine deletes the definition for the selected live range ("x.d") and loops to step 1703 to select the next live range. FIG. 18 is a flow diagram of an example implementation of the extend_loop_live_range routine. This routine extends, as appropriate, the live ranges of those active families whose last operation is within the loop. If the target register for the family is not a callee save register for each function called in the loop or the next family that uses that target register starts in the loop, then the routine tries to find another target register that is such a register. Otherwise, the routine extends the family to encompass the loop. If the routine can find no such other target register, then the routine sets a family to inactive and repacks the registers. If, however, such a target register can be found, then the routine assigns that found target register to the family and extends the family to encompass the loop. In step 1801, the routine sets the end of the loop ("end") to the last operation at that loop depth ("loop_last_op(loop_depth)"). In step 1802, the routine selects the next active family ("f") in order of decreasing cost ("f.cost"). In step 1803, if all the active families have already been selected, then the routine returns, else the routine continues at step 1804. In step 1804, if the last operation in the selected family is not within the loop ("last(f).gtoreq.end"), then the live range already includes the loop and the routine loops to step 1802 to select the next active family, else the routine continues step 1805. In step 1805, the routine calculates the limit for the target registers ("limit") for the selected family as the minimum of the limit for the family ("f.limit") and the call limit of this loop ("call_limit(loop_depth)"). In step 1806, the routine determines whether the target register for the selected family needs to be changed. The target register needs to be changed if the target register is greater than the calculated limit or if the next live range in the family extends into the loop. In step 1806, if the target register for the selected family ("f.tr") is greater than the calculated limit or the first operation of the next family ("first (f.next)") and is less than the last operation in the loop ("end"), then routine continues at step 1807, else the routine continues at step 1809. In step 1807, the routine invokes the find_new_target_register routine. In step 1808, if the invoked routine indicates to extend the range, then the routine continues at step 1809, else the routine loops to step 1802 to select the next active family. In steps 1809-1810, the routine extends the live ranges in the selected family to encompass the loop. In step 1809, the routine sets the limit for the family ("f.limit") to be the calculated limit. In step 1810, for each live range of the selected family ("f.ranges"), the routine sets the last operation of the live range ("l.last") equal to the last operation in the loop ("end") and loops to step 1802 to select the next active family. FIG. 19 is a flow diagram of an example implementation of the find_new_target_register routine. This routine is passed a family, a limit for the target registers, and the end of the loop. The routine finds a suitable target register for the family, if possible. In step 1901, the routine looks for a suitable target register ("tr"). A target register is suitable if it is a callee save register for each called function in the live ranges of the family ("tr<=limit") and either the target register has not yet been assigned to an active family ("last_def(tr)==NULL") or is defined outside the loop ("first (last_def(tr))>end") and that family is not active or that family has the least cost. In step 1902, if no suitable target register exists or if the passed family is bound to a specific target register ("f.physical==TRUE"), then the routine continues at step 1903, else the routine continues at step 1905. In step 1903, the routine sets the passed family to inactive because no available target register is suitable, In step 1904, the routine invokes the repack register routine passing an indication of the target register of the passed family ("f.tr)" and returns a flag indicating that the passed family is not to be extended. In steps 1905-1909, the routine switches the passed family to the found target register. In step 1905, the routine sets the family, if any, that was last assigned to the found target register to inactive. In step 1906, the routine unassigns the passed family from its currently assigned target register ("last_def(f.tr)=f-next"). In step 1907, the routine sets the passed family to point to the family last assigned to the found target register ("f.next=last_def(tr)"). In step 1908, the routine links the passed family onto the list of families for the found target register ("last_def(tr)=f"). In step 1909, the routine sets the target register of the passed family to the found target register ("f.tr=tr") and returns with a flag indicating that the passed family is to be extended. FIG. 20 is a flow diagram of an example implementation of the coalesce routine. This routine coalesces the live ranges in a set that are dominated by a block. This routine is passed the block and the set of live ranges. In step 2001, the routine calculates the total cost for the live ranges in the passed set ("cost=.SIGMA.freq(l.d)"). In step 2002, if the calculated cost is greater than the frequency of the passed block, then a reduction in cost would be achieved by moving the target definition into the passed block and the routine continues at step 2003, else the routine continues at step 2011 to extend the live ranges in the set to the passed block if the passed block dominates the live range. In steps 2003-2019, the routine coalesces the live ranges in the passed set. In step 2003, the routine selects a live range ("l") from the passed set. In step 2004, the routine decrements the cost of the family of the selected live range by the block that contains the target definition of the live range and increments that cost by the frequency of the passed block. The routine moves the target definition for the selected live range ("l.d") to the passed block and sets the first operation of the selected live range ("l.first") to equal the last operation of the passed block ("block_last_op(block)"). In steps 2005-2009, the routine loops selecting the other live ranges from the passed set and coalesces them into the selected live range. In step 2005, the routine selects another live range ("x") from the passed set to be coalesced. In step for 2006, if all the other live ranges have already been selected, then the routine continues at step 2010, else the routine continues at step 2007. In step 2007, the routine removes the live range to be coalesced ("x") from the ranges for its family ("x.family.ranges"). The routine also decrements the cost of the family of the selected live range by the frequency of the block that contains the live range to be coalesced. In step 2008, the routine adds the branches of the live range to be coalesced to the branches of the selected live range ("l.branches+=x.branches") and sets the last operation for the selected live range to be the maximum of the last operation for the selected live range and the live range to be coalesced. In step 2009, the routine deletes the definition for the live range to be coalesced and removes the live range to be coalesced from the set, and loops to step 2009 to select another live range. In step 2010, the routine decrements the cost of the family of the selected live range by the cost for the live ranges that were coalesced minus the frequency of the passed block ("l.family.cost-=cost-freq(block)"). In step 2011, if there is a dominator for the passed block ("dom(block).noteq.NULL"), then the routine continues at step 2012, else the routine returns. In steps 2012-2015, the routine loops extending the live ranges in the set. In step 2012, the routine selects the next live range in the passed set. In step 2013, if all the live ranges in the passed set have already been selected, then the routine returns, else the routine continues at step 2014. In step 2014, if the last operation of the dominator of the passed block is greater than the location of the operation that sets the computed branches ("f.avail"), then the selected live range can be added to the dominator list for the dominator of the passed block. The routine then continues at step 2015, else the routine loops to step 2012 to select the next live range in the passed set. In step 2015, the routine adds the selected live range to the dominator list of the dominator of the passed block and loops to step 2012. FIG. 21 is a flow diagram of an example implementation of the process_call routine. This routine is invoked when a call is encountered and ensures that the return register (e.g., register 1) is available to store the call return address. The routine creates a live range for the return register and a family for that live range. The routine then checks whether any active families use a target register that is not a callee save register for the called function. If so, the routine tries to find a suitable callee save register for the family. If none can be found, then the routine sets that family to inactive and repacks the registers. If, however, one can be found, then the routine assigns that suitable callee save register to the family. In step 2101, the routine sets the family to which the return register was last assigned to inactive. In step 2102, the routine creates a new live range ("l") for the return register and creates a new family ("f"), assigns the target register 1, which is the return register in one embodiment, and sets the family to be constrained to its target register. In steps 2103-2109, the routine loops processing each active family. If an active family is using a target register that is not a callee save register of the called function, then the routine tries to find a callee save register that the active family can use. If no such target register is found, then the routine sets the family to inactive and repacks the registers. If such a target register is found, then the routine sets the family that was last assigned that found register to inactive and switches the selected active family to use the found register. In step 2103, the routine selects the next active family in order of decreasing cost. In step 2104, if all the active families have already been selected, then the routine returns, else the routine continues at step 2105. In step 2105, if the target register for the selected family is greater then the number of callee save registers ("NUMM_CALLEE_SAVE") for the called function, then the routine attempts to find a suitable target register for the selected active family and the routine continues at step 2106, else the routine continues at step 2114. In step 2106, the routine looks for a suitable target register ("tr") that is a callee save register for the function being called and for which no family has been assigned that register ("last_def(tr)"==NULL") or the last assigned family does not overlap the selected family ("first(last_def(tr))>last(f)") and, if active, that family has the least cost. In step 2107, if no such target register exists or if it is bound to a certain target register ("f.physical==TRUE"), then the routine continues at step 2108, else the routine continues at step 2110. In step 2108, the routine sets the selected family to inactive. In step 2109, the routine invokes the repack_register routine passing the target register of the selected family. The routine then loops to step 2103 to select the next active family. In steps 2110-2113, the routine assigns the suitable target register to the selected family. In step 2110, the routine sets the family to which the suitable target register ("last_def(tr)") was assigned, if any, to inactive. In step 2111, the routine unlinks the selected family from its current target register ("last_def(f.tr)=f.next"). The routine also sets the target register for that selected family to the found target register ("f.tr=tr"). In step 2112, the routine links the selected family into the list of families for the found target register. In step 2113, the routine sets the limit for the family to the number of callee save registers ("NUM_CALLEE_SAVE"). In step 2114, if the call is in a loop, then the routine continues at step 2115, else the routine loops to step 2103 to select the next active family. In step 2115, the routine sets the limit for the current loop ("call_limit(loop_depth)") to the minimum of the limit for the current loop and the number of callee save registers. The routine then returns to select the next active register. FIG. 22 is a flow diagram of an example implementation of the spill_cheapest routine. In step 2201, the routine selects the active family with the lowest cost. In step 2202, the routine sets the selected family to inactive. In step 2203, the routine invokes the repack_register routine passing the target register of the selected active family and returns. FIG. 23 is a flow diagram of an example implementation of the repack_register routine. This routine determines if families assigned lower numbered registers can be assigned to higher numbered registers starting at the passed register. This routine assumes that the lower numbered registers are the callee save registers of the function whose target registers are being allocated by the TRA system. The routine selects those families whose live ranges are earlier in the function than any of the live ranges of a family that uses a higher numbered target register and assigns those families to the higher numbered target register. In step 2301, the routine records the first operation in the live ranges for the family that was last assigned the passed register ("start=first(last_def(reg))"). Families whose last operation is less than recorded value can be assigned to the passed register. In step 2302-2310, the routine loops processing each lower numbered register. In step 2302, the routine selects the next lower numbered register ("r") starting with the next lower numbered register than the passed register. In step 2303, if all such registers have already been selected, then the routine returns, else routine continues at step 2304. In step 2304, the routine selects the end operation ("end") such that all families assigned to the selected target register with their last operation less than the end operation and are not constrained to a certain register ("f.physical==FALSE") and the limit (of callee save register of called functions within the family) of the family is greater than or equal to the passed register ("f.limit.gtoreq.reg"). In step 2305, the routine selects the family ("f0") most recently assigned to the selected target register ("last_def(r)"). In step 2306, if there are some such families of the selected target register ("last(f0))<end"), then some families can be reassigned to the passed register and the routine continues at step 2306, else the routine loops to step 2302 to select the next lower numbered register. In step 2307, the routine selects the last family ("f") that can be moved. In step 2309, the routine unlinks those families that can be moved from the selected register and links them into the front of the linked list of families for the passed register, the routine sets the last defined family for the selected register to point to the next family of family f1 ("last_def(r)=f1.next"), sets the next family of family f1 to be the last defined family for the passed register ("f1.next=last_def(reg)"), and sets the last defined family for the passed register to be family f0 ("last_def (reg)=f0"). In step 2310, the routine recursively invokes the repack_register routine passing the selected register ("r") to repack lower numbered registers into the selected register and then returns. Tables 5-8 illustrate the processing of a sample function by the TRA system. Table 5 contains the pseudo code for the function. The blocks are labeled B1-B8. Block B2 has two parts--B2a and B2b --to represent the call within the block. The frequency column contains the execution frequencies of each block relative to the first block. The dominator column indicates the dominator block of each block. The target requirements column indicates branches that require target definitions. The example assumes that there are 3 target registers, that there are 2 callee save registers, and that the return register is register 1.
TABLE 5
PSEUDO CODE
Pseudo Code Frequency Dom Target Requirements
B1: S1 1.0 N/A
do{
B2a: S2 8.0 1 call/call return
call f;
B2b: S3
while( )
B3: S4 1.0 2 branch to B6
if( )then
B4: S5 0.5 3 branch to B8
if( )goto B8;
B5: S6 0.25 4
endif
B6: S7 0.75 3 branch to B8
if( )then
B7: S8 0.375 6
endif
B8: S9 1.0 3
Table 6 illustrates the live ranges and family that are created when processing the pseudo code.
TABLE 6
LIVE RANGES
Name Branches Target Physical? Family
lr1 b6 B8 no 1
lr2 b5 B8 no 1
lr3 b3 B6 no 2
lr4 b2 B2a no 3
lr5 call ret B2b yes(1) 4
lr6 call f no 5
Table 7 describes the processing performed by the TRA system for each block.
TABLE 7
ACTION OF THE ALGORITHM ASSUMING
NUM_CALLEE_SAVE = 2 AND MAX_LIVE = 3
Block Action
8 no action
7 no action
6 create live range lr1 and family f1
f1.tr = 3
active={f1}
dom_list(3) = {lr1}
5 no action
4 create live range lr2 and add to existing family f1
f1.ranges = {lr1,lr2}
active={f1}
dom_list(3) = {lr1,lr2}
3 create live range lr3 and family f2
f2.tr = 2
active = {f1,f2}
dom_list(2) = {lr3}
coalesce live ranges in dom_list(3):
cost of b3 = 1, total cost of {lr1,lr2} is 1.25
so move definition of lr1 to block 3
remove definition of lr2 and remove lr2 from f1
lr1.branches = {b6,b4}
dom_list(2) = {lr1,lr3}
2 start a new loop
loop_depth = 1
loop_last(1) = end of block 2
call_limit(1) = MAX_LIVE
create a live range lr4 and family f3
f3.tr = 1
active = {f1,f2,f3}
dom_list(1) = { lr4 }
no coalescing since the cost of block 2 is 8
dom_list(1) = {lr1,lr3,lr4}
create a live range lr5 and family f4 for call return. The return
address register needs to be made available. The cheapest element of
the
active set is removed. f1 and f2 have equal cost, assume f1 is
removed.
Family f3 is moved to target 3 and
f3.next = f1
f4.tr = 1
f4.physical = yes
active = {f2,f3,f4}
dom_list(1) = {lr1,lr3,lr4,lr5}
callee-save requirement is enforced
Consider families f4,f3,f2 in that order.
F4 (in 1) is already in a callee. (f4.limit = 2)
f3 (in 3) is not and has higher cost than f2 so
remove f2 from the active set and move f3 to
target
register 2. (f3.tr=2, f3.limit=2)
active={f3,f4}
call_limit(1) = 2 /* for inner loop */
create a live range lr6 and family f5 for the call to
function f.
f5.tr = 3
active = {f3,f4,f5}
dom_list(1) = {lr1,lr3,lr4,lr5,lr6}
1 a loop header: attempt to extend all active live
ranges to include the entire loop. In cost order, consider
families
f5,f4,f3. There are no target registers available in the range of
1..call_limit(1)=2 so remove family 5 from the active set. Both
families
3 and 4 can be extended to the entire loop. lr4.last = last(1)
lr5.last =
last(1)
loop_depth = 0
active = {f3,f4}
attempt to coalesce. only live ranges lr4 and lr5
are still active and both have cost greater than the
current
block so the definitions of these live ranges are moved into block 1.
Table 8 illustrates the final placement of the target definition and assignment of the target register.
TABLE 8
REGISTERS USED
Pseudo Code Target Definition
B1: S1
tr1 = B2b
tr2 = B2a
do {
B2a: S2
tr3 = f
call f tr3, tr1
B2b: S3
while( ) tr2
B3: S4
tr3 = B8
tr2 = B6
if( ) then tr2
B4: S5
if( ) goto B8; tr3
B5: S6
endif
B6: S7
if( ) then tr3
tr3
B7: S8
endif
B8 S9
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.
|
Same subclass Same class Consider this |
||||||||||
