Method to determine dynamic compilation time and to select bytecode execution mode6546550Abstract To perform efficient execution of a bytecode by combining an interpreter and a compiler. At a time of a bytecode execution by an interpreter, if an instruction to be executed is a backward conditional branch instruction, it is determined whether the backward conditional branch instruction is a back edge of a loop. And if it is determined the instruction is a back edge of a loop, the number of the loop iteration is estimated and stored into a storage. A bytecode execution mode is selected according to the estimated number of the loop iteration. This execution mode comprises the modes of immediately compiling a method including a loop, and having the interpreter execute a bytecode. Claims What is claimed is: Description DETAILED DESCRIPTION OF THE INVENTION
address bytecode
5 (head of a loop)
. . . (a process in a loop)
17 iinc 0 1 /*increase variable i by 1*/
20 iload_0 /*variable i to a stack*/
21 sipush 1000 /*1000 to a stack*/
24 if_icmplt 5 /*in the case of (i<1000), to address 5*/
There are several types of patterns of such a bytecode sequence according to description of a loop on a source code. They are detected by performing pattern matching backward from a conditional branch instruction so as to promptly determine whether it is a loop. If a bytecode once determined to be a loop turns out not to be a loop in fact, as mentioned later, an error code is returned from JIT compiler and, in this case, it can be executed by an interpreter. 2. If a possibility of being a loop is high, the number of the loop iteration from now on is estimated from operands of bytecodes. This should also be estimated in as short a time as possible. In the above-mentioned example, the number of the loop iteration can be estimated as 999 times from second operand (1) of iinc, operand (1000) of sipush and the current value of i. In the case of branching forward, in the case that a branch instruction is not taken, and in the case that it does not match any pattern, modification of a bytecode mentioned later in section 1.3 is implemented. Moreover, the estimated number of the loop iteration is stored in memory. 1.2: Change of an Operation Mode An operation mode from now on is determined according to the number of the loop iteration estimated as above. For instance, if an estimated number of the loop iteration is n, and the threshold values for changing an operation mode are N1<N2<N3<N4, an operation mode is changed as follows: (1) If n.ltoreq.N1, nothing is done, and it is compiled according to the number of method calls. (2) If N1<n.ltoreq.N2, a variable holding the number of method calls are manipulated so as to be compiled with fewer method calls than usual. (3) If N2<n.ltoreq.N3, a variable holding the number of method calls are manipulated so as to be compiled with a next method call. (4) If N3<n.ltoreq.N4, a hot spot compilation is immediately performed, and if possible, it is executed with a compiled code from a next iteration of the loop. (5) If N4<n, a hot spot compilation is immediately performed at a higher optimization level, and if possible, it is executed with a compiled code from a next iteration of the loop. As above, as to (1) to (4), a compilation timing is either determined or adjusted. In (5), two or more optimization levels may be determined. A "hot spot compilation" referred to here is to create a code, in addition to an ordinary compilation, which has an entry point from the head of this loop. JIT compiler creates an entry point to the head of a method by compiling the entire method, and additionally creates a code for entering at the head of a designated loop and returns it to an interpreter. At this time, if the designated part turns out not to be a loop, JIT compiler returns an error code to the interpreter. In response to a return code from JIT compiler, the interpreter executes with a compiled code from a next iteration if a hot spot compilation is successful. If not, it is executed by the interpreter. 1.3: Modification of a Bytecode Since a loop does not need to be checked twice, a bytecode instruction already checked should be modified to be a bytecode instruction not to check. At this time, it is also possible to modify it to be a different instruction depending on whether a conditional branch was taken or not taken in order to help optimization of JIT compiler. In the above example, a bytecode of if_icmplt is modified to be "if_icmplt_taken" or "if_icmplt_nottaken" which is defined by using an unused bytecode. JIT compiler can generate an optimized code by using this dynamic information. In addition, even if this conditional branch instruction is executed again with an interpreter, it can handle it with an interpretation routine optimized to whether it was taken or not taken respectively. The following explains an actual example, namely a case where the present invention is incorporated in Java.TM. interpreter/JIT compiler. In other words, while there are several methods for composing a loop in the Java.TM. language, for, while, do while are of high frequency of occurrence. Here it is explained by taking an example of Java.TM. bytecode compiler (Java.TM.c) which is incorporated in a JDK (Java.TM. Development Kit) of Sun Mircosystems, Inc. 2.1: Identification of a Loop and Estimation of the Number of the Loop Iteration Of loop control structures describable in the Java.TM. language, the following is covered as one which is identifiable as to whether a loop or not at a low cost, predictable as to the number of the loop iteration, and frequently used in general: Structure: for((process0));(condition);(increase and decrease of loop variable)) { (process 1) } and (increase and decrease of a loop variable) ends by * increasing or decreasing a loop variable by a value of a constant such as i++, i-+2, and (condition) is * a comparison of a loop variable and a constant such as i<1000, i!=0, * a comparison of a loop variable and a variable not changeable in a loop such as i<j, i!=j, and * a comparison of a loop variable and a static value or a field value such as i<m i<this.m. Even loops described in a while sentence and a do-while sentence are covered as long as represented in the above form A back edge of such a loop is compiled as a bytecode shown in the following Table 1, if (condition) is a comparison with 0 according to Java.TM.c. Here, boldface is an operation code, and opr means operand.
TABLE 1
type 1 iinc opr opr iload.sub.-- x ifxx
type 2 iinc opr opr iload opr ifxx
A bytecode shown in Table 1 can be identified by using state transition as shown in Appendix.A in the following Table 2.
TABLE 2
Appendix.A
Notation used in the following is described.
(state X) : Name of state.
{. . . . . } : What is known when transitioned to the state.
look-ahead pc[-x] : Check a bytecode of x byte before a conditional
branch instruction.
opcode1: : If a result of look-ahead pc[-x] is opcode1,
process the right of:.
<verify type x> ? (state X) : (state Y)
: Verify simply whether an instruction pattern is
type x of a bytecode pattern, and if OK, transit to
(state X), and if not, transit to (state Y).
(state 0) { } : look-ahead pc [-1]
iload.sub.-- x: <verify type 1> ? (is type 1) : (state 1)
others: (state 1)
(state 1) {pc[-1] != opcode} : look-ahead pc[-2]
iload: <verify type 2> ? (is type 2) : (not a loop)
others: (not a loop)
On the other hand, if (condition) is a comparison with anything other than 0, the above-mentioned back edge of a loop is compiled by a Java.TM.c as a bytecode shown in Table 3 below.
TABLE 3
type 1 iinc opr opr iload_x bipush
opr if_xx
type 2 iinc opr opr iload opr bipush
opr if_xx
type 3 iinc opr opr iload_x sipush opr
opr if_xx
type 4 iinc opr opr iload opr sipush opr
opr if_xx
type 5 iinc opr opr iload_x ldc_quick
opr if_xx
type 6 iinc opr opr iload opr ldc_quick
opr if_xx
type 7 iinc opr opr iload_x ldc_w_quick opr
opr if_xx
type 8 iinc opr opr iload opr ldc_w_quick opr
opr if_xx
type 9 iinc opr opr iload_x
iload_y if_xx
type 10 iinc opr opr iload opr
iload_y if_xx
type 11 iinc opr opr iload_x iload
opr if_xx
type 12 iinc opr opr iload opr iload
opr if_xx
type 13 iinc opr opr iload_x getstatic_quick opr
opr if_xx
type 14 iinc opr opr iload opr getstatic_quick opr
opr if_xx
type 15 iinc opr opr iload_x aload_0 getfield_quick opr
opr if_xx
type 16 iinc opr opr iload opr aload_0 getfield_quick opr
opr if_xx
A bytecode shown in Table 3 certainly has an operation code two to three bytes before if_xx, so they can be identified by using state transition as shown in Appendix.B in the following Table 4.
TABLE 4
Appendix.B
See Appendix.A as to notation used in the following.
(state 0) { } : look-ahead pc [-2]
bipush: (state 1)
ldc.sub.-- quick: (state 3)
iload.sub.-- x: <verify type 9> ? (is type 9) : (state 7)
iload: (state 5)
others: (state 7)
(state 1) {pc[-2] = bipush} : look-ahead pc[-3]
iload.sub.-- x: <verify type 1> ? (is type 1) : (state 2)
others: (state 2)
(state 2) {pc[-2] =bipush,pc[-3]!=opcode}:look-ahead pc [-4]
iload: <verify type 2> ? (is type 2) : (state 7)
others: (state 7)
(state 3) {pc[-2] = ldc.sub.-- quick} : look-ahead pc[-3]
iload.sub.-- x: <verify type 5> ? (is type 5) : (state 4)
others: (state 4)
(state 4) {pc[-2] = ldc.sub.-- quick,pc[-3] ! = opcode} : look-ahead pc[-4]
iload: <verify type 6> ? (is type 6) : (state 7)
others: (state 7)
(state 5) {pc[-2] - iload} : look-ahead pc[-3]
iload.sub.-- x: <verify type 11> ? (is type 11) : (state 6)
others: (state 6)
(state 6) {pc[-2] = iload,pc[-3] ! = opcode} : look-ahead pc[-4]
iload: <verify type 12> ? (is type 12) : (state 7)
other: (state 7)
(state 7) {pc[-2]! = opcode} : look-ahead pc[-3]
sipush: (state 8)
ldc.sub.-- w.sub.-- quick: (state 10)
iload: <verify type 10> ? (is type 10) : (not a loop)
getstatic.sub.-- quick: (state 12)
getfield.sub.-- quick: (state 14)
others: (not a loop)
(state 8) {pc[-3] = sipush} : look-ahead pc[-4]
iload.sub.-- x: <verify type 3> ? (is type 3) : (state 9)
others: (state 9)
(state 9) {pc[-3] = sipush,pc[-4]! = opcode} : look-ahead pc[-5]
iload: <verify type 4> ? (is type 4) : (not a loop)
others: (not a loop)
(state 10) {pc[-3] = ldc.sub.-- w.sub.-- quick} : look-ahead pc[-4]
iload.sub.-- x: <verify type 7> ? (is type 7) : (state 11)
others: (state 11)
(state 11) {pc[-3] = ldc.sub.-- w.sub.-- quick,pc[4]! = opcode} :
look-ahead pc[-5]
iload: <verify type 8> ? (is type 8) : (not a loop)
others: (not a loop)
(state 12) {pc[-3] = getstatic.sub.-- quick} : look-ahead pc[-4]
iload.sub.-- x: <verify type 13> ? (is type 13) : (state 13)
others: (state 13)
(state 13) {pc[-3] = getstatic.sub.-- quick,pc[-4]! = opcode} : look-ahead
pc[-5]
iload: <verify type 14> ? (is type 14) : (not a loop)
others: (not a loop)
(state 14) {pc[-3] = getfield.sub.-- quick} : look-ahead pc[-5]
iload.sub.-- x: <verify type 15> ? (is type 15) : (state 15)
others: (state 15)
(state 15) {pc[-3] = getfield.sub.-- quick,pc[-5]! = opcode} : look-ahead
pc[-6]
iload: <verify type 16> ? (is type 16) : (not a loop)
others: (not a loop)
If it is found that a possibility of being a loop is high, the number of the loop iteration is estimated next. For instance, in the case of type 3 in the above example, it is easily acquired by taking an increment of a loop variable from an operand of iinc, the current value of a loop variable held by a local variable from an operand of iload_x, and a value to be compared with a loop variable from an operand of sipush. Moreover, if it matches a pattern up to immediately before iinc instruction, it is found likely to be a back edge of a loop though the number of the loop iteration cannot be estimated. So, as a default operation, for instance, a variable holding the number of method calls is manipulated so that this method will be compiled when it is called next time. 2.2: Overhead in a Branch Instruction with No Possibility of Being a Back Edge In the case of Pentium (a trademark of Intel Corp.), an interpretation routine of if.sub.c-- icmplt before adopting the present invention is as shown in Appendix.C in the following Table 5, and it takes nine cycles if a condition is taken and five cycles if not taken. As opposed to this, an interpretation routine of if_icmplt after adopting the present invention is as shown in Appendix.D in the following Table 6, and it takes ten cycles if a forward conditional branch is taken and five cycles if not taken. Consequently, it is clear that, if the present invention is adopted, it takes overhead of only one cycle at a time of a forward conditional branch.
TABLE 5
Appendix.C
An interpretation routine of if_icmplt and execute cycles on a Pentium
processor before adopting the present invention are as follows.
entry_if_icmplt:
; cycle-1
cmp edx,eax ; Compare v1 and v2
jge SHORT if_icmple_not_tken
; If v1 >= v2, the condition
; is not taken
; if_icmplt_taken: ; (A process when a condition
; is taken)
; cycle-2
mov ch,BYTE PTR [esi+1] ; ch = a first operand of
; if_icmplt
; cycle-3
mov cl,BYTE PTR [esi+2] ; cl = a second operand of
; if_icmplt
; cycle-4
shl ecx, 16
; cycle-5
sar ecx, 16 ; ecx = a branch offset
; cycle-6
mov ebx,DWORD PTR [esi+ecs-1] ; Load a word including
; a next bytecode.
; cycle-7
and ebx, 0000FF00H ; Mask leaving an operation
; code.
add esi, ecx ; Update bytecode program
; counter.
; cycle-8
add ebx, L0_HandlerStart_ ; Calculate an address of a
; next bytecode
; interpretation routine.
; cycle-9
jmp ebx ; Jump to a next bytecode
; interpretation routine.
if_icmplet_not_taken: ; (A process when a condition
; is not taken)
; cycle-2
mov ebx,DWORD PTR [esi+3-1] ; Load a word including a
; next bytecode.
; cycle-3
and ebx,0000FF00H ; Mask leaving an operation
; code.
add esi,3 ; Update bytecode program
; counter.
; cycle-4
add ebx,L0_HandlerStart_ ; Calculate an address of a
; next bytecode
; interpretation routine.
; cycle-5
jmp ebx ; Jump to a next bytecode
; interpretation routine.
It is call check_backedge (check if it is a back edge of a loop) of cycle-7 in if_icmplt_taken_backward: (a process at a backward conditional branch) that should be noted here. This process not only checks whether it is a back edge of a loop but also includes a process of manipulating a variable holding the number of method calls in (2) and (3) of 1.2 Change of an Operation Mode. Moreover, as shown in cycle-8, the estimated number of the loop iteration is stored in a storage. Advantages of the Invention As it is clear from the above description, according to the present invention, detection of a method including a loop and estimation of an effect of compiling the method are performed at an execution time, and an execution mode is selected based on the estimation so as to allow processing speed to be significantly increased. In addition, if it is organized so that an execution history of a conditional branch instruction is recorded, optimization of a compiler is also facilitated. DESCRIPTION OF THE SYMBOLS
1: Server computer
3: Network
5: Client computer
10: Java .TM. source code
12: Java .TM. compiler
14: Bytecode
52: Java .TM. VM
54: Java .TM. interpreter
56: Java .TM. JIT compiler
58: Machine code
60: Garbage collector
53: OS
55: Hardware (including CPU and memory)
|
Same subclass Same class Consider this |
||||||||||
