Method and system for identifying instrumentation targets in computer programs related to logical transactions6327700Abstract A method and system for identifying sets of instructions within a computer program, execution of which serve as an indicator for processing of a transaction by the computer program and that together comprise a witness set. The witness set may be employed to monitor execution of the computer program and detect processing of the transaction. Witness sets are constructed by iteratively filtering an initial set of instructions based on profile data collected during execution of the computer program. Claims What is claimed is: Description TECHNICAL FIELD
TABLE 1
instruction opcode # arguments arguments
jsr 04 1 dest
ret 02 1 dest
mov 10 2 dest, source
cmp 20 2 source2, source1
br 30 1 dest
bne 36 1 dest
add 6A 2 dest, source
sub 6B 2 dest, source
The assembly language version of the routine "routeRequest" includes the following types of instructions, described above, in Table 1: (1) "jsr," an instruction that calls the routine specified by an address in a single destination argument; (2) "ret," a return-from-subroutine instruction that directs program execution to a calling subroutine specified by an address in a single destination argument; (3) "mov," a data transfer instruction that moves the value specified by a source argument into a memory location specified by a destination argument; (4) "cmp," an instruction that compares values specified by two source arguments; (5) "br," a branch instruction that directs execution to an address specified by a single destination argument; (5) "bne," a conditional branch instruction that directs execution to an address specified by a single destination argument in the case that execution of the most recent "cmp" instruction has determined that the two values supplied to the "cmp" instruction as arguments are not equal; (7) "add," an instruction that adds a value specified by a source argument to a value specified by a destination argument; and (8) "sub," an instruction that subtracts the value specified by a source argument from the value specified by a destination argument. The hypothetical assembly language used for the assembly language version of the routine "routeRequest" specifies general machine registers in two characters, the character "r" followed by a single decimal digit. A special program counter register is specified as "pc" and a special stack pointer register is specified as "sp." The simple assembly language in FIG. 2 employs three modes of memory location addressing: (1) direct addressing, where the address is specified by an alphanumeric label, such as "L1" or "_fetchEmployee;" (2) register-based addressing, where the contents of a specified register contain the address; and (3) relative addressing, notationally specified as a decimal value offset prefixing a parenthesized register, such as "8(sp)," specifying a memory location with an address 8 bytes greater than the address stored in the register "sp." Literal values are indicated in the assembly language by a "#" prefixing a decimal representation of the value. Note that much of the high-level mnemonic information and comments included in the source code of FIG. 1 are lost in the translation to assembly language in FIG. 2. For example, the fact that the first argument supplied to the routine "routeRequest" is an integer representation of a request type, apparent on line 3 of FIG. 1, is not readily discernible in the assembly language versions of the routine "routeRequest" shown in FIG. 2. Instead, that argument is represented in FIG. 2 as a value contained in the register "r1." The five routines called by the routine "routeRequest" are discernible in the assembly language version, shown in FIG. 2, as "jsr" instructions 202-206. The "switch" statement 116 on line 5 of FIG. 1 is implemented, in assembly language, as a series of "cmp" instructions 208-211 followed by "bne" instructions 214-217. Although assembly language was previously an obligatory intermediate phase in the process of compiling source-level language, such as the routine shown in FIG. 1, to machine code, it is currently more common to compile source-level language to machine code without an intermediate assembly code step. FIG. 3 shows a machine code version of the routine "routeRequest." Machine code is a rather direct translation of the assembly code shown in FIG. 2. The machine code is shown in two columns 302 and 304 consisting of hexadecimal address and hexadecimal value pairs. For example, the hexadecimal value "6B40" 306 is stored in address location "0" 308. Machine code thus comprises a sequence of numbers. Within the hypothetical machine architecture used for the current example, each 8-bit quantity of memory is separately addressable. In the machine code representation of FIG. 3, 32-bit values are shown stored in byte-addressable memory locations, each memory location having an address 4 bytes greater than the previous memory location. For example, the second 32-bit value "00CD" 310 is stored in a 32-bit memory location addressed as "0004" 312. The translation of assembly language instructions into numbers is relatively straight-forward, and is described with reference to Tables 2 and 3, below:
TABLE 2
assembly code assembly code
mask meaning example 1 example 2 machine code
00 direct #12 fetchProvider 4 bytes
01 register r2 r1 1 byte
10 register + 4(sp) 8(sp) 4 bytes + 1
offset byte
The addressing mode of each argument specified for an instruction is indicated by a mask comprising 2 bits. As indicated in Table 2, direct addressing is indicated by the mask "00," register addressing is indicated by the mask "01," and indirect register addressing is indicated by the mask "10." Direct addressing indicates that the 4-byte value representing an argument follows the instruction in memory. Register addressing indicates that a single byte specifying the register containing the value of an argument follows the instruction in memory, and direct register addressing indicates that a 4-byte offset followed by a single byte register specification follows the instruction in memory. Referring to Table 3, single argument instructions are formed by adding the mask to a two-byte opcode, followed by one, four or five bytes representing the argument, depending on the addressing mode specified by the mask. Two-argument instructions, by contrast, are formed from a two-byte opcode, followed by a byte including a two-bit mask for each argument, followed by one, four, or five bytes for each argument, depending on the addressing mode for that argument. For example, the first instruction in the assembly language version of the routine "routeRequest" is: sub sp, #12 This instruction is translated into the eight bytes "6B40 00CD" stored in the first two 32-bit memory locations of the machine code version of the routine "routeRequest" shown in FIG. 3 at addresses "0000" and "0004," respectively. The first two bytes "6B" represent the opcode of the "sub" instruction, as shown above in Table 1. The next byte "4" is the mask for the source and destination registers, represented in binary as "0100," indicating direct addressing for the source argument and register addressing for the destination argument. Next, the four bytes "000C" represent the decimal value "12," and the final byte "D" is a numeric designation of the register "sp." The machine code shown in FIG. 3 is intermediate machine code. In intermediate machine code, addresses are represented generally as relative offsets either from the beginning of a subroutine or from the beginning of tables that contain pointers to other subroutines or to library subroutines. In FIG. 3, the addresses 314-318 internal to the subroutine "routeRequest," represented in FIG. 2 by the alphanumeric labels "L1," "L2," "L3," "L4," and "L5," 220-223, are indicated by single underlining. The table-based addresses 321-323 for the application routines "fetchEmployee" 202, "fetchCustomer" 203, "fetchSupplier" 204, and "fetchProvider" 205 are indicated by double underlining, and the table-based library routine address "_AppErrorLib_AppError" is indicated by a wavy line 324. The intermediate-level machine code of FIG. 3 is finally transformed into executable code that appears within a loaded executable by substituting absolute addressing for the subroutine-based addressing of the intermediate-level code and resolving internal subroutine and external routine hardware routine addressing. In Table 4, below, the initial intermediate-level address indication and the final executable address indication for each of the five routines called by the subroutine "routeRequest" are shown:
TABLE 4
routine initial final
fetchEmployee 01A2 BE10
fetchCustomer 01BC 1090
fetchSupplier 01C9 36A4
fetchProvider 01E5 4BAC
appError CDE1 F132
The executable machine-code version of a computer program comprises millions and millions of contiguous memory locations containing mixed instructions and data, a small portion of which might correspond to the routine "routeRequest," illustrated in FIG. 4. The executable program contains almost no information related to the organization, function, and meaning of the instructions. This is dramatically illustrated in the above-referenced figures. The high-level source-code representation of the routine "routeRequest," shown in FIG. 1, is readily understandable to most trained computer programmers. However, the final machine-code version of the routine "routeRequest," shown in FIG. 4, can be only partially deciphered by very laborious backward translation to a list of instructions resembling the assembly language version of the routine "routeRequest" shown in FIG. 2. When only executable machine code of a computer program is available, the task of instrumenting or monitoring execution of the machine code in order to detect processing of a particular transaction is a daunting, if not a seemingly impossible task. First, instructions executed during processing of the transaction must be identified. Because processing of a transaction may involve hundreds of thousands of subroutine calls and execution of many millions of instructions dispersed throughout the executable, identifying instructions related to a particular transaction is difficult. Moreover, many of the instructions executed during processing of a transaction may correspond to common routines that are executed in processing of many different transactions, making these commonly executed instructions, by themselves, poor indications of the processing of a particular transaction. One approach to discovering a witness set for a particular transaction, taken by one embodiment of the present invention, is to start with a relatively large set of candidate instructions, instrument the executable computer program in order to profile execution of the candidate instructions, and then iteratively filter the candidate instructions based on profiling the executable computer program until a reasonable witness set remains. The final witness set generally contains a number of instructions that are executed only when a particular transaction is processed. The final witness set may also contain instructions that are executed in a less transaction-specific manner, but may instead identify groups of transactions or mark, in combination with transaction-specific instructions, the starting or ending points of transactions. For example, instructions that call subroutines, such as the "jsr" instructions in the assembly language version of the routine "routeRequest" in FIG. 2, may be a reasonable initial set of candidate instructions from which to choose a witness set. Of course, most of these candidate instructions will be eliminated during the filtering process, since most subroutines in a large computer program are either not invoked during processing of a particular transaction or are invoked during the processing of many different types of transactions. The goal of the filtering process is to identify instructions whose execution is characteristic of processing of a particular transaction. For example, the routine "routeRequest" in FIG. 1 may be an intermediary dispatch routine that is called after execution of a number of GUI routines that solicit and process user input, but that is called before execution of any subroutines particular to processing of various types of data retrieval transactions. The GUI user input processing routines are probably executed for a large number of different types of transactions. These routines may, for example, display graphical menus from which a user may select a particular transaction. The call instruction that invokes execution of the routine "routeRequest" is executed during processing of four different types of transactions that retrieve employee data, customer data, supplier data, and service provider data, respectively. The instruction that invokes the routine "routeRequest" may, in some circumstances, be a valuable member of a witness set, since it signals processing of any of four different transactions. If a witness set is being constructed for processing of the transaction that retrieves employee data, then the jsr instruction "jsr_fetchEmployee" (202 in FIG. 2) corresponding to the call to the routine "fetchemployee" on line 8 of FIG. 1 may be the first unambiguously particularized instruction involved in processing of the employee data retrieval transaction. It is very desirable that witness sets include such unambiguously particularized instructions. However, it may also be useful to include somewhat ambiguous instructions, such as the instruction that invokes the routine "routeRequest" in witness sets. It may even be useful, for some types of performance monitoring, to include instructions in a witness set that are executed for the processing of many different types of transactions. One embodiment of the present invention is illustrated in the following C-like pseudo-code implementation. This embodiment may be implemented in many different ways, using any number of different programming languages, with a level of automation ranging from computation interleaved with human user analysis and parameter selection to fully automated computation. The C-like pseudo-code implementation is provided to illustrate one embodiment of the present invention rather than to limit the scope of the present invention to the provided pseudo-code implementation. Comments are provided, in the C-like pseudo-code implementation, bracketed by "/*" and "*/" delimiters, as in the C programming language.
1 findWitnessSet (program *prog, InstructionSet *witnessSet,
Transaction *trans)
2 {
3 InstructionSet initialMonitorSet; /* initial instructions
for instrumentation */
4 InstructionSet monitorSet; /* instructions to
instrument for each
iteration */
5 Profile nextProfile; /* profile generated
from execution of
program */
6 Boolean qualified; /* indicates whether
adequate witness set
found */
7 SetOperation op; /* next set operation
to perform */
8 int numTrans; /* number of times
to process
transaction */
9 ExecutionPlan plan; /* plan for next
witness set
determination */
10
11 initializeMonitorSet (&initialMonitorSet);
12 setEqual (&monitorSet, &initialMonitorSet);
13 qualified = FALSE;
14
15 while (!qualified) /* iterate until a suitable witness
set has been found */
16 {
17 instrumentProgram (&monitorSet, prog);
18 initializePlan (&plan); /* determine an initial witness
set determination plan */
19 clear (witnessSet);
20 while (getNextExecutionParameters (&plan,
&numTrans, &op))
21 {
22 execute (prog, trans, numTrans, &nextprofile);
/* execute program */
23 filter (&nextProfile, numTrans, op, trans);
/* filter resulting profile */
24 switch (op) /* perform set operation indicated by
execution plan */
25 {
26 case SET_DIFFERENCE:
27 setDifference (witnessSet, &nextProfile);
28 break;
29 case SET_UNION:
30 setUnion (witnessSet, &nextProfile);
31 break;
32 case SET_INTERSECTION:
33 setIntersection (witnessSet, &nextProfile);
34 break;
35 }
36 }
37 qualified = qualifyWitnessSet (witnessSet);
38 if (!qualified) refineMonitorSet (&monitorSet,
&initialMonitorSet, &plan, trans);
39 }
40}
The function "findWitnessSet," provided above, takes the following arguments: (1) a pointer "prog" to a computer program; (2) a pointer "witnessSet" to a set of instructions; and (3) a pointer "trans" to a transaction. The data type described by the defined type "InstructionSet" is a type of data structure that contains, in the current implementation, a set of addresses of instructions within a program. The data type described by the defined type "Transaction" is a type of data structure that contains information about a particular transaction, including information about how to cause the program pointed to by prog to process the transaction and information about how processing of the transaction can be identified by either a human user or by automated methods. Additional type definitions used in the C-like implementation include: (1) "Profile," a type of data structure that contains the results of monitoring execution of a program in the form of instruction/count pairs, where each instruction executed during execution of the program is paired with a count of the number of times that the instruction was executed; (2) "SetOperation," a defined type that includes the set operation values "SET_DIFFERENCE," "SET_UNION," and "SET--INTERSECTION;" and (3) "ExecutionPlan," a type of data structure that contains pairs of indications, each pair including a number of times to process the transaction referenced by the pointer "trans" during, the next execution of the program referenced by the pointer "prog" and a SetOperation value to indicate the set operation to apply to the profile resulting from execution of the program referenced by the pointer "prog." On lines 3-9, above, the following local variables are declared: (1) "initialMonitorSet," an initial set of instructions to be monitored, or watched, during execution of the program referenced by the pointer "prog;" (2) "monitorSet," a set of instructions to be monitored, or watched, during execution of the program referenced by the pointer "prog;" (3) "nextProfile," a profile that holds the results of monitoring execution of the program referenced by the pointer "prog;" (4) "qualified," a Boolean variable that stores an indication of whether or not the witness set is of adequate quality; (5) "op," a variable that stores the next set operation to perform on nextProfile and witnessSet; (6) "numTrans," an integer variable that stores the number of times the transaction referenced by the pointer "trans" needs to be processed during the next execution of the program referenced by the pointer "prog;" and (7) "plan," the execution plan for construction of the witness set. The function "findWitnessSet" initializes the variables "initialMonitorSet," "monitorSet," and "qualified" on lines 11-13. The variable initialMonitorSet is initialized, on line 11, to contain some logical set of instructions to monitor in order to generate profiles of execution of the program referenced by the pointer "prog." Instructions of a certain instruction type, instructions associated with calling of certain library routines, and instructions known to be useful markers from previous witness set construction might be selected for this initial monitor set. The variable "monitorSet" is assigned to have the value of the variable "initialMonitorSet" by the call to function "setEqual" on line 12. On line 13, qualified is initialized to contain the Boolean value FALSE. Lines 15-39 comprise a while-loop during each iteration of which a witness set is constructed and evaluated. When the evaluation, carried out by a call to the function "qualifyWitnessSet," on line 37, is successful, the current witness set is selected as the witness set associated with the transaction referenced by the pointer "trans" and the function "findWitnessSet" returns. If the evaluation is unsuccessful, returning a Boolean value FALSE that is stored in the variable "qualified," then the witness set construction parameters are refined or changed via a call to function "refineMonitorSet," on line 38, and the while-loop iterates again, starting on line 17, to construct a different witness set. It is assumed that an adequate witness set can be found for any transaction, so that the function "findWitnessSet" will always return. Construction of a witness set begins on line 17. First, the computer program is instrumented via a call to the function "instrumentProgram" which takes, as arguments, the pointer "prog" and the address of monitorSet. The function "instrumentProgram" prepares for monitoring execution of the instructions in monitorSet during execution of the program referenced by the pointer "prog." Monitoring can be accomplished by replacing monitored instruction with subroutine calls or jump instructions that cause execution of code that increments a counter associated with the instruction as well as execution of the replaced instruction, or can be accomplished using various types of monitoring routines that can monitor instruction execution by the computer on which a program runs. It should be noted that instrumentation may, under certain circumstances, be an optional step. For example, programs may be instrumented in advance, and the preexisting may be used in place of an explicit instrumentation step. Next, on line 18, the variable "plan" is initialized to contain a number of pairs of indications, each pair including a numbers of times to process the transaction referenced by the pointer "trans" and a SetOperation value. Then, on line 19, witnessSet is initialized to contain no instructions. Lines 20-36 comprise an inner while-loop during which the computer program referenced by the pointer "prog" is repeatedly executed and profiled and the resulting profiles are filtered and used to modify the contents of witnessSet. In the while statement, on line 20, the function "findWitnessSet" calls the function "getNextExecutionParameters," passing to getNextExecutionParameters a pointer to the execution plan "plan," in order to extract from plan the number of times to process the transaction and the set operation to be used in the next iteration, placing those values in local variables "numTrans," and "op," respectively. If getNextExecutionParameters returns the Boolean value TRUE, then another iteration of the inner while-loop commences. If, on the other hand, getNextExecutionParameters returns the Boolean value FALSE, then the execution plan is complete, and the witness set is next evaluated on line 37. On line 22, the function "execute" is called to execute the computer program referenced by the pointer "prog," instrumented with the monitor set, in order to process the transaction referenced by the pointer "trans" the number of times indicated by the value of the variable "numTrans," placing a profile resulting from the execution of the computer program into the profile "nextProfile." Execute attempts to invoke or initiate the transaction referenced by the pointer "trans" in different ways in order to obtain a representative average profile of processing of the transaction referenced by the pointer "trans." It should be noted that preexisting profiles may be used by the method and system of the present invention, rather than explicitly generated profiles. Thus, an alternative implementation might use plans that indicate a sequence of set operations to conduct on identified preexisting profiles. Next, on line 23, the profile "nextProfile" is filtered via a call to the function "filter." The profile is filtered based on a variety of characteristics, including the values of the local variables "numTrans," "op," and "trans." Filtering may be done automatically, or may be done through user interaction and selection. In one common approach, filtering comprises removing instructions from the profile that have not been executed the number of times "numTrans" that the transaction was processed in the previous call to the function "execute." Other filtering techniques may be used as well. Then, the contents of witnessSet are updated according to the SetOperation value stored in the local variable "op." If op contains the value SET_DIFFERENCE, then, on line 27, witness set is modified to contain the set difference between the current value of witnessSet the profile "nextProfile" via a call to the function "setDifference." If op contains the value SET_UNION, then, on line 30, witnessSet is modified to be the union of the current value of witnessSet and the instructions in the filtered profile "nextProfile" via a call to the function "setUnion." If op contains the value SET_INTERSECTION, then, on line 33, witnessSet is modified to contain to the set intersection of the current value of witnessSet and the profile "nextProfile" via a call to the function "setIntersection." It should be noted that the set operation functions "setDifference," "setUnion," and "setIntersection" remove the count indications from the addresses of instructions in nextProfile prior to performing set operations. The inner while-loop comprising lines 20-36 continues to iterate until a call to function "getNextExecutionParameters" returns the Boolean value FALSE. When the inner while-loop comprising lines 20-36 is concluded, then, on line 37, the function "qualifyWitnessSet" is called to evaluate the contents of witnessSet determine whether it is of sufficient quality to be designated as a witness set for the transaction referenced by the pointer "trans," as discussed above. The quality of a witness set may be determined computationally or by user intervention and analysis. Quality witness sets may have a number of instructions within a desirable range of numbers, and may also contain instructions that are distributed in certain ways through a section or sections of the program referenced by the pointer "prog." A witness set may also be qualified through additional testing during monitoring of the program referenced by the pointer "prog" instrumented with the witness set. In one approach, a candidate witness set may be refined until the number and identity of instructions within the candidate witness set stabilizes. Generation of an empty candidate witness set during a particular iteration may indicate that either the computer program is not being properly exercised during execution, or that the transaction has not been properly defined or properly correlated with observable events. If witnessSet does not meet the quality standards, then the function "refineMonitorSet" is called, on line 38, to alter the witness set determination parameters, such as the contents of the initial monitor set "initialMonitorSet" and the execution plan "plan," based on the current values of monitorSet and plan. Following alteration of the witness set determination parameters, the outer while-loop comprising lines 15-39 is again executed, to construct a new witness set. The above implementation of the instrumented application constructor is of a general nature. Many different approaches to constructing witness sets are embodied in this implementation. Consider, for example, the construction of a witness set for a specific transaction represented in the implementation above as a single complete iteration of the while-loop comprising lines 15-39. First, the application is instrumented according to an initial monitor set on line 17. The initial monitor set can be chosen according to a number of different criteria. Often, choosing a certain set of library functions may be a reasonable initial monitor set for a large class of logical transactions. Alternatively, choosing subroutine calling instructions may provide a sufficient initial monitor set. The determination of a monitor set may be made by an instrumentation technician, automatically by computational methods, or by some mixture of these two strategies. Next, the execution plan "plan" is initialized, on line 18, to specify a sequence of program executions in order to generate a sequence of profiles. In one common approach, the execution plan may contain the following numTrans/op pairs: 1/SET_UNION, 3/SET_INTERSECTION, 5/SET_INTERSECTION, and 0/SE_DIFFERENCE. This execution plan then controls the while-loop comprising lines 20-36 to conduct the following operations. According to the first numTrans/op pair, the program, instrumented according to the monitor set, is executed to process the transaction one time. The resulting profile is stored into the profile "nextProfile," filtered to retain only instructions executed once, and the instructions contained in the filter profile are stored in witnessSet via a set union operation. Next, according to the numTrans/op pair "3/SET--INTERSECTION," the program is executed to process the transaction three times, and the resulting profile is stored in the profile "nextProfile." The profile "nextProfile" is then filtered to retain only those instructions executed three times. The instruction set "witnessSet" is updated, at this point, via a set intersection operation, to contain the intersection of the instructions currently in witnessSet and the instructions in nextProfile that result from executing the program to process the transaction three times. In general, this greatly reduces the number of addresses in witnessSet so that only those addresses remain that contain instructions executed the same number of times that the transaction in question is processed. Then, according to the numTrans/op pair "5/SET_INTERSECTION," the program is executed so that the transaction in question is processed five times, and witnessSet is modified to contain the intersection of the instructions currently contained in witnessSet and the instruction in the profile "nextprofile" generated by executing the instrumented program to process the transaction five times, filtered to retain only those instructions executed 5 times. At this point, witnessSet contains the addresses of those instructions executed the same number of times that the transaction in question is processed by the program when the transaction is processed one, three, and five times. The addresses of instructions in witnessSet thus have a relatively high probability of being associated with processing of the transaction in question. Generally, processing the transaction 5 times is more than adequate for identifying a high-quality witness set. Finally, according to the numTrans/op pair "0/SET_DIFFERENCE," the program is executed without processing the transaction in question, and the instructions in the resulting profile with counts greater than 0 are subtracted from witnessSet. This has the effect of removing from witnessSet the addresses of any instructions that are executed when the running program does not process the transaction in question. The addresses of the instructions remaining in the witnessSet after the set difference operation are associated with the processing of the transaction in question with very high probability and often comprise a valid and useful witness set. The instruction set "witnessSet" is analyzed, as indicated in the above implementation by the call to "qualifyWitnessSet" on line 37, by interactive, automated, or some combination of interactive and automated methodologies. If witnessSet is found to be of insufficient quality, then the process of constructing a witness set may be tried again after altering certain of the various parameters involved in witness set determination. For example, the monitor set can be altered, either by adding additional instructions, or selecting fewer instructions in which to initially monitor, the execution plan can be changed to modify the number of times the instrumented application is executed, the number of times the transaction in question is processed in each of those iterations, and to alter the order of set arithmetic operations performed on the resulting sequence of profiles generated. Consider the above-described approach with reference to the example of FIGS. 1-4. Although a normal application would comprises many hundreds or thousands of subroutines, the operations outlined above are considering only with respect to the routine "routeRequest," shown above in various forms in FIGS. 1-4, to illustrate the above described witness set construction process using a relatively small example. First, suppose that the initial monitor set includes the target addresses of "jsr" instructions. The machine code of FIG. 4 can be analyzed to locate "jsr" instructions at addresses BC3D, BC5F, BCA3, BCA7, and BCB7. The targets of these "jsr" instructions can then be determined to be the following addresses included in the column "final" in Table 4, above: BE10, 1090, 36A4, 3BAC, and F132. If the first instruction of the routine "routeRequest" is also included in the monitor set, then the initial monitor set in this example is: {BC00, BE10, 1090, 368A, 3BAC, F132} The application is instrumented to watch for the program counter to be set to the values contained in the monitor set. Various techniques can be used to instrument the application, including inserting jump instructions into the executable at the addresses in the monitor set, copying the instructions originally located at those addresses into memory locations at the target of the jump instructions and adding additional instructions after the original instructions to implement counters associated with the addresses and to return execution to instructions following the addresses. Other techniques include insertion of monitoring routines into the operating system or into firmware of the computer in order to monitor the values of the program counter and to count the number of times that the program counter takes on any new values in the monitor set. Additional monitoring techniques are possible. Next, profiles are generated by running the application and causing the transaction of interest to be processed one time. In the current example, assume that the transaction of interest is the retrieval of employee information. Execution of the application to process the employee retrieval transaction one time would likely generate the following profile: {BC00:1, BE10:1, 1090:0, 368A:0, 3BAC:0, F132:0} where the counts for each address in the profile appear to the right of the address, separated from the address by a colon. Next, the above profile is filtered to remove any addresses containing instructions that are not executed one time to produce the following set: {BC00:1, BC10:1} In the above pseudo-code implementation, the instructions in the above filtered profile is placed into the witness set via a set union operation. Next, the application is executed to process the fetchEmployee transaction three times, and the resulting profile is filtered to retain addresses containing instructions executed three times: nextProfile={BC00:3, BE10:1} Then, the witness set is modified to include the set of instructions resulting from a set intersection operation conducted on the current contents of the witness set and nextProfile, which, in this simple example, does not alter the witness set. Note that, for the set operations, the counts, or number of times the instruction contained in an address is executed, is immaterial, and the set operations are conducted only with respect to the instructions contained in the profiles. witnessSet=witnessSet nextProfile={BC00, BE10} It is not uncommon that substantial reduction in the size of a witness set occurs during the above described step, in contrast to this simple example, resulting, in some cases, in a decrease in witness set size from thousands of instructions in the initial monitor set to perhaps tens or hundreds of instructions that are executed once when the transaction in question is executed once, and three times when the transaction in question is executed three times. Then, the application is executed to process the fetchEmployee transaction five times. The resulting profile is filtered to retain only those addresses containing instructions executed five times, which is then intersected with the witness set. This intersection operation leaves the witness set unchanged in the current simple example: witnessSet=witnessSet nextProfile={BC00, BEC0} Again, in constructing a witness set for a real executable, additional contraction of the number of elements in the profile "setProfile" may often occur at this point. Finally, the instrumented application is executed without processing the fetchEmployee transaction, perhaps, in this case, instead executing the fetchCustomer transaction, to produce the following profile: nextProfile={BC00:1, BE10:0, 1090:1, 368A:0, 3BAC:0, F132:0} A set difference is taken between the current contents of the witness set and the profile "nextprofile" to produce the final candidate witness set: witnessSet=witnessSet-nextProfile={BC00, BE10}-{BC00:1 BE10:0, 1090:1, 368A:0, 3BAC:0, F132:0}={BE10} Note that this resulant witness set contains the single address corresponding to the beginning address of the routine "fetchEmployee," as shown in the column "final" of Table 4. In constructing a witness set for a real executable, the initial set may be far larger, and greater filtering may occur during each of the above steps, but the final witness set would likely include the address "BE10 " along with probably many other instructions that are executed only during processing of a fetchEmployee transaction. Although the present invention has been described in terms of a particular embodiment, it is not intended that the invention be limited to this embodiment. Modifications within the spirit of the invention will be apparent to those skilled in the art. For example, as discussed above, the method of the present invention may be practiced over a wide range of degree of automation, ranging from a large degree of manual manipulation and analysis to fully-automated witness set construction systems. For example, various computer program exercising systems are currently available for automatically executing computer programs for testing or performance analysis purposes, and such computer program exercising systems may be used for automating the execution step in the above-described implementation. The witness set construction systems may be implemented in a large number of different ways and using different types of instrumentation and monitoring strategies and systems, different programming languages and modular organizations. Many different types of selection criteria can be applied for selecting instructions for the initial monitor set, for filtering partial set results, for determining the iteration strategies used in the initial and subsequent candidate witness set construction steps, and for determining whether to accept a candidate witness set or to continue to refine the candidate witness set construction process to produce a more useful witness set. The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that the specific details are not required in order to practice the invention. Thus, the foregoing descriptions of specific embodiments of the present invention are presented for purposes of illustration and description; they are not intended to be exhaustive or to limit the invention to the precise forms disclosed, obviously many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications and to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalents:
|
Same subclass Same class Consider this |
||||||||||
