Analysis of the effect of program execution of calling components with data variable checkpointing and resource allocation analysis6154876Abstract An error detection mechanism for detecting programming errors in a computer program. A component of the computer program, e.g., a procedure or function of the computer program, is analyzed to determine the effect of the component on resources used by the computer program. A component is analyzed by traversing the computer instructions, i.e., statements, of the component and tracking the state of resources used by the component as affected by the statements of the component. Each resource has a prescribed behavior represented by a number of states and transition between states. Violations in the prescribed behavior of a resource resulting from an emulated execution of the statements of the component are detected and reported as programming errors. Resources used by two or more components are modelled by modelling externals of the components. The effect of execution of a component on externals and resources of the component is determined by traversing one or more possible control flow paths through the component and tracking the use of each external and resource by each statement of each control flow path. Once the effect of execution of a component on externals and resources of the component is determined, a model of the component is created and used to model externals and resources of other components which invoke the modelled component. Claims What is claimed is: Description CROSS REFERENCE TO MICROFICHE APPENDIX
TABLE A
______________________________________
U = Unallocated
A = Allocated
Q = Questionably allocated
X = Invalid ("NULL")
E = Error or unknown state
______________________________________
States U and X are similar but distinct: an item associated with an unallocated resource has an indeterminate value, and an item associated with an invalid resource has a known, invalid value. A resource behavior model can be as complex as the actual behavior of the resource whose behavior is modelled. However, even substantially simplified resource behavior models such as that represented in state diagram 300 are effective in detecting a substantial majority of all possible errors in the use of such a resource. Resources are initially in state U since a resource is initially unallocated. Emulated execution of each computer instruction, actual execution of which causes a change in the state of a resource, applies an operation to the resource. By application of an operation to a resource, the state of the resource changes according to state diagram 300. The following are the operations which can be applied to a resource.
TABLE B
______________________________________
a = definitely allocates
m = maybe allocates
k = kills, i.e., frees or deallocates
c = uses in a calculation
p = uses in a predicate
i = uses in an indirection
x = mark invalid
______________________________________
Thus, according to state diagram 300, if an unallocated resource, i.e., a resource in state U, is definitely allocated by an instruction in a function, thereby applying operation a, the resource is then in state A, i.e., allocated. However, if an unallocated resource, i.e, in state U, is used in a calculation, thereby applying operation c, the resource is then in state E. State E indicates that a state violation has occurred as a result of a programming error. State E is optional in that state E does not describe the prescribed behavior of a resource, but is used in the disclosed embodiment as a convenient way to represent a state violation. In an alternative embodiment, state E is omitted and a violation is detected in the above example by noting that, when a resource is in state U, operation c is undefined. State diagram 300 (FIG. 3A) is summarized in Table C below.
TABLE C
______________________________________
New States Resulting from Operations
operation:
a m k c p i x
______________________________________
old state:
U: A Q U.sup.1
E.sup.2
E.sup.2
E.sup.6
E.sup.2
A: A Q U A A A X
Q: A Q U.sup.3
A.sup.4
A A.sup.4
X
X: A Q U.sup.5
E.sup.6
X E.sup.6
X
E: A Q U E E E E
______________________________________
Superscript numerals corresponding to operation identifiers in state diagram 300 and to new state identifiers in Table C indicate specific errors. The errors are listed in Table D.
TABLE D
______________________________________
1 Freeing an unallocated or freed resource.
2 Using an unallocated or freed resource.
3 Freeing potentially-allocated data without
checking.
4 Using potentially-allocated data without checking.
5 Freeing NULL data.
6 Using (e.g., dereferencing) NULL data.
______________________________________
In the example given above, applying operation c to a resource in state U places the resource in state E as indicated in state diagram 300 by an arrow from state U to state E identified by "c.sup.2 ". Thus, the error in this example is error number 2 in Table D, namely, the use of an unallocated resource. Each function model specifies which operations are applied to each external of a corresponding function. For example, function fopen(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.5.3, defines two parameters, the first of which is accepted as input and which specifies a file to be opened, and defines a returned item which is a file pointer corresponding to the opened file. File pointers, i.e., pointers to items of the type "FILE", are well-known and are described in the C Standard at Section 7.9.1. The file pointer is an external of function fopen() and the file specified by the parameter is the resource associated with the external. The function model for function fopen() specifies that a new resource whose initial state is state Q is created. The initial state of the resource is state Q rather than state A because function fopen() does not guarantee that the file is opened successfully. Function fclose(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.5.1, defines a parameter which is a file pointer. Execution of function fclose() closes the file to whose file descriptor the parameter points. The function model for function fclose() specifies that an operation k is applied to the parameter to reflect closing, and thus deallocating, the associated file. Similarly, function models for functions of the C computer language defining read and write operations to the file specify application of an operation c to a resource representing the file to reflect use of the file. If an item corresponding to a resource, e.g., the file pointer which is the returned item of function fopen(), is used as a predicate in a decision instruction, operation p is applied to the resource to thereby change the state of the resource according to state diagram 300. An item is used in a predicate if the item appears as an operand in a relational expression (e.g., an operation involving any of operators >, <, .ltoreq., .gtoreq., and !=) or a boolean expression (e.g., an operation involving any of operators &&, .parallel., and !) or if the item is used as the control expression in a "switch" statement. The "switch" statement is defined for the C computer language and controls flow of a function according to the value of the control expression. The "switch" statement is described more completely in the C Standard at Section 6.6.4.2. If an item corresponding to a resource is used in a calculation, operation c is applied to the resource to thereby change the state of the resource according to state diagram 300. An item is used in a calculation (i) if the item appears as an operand to a mathematical operation (e.g., +, /, *, or -), (ii) if the resource appears as a dereference of a pointer or as an access into an array, or (iii) if the resource appears as an array index. Pointers and arrays are well-known and are described in the C Standard. For completeness, pointers and arrays are briefly described herein. In the context of the C computer language, a pointer is an item whose value is the address in memory of another item. Thus, a pointer "points" to the other item. Dereferencing a pointer is retrieving the item to which the pointer points. Data structures, which are used to implement the disclosed embodiment of the present invention and which are described below in greater detail, are described as including pointers to other data structures. It is appreciated that mechanisms other than pointers are known for uniquely identifying a data structure and that these mechanisms can be substituted for pointers without deviating from the principles of the present invention. An array is a collection of one or more items of similar structure. The items of an array are called elements and are numbered sequentially. An access to an array is an access to an element of the array by reference to the number of the element, i.e., the index of the element. Operation x is applied to a resource corresponding to an item which is assumed to be NULL. NULL is generally an invalid value and is assigned to an item to indicate that the item has no valid value. For example, a pointer whose value is NULL points to no item. In the context of the C computer language, NULL is also a boolean value of "false". An item is assumed to be NULL, i.e., to have a value of NULL, if the item is compared to NULL and the result of the comparison is assumed to be true. As described more completely below, analysis of a function requires that assumptions be made regarding the particular behavior of the function when executed. For example, function fopen() either successfully opens a file or fails to do so. If the returned item, i.e., the file pointer, is compared to NULL and the result is assumed to be true, i.e., if function fopen() is assumed to have failed, operation x is applied to the resource representing the file as described more completely below. Illustrative Examples of the Basic Principles of the Present Invention The utility of the modelling of resources is described by way of example. The following source code excerpt (1) includes a programming error which is detected by the disclosed embodiment of the present invention. Source code excerpt (1) comports with the known C computer language and defines a function example.sub.-- 1(). Line numbers, which are not part of the C computer language, are added for clarity in the discussion below.
______________________________________
1 #include <stdio.h> (1)
3 #define MAX.sub.-- STR.sub.-- LEN 100
4 #define FALSE 0
5 #define TRUE 1
6
7 int example.sub.-- 1(input.sub.-- file.sub.-- name) /* begin function
*/
8 char *input.sub.-- file.sub.-- name;
/* parameter to function */
9 {
10 char *str;
/* Declaration of local variable "str" */
11 FILE *fptr; /* Declaration of local variable "fptr" */
12
13 /* try to open a file */
14 fptr = fopen(input.sub.-- file.sub.-- name, "r");
15 if (fptr == NULL)
16 {
17 /* could not open the file */
18 fprintf(stderr, "Could not open file %s.backslash.n",
19 input.sub.-- file.sub.-- name);
20 return FALSE; /* an error */
21 }
22 /* allocate some memory for a string buffer */
23 str = (char *)malloc(MAX.sub.-- STR.sub.-- LEN);
24 /* get some input from the file */
25 fgets(str, MAX.sub.-- STR.sub.-- LEN - 1, fptr);
26 /* print out the information */
27 printf(str);
28 /* clean up */
29 free(str);
30 fclose(fptr);
31 return TRUE; /* no error */
32 }
______________________________________
As function example.sub.-- 1() is analyzed, the state of each item, including each external, is tracked. Variable "str" is locally-defined, i.e., is defined only in the context of function example.sub.-- 1(). Variable "str" is a pointer to data whose type is "char" as defined in line 10. However, variable "str" is initially uninitialized and points to no specific data. Therefore, variable "str" is not associated with a resource. Execution of function malloc(), which is defined for the C computer language and which is described in the C Standard at Section 7.10.3.3, accepts a request for allocated memory, e.g., memory 104 (FIG. 1), and either allocates the memory or fails to do so. Function malloc() returns, as the returned item, a pointer to the allocated memory if the memory is successfully allocated or a NULL pointer otherwise. Therefore, function malloc() creates a new resource whose initial state is state Q and associates the new resource with the returned item of function malloc(). After variable "str" is assigned the value of the returned item of function malloc() at line 23, variable "str" points to newly allocated memory if such memory is allocated or is a NULL pointer otherwise. At line 25 of source code excerpt (1), variable "str" is used as a parameter in function fgets(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.7.2. Execution of function fgets() dereferences the first parameter, which is variable "str" in the context of line 25 of source code excerpt (1). Therefore, operation i is applied to the resource associated with variable "str". As shown in state diagram 300 (FIG. 3A) and Tables C and D, application of operation i to a resource in state Q places the resource in state A, producing an error message indicating that potentially allocated data is used without checking. At line 29 of source code excerpt (1), variable "str" is passed as a parameter to function free(), which frees, i.e., deallocates, the memory to which variable "str" points. Therefore, operation k is applied to the resource associated with variable "str". As shown in state diagram 300 and Tables C and D, application of operation k to a resource in state A places the resource in state U. Since deallocation of an allocated resource is proper, no error is reported. Text (2) below illustrates the error messages produced by the disclosed embodiment of the present invention in analyzing function example.sub.-- 1() of source code excerpt (1).
______________________________________
example.sub.-- 1.c: In function `example.sub.-- 1`:
(2)
example.sub.-- 1.c:25: warning:
(6) : dereferencing invalid
data (argument 0)
______________________________________
In text (2), "example.sub.-- 1.c" refers to a file containing source code excerpt (1) above, and thus defining function example.sub.-- 1(). Thus, function example.sub.-- 1() fails to account for the contingency that there may be insufficient memory to allocate the amount of memory requested in calling, i.e., invoking execution of, function malloc() at line 23 of source code excerpt (1). If function malloc() fails to allocate the requested memory during execution of function example.sub.-- 1(), the computer process in which function example.sub.-- 1() is executed aborts abruptly without giving to a user an indication of the reason for the unexpected termination of processing. However, detecting and reporting the failure to account for such a contingency using, for example, text (2) above provides the developer of function example.sub.-- 1() with the necessary information to correct the defect in function example.sub.-- 1() and to properly provide for such a contingency. The utility of the present invention is further illustrated by considering the tracking of the state of file pointer "fptr" in function example.sub.-- 1() of source code excerpt (1). File pointer "fptr" is a locally-defined variable of function example.sub.-- 1(). File pointer "fptr" is a pointer to data of the type "FILE". Initially, file pointer "fptr" is uninitialized and is not associated with any resource. The returned item of function fopen() is assigned to file pointer "fptr" at line 14. As described above, function fopen() creates a new resource, whose initial state is state Q, and associates the new resource with the returned item of function fopen(). The "if" statement at line 15 determines whether the file to which file pointer "fptr" points is successfully opened by comparing file pointer "fptr" to NULL. If file pointer "fptr" is NULL, the file is not successfully opened and function example.sub.-- 1() terminates after reporting to a user the failure to open the file. Conversely, if file pointer "fptr" is not NULL, the file to which file pointer "fptr" points is known to be successfully opened and function example.sub.-- 1() continues at line 22. The comparison of file pointer "fptr" in line 15 applies operation p to the resource associated with file pointer "fptr". Thus, the state of the resource associated with file pointer "fptr" is changed from state Q to state A. As a result, any uses of file pointer "fptr", either in calculation (applying operation c) or in a predicate (applying operation p) do not produce any error messages as shown in state diagram 300 and Table C. Therefore, no errors with respect to the treatment of file pointer "fptr" are detected. As described above, functions fopen() and malloc(), when executed, perform specific processing on resources of parameters and returned items. Functions such as functions fopen() and malloc() are included in library functions 112 (FIG. 1) which are accessed by computer process 110. Calls to such functions are included in function 202 (FIG. 2). As used herein, a "call" to a function is a statement which, when executed, causes a processor, such as CPU 102 (FIG. 1), to (i) supply zero or more items as parameters to the function, (ii) execute the function, and (iii) produce a returned item representing the value to which the function evaluates if a returned item is defined by the function. A first function, which includes a call to a second function, is called a "calling function." The second function is called a "called function." To properly analyze resources of function 202 (FIG. 2) affected by execution of functions called by statements of function 202, function models describing the behavior of such called functions are maintained. In one embodiment, such function models are created from well-known textual descriptions of the behavior of such functions, e.g., from the C Standard, and those function models are stored in memory 104 of computer 100. Those function models are then retrieved from memory 104 prior to analyzing a computer program as described more completely below. The following are illustrative examples of function models of some of the functions called by function example.sub.-- 1() of source code excerpt (1) above. All of the called functions are from the C standard library's "stdio" (input/output) header file which is a well-known file for use with the C computer language and which is described in the C Standard in Sections 7.9 et seq.
______________________________________
(malloc /* model for function malloc()
*/ (3)
(retval (new Q "memory"))
/* returned item:
creates a new, possibly
allocated resource */
((param 0) (op c)) /* parameter 0: used in
a computation */ )
______________________________________
A function model structure, which represents in memory 104 (FIG. 1) a function model according to the disclosed embodiment of the present invention, is described more completely below. Function model (3) defines the effect of execution of function malloc() on the respective states of the externals of function malloc(). According to function model (3), a new resource is created, initialized to state Q, and associated with the returned item of function malloc(). Function model (3) also specifies that operation c is applied to parameter 0, i.e., the first parameter, of function malloc().
______________________________________
(free /* model for function free() */
(4)
((param 0) (op k)))
/* parameter 0: free (kill)
*/
______________________________________
Function model (4) represents the effect of execution of function free() on the externals of function free() and specifies that operation k is applied to parameter 0, i.e., the first parameter in the argument list.
______________________________________
(fgets /* model for function fgets() */
(5)
((param 0) (op i))
/* parameter 0 (string
buffer) : apply operation i,
indirection */
((param 1) (op c))
/* parameter 1 (buffer
length) : use in computation
(op c) */
((param 2) (op i))
/* parameter 2 (the file):
indirection (op i -- file must
be open) */
______________________________________
Function model (5) specifies that (i) operation i is applied to parameter 0, i.e., the first parameter, (ii) operation c is applied to parameter 1, i.e., the second parameter, and (iii) operation i is applied to parameter pb 2, i.e., the third parameter, by calling function fgets(). Detection of Resource Leaks By modelling resources and tracking associations of resources with externals of a function, the disclosed error detection mechanism provides a convenient mechanism for detecting resource leaks. A resource is "leaked" by a function when execution of the function terminates, leaving the resource in an allocated state, when the resource cannot be accessed by any external of the function. When a resource is leaked, the resource cannot be used since no pointer to the resource remains after execution of the leaking function terminates. If the resource is reusable, such as dynamically allocated memory 114 (FIG. 1), failure to free the resource prior to termination of execution of the function prevents other functions from reusing the resource. A process fragment which repeatedly leaks dynamically allocated memory can ultimately cause exhaustion of all memory which is available to the computer process of which the process fragment is a part. As an example of detection of a resource leak, function example.sub.-- 2() of source code excerpt (6) is considered.
______________________________________
0 #include <stdio.h)> (6)
1 #include <string.h)>
3 #define MAX.sub.-- STR.sub.-- LEN 100
4 #define FALSE 0
5 #define TRUE 1
6
7 char *example.sub.-- 2(input.sub.-- file.sub.-- name) /* begin
function */
8 char *input.sub.-- file.sub.-- name;
/* parameter to the function */
9 {
10 char *str;
/* declare local variable "str" */
11 FILE *fptr; /* declare local variable "fptr" */
12
13 /* allocate some memory for a string buffer */
14 str = (char *)malloc(MAX.sub.-- STR.sub.-- LEN);
15 /* check to ensure that the allocation succeeded */
16 if (str == NULL)
17 return NULL;
18 /* try to open a file */
19 fptr = fopen(input.sub.-- file.sub.-- name, "r");
20 if (fptr == NULL)
21 {
22 /* could not open the file */
23 fprintf(stderr, "Could not open file %s.backslash.n",
24 input.sub.-- file.sub.-- name);
25 return NULL; /* error condition */
26 }
27 fgets(str, MAX.sub.-- STR.sub.-- LEN - 1, fptr);
28 fclose(fptr);
/* close file */
29 return str; /* no error */
30 }
______________________________________
Variable "str" is local to function example.sub.-- 2() and is therefore not accessible to any function other than function example.sub.-- 2(). Since the memory to which variable "str" points is not freed prior to instruction "return" of line 25 of source code excerpt (6), that memory is not useable and cannot be deallocated or reallocated until computer process 110, which function example.sub.-- 2() partly defines, terminates. That resource therefore "leaks" from computer process 110. Since an external of a function is an item which exists past the termination of execution of the function, any allocated resource reachable through an external is not leaked. A resource which is not associated with a particular external can, in some circumstances, be reachable through the external. For example, a resource which is associated with a particular element of an array of items is reachable through an external which is a different element of the array of items. This is true since the location in memory of an element of an array can be calculated from the location of any other element of the array according to the C computer language. Leaks are checked at the conclusion of a traversal of a function. The detection of leaks is described more completely below and is summarized briefly here. All resources reachable through any external are marked. Any resource which is not marked and which is allocated is reported as leaked. Since variable "str", at line 25, is not returned, variable "str" is not an external. The memory pointed to by variable "str" is therefore allocated and not marked at the conclusion of the traversal of function example.sub.-- 2(). The memory pointed to by variable "str" is therefore leaked. Analysis of function example.sub.-- 2() produces the following error message.
______________________________________
example.sub.-- 2.c: In function `example.sub.-- 2`:
(7)
example.sub.-- 2.c:25: warning:
(15) : leaking resources
allocated on line 14
______________________________________
Static checkers of the prior art cannot detect resource leaks. Run-time checkers of the prior art often do not consider all potential events which might cause a function to leak a resource and generally cannot analyze a single function outside of the context of a larger computer program to detect resource leaks in that single function. In contrast, the disclosed embodiment of the present invention provides for efficient detection of resource leaks by analysis of a single function of a larger computer program. As described more completely below, the disclosed error detection mechanism considers all possible events which might cause a function to leak a resource. The present invention therefore represents a significant improvement over the prior art. Composite States of Externals As described more completely below, a function is analyzed by following the flow of control of the function, emulating execution of individual statements of the function, and tracking the state of externals and resources. The flow of control through a function is the particular sequence of computer instructions of the function executed during a particular execution of the function. When control transfers from a first computer instruction to a second computer instruction, the second computer instruction is executed following execution of the first computer instruction. The flow of control through a function is sometimes called herein the control flow path through the function. Flow of control through a function is often dependent upon particular events which occur during execution of the process fragment, defined by the function, in a computer process. In analyzing a function, it is preferred to consider all possible control flow paths through the function. It is therefore preferred to consider all events which can influence the control flow path through the function. Static checkers of the prior art often do not consider control flow paths at all. Run-time checkers only consider all control flow paths through a particular function to the extent a user can coerce, through manipulation of the events which influence the control flow path of the function, a computer process to follow each possible control flow path during execution of the computer process. In contrast, the disclosed error detection mechanism analyzes each possible control flow path through a function automatically without user intervention. Furthermore, the disclosed error detection mechanism can analyze a function outside of the context of a computer program or computer process which includes the function. Thus, individual functions can be more completely checked for errors prior to inclusion in a larger function or computer program or process. As an example, function example.sub.-- 2() of source code excerpt (6) is considered. The precise control flow path through function example.sub.-- 2() is not known until function example.sub.-- 2() is executed in a computer process. For example, control flows from the "if" statement at line 16 to a call to function fopen() at line 19 if function malloc(), called at line 14, successfully allocates memory as requested. In other words, if function malloc() successfully allocates memory as requested when called at line 14, the call to function fopen() at line 19 follows execution of the "if" statement at line 16. Conversely, control flows from the "if" statement at line 16 to the "return" statement at line 17 if the allocation of memory fails. Whether memory is successfully allocated by function malloc() as called at line 14 is typically not known until function example.sub.-- 2() is executed in a computer process. In analyzing function example.sub.-- 2(), it is preferred that each possible control flow path through function example.sub.-- 2() is considered. Multiple control flow paths through a function are considered by multiple traversals of the function under varying assumptions. For example, function example.sub.-- 2() is traversed once under the assumption that function malloc(), called at line 14, successfully allocates the requested memory and once under the assumption that function malloc() fails to allocate the requested memory. In one embodiment of the present invention which is described below in greater detail, a function is traversed repeatedly, and, during each traversal, assumptions are made by random chance. Each traversal of function example.sub.-- 2() tracks the state of the externals of function example.sub.-- 2(). Each external has a composite state which reflects the states of the external resulting from multiple traversals of function example.sub.-- 2(). Externals have composite RS, CP, and DK states. These composite states are used for the dual purposes of (i) detecting inconsistent uses of an external when varying control flow paths through the function are considered and (ii) building a function model describing the effect of execution of the function on the externals of the function. The function model can then be used to analyze other functions which call the modelled function. Within the context of a particular function, each external has a CP state, a DK state, and a RS state. The CP state of an external is used to determine whether the external is checked before being used. The term "CP" is derived from the operations of primary concern: operation c, which represents use of the external, before operation p, which represents checking of the external. The DK state of an external is used to determine whether the function allocates and/or frees the external. The term "DK" is derived from the purpose of the DK state: to determine whether a resource is defined ("D") before being killed ("K"), i.e., freed. The RS state of an external is the state of the resource associated with the external if a resource is so associated. The term "RS" is derived from resource ("R") state ("S"). Each external of a function also has a composite CP state, a composite DK state, and a composite RS state reflecting multiple CP, DK, and RS states, respectively, resulting from multiple traversals of the function. After each iterative traversal of a function, a new composite RS state of an external is composed, as described more completely below, from the previous composite RS state of the external and the RS state of the resource associated with the external resulting from the most recent traversal of the function. In a similar fashion, as described more completely below, new composite CP and DK states are composed from previous composite CP and DK states, respectively, and CP and DK states, respectively, resulting from the most recent traversal of the function. State diagram 350 (FIG. 3B) represents states and state transitions for a composite RS state. Arrows are used in state diagram 350 to represent composite RS state transitions from a previous composite RS state according to an RS state resulting from a traversal of the function. State diagram 350 is summarized in Table E.
TABLE E
______________________________________
New Composite RS States
next RS state: U A Q X E
______________________________________
U: U Q Q Q E
previous A: Q A Q Q E
composite Q: Q Q Q Q E
RS state: X: Q Q Q X E
E: E E E E E
______________________________________
State diagram 400 (FIG. 4A) represents states and state transitions for a CP state of an external. Arrows are used in state diagram 400 to represent CP state transitions resulting from application of operations. An external can have any of the following CP or composite CP states.
TABLE G
______________________________________
O = Used in neither a predicate nor a computation
(initial state).
C = Used in computation before checking.
I = Used for indirection before checking.
P = Checked (used in predicate) before using.
N = Neither; assigned to before checking or using.
______________________________________
The operations which can be applied to an external are described above with respect to Table B. State diagram 400 is summarized in Table H below.
TABLE H
______________________________________
New States Resulting from Operations
operation: a m k c p i x
______________________________________
old state:
O: N N C C P I N
C: C C C C C C C
I: I I I I I I I
P: P P P P P P P
N: N N N N N N N
______________________________________
State diagram 450 (FIG. 4B) represents states and state transitions for a composite CP state of an external. Arrows are used in state diagram 450 to represent composite CP state transitions from a previous composite CP state according to a CP state resulting from a traversal of the function. State diagram 450 is summarized in Table I below.
TABLE I
______________________________________
New Composite CP States
next CP state: O C I P N
______________________________________
O: O C I P N
previous C: C C I C C
composite I: I I I I I
CP state: P: P P I P P
N: N C I P N
______________________________________
State diagram 500 (FIG. 5A) represents states and state transitions for a DK state of an external. Arrows are used in state diagram 500 to represent DK state transitions resulting from application of operations. An external can have any of the following DK or composite DK states reflecting the effect of execution of the function on a resource associated with the external.
TABLE J
______________________________________
O = The function neither allocates nor kills the
resource (initial state).
A = The function definitely allocates the resource.
Q = The function questionably allocates the resource.
K = The function kills, i.e., deallocates, the
resource.
KA = The function kills, then definitely allocates, the
resource.
KQ = The function kills, then questionably allocates,
the resource.
E = Error (unknown state).
______________________________________
The operations which can be applied to an external are described above with respect to Table B. State diagram 500 is summarized in Table K below.
TABLE K
______________________________________
New States Resulting from Operations
operation: a m k c p i x
______________________________________
old state:
O: A Q K O O O O
A: A A O A A A A
Q: Q Q O Q Q Q Q
K: KA KQ K K K K K
KA: KA KA K KA KA KA KA
KQ: KA KQ K KQ KQ KQ KQ
E: E E E E E E E
______________________________________
State diagram 550 (FIG. 5B) represents states and state transitions for a composite DK state of an external. Arrows are used in state diagram 550 to represent composite DK state transitions from a previous composite DK state according to a DK state resulting from a traversal of the function. State diagram 550 is summarized in Table L below.
TABLE L
______________________________________
New Composite DK States
next DK state: O A Q K KA KQ E
______________________________________
O: C A Q K KA KQ E
A: A A Q E E E E
previous Q: Q Q Q E E E E
composite K: K E E K E KQ E
DK state: KA: KA E E E KA KQ E
KQ: KQ E E KQ KQ KQ E
E: E E E E E E E
______________________________________
Function example.sub.-- 2() of source code excerpt (6) above provides an illustrative example of the utility of composite states of externals. As described above, flow of control through function example.sub.-- 2() can take any of several paths depending on assumptions made with respect to events during an emulated execution of the function. For example, the "if" statement at line 16 can be followed by the "return" statement at line 17, if variable "str" is not NULL, or by the expression on line 19, otherwise. The returned item of function example.sub.-- 2() is an external of function example.sub.-- 2(). The returned item of function example.sub.-- 2() is assigned at line 17, line 25, or line 29 of source code excerpt (6) depending only the particular assumptions made during a particular traversal of function example.sub.-- 2(). At line 17 or line 25, the returned item has no associated resource. Thus, after a traversal of function example.sub.-- 2() in which control transfers through either line 17 or line 25 of source code excerpt (6), the composite RS state of the external representing the returned item is state U. After a subsequent traversal of function example.sub.-- 2() in which control transfers through line 29, the external representing the returned item is associated with a resource created within function example.sub.-- 2() and is definitely allocated, i.e., in state A. The resource is definitely allocated because lines 16-17 of source code excerpt (6) properly prescribe an action to be taken in the event that execution of function malloc() does not successfully allocate memory. As shown in state diagram 350 (FIG. 3B), an external, whose previous composite RS state is state U and whose next RS state is state A, has a new composite RS state of state Q. Such reflects the fact that execution of function example.sub.-- 2 can allocate, but does not necessarily allocate, memory to which the returned item points. Thus, when forming a function model describing the behavior of function example.sub.-- 2, the returned item of function example.sub.-- 2 is described as associated with a newly created resource whose initial state is state Q. Composite states can also be used to detect inconsistent use of an external by a function. For example, if a function terminates with an external in an allocated state, i.e., a RS state of state A, and, in a subsequent traversal of the function, the function terminates with the same external in a freed state, i.e., a RS state of state K, the composite RS state of the external is in state E. This can be viewed as an error since a calling function generally would not expect the function to allocate a resource associated with an external in one execution and to free a resource associated with the same external in another execution. Analysis of a Computer Program A computer program 610 (FIG. 6) is analyzed in accordance with the present invention by a resource checker 602 which analyzes the use of resources prescribed by computer program 610 as described herein. In the disclosed embodiment, resource checker 602 is a computer process executing in CPU 102 from memory 104, which is connected to CPU 102 through bus 108. The analysis of computer program 610 according to the present invention is illustrated by logic flow diagram 900 (FIG. 9). Processing begins in step 902 in which a command entered by a user, e.g., through keyboard 124 (FIG. 1) or mouse 122, initiates analysis of computer program 610 (FIG. 6) and specifies characteristics of the environment in which computer program 610 is analyzed. Characteristics of the environment which can be modified by the user include (i) specific types of errors to detect, (ii) a maximum number of errors to report, (iii) a maximum number of functions to analyze, (iv) a maximum number of iterative traversals of each function, and (v) the particular technique for traversing all possible control flow paths through a function. Processing transfers from step 902 (FIG. 9) to step 904 in which resource checker 602 (FIG. 6) initializes function models, which describe the effect on resources of execution of the various functions used by the computer program. Resource checker 602 includes a model parser 702 (FIG. 7) which reads models from a model description file 604 (FIG. 6) and constructs therefrom function model structures which are described more completely below. By creating function model structures within resource checker 602, the function models are initialized. Step 904 (FIG. 9) is described more completely below with respect to logic flow diagram 904 (FIG. 10). Processing transfers from step 904 (FIG. 9) to step 906, in which a program parser 704 (FIG. 7), which is part of resource checker 602, reads and parses computer program 610 (FIG. 6), using conventional techniques, according to the language to which computer program 610 comports. Program parser 704 (FIG. 7) parses computer program 610 (FIG. 6) into smaller program components, e.g., functions. In step 906 (FIG. 9), a single function is parsed from computer program 610 (FIG. 6) and a function structure, which represents the parsed function is transferred to a dynamic inspection engine 706, which is described more completely below. In an alternative embodiment, a preprocessor, which is described in more detail below, parses computer program 610 and stores a number of function structures representing the parsed functions of computer program 610. In this alternative embodiment, program parser 704 retrieves a single function structure and transfers the function structure to dynamic inspection engine 706. Processing transfers from step 906 (FIG. 9) to step 908. In step 908, dynamic inspection engine 706 (FIG. 7), which is part of resource checker 602, analyzes the "subject function", i.e., the function represented by the function structure transferred to dynamic inspection engine 706 by program parser 704 in step 906 (FIG. 9). In other words, the effect on the resources used by computer program 610 resulting from the execution of the subject function is determined and the state transitions of each of the resources affected by execution of the subject function are analyzed as described more completely below. The function models initialized in step 904 are used to analyze the states and state transitions of the resources and externals of the subject function. Any detected state violations are reported as programming errors. Once the behavior of the subject function with respect to resources and externals of the subject function is determined, model parser 702 forms and stores in model description file 604 a function model describing the behavior of the subject function. Step 908 (FIG. 9) is described more completely below with respect to logic flow diagram 908 (FIG. 24). Processing transfers from step 908 (FIG. 9) to test step 910 in which program parser 704 (FIG. 7) further parses computer program 610 (FIG. 6) to determine whether computer program 610 contains a function which has yet to be analyzed by dynamic inspection engine 706 (FIG. 7) according to step 908 (FIG. 9). In the alternative embodiment described above, program parser 704 (FIG. 6) determines whether a function structure representing a function of computer program 610 has yet to be analyzed by dynamic inspection engine 706 (FIG. 7) according to step 908 (FIG. 9). If dynamic inspection engine 706 (FIG. 7) has not processed a function structure representing a function of computer program 610, processing transfers to step 906 (FIG. 9) in which program parser 704 (FIG. 6) transfers the function structure to dynamic inspection engine 706 (FIG. 7) as described above. Conversely, if dynamic inspection engine 706 (FIG. 7) has processed every function structure representing a function of computer program 610, processing according to logic flow diagram 900 (FIG. 9) terminates. Initialization of Models As described above with respect to step 904 (FIG. 9) of logic flow diagram 900, function models describing the behavior of functions are initialized. Step 904 is shown in greater detail as logic flow diagram 904 (FIG. 10). Processing begins with step 1002 in which model description file 604 (FIG. 6), which contains function models as described above, is opened. In one embodiment, function models are stored in textual format and are read in, then stored in data structures within memory 104 (FIG. 1), which are described more completely below. A function model includes information which identifies a function and a singly-linked list of external models for the externals of the function. The information which identifies the function includes (i) the name of the function, (ii) the name of the source code file in which the function is defined, (iii) the number of the textual line within the source code file at which the definition of the function begins, and (iv) a short description of the function. A source code file is a file stored in memory 104 (FIG. 1), typically in secondary storage such as a magnetic disk, which contains a computer program such as computer program 610. The external models, as stored in a singly-linked list, define the effect of execution of the function on externals of the function in terms of operations applied to those externals and any resources created on behalf of those externals. An external model includes information specifying the type of external, information which identifies the external, and information which specifies the effect on the external of execution of the function. The information which identifies the external is either a parameter number, if the external is a parameter, a variable name, if the external is a global or static variable, or NULL, if the external is a returned item. The information which specifies the effect on the external of execution of the function includes (i) a list of the operations to be applied to the external, (ii) a flag specifying whether a new resource is created on behalf of the external, and (iii) the initial state of the new resource if one is created. The textual format of the models as stored in model description file 604 (FIG. 6) is defined by the following Backus-Naur Form (BNF) definition (8). Backus-Naur Form is a well-known format for describing a formal language.
______________________________________
<function-spec> : := ( <function-prefix> <extern-list> ) (8)
<function-prefix> : :=
<function-name>
[<defining-file> [<defining-line> [<description>]]]
<extern-list> : := <extern> .linevert split. <extern> <extern-list>
<extern> : := ( <extern-type> <result-list> )
<extern-type> : :=
retval // returned item
.linevert split. ( param <param-number> )
// parameter
.linevert split. ( var <var-name> )
// global/static item
<result-list> : := <result> .linevert split. <result> <result-list>
<result> : :=
( op <state-op> )
.linevert split. ( new <initial-state> [<description>] )
<initial-state> ::= A .linevert split. Q .linevert split. U .linevert
split. X .linevert split. E
<state-op> ::= a .linevert split. m .linevert split. k .linevert split. x
.linevert split. i .linevert split. c .linevert split. p
______________________________________
A function model, in textual format, is represented by non-terminal <function-spec> of BNF definition (8). In BNF, a terminal is a term that is not expanded further in a particular BNF definition, and, conversely, a non-terminal is a term that is expanded further. Terminal <function-name> is the identifier assigned to the function, i.e., is the identifier used by another function to call the function represented by the function model. Terminal <function-name> can be any function identifier which is valid according to the computer language with which the function is defined. Terminal <defining-file> is an alphanumeric identification of the source code file within which the function is defined. The alphanumeric identification can be a path description of the source code file, for example. Terminal <defining-line> is a textual representation of a non-negative number, i.e., using digits 0-9, specifying at which textual line of the source code file identified by terminal <defining-file> the definition of the modelled function begins. It should be noted that, in BNF, terms which are optionally present are enclosed in brackets ("[]"). Therefore, in the definition of terminal <function-prefix>, terminals <defining-file>, <defining-line>, and <description> are optionally present. If should be further noted that successive slashes ("//") denote the beginning of a comment and the slashes, and any text following the slashes to the end of a textual line, are not considered part of the BNF definition. Terminal <description> of BNF definition (8) is a series of one or more characters (i.e., letters, numerals, and/or symbols). Terminal <description> is not used by the resource checker 602 (FIG. 6) but is instead provided for the convenience and understanding of a user reading the model in the textual format. Terminal <param-number> of BNF definition (8) is a textual representation of a non-negative integer using the digits 0-9 and specifies a particular parameter in a list of parameters. Parameter zero is the first, i.e., leftmost, parameter in a list of parameters in a call to a function. Subsequent parameters are numbered sequentially. Terminal <var-name> of BNF definition (8) is an identifier of a variable. Thus, function models retrieved from model description file 604 (FIG. 6) each describe the effect of execution of a respective function on externals of the function. Processing transfers from step 1002 (FIG. 10) to loop step 1004 in which each function model stored in model description file 604 (FIG. 6) is retrieved and processed according to a loop defined by loop step 1004 (FIG. 10) and next step 1014. During each iteration of the loop, the function model which is processed is called the current function model. When each and every function model stored in the model description file has been processed according to the loop defined by loop step 1004 and next step 1014, processing transfers from loop step 1004 to step 1006 in which model description file 604 (FIG. 6) is closed and processing according to logic flow diagram 904 (FIG. 10) terminates. For each function model retrieved from the model description file, processing transfers from loop step 1004 to step 1008 in which the portion of the current function model corresponding to non-terminal <function-prefix> of BNF definition (8) above is parsed from the current function model. Processing transfers to step 1010 in which a function model structure is initialized and the information parsed from the current function model in step 1008 is stored in a function model structure. A function model structure 1100 (FIG. 11) includes a field "name" 1102, a field "file" 1110, a field "line" 1112, and a field "description" 1108. Portions of the function model corresponding to terminals <function-name>, <defining-file>, <defining-line>, and <description> of BNF definition (8), all of which are part of non-terminal <function-prefix>, are parsed from the function model and stored in field "name" 1102, field "file" 1110, field "line" 1112, and field "description" 1108, respectively, of function model structure 1100. Processing transfers from step 1010 (FIG. 10) to loop step 1012. Loop step 1012 and next step 1028 define a loop, in each iteration of which an external specified in the portion of the function model corresponding to non-terminal <extern-list> of BNF definition (8) above is processed. During each iteration of the loop defined by loop step 1012 and next step 1028, the currently processed external is called the subject external. After every external defined in the current function model has been processed according to the loop defined by loop step 1012 and next step 1028, processing transfers from loop step 1012 to next step 1014. Processing transfers from next step 1014 to loop step 1004 in which another function model retrieved from model description file 604 (FIG. 6) is processed or, if all function models have been processed, from which processing transfers to step 1006 (FIG. 10) as described above. For each external specified in the portion of the current function model corresponding to non-terminal <extern-list> of BNF definition (8), processing transfers from loop step 1012 to step 1016. In step 1016, a new external model structure, e.g., external model structure 1200 (FIG. 12), is created. External model structure 1200 includes a field "equivalent" 1202, a field "type" 1204, a field "parameter.sub.-- number" 1206, a field "name" 1208, a field "next" 1210, a field "number.sub.-- of.sub.-- operations" 1212, a field "operations" 1214, a field "new.sub.-- resource" 1218, a field "initial.sub.-- state" 1220, and a field "description" 1222. In step 1016 (FIG. 10), the portion of the subject external model corresponding to terminal <param-number> in the definition of non-terminal <external> of BNF definition (8) is parsed from the subject external model and is stored in field "parameter.sub.-- number" 1206 (FIG. 12) of external model structure 1200. In one embodiment, field "equivalent" 1202 is used to identify a second external model structure. By doing so, external model structure 1200 is related to the second external model structure. Such would be appropriate if, for example, the returned item of a function is the first parameter. The embodiment described herein does not make use of field "equivalent" 1202, which is therefore initialized to a NULL value. From step 1016 (FIG. 10), processing transfers to step 1018. In step 1018, the portion of the subject external model corresponding to non-terminal <extern-type> of BNF definition (8), which specifies the type of external represented by the subject external model, is parsed from the subject external model. As shown in BNF definition (8) above, an external represented by an external model can be a returned item, a parameter, or a globally-defined or static variable. Data specifying the type of external represented by the subject external model are stored in field "type" 1204 (FIG. 12) of external model structure 1200. Processing transfers from step 1018 (FIG. 10) to a loop step 1020. As shown in BNF definition (8) above, execution of a function can have one or more effects or "results" on each external of the function. Each result is represented in BNF definition (8) as non-terminal <result>. One or more results are included in non-terminal <result-list>. Loop step 1020 and next step 1024 define a loop in which each result in the list of non-terminal <result-list> of the subject external model is processed. During an iteration of the loop defined by loop step 1020 and next step 1024, the result being processed is called the subject result. After every result of the subject external model has been processed according to the loop defined by loop step 1020 and next step 1024, processing transfers from loop step 1020 to step 1026 which is described below. For each result for the subject external model, processing transfers from loop step 1020 to step 1022. In step 1022, the subject result is parsed from the subject external model. The result is then stored in an external model structure such as external model structure 1200 (FIG. 12). For example, function model (3), which is defined above, specifies one result for a first external, i.e., the returned item, and one result for a second external, i.e., parameter zero. The result of the returned item is specified as `(new Q "memory")`, indicating that a new resource is created for the returned item, the initial state of the resource is state Q, and provides "memory" as a brief description of the resource. Accordingly, if external model structure 1200 represents the external model for the returned item, (i) field "new.sub.-- resource" 1218 is set to a boolean value of "true" to indicate that a new resource is created, (ii) field "initial.sub.-- state" 1220 is set to indicate that the initial state of the new resource is state Q, and (iii) the text "memory" is stored in field description 1222. As a second example, function model (3) above specifies a result "(op c)" for the second external, i.e., parameter zero. Result "(op c)" specifies that operation c is applied to the external. Accordingly, if external model structure 1200 represents the external model for parameter zero, field "number.sub.-- of.sub.-- operations" 1212, which initially has a value of zero, is incremented and an operation identifier "c" is stored in field "operations" 1214 corresponding to a position indicated by field "number.sub.-- of.sub.-- operations" 1212. In this example, field "number.sub.-- of.sub.-- operations" 1212 stores a value of one and the first operation identifier in field "operations" 1214 is an identifier of operation c. If a second operation is applied to the second external, field "number.sub.-- of.sub.-- operations" 1212 is again incremented to a value of two and the second operation identifier in field "operations" 1214 is the identifier of the second operation. Processing transfers from step 1022 (FIG. 10) through next step 1024 to loop step 1020 which is described above. As described above, processing transfers from loop step 1020 to step 1026 once all results for the subject external model have been processed. In step 1026, the external model structure representing the subject external model is added to a singly linked list of externals in the current function model structure. An illustrative example is discussed in the context of function model (3) above. An external model structure 1200A (FIG. 13) is first added to a function model structure 1100A by storing in fields "first.sub.-- external" 1104A and "last.sub.-- external" 1106A pointers to external model structure 1200A. A second external model structure 1200B is then added to function model structure 1100A by storing in field "next" 1210A of external model structure 1200A, and in field "last.sub.-- external" 1106A of function model structure 1100A (superseding the pointer previously stored in field "last.sub.-- external" 1106A), a pointer to external model structure 1200B as shown in FIG. 13. Processing transfers from step 1026 (FIG. 10) through next step 1028 to loop step 1012. After every external model has been processed as described above, processing transfers from loop step 1012 through next step 1014 to loop step 1004. After every function model has been processed as described above, processing transfers from loop step 1004 to step 1006 in which the file containing function models in the textual format described above is closed as described above. Processing according to logic flow diagram 904 terminates after step 1006. Internal Representation of a Function Once computer program 610 (FIG. 6) is parsed by program parser 704 (FIG. 7), computer program 610 is represented in memory 104 by a series of function structures. In an alternative embodiment as described above, program parser 704 retrieves from computer program 610 function structures which have been formed by a previous parsing of a source computer program conforming to a particular computer language, e.g., the C computer language. The source computer program is parsed by a source code preprocessor which parses the source computer program according to the computer language to which the source computer program comports and forms and stores in computer program 610 function structures representing the functions defined in the source computer program. The source code preprocessor (not shown) is a separate computer process from resource checker 602. In this alternative embodiment, the source code preprocessor is based on the known GNU C compiler available from Free Software Foundation, Inc. of Cambridge, Mass. Appendix B, which is a part of this disclosure and is incorporated herein in its entirety, is a list of computer instructions which define data structures and functions for transporting parsed functions of a computer program from a source code preprocessor into data structures described more completely below for representing a parsed function. In one embodiment, a conventional compiler, such as the known GNU C compiler described above, is used to parse a computer program and the parsed program is represented in data structures such as those defined in Appendix B. The following is a description of a function structure. Familiarity with fields and relationships within a function structure facilitates the subsequent description of the processing of dynamic inspection engine 706 (FIG. 7). Function structure 1400 (FIG. 14) represents a function defined by computer program 610 or, in an alternative embodiment as described above, the source computer program and includes (i) a field "name" 1402, (ii) a field "line" 1404, (iii) a field "file" 1406, (iv) a field "result" 1408, (v) a field "externals" 1410, and (vi) a field "statement". Field "name" 1402 of function structure 1400 specifies the identifier of the function represented by function structure 1400. For example, the identifier of function example.sub.-- 1() of source code excerpt (1) above is "example.sub.-- 1". Field "file" 1406 and field "line" 1404 specify the source code file and line number within that file, respectively, at which the function represented by function structure 1400 is defined. For example, if source code excerpt (1) above represents the entire contents of a single source code file whose file name is "example.sub.-- 1.c", field "file" 1406 and field "line" 1404 of a function structure representing function example.sub.-- 1() contain, respectively, data specifying the text "example.sub.-- 1.c" and an integer value of seven (7). Field "result" 1408 points to a declaration structure 1418, which is analogous to declaration structure 1506 described below and which specifies the type of result returned by the function represented by function structure 1400. For example, function example.sub.-- 1() of source code excerpt (1) above returns a result which is an integer, i.e., data of the type "int", as specified at line 7 of source code excerpt (1). Thus, if function structure 1400 represents function example.sub.-- 1(), field "result" 1408 points to declaration structure 1418 which specifies integer data. Field "externals" 1410 of function structure 1400 is a pointer to an external list structure 1414, which is described below in greater detail. As described more completely below, external list structures such as external list structure 1414 include a pointer which is used to link external list structures in a singly-linked list. Thus, pointing to an external list structure is to point to a singly-linked list of external list structures, even if the length of the list is one. Such a singly-linked list, which is pointed to by field "externals" 1410 of function structure 1400, includes external list structures representing the externals of the function represented by function structure 1400. Field "first.sub.-- stmt" 1412 of function structure 1400 is a pointer to a statement structure 1416, which is described below in greater detail. As described more completely below, statement structures such as statement structure 1416 include a pointer which is used to link statement structures in a singly-linked list. Thus, pointing to a statement structure is to point to a singly-linked list of statement structures, even if the length of the list is one. Such a singly-linked list, which is pointed to by field "first.sub.-- stmt" 1412 of function structure 1400, includes statement structures representing the statements of the function represented by function structure 1400. External List Structures External list structure 1414 is shown in greater detail in FIG. 15. External list structure 1414 represents an external of the function represented by function structure 1400 (FIG. 14) and includes a field "first.sub.-- decl" 1502 (FIG. 15), a field "next" 1504, and a field "first.sub.-- external" 1510. Field "first.sub.-- decl" 1502 is a pointer to a declaration structure 1506, which specifies the data type of the external represented by external list structure 1414 and which is described below in greater detail. Field "next" 1504 is a pointer to another external list structure 1508 if external list structure 1508 immediately follows external list structure 1414 in the singly-linked list of externals. If no external list structure follows external list structure 1414 in the singly-linked list of external list structures, field "next" 1504 of external list structure 1414 is NULL, i.e., contains NULL data. Field "first.sub.-- external" 1510 is a pointer to an external state structure (not shown) which specifies the state of the external represented by external list structure 1414 and which is described below in greater detail. Declaration Structures Declaration structure 1506 is shown in greater detail in FIG. 16. A declaration structure is a structure which specifies a declared variable or function, i.e., a variable or function, respectively, specified in a declaration. Declarations in the context of the C computer language are well-known and are described in the C Standard. Declaration structure 1506 includes a field "kind" 1602, a field "name" 1604, and field "type" 1606, a field "item" 1608, and a field "model" 1610. Field "kind" 1602 contains data specifying whether the declared item or function is globally defined, static, or a locally defined. Field "name" 1604 contains textual data specifying an identifier of the item or function. As described above, in the context of the C computer language, an item or function is identified by a textual identifier and identifiers must conform to a specific format, which is described in Section 6.1.2 of the C Standard. Field "type" 1606 of declaration structure 1506 is a pointer to a type structure 1612 which specifies the particular type of data represented by the declared item or function. Type structure 1612 is described below. Field "item" 1608 is a pointer to item structure 2700 which represents the declared item. If declaration structure 1506 represents a declared function, field "item" 1608 is NULL and therefore points to no item structure. Field "model" 1610 of declaration structure 1506 is a pointer to function model structure 1100 if declaration structure 1506 represents a declaration of a function whose model is represented by function model structure 1100. If declaration structure 1506 does not represent a declaration of a function, field "model" 1610 is NULL, i.e., contains NULL data, and therefore points to no function model structure. Furthermore, if declaration structure 1506 represents a declaration of a function for which no function model structure exists, field "model" 1610 is NULL. Type Structures Type structure 1612 is shown in greater detail in FIG. 17. A type structure such as type structure 1612 specifies a particular data type, such as integer, floating point, alphanumeric characters, and user-defined types such as structures. Type structure 1612 includes a field "kind" 1702, a field "name" 1704, a field "size" 1706, a field "points.sub.-- to" 1708, and a field "fields" 1710. Field "kind" 1702 contains data specifying whether the type represented by type structure 1612 is integer, real (i.e., floating point numerical data), pointer, array, structure (i.e., data type "struct" as defined for the C computer language), or union. Each of these types are well-known and are described in the C Standard at Sections 6.1.2.5 and 6.5 et seq. Field "name" 1704 of type structure 1612 contains alphanumeric data specifying the identifier of the type if the type represented by type structure 1612 is user-defined. Otherwise, if the type represented by type structure 1612 is predefined by the C computer language, field "name" 1704 is NULL. Field "size" 1706 specifies the size of the type represented by type structure 1612. If the type is not an array, field "size" 1706 specifies the number of bits of data included in an item of the type represented by type structure 1612. For example, if the type is a 32-bit integer, field "size" 1706 of type structure 1612 specifies the value 32. If the type is an array, field "size" 1706 specifies the number of bits of data included in the entire array, i.e., the number of bits of data included in an item of the type represented by an element of the array multiplied by the number of elements in the entire array. For example, a declaration "int array[10];" declares an array with ten (10) elements. If the type "int" is a 32-bit integer, the size of the declared array is therefore ten (10) elements multiplied by 32 bits. The size of the array is therefore 320 bits. If the type represented by type structure 1612 is a pointer to a second type of data, field "points.sub.-- to" 1708 is a pointer to a type structure representing the second type of data, i.e., to type structure 1712. Type structure 1712 is analogous to type structure 1612. Conversely, if the type represented by type structure 1612 is not a pointer, field "points.sub.-- to" 1708 is NULL. If the type represented by type structure 1612 is a structure type (i.e., type "struct" of the C computer language) or a union type, field "fields" 1710 is a pointer to field structure 1714 representing the first field of the structure type or union type, respectively. As described more completely below, field structures corresponding to fields of a particular structure type or union type are linked to form a singly-linked list. If the type represented by type structure 1612 is neither a structure type nor a union type, field "fields" 1710 of structure type 1612 is NULL. Field Structures Field structure 1714 is shown in greater detail in FIG. 18. Field structure 1714 includes a field "name" 1802, a field "size" 1804, a field "offset" 1806, and a field "next" 1808. Field structure 1714 is described in the context of the illustrative example of the following type definition according to the C computer language. typedef struct { (9) int x; int y; } point; The type definition of source code excerpt (9) defines a structure type whose identifier is "point" and which has two fields. Type "point" is therefore a structure type. Each field is of the type "int", which is typically a 32-bit integer, and has either of respective identifiers "x" and "y". Field "name" 1802 of field structure 1714 contains alphanumeric data specifying the identifier of the field represented by field structure 1714. For example, field "name" of a field structure representing the first field of the structure defined in source code excerpt (9) contains the text "x". Field "size" 1804 of field structure 1714 specifies the number of bits of data contained in the field represented by field structure 1714. For example, in a typical implementation of the C computer language, such as that compiled by the SunOS C compiler available from Sun Microsystems, Inc. of Mountain View, Calif., an item of type "int" is 32 bits in length. In the example of source code excerpt (9), each field is a 32-bit integer and therefore contains 32 bits of data. Accordingly, field "size" of each field structure representing each respective field specifies the integer value 32. Field "offset" 1806 of field structure 1714 specifies the offset from the beginning of the structure to the data of the field represented by field structure 1714. For example, field "x" in source code excerpt (9) is the first field of type "point" and therefore has an offset of zero. Type "point" is shown diagrammatically in FIG. 19. Field "x" of type "point" is 32 bits in length and begins at offset zero (0) . Field "y" of type "point" is 32 bits in length and begins at offset 32. Accordingly, field "offset" 1806X (FIG. 20) of field structure 1714X, which is directly analogous to field structure 1714 (FIG. 18) and which represents field "x" of type "point", specifies the integer value of zero (0). Similarly, field "offset" 1806Y (FIG. 20) of field structure 1714Y, which is also directly analogous to field structure 1714 (FIG. 18) and which represents field "y" of type "point", specifies the integer value of thirty-two (32). Field "next" 1808 (FIG. 18) of field structure 1714 is a pointer to the next field structure in a singly-linked list of field structures of a given structure type. For example, field "fields" 1710P (FIG. 20) of type structure 1612P representing type "point" points to field structure 1714X, which represents field "x" which in turn is the first field of type "point". The next field of type "point" is field "y". Field "next" 1808X of field structure 1714X therefore points to field structure 1714Y which represents field "y" of type "point". Field "y" of type "point" is the last field of type "point" and is therefore not followed by any other field of type "point". Accordingly, field "next" 1810Y of field structure 1714Y is NULL. Statement Structures As described above, field "first.sub.-- stmt" 1412 (FIG. 14) of function structure 1400 points to statement structure 1416. Statement structure 1416 is shown in greater detail in FIG. 21. Statement structures such as statement structure 1416 represent statements which collectively form a function according to the C computer language. Statement structure 1416 includes the following fields: (i) a field "kind" 2102, (ii) a field "line" 2104, (iii) a field "next" 2106, (iv) a field "flags" 2108, and (v) a field "pointers" 2110. Field "kind" 2102 of statement structure 1416 specifies the kind of statement represented by statement structure 1416. Field "kind" 2102 identifies one of the following kinds of statement: error, declaration, expression, block, "if", "else", "return", loop, "switch", "break", "continue", and "goto". The representation of each of these kinds of statement by a statement structure is described below more completely. Field "line" 2104 of statement structure 1416 specifies the textual line on which the statement represented by statement structure 1416 appears within the source code file defining the function represented by function structure 1400 (FIG. 14), and therefore including the statement represented by statement structure 1416. The line on which the statement appears is maintained in statement structure 1416 so that reports of detected errors can specify to the user the specific statement causing the error. Field "next" 2106 (FIG. 21) of statement structure 1416 is a pointer to a second statement structure 2112, which represents the statement immediately following the statement represented by statement structure 1416 in a block of statements. In this way, the statements of a block of statements are represented by statement structures which are linked to form a singly-linked list. If the statement represented by statement structure 1416 (FIG. 21) is the last statement of the block of statements, field "next" 2106 is NULL, and therefore points to no other statement structure. Field "flags" 2108 of statement structure 1416 is an unsigned 32-bit integer whose individual bits are used as flags to indicate which errors associated with the statement represented by statement structure 1416 have been reported to the user. Each time an error is to be reported, the flag of field "flags" 2108 corresponding to the error to be reported is checked. If the flag is set, the error is not reported since the flag indicates that the error has already been reported in the context of the statement represented by statement structure 1416. If the flag is not set, the error is reported and the flag is set to reflect the reporting of the error. In this way, each type of error is reported only once with respect to any particular statement. Field "pointers" 2110 of statement structure 1416 is an array of one or more pointers to structures representing the respective parts of the statement represented by statement structure 1416. The number of pointers in the array depends on the particular kind of statement represented by statement structure 1416. Error, "break", and "continue" statements have no parts; therefore, field "pointers" 2110 is NULL if statement structure 1416 represents an error, break", or "continue" statement. An error statement is a statement which does not conform to the C computer language. "Break" and "continue" statements are well-known and are described in the C Standard at Sections 6.6.6.3 and 6.6.6.2, respectively. A declaration statement includes a declared variable having data of a specified type and perhaps an initial value for that variable. Accordingly, if statement structure 1416 represents a declaration statement, field "pointers" 2110 is an array of two pointers. The first pointer points to a declaration structure representing the declared variable. The second pointer points to an expression structure representing an expression which evaluates to the initial value of the declared variable, if an initial value is specified. Conversely, if no initial value is specified for the declared variable, the second pointer is NULL. An expression statement is a statement which is itself an expression. An expression is a well-known component of the C computer language and is a collection of one or more items, calls to functions, and operators. Every expression in the C computer language has a value. Evaluation of an expression results in an item, whose value is the value of the expression and which is sometimes called the item of the expression. The value of the item of an expression is sometimes called herein the value of the expression. The evaluation of an expression is described more completely below. If statement structure 1416 represents an expression statement, field "pointers" 2110 is an array of one pointer which points to an expression structure, such as expression structure 2200 (FIG. 22). Expression structure 2200 includes a field "kind" 2202, a field "type" 2204, a field "item" 2206, a field "num.sub.-- operands" 2208, and a field "operands" 2210. Field "kind" 2202 specifies the kind of expression represented by expression structure 2200. If the expression involves an operator, field "kind" 2202 specifies that operator. Field "type" 2204 is a pointer to a type structure 2212 which represents the data type of the expression, i.e., the type of item to which the expression evaluates. Type structure 2212 is analogous to type structure 1612 described above. Field "item" 2206 of expression structure 2200 is a pointer to an item structure 2214 which represents the item to which the expression evaluates. Item structure 2214 is analogous to item structure 2700 (FIG. 27) described above. Prior to evaluation of the expression represented by expression structure 2200, field "item" 2206 is NULL. Field "num.sub.-- operands" 2208 specifies the number of operands in the expression represented by expression structure 2200. Field "operands" 2210 is an array of expression structures, each of which represents an operand of the expression represented by expression structure 2200. The length of the array is equal to the number of operands specified in field "num.sub.-- operands" 2208. The various types of expression, which are defined in the C computer language, and the number and type of operands of each type of expression, are well-known and are described in the C Standard. A block statement is a statement which groups together one or more statements. Execution of a block statement is execution of the one or more statements. A block statement has one part, namely, the one or more statements. If statement structure 1416 (FIG. 21) represents a block statement, field "pointers" 2110 is a single pointer which in turn points to the statement structure representing the first statement of the one or more statements. The statement structures representing the one or more statements are linked to form a singly-linked list by using field "next" 2106 as described above. An "if" statement evaluates an expression, which is sometimes called the predicate of the "if" statement, and causes a second statement to be executed if the expression evaluates to a boolean value of "true". If statement structure 1416 represents an "if" statement, field "pointers" 2110 is an array of two pointers. The first pointer points to an expression structure which represents an expression whose value determines whether the second statement is executed. The second pointer points to a statement structure representing the second statement. An "else" statement is immediately preceded by an "if" statement and causes a third statement to be executed if the predicate of the "if" statement evaluates to a boolean value of "false". If statement structure 1416 represents an "else" statement, field "pointers" 2110 is an array of two pointers. The first pointer points to a statement structure which represents the third statement. The second pointer points to an expression structure or is NULL. The expression represented by the expression structure is sometimes called the predicate of the "else" statement. If the second pointer points to an expression structure, the third statement is executed only if the predicate of the "if" statement evaluates to a boolean value of "false" and the predicate of the "else" statement evaluates to a boolean value of "true". This represents an "else if" statement which is generally known and described in the C Standard. If the second pointer is NULL, the third statement is executed only if the predicate of the "if" statement evaluates to a boolean value of "false". A "return" statement terminates execution of a called function and transfers control to a calling function while optionally supplying to the calling function a returned item. Transferring control to a calling function while supplying a returned item to the calling function is called returning the returned item to the calling function. If statement structure 1416 represents a "return" statement, field "pointers" 2110 is a single pointer which points to an expression structure or is NULL. If the pointer points to an expression structure, the expression structure represents the expression which is evaluated to an item which in turn is returned to the calling function. A loop statement causes a second statement to be executed zero or more times. Examples of loop statements in the C computer language are a "for" statement, a "do" statement, and a "while" statement, each of which is generally known and described in the C Standard at Section 6.6.5. If statement structure 1416 represents a loop statement, field "pointers" 2110 is a single pointer which points to a statement structure representing the second statement. A "switch" statement evaluates an expression and transfers control within a block statement to a particular statement within the block statement according to the value to which the expression evaluates. The expression is sometimes called the predicate of the "switch" statement. If statement structure 1416 represents a "switch" statement, field "pointers" 2110 is an array of two pointers. The first pointer points to an expression structure which represents an expression according to whose value control transfers. The second pointer points to a statement structure representing the block statement. A "goto" statement causes a transfer of control to a second statement. In one embodiment, if statement structure 1416 represents a "goto" statement, field "pointers" 2110 is an array of a single pointer, which points to a statement structure representing the second statement. In a simpler embodiment, a "goto" statement is treated as terminating execution of the called routine. In this embodiment, a "goto" statement has no parts, and field "pointers" 2110 is an array of zero pointers. Thus, function structure 1400 (FIG. 14) represents a function to be analyzed by dynamic inspection engine 706. Analysis of a Function As described above, dynamic inspection engine 706 (FIG. 7) analyzes each function structure resulting from parsing of computer program 610 (FIG. 6) step 908 (FIG. 9). Step 908 is shown in greater detail in logic flow diagram 908 (FIG. 24). The function structure processed in a performance of step 908 is called the subject function structure. Similarly, the function represented by the subject function structure is called the subject function. In steps 2404 and 2408 of logic flow diagram 908, the subject function is analyzed under different assumptions. As described above in greater detail, the control flow path through a particular function sometimes depends on events which are not know until the function is executed. Even then, the events of one execution of the function may not always occur in every execution of the function. Thus, a function whose flow of control depends on an unknown event is repeatedly analyzed under different assumptions with respect to the unknown event. In one embodiment, every possible control flow path through the function is determined and analyzed. For example, control can flow along one of two possible paths for every "if" statement in the function, and control can flow along one of a number of possible paths for every "switch" statement in the function. In the case of a "switch" statement, the number of possible paths is equal to the number of "case" statements, including a "default" statement if one is present, associated with the "switch" statement. Once all of the possible control flow paths through a function are determined, the function is repeatedly analyzed, once using each possible control flow path through the function. In this way, the function is analyzed in view of all possible events which might affect flow of control through the function. In a simpler embodiment, the particular control flow path through a function is chosen randomly by making random assumptions with respect to events at each "if" statement and each "switch" statement within the function. The function is analyzed repeatedly and different control flow paths are selected randomly. The number of times the function is analyzed is chosen such that there is a substantial likelihood that every possible control flow path through the function is analyzed, or alternatively can be chosen to limit the amount of effort that is expended to analyze any one routine. Steps 2404 and 2408 (FIG. 24) illustrate the latter, simpler embodiment. In step 2404, the number of times the subject function is analyzed is determined. Step 2404 is shown in greater detail as logic flow diagram 2404 (FIG. 25). In step 2502, the number of times an "if" statement is used in the subject function is determined. Specifically, execution engine 802 (FIG. 8) compares field "kind" 2102 (FIG. 21) of each statement structure in the singly-linked list of statement structures pointed to, directly or indirectly, by field "first.sub.-- statement" 1412 (FIG. 14) of the function structure 1400 to data indicating an "if" statement. The number of times field "kind" of a statement structure matches data indicating an "if" statement is recorded as the number of times an "if" statement is used in the subject function. From step 2502, processing transfers to step 2504 in which the number of times the subject function is analyzed is determined. In one embodiment, the number of times the subject function is analyzed corresponds to the number of times the "if" statement is used in the subject function as shown in Table M.
TABLE M
______________________________________
No. of times the function is
No. of "if"s
analyzed
______________________________________
0 | ||||||
