Using program flow graph

Computer process resource modelling method and apparatus

5694539

Abstract

An error detection mechanism for detecting programing errors in a computer program. A component of the computer program, e.g., a procedure or function of the computer program, is analyzed to determine the effect of the component on resources used by the computer program. A component is analyzed by traversing the computer instructions, i.e., statements, of the component and tracking the state of resources used by the component as affected by the statements of the component. Each resource has a prescribed behavior represented by a number of states and transition between states. Violations in the prescribed behavior of a resource resulting from an emulated execution of the statements of the component are detected and reported as programming errors. Resources used by two or more components are modelled by modelling externals of the components. The effect of execution of a component on externals and resources of the component is determined by traversing one or more possible control flow paths through the component and tracking the use of each external and resource by each statement of each control flow path. Once the effect of execution of a component on externals and resources of the component is determined, a model of the component is created and used to model externals and resources of other components which invoke the modelled component.


Claims

What is claimed is:

1. A method for detecting programming errors in a component of a computer program, the component comprising one or more statements, the method comprising:

(a) defining one or more valid states for a resource which has a resource state;

(b) traversing a control flow path through the one or more statements;

(c) for each of the one or more statements along the control flow path, performing the following steps:

(i) evaluating an effect of the statement on the resource state of the resource; and

(ii) modelling a transition from the resource state to a new resource state of the resource in accordance with the effect; and

(d) detecting that the new state is not one of the valid states.

2. The method of claim 1 wherein the component is a function.

3. The method of claim 1 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate; and

evaluating the predicate to determine the control flow path.

4. The method of claim 3 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

5. The method of claim 4 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

6. The method of claim 4 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more operands of the predicate.

7. The method of claim 6 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the operands of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more operands of the first predicate.

8. The method of claim 1 wherein step (c)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

9. The method of claim 8 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the called component on the resource, which is associated with the external.

10. The method of claim 1 further comprising:

reporting to a user that the resource state is not one of the valid states.

11. The method of claim 1 further comprising:

identifying to a user the statement whose effect on the resource state causes a transition to a new resource state which is not one of the valid states.

12. A method for detecting programming errors in a component of a computer program, the component comprising one or more statements, the method comprising:

(a) defining one or more valid states for a resource which has a resource state;

(b) defining one or more valid state transitions, each state transition being from a first of the valid states to a second of the valid states; and

(c) traversing a control flow path through the one or more statements;

(d) for each of the one or more statements along the control flow path, performing the following steps:

(i) evaluating an effect of the statement on the resource state of the resource; and

(ii) modelling a change in the resource state of the resource in accordance with the effect; and

(e) detecting that the change is not one of the valid state transitions.

13. The method of claim 12 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate; and

evaluating the predicate to determine the control flow path.

14. The method of claim 13 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

15. The method of claim 14 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

16. The method of claim 14 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more elements of the predicate.

17. The method of claim 16 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

18. The method of claim 12 wherein step (d)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

19. The method of claim 18 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.

20. The method of claim 12 further comprising:

reporting to a user that the resource state is not one of the valid states.

21. The method of claim 12 further comprising:

identifying to a user the statement whose effect on the resource state causes a transition to a new resource state which is not one of the valid states.

22. A method for modelling the behavior of a component of a computer program, the component including an external, the method comprising:

(a) defining one or more valid states for the external, which has an external state;

(b) traversing a control flow path, which comprises one or more statements of the component;

(c) for each of the one or more statements, performing the following steps:

(i) determining that execution of the statement has an effect on the external state of the external; and

(ii) modelling a transition from the external state to a new external state of the external in accordance with the effect; and

(d) building a model which represents a collective effect of the statements of the component on the external.

23. The method of claim 22 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate; and

evaluating the predicate to determine the control flow path.

24. The method of claim 23 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

25. The method of claim 24 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

26. The method of claim 24 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more components of the predicate.

27. The method of claim 26 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

28. The method of claim 24 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the external.

29. The method of claim 22 wherein step (c)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

30. The method of claim 29 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.

31. The method of claim 22 further comprising:

(e) traversing a second control flow path, which comprises a second collection of one or more statements;

(f) for each statement of the second collection of one or more statements, performing the following steps:

(i) determining that execution of the statement has an effect on the external state of the external; and

(ii) modelling a transition from the external state to a second new external state of the external in accordance with the effect;

(g) forming from the first-mentioned new external state and the second new external state a composite external state of the external.

32. The method of claim 31 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the second control flow path according to a value of a predicate; and

evaluating the predicate to determine the second control flow path.

33. The method of claim 32 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

34. The method of claim 33 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

35. The method of claim 33 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more elements of the predicate.

36. The method of claim 33 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the external.

37. The method of claim 35 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

38. The method of claim 31 wherein step (f)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

39. The method of claim 38 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.

40. The method of claim 31 further comprising:

deriving from the composite state of the external a model of the behavior of the component.

41. The method of claim 40 further comprising:

including in the model of the behavior of the component information prescribing one or more transitions in the state of the external.

42. The method of claim 22 further comprising:

using the model during analysis of a calling component of the computer program, the calling component including a call to the first-mentioned component, to determine the effect of execution of the first component on the external.

43. A method for modelling the behavior of a component of a computer program, the component prescribing use of a resource, the method comprising:

(a) defining one or more valid states for the resource, which has a resource state;

(b) traversing a control flow path, which comprises one or more statements of the component;

(c) for each of the one or more statements, performing the following steps:

(i) determining that execution of the statement has an effect on the resource state of the resource; and

(ii) modelling a transition from the resource state to a new resource state of the resource in accordance with the effect; and

(d) building a model which represents a collective effect of the statements of the component on the resource.

44. The method of claim 43 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate; and

evaluating the predicate to determine the control flow path.

45. The method of claim 44 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

46. The method of claim 45 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

47. The method of claim 45 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more elements of the predicate.

48. The method of claim 47 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

49. The method of claim 43 wherein step (c)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

50. The method of claim 49 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.

51. The method of claim 43 further comprising:

(e) traversing a second control flow path, which comprises a second collection of one or more statements;

(f) for each statement of the second collection of one or more statements, performing the following steps:

(i) evaluating an effect of the statement on the resource state of the resource; and

(ii) modelling a transition from the resource state to a second new resource state of the resource in accordance with the effect;

(g) forming from the first-mentioned new resource state and the second new resource state a composite resource state of the resource.

52. The method of claim 51 wherein the step of traversing comprises:

determining that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate; and

evaluating the predicate to determine the control flow path.

53. The method of claim 52 wherein the step of evaluating the predicate comprises:

determining that the value of the predicate is unknown; and

assigning to the predicate an assumed value.

54. The method of claim 53 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value information regarding the state of the resource.

55. The method of claim 53 wherein the step of evaluating the predicate further comprises:

inferring from the assumed value respective values of one or more elements of the predicate.

56. The method of claim 55 wherein the step of traversing further comprises:

determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

evaluating the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

57. The method of claim 51 wherein step (g)(i) comprises:

determining whether the statement is a call to a called component;

if the statement is a call to a called component, emulating execution of the called component to evaluate the effect of the called component on the resource state of the resource.

58. The method of claim 57 wherein the step of emulating execution of the called component comprises:

determining that the resource is associated with an external of the called component; and

determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.

59. The method of claim 51 further comprising:

deriving from the composite state of the resource a model of the behavior of the component.

60. The method of claim 59 further comprising:

including in the model of the behavior of the component information prescribing one or more transitions in the state of the resource.

61. The method of claim 43 further comprising:

using the model during analysis of a calling component of the computer program, the calling component including a call to the first-mentioned component, to determine the effect of execution of the first component on the resource.

62. A resource checker for detecting programming errors in a component of a computer program, the component comprising one or more statements, the resource checker comprising:

a resource behavior model which defines one or more valid states for a resource which in turn has a resource state;

an execution engine which traverses a control flow path through the one or more statements and which determines that each of the one or more statements has a respective effect on the resource state of the resource; and

a state machine which models a transition from the resource state to a new resource state of the resource in accordance with the effect of each of the one or more statements and which detects that the new state is not one of the valid states.

63. The resource checker of claim 62 wherein the component is a function.

64. The resource checker of claim 62 wherein the execution engine determines that a selected statement of the one or more statements determines a portion of the control flow path according to a value of a predicate and evaluates the predicate to determine the control flow path.

65. The resource checker of claim 64 wherein the execution engine evaluates the predicate by determining that the value of the predicate is unknown and by assigning to the predicate an assumed value.

66. The resource checker of claim 65 wherein the execution engine infers, from the assumed value, information regarding the state of the resource.

67. The resource checker of claim 65 wherein the execution engine infers, from the assumed value, respective values of one or more elements of the predicate.

68. The resource checker of claim 67 wherein the execution engine traverses the control flow path by determining that a second selected statement of the one or more statements determines a second portion of the control flow path according to a second value of a second predicate, which comprises one or more of the elements of the first-mentioned predicate; and

further wherein the execution engine evaluates the second predicate to determine the second portion of the control flow path using the inferred respective values of the one or more elements of the first predicate.

69. The resource checker of claim 62 wherein the execution engine determines whether a selected statement of the one or more statements is a call to a called component;

further wherein the execution engine emulates execution of the called component to evaluate the effect of the called component on the resource state of the resource if the selected statement is a call to a called component.

70. The resource checker of claim 69 wherein the execution engine emulates execution of the called component by determining that the resource is associated with an external of the called component and by determining from a model of the called component the effect of execution of the component on the resource, which is associated with the external.


Description

CROSS REFERENCE TO MICROFICHE APPENDIX

Appendix A, which is a part of this disclosure, is a microfiche appendix consisting of 2 sheets of microfiche having a total of 124 frames. Microfiche Appendix A is a list of computer programs and related data in one embodiment of the present invention, which is described more completely below.

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to the analysis of computer programs and, in particular, to the detection of programming errors in a computer program through analysis of the use of resources prescribed by the computer program.

2. Discussion of Related Art

Some existing programming error detection methods detect violations in the computer instruction protocol with which a particular program comports. Such a programming error detection method is called "static checking" since the syntax of the computer instructions, or "statements", of the computer program is analyzed outside the context of the behavior resulting from the execution of those statements. The term "statement" is used herein as it is defined in Section 6.6 of American National Standard for Programming Languages--C (American National Standards Institute/International Organization for Standardization ANSI/ISO 9899-1990), which is reproduced in Herbert Schildt, The Annotated ANSI C Standard, (Osborne McGraw-Hill 1990) (hereinafter the C Standard). Briefly, in the context of the C computer language, a statement is a computer instruction other than a declaration. In other words, a statement is a any expression or instruction which directs a computer to carry out one or more processing steps. Static checking in the context of the C computer language includes, for example, (i) making sure that no two variables in the computer program are identified by the same name; (ii) ensuring that each "break" statement corresponds to a preceding "while", "for", or "switch" statement; and (iii) verifying that operators are applied to compatible operands. Static checking is discussed, for example, in Alfred V. Aho et al., Compilers, (Addison Wesley 1988).

Some existing static checking methods, which are generally called "data flow analysis" techniques, analyze data flow through a program to detect programming errors. Such analysis includes use of control flow information, such as sequencing of statements and loop statements, to detect the improper use of data objects, e.g., the use of a variable before a value has been assigned to the variable. Flow of control in a computer program is the particular sequence in which computer instructions of the computer program are executed in a computer process defined by the computer program. Computer programs and processes and the relation therebetween are discussed more completely below. Data flow techniques are discussed in Beizer, Software Testing Techniques, (1990) at pp. 145-172.

Existing static checking techniques suffer from the inability to track use of resources through several discrete components of a computer program such as several functions which collectively form a computer program. For example, a variable may be initialized in a first function and used in a calculation in a second, subsequently executed function. By analysis of only the computer instructions of the second function, the variable appears to be used before the variable is initialized which can be erroneously reported as an error. In addition, existing static checking techniques are static in nature and do not consider particular data values associates with particular data objects. Static analysis is limited to what can be determined without considering the dynamic effects of program execution. Beizer describes several areas for which static analysis is inadequate, including: arrays, especially dynamically calculated indices and dynamically allocated arrays; records and pointers; files; and alternate state tables, representing the different semantics of different types in the same program.

Static checkers do not detect errors involving calculated addresses corresponding to dynamically allocated memory or calculated indices into arrays. Calculated addresses and indices are addresses and indices, respectively, which are calculated during the execution of a computer process. Static checkers do not detect such errors in a computer program because checking for such errors typically involves determining the precise values of calculated addresses and indices, which in turn involves consideration of the behavior of the computer program during execution, i.e., as a computer process.

Static checkers do not detect errors involving the use of questionably allocated resources or the use of resources whose state is determined by the value of a variable or other data object. In the C computer language, a resource, e.g., dynamically allocate memory or a file, is questionably allocated. In other words, a function which allocates the resource completes successfully, even if allocation of the resource failed. Whether the allocation succeeded is determined by comparison of the returned item of the function, which is a pointer to the allocated resource, to an invalid value, e.g., NULL. Static checkers do not consider the behavior of a called function but instead only verify that the syntax of the call to the called function comports with the syntax prescribed in the particular computer language. Therefore, static checkers do not detect errors involving use of a resource which is questionably allocated.

As described above, a static checker does not consider the behavior of a called function. Thus, verifying the use of a resource which spans multiple functions is impossible. For example, if a first function allocates a resource, a second function uses the resource, and a third function deallocates the resource, static checking of any of the first, second, and third functions alone or a function calling all three functions, cannot verify the proper use of the resource.

When using an error detection technique, which employs insufficient information regarding the behavior of a computer program during execution, the errors reported by such a technique are either under-inclusive or over-inclusive. For example, if a function accepts as a parameter a pointer to an allocated resource, e.g., a file, and uses the parameter without comparing the parameter to an invalid pointer, the function contains a possible error. Whether the function contains an error depends on circumstances which are unknown within the context of the function. For example, if the pointer is verified to be a valid pointer before the function is called, there is no error in the function. To report the use of the pointer as an error would clutter an analysis of the function with a falsely reported error, and thus would be over-inclusive. Falsely reporting errors in analysis of a large program, at best, is an inconvenience to a program developer and, at worst, renders analysis of a computer program useless. If the pointer is not checked to be valid prior to calling the function, failure to report the error results in failure to detect an error which can cause an execution of the computer program to be aborted abruptly and can result in the corruption of data structures and possibly in the loss of valuable data.

One particular drawback of the failure of static checking techniques to consider the dynamic behavior of a computer program is the reporting of apparent, but "false", errors, i.e., errors resulting from computer instructions through which control cannot flow. In functions in which control flow paths depend on particular values associated with particular data structures and program variables, control flow cannot be determined without considering the values associated with those data structures and variables which generally in turn cannot be determined without consideration of the behavior of the function during execution. As a result, instructions which are not executed or which are executed only under specific circumstances are generally assumed to always be executed by static checkers.

Another type of existing programming error detection technique is called program verification. In program verification, a computer program is treated as a formal mathematical object. Errors in the computer program are detecting by proving, or failing to prove, certain properties of the computer program using theoretical mathematics. One property for which a proof is generally attempted is that, given certain inputs, a computer process defined by the computer program produces certain outputs. If the proof fails, the computer program contains a programming error. Such program verification techniques are described, for example, in Eric C. R. Hehner et al., A Practical Theory of Programming, (Verlag 1993) and Ole-Johan Dahl, Verifiable Programming, (Prentice Hall 1992).

Verified programming techniques are limited in at least two ways: (i) only properties of computer programs which can be expressed and automatically proven using formal logic can be verified, and (ii) a person developing a computer program generally must formally specify the properties of the computer program. Formally specifying the properties of a computer program is extremely difficult in any case and intractable for larger programs. As a result, commercially successful products employing verified programming techniques are quite rare.

In another type of programming error detection technique, a computer program is executed, thus forming a computer process, and the behavior of the computer process is monitored. Since a computer program is analyzed during execution, such a programming error detection technique is called "runtime checking". Some runtime checking techniques include automatically inserting computer instructions into a computer program such that execution of the inserted computer instructions note, during execution of the computer program, the status of variables and resources of the computer program. Such an error detection technique is described by U.S. Pat. No. 5,193,180 to Hastings.

Runtime checking can typically detect errors such as array indices out of bounds and memory leaks. Examples of runtime checking include Purify which is available from Pure Software Inc. of Sunnyvale, Calif. and Insight which is available from Parasoft Corporation of Pasadena, Calif. Purify inserts into a computer program monitoring computer instructions after a computer program has been compiled in to an object code form, and Insight inserts into a computer program monitoring computer instructions before a computer program is compiled, i.e., while the computer program is still in a source code form.

Runtime checking is generally limited to what can be determined by actually executing the computer instructions of a computer program with actual, specific inputs. Runtime checking does not consider all possible control flow paths through a computer program but considers only those control flow paths corresponding to the particular inputs to the computer program supplied during execution. It is generally impracticable to coerce a computer process, formed by execution of the computer instructions of a computer program, to follow all possible control flow paths. To do so requires that a programmer anticipate all possible contingencies which might occur during execution of the computer instructions of a computer program and to cause or emulate all possible combinations of occurrences of such contingencies.

Furthermore, runtime checking can only be used when the computer program is complete. Analysis of a single function before the function is incorporated into a complete program is impossible in runtime checking since the function must be executed to be analyzed. Analysis of a function using runtime checking therefore requires that (i) all functions of a computer program be developed and combined to form the computer program prior to analysis of any of the functions or (ii) that a special purpose test program, which incorporates the function, be developed to test the function. Top-down programming, which involves the design, implementation, and testing of individual functions prior to inclusion in a complete computer program and which is a widely known and preferred method of developing more complex computer programs, therefore does not lend itself well to runtime analysis.

What is needed is a programming error detection technique which considers the dynamic behavior of a computer program, which automatically considers substantially all possible control flow paths through the computer program, and which does not require a programmer of such a computer program to express the computer program in an alternative, e.g., mathematical, form. What is further needed is a programming error detection technique which analyzes an individual component of a program, considering the behavior of the component during execution. What is further needed is a programming error detection technique which considers the behavior of a component whose execution is invoked by a computer program component under analysis.

SUMMARY OF THE INVENTION

In accordance with the present invention, a computer program is analyzed, and programming errors in the computer program are detected, by modelling the behavior of resources used by the computer program and detecting potential state violations in the those resources. A resource is modelled according to resource states and resource state transitions which describe the behavior of the resource. The computer instructions of the computer program are dynamically inspected, i.e., the dynamic behavior of the computer instructions is determined and the states of resources are changed according to the dynamic behavior of the computer instructions.

Each component of a computer program is analyzed individually. Use of a resource whose use spans more than one component, e.g., a resource which is allocated by a first component, used by a second component and deallocated by a third component, is analyzed by modelling the externals of each component. Two components of a computer program communicate with one another through the externals of each component. For example, information regarding a resource allocated by a first component is transmitted to a second component, which uses the resource, through the externals of the first and second components. By analyzing the behavior of each component with respect to the externals of the component, resources whose use span more than one component are properly modelled.

Each component is analyzed and the effect of execution of the component on each external of the component is determined. From the analysis of the component, a model of the component is created. The model of the component describes the effect of execution of the component on each external of the component in terms of changes in the respective states of the externals and the introduction of new resources associated with any external of the component. Execution of the modelled component can have any of a number of effects on any individual external, and those effects are represented in a composite state of the external. The model of the component can then be used in the analysis of other components which invoke execution of the modelled component.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a computer.

FIG. 2 is a block diagram of a computer process component, resources of the component, and other components.

FIGS. 3A and 3B are state diagrams representing the modelling of a resource according to one embodiment of the present invention.

FIGS. 4A, 4B, 5A and 5B are state diagrams representing the modelling of an external according to one embodiment of the present invention.

FIGS. 6 and 7 are block diagrams of a resource checker in accordance with the present invention.

FIG. 8 is a block diagram of a dynamic inspection engine in accordance with the present invention.

FIG. 9 is a logic flow diagram of the analysis of a computer program in accordance with the present invention.

FIG. 10 is a logic flow diagram of the initialization of a model in the logic flow diagram of FIG. 9.

FIG. 11 is a block diagram of a function model structure in accordance with an embodiment of the present invention.

FIG. 12 is a block diagram of an external model structure in accordance with an embodiment of the present invention.

FIG. 13 is a block diagram of a function model structure and two external model structures associated with the function model structure.

FIG. 14 is a block diagram of a function structure in accordance with an embodiment of the present invention.

FIG. 15 is a block diagram of an external list structure in accordance with an embodiment of the present invention.

FIG. 16 is a block diagram of a declaration structure in accordance with an embodiment of the present invention.

FIG. 17 is a block diagram of a type structure in accordance with an embodiment of the present invention.

FIG. 18 is a block diagram of a field structure in accordance with an embodiment of the present invention.

FIG. 19 a block diagram of a two-field data object.

FIG. 20 is a block diagram of a type structure and two field structures representing the data object of FIG. 19.

FIG. 21 is a block diagram of a statement structure in accordance with an embodiment of the present invention.

FIG. 22 is a block diagram of an expression structure in accordance with an embodiment of the present invention.

FIG. 23 is a block diagram of an expression structure, an associated declaration structure and an associated item structure in accordance with an embodiment of the present invention.

FIG. 24 is a logic flow diagram of the analysis of an individual computer program component according to an embodiment of the present invention.

FIG. 25 is a logic flow diagram of a step in the logic flow diagram of FIG. 24.

FIG. 26 is a logic flow diagram of a single iterative evaluation of a computer program component according to logic flow diagram 24.

FIG. 27 is a block diagram of an item structure in accordance with an embodiment of the present invention.

FIG. 28 is a logic flow diagram of the analysis of a statement in accordance with an embodiment of the present invention.

FIG. 29 is a logic flow diagram of the evaluation of an expression in accordance with an embodiment of the present invention.

FIG. 30 is a block diagram of an external structure in accordance with an embodiment of the present invention.

FIG. 31 is a block diagram of a resource structure in accordance with an embodiment of the present invention.

FIG. 32 is a logic flow diagram of the application of an operation to an item in accordance with an embodiment of the present invention.

FIGS. 33A, 33B, 33C, 33D and 33E are a logic flow diagram of the processing of an operator in accordance with an embodiment of the present invention.

FIG. 34 is a logic flow diagram of the processing of a declaration in accordance with an embodiment of the present invention.

FIG. 35 is a logic flow diagram of the processing of an "if" statement in accordance with an embodiment of the present invention.

FIG. 36 is a logic flow diagram of the processing of a logical operator in accordance with the present invention.

FIG. 37 is a logic flow diagram of the processing of a step of the logic flow diagram of FIG. 36.

FIG. 38 is a logic flow diagram of the processing of another step of the logic flow diagram of FIG. 36.

FIG. 39 is a logic flow diagram of the processing of a "return" statement in accordance with an embodiment of the present invention.

FIG. 40 is a logic flow diagram of the processing of a "block" statement in accordance with an embodiment of the present invention.

FIG. 41 is a logic flow diagram of the detection of resource leaks in accordance with one embodiment of the present invention.

FIG. 42 is a logic flow diagram of the composition of the composite states of an external in accordance with an embodiment of the present invention.

FIG. 43 is a logic flow diagram of the production of a function model from the analysis of the function in accordance with an embodiment of the present invention.

FIG. 44 is a logic flow diagram of the processing of a step of the logic flow diagram of FIG. 43.

FIG. 45 is a logic flow diagram of the assignment of the value of one item to another item in accordance with an embodiment of the present invention.

FIG. 46 is a logic flow diagram of the emulation of a called routine in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION

In accordance with the present invention, errors in a computer program are detected by modelling resources used by the computer program and detecting potential state violations in those resources. A resource is modelled by simulating the behavior of the resource in terms of states of the resource and transitions between those states. Each computer instruction of the computer program is analyzed and the state of the resource is changed according to the effect execution of the computer instruction would have on the resource. State violations, i.e., invalid states and invalid state transitions in the state of the resource, are detected and reported as programming errors. In this way, error detection according to the present invention considers the behavior of a computer process as defined by the computer program, thereby overcoming many of the limitations of static checkers of the prior art.

Each resource has a prescribed behavior which can be described in terms of valid states and valid transitions between those states. A common source of errors in computer programs is the failure of the developer of the computer program to observe the prescribed behavior of a resource. When a computer instruction in the computer program directs a computer to use the resource in violation of the prescribed behavior of the resource, a state violation occurs. An example of a state violation is the reading of a record from a file after the file has been closed when the prescribed behavior of the file dictates that the file must be open to be read.

A computer 100 (FIG. 1) includes a central processing unit (CPU) 102, memory 104, and input/output circuitry (I/O circuitry) 106, all of which are interconnected through a bus 108. Memory 104 can include any type of memory, including randomly-accessible memory (RAM), read-only memory (ROM), and secondary storage devices such as magnetic disks. CPU 102 executes from memory 104 a computer process 110, which has access to library functions 112, dynamically allocated memory 114, and a second computer process 116. I/O circuitry 106 includes drivers 106A, 106B, 106C, 106D, and 106E, which drive a video monitor 118, secondary storage 120, a network 126, a locator device such as a mouse 122, and a keyboard 124.

As used herein, a resource is a part of a computer system which is used by a computer process and which generally must be allocated before being used and generally must be deallocated, i.e., freed, after being used. Examples of resources include global memory, files, windows, menus, and dialogs. Resources of computer process 110 include, for example, dynamically allocated memory 114, computer process 116, and magnetic disk 120.

As used herein, a computer process is a series of steps carried out by a computer. A computer program is a series of instructions which can be carried out by a computer. It should be understood that the instructions of a computer program define the steps which, when carried out by a computer, form a computer process. Thus, to model the behavior of computer process 110, the computer program defining computer process 110 is analyzed.

ANALYZING AT THE FUNCTION LEVEL

Computer programs are typically a combination of previously developed components and newly developed code. As used herein, "code" refers to source code, i.e., computer instructions in human intelligible form, and/or object code, i.e., computer instructions in computer intelligible form. A component of a computer program is a collection of computer instructions and/or data structures which are previously developed to perform a specified process fragment and which have typically been tested to ensure that the process fragment is performed faithfully by the component. A process fragment is one or more of the steps of a computer process, i.e., is a fragment of the computer process. A developer of a computer program uses such components to perform the specified process fragments and typically trusts that the components, when executed, perform as specified. Such components can include invocations of execution of, i.e., calls to, components previously developed by the developer or components acquired commercially. Thus, redundancy in developing a computer program is avoided.

A new computer program is typically developed by combining previously developed components and interconnecting those components using newly written computer instructions. The result of such combining and interconnecting can be either a new computer program or a new component that can be used by other components or computer programs. A component of a computer program defines a process fragment of the computer process defined by the computer program. Each process fragment of a computer process can alter the state of a resource used by the computer process. Thus, to properly analyze the state and state transitions of a resource used by a computer process, the effect on the state of the resource resulting from execution of the process fragment as defined by the component of the computer program must be ascertained. As an example, properly analyzing the use of a resource, which is allocated in a first process fragment defined by a first component, used in a second process fragment defined by a second component, and deallocated in a third process fragment defined by a third component, requires analysis of the effect of each of the first, second and third process fragments on the resource.

Computer programs can be written in any of a number of computer languages. Traditional computer languages are procedural in that the computer instructions of a computer program are organized into components, sometimes called procedures or "functions", each of which is designed to carry out a particular process fragment when executed. Examples of procedural languages include C, Ada, Pascal, Fortran, and Basic. Some procedural languages are object-oriented, such as C++ and SmallTalk. In object-oriented computer languages, functions and data structures are combined into objects which are in turn organized into components known as "classes".

Some computer languages are graphics-based in that instructions are represented as graphical images which are displayed on a computer screen and which are linked by a programmer to form a computer program. For example, Microsoft Visual Basic, which is available from Microsoft Corporation of Redmond, Wash., is such a graphics-based computer language. Some computer languages are specific to a particular software product such as the Microsoft Word Basic computer language for the Microsoft Word word processor available from Microsoft Corporation or the Lotus 1-2-3 macro language for the Lotus 1-2-3 Spreadsheet product available from Lotus Development Corporation of Cambridge, Mass. The present invention is applicable to any computer language, i.e., to any computer instruction protocol, in which resources are used. While source code computer instruction protocols are described above, it is appreciated that the teachings herein are equally applicable to computer instructions in the form of object code. In the illustrative embodiment described herein, the particular computer language analyzed is the well-known C computer language as described in the C Standard.

Computer programs written in the C computer language are typically divided into a number of functions. A function, when executed, accepts as input zero or more parameters and produces as output one returned item or no returned item. The parameters and the returned item are data structures which are stored in memory, such as memory 104, and which include data accessible by the function. An illustrative example of a function defined in the C computer language is given below in computer code excerpt (1).

In the illustrative embodiment described herein, each function of a computer program is analyzed individually. A function is analyzed by modelling changes to and uses of the resources, externals and items of the function effected by the computer instructions of the function. An item of a function represents a location in memory, such as memory 104, that is accessible by the function. An item has a type and a value. Types of items supported in one embodiment of the present invention include integer, floating point, and pointer data. The value of an item is the value represented by the particular data stored in the location of memory represented by the item. An external and a resource can be associated with each item of a function. Items are described more completely below. A variable is an association between an identifier and one or more items.

An external of a function represents a part of a computer process which exists outside of the context of the function, i.e., before execution of the function begins or after execution of the function terminates. Examples of externals of a function include the parameters and returned item of the function, globally defined variables, and static variables. The terms (i) "globally defined variables" and (ii) "static variables" are used herein to describe, respectively, (i) variables with "extern" linkage and (ii) variables with "intern" linkage and "static" storage duration. "Locally-defined variables" are variables with "intern" linkage and "automatic" storage duration. Linkage is discussed in the C Standard at Section 6.1.2.2, and storage duration is discussed in the C Standard at Section 6.1.2.4. Briefly, a globally-defined variable is defined for all process fragments of a computer process, and a static variable is defined for a number of process fragments, but not necessarily all process fragments, of a computer process.

Each process fragment uses a number of resources. For example, function 202 (FIG. 2) of process 110 (FIG. 1) uses dynamically allocated memory 114, and computer process 116. Function 202 (FIG. 2) also uses (i) globally defined memory 204, which is also accessible by functions 202A and 202B and other functions, (ii) local memory, (iii) parameters 208A-208C, and (iv) returned item 210. Function 202 is analyzed by modelling one or more of these resources.

Each resource and external has a state. Execution of each computer instruction of a function is emulated, modelling any changes in the state of any externals or resources of the function which would result from actual execution of the computer instruction. If the state of an external or resource is changed, the state change is compared to a corresponding external behavior model or resource behavior model, respectively, to determine whether the change in state reflects appropriate use of the external or resource, respectively. If the state change is inappropriate, a state violation occurs and an error is reported. The error can be reported to the user (i) by displaying an error message on video monitor 118 (FIG. 1) or similar output device, (ii) by recording an error message in an error log file in memory 104 or in secondary storage 120, or (iii) by both displaying an error message and recording an error message.

BEHAVIOR MODELS

A function model represents the abstraction of a function in terms of operations applied by the function to the externals of the function and any new resources the function allocates.

As described above, a resource has a state. The valid states and valid transitions between states of a resource is represented by a resource behavior model. The modelling of the behavior of a resource can be substantially simpler than the actual behavior of the resource. For example, the state of a resource is modelled according to a resource behavior model represented by state diagram 300 (FIG. 3A). According to state diagram 300, a resource can have any of the following states.

Table A

U=Unallocated

A=Allocated

Q=Questionably allocated

X=Invalid ("NULL")

E=Error or unknown state

States U and X are similar but distinct: an item associated with an unallocated resource has an indeterminate value, and an item associated with an invalid resource has a known, invalid value. A resource behavior model can be as complex as the actual behavior of the resource whose behavior is modelled. However, even substantially simplified resource behavior models such as that represented in state diagram 300 are effective in detecting a substantial majority of all possible errors in the use of such a resource.

Resources are initially in state U since a resource is initially unallocated. Emulated execution of each computer instruction, actual execution of which causes a change in the state of a resource, applies an operation to the resource. By application of an operation to a resource, the state of the resource changes according to state diagram 300. The following are the operations which can be applied to a resource.

Table B

a=definitely allocates

m=maybe allocates

k=kills, i.e., frees or deallocates

c=uses in a calculation

p=uses in a predicate

i=uses in an indirection

x=mark invalid

Thus, according to state diagram 300, if an unallocated resource, i.e., a resource in state U, is definitely allocated by an instruction in a function, thereby applying operation a, the resource is then in state A, i.e., allocated. However, if an unallocated resource, i.e, in state U, is used in a calculation, thereby applying operation c, the resource is then in state E. State E indicates that a state violation has occurred as a result of a programming error. State E is optional in that state E does not describe the prescribed behavior of a resource, but is used in the disclosed embodiment as a convenient way to represent a state violation. In an alternative embodiment, state E is omitted and a violation is detected in the above example by noting that, when a resource is in state U, operation c is undefined.

State diagram 300 (FIG. 3A) is summarized in Table C below.

                  TABLE C
    ______________________________________
    New States Resulting from Operations
            operation:
    old state:
              a      m      k    c    p     i    x
    ______________________________________
    U:        A      Q      U.sup.1
                                 E.sup.2
                                      E.sup.2
                                            E.sup.6
                                                 E.sup.2
    A:        A      Q      U    A    A     A    X
    Q:        A      Q      U.sup.3
                                 A.sup.4
                                      A     A.sup.4
                                                 X
    X:        A      Q      U.sup.5
                                 E.sup.6
                                      X     E.sup.6
                                                 X
    E:        A      Q      U    E    E     E    E
    ______________________________________


Superscript numerals corresponding to operation identifiers in state diagram 300 and to new state identifiers in Table C indicate specific errors. The errors are listed in Table D.

Table D

1--Freeing an unallocated or freed resource.

2--Using an unallocated or freed resource.

3--Freeing potentially-allocated data without checking.

4--Using potentially-allocated data without checking.

5--Freeing NULL data.

6--Using (e.g., dereferencing) NULL data.

In the example given above, applying operation c to a resource in state U places the resource in state E as indicated in state diagram 300 by an arrow from state U to state E identified by "c.sup.2". Thus, the error in this example is error number 2 in Table D, namely, the use of an unallocated resource.

Each function model specifies which operations are applied to each external of a corresponding function. For example, function fopen(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.5.3, defines two parameters, the first of which is accepted as input and which specifies a file to be opened, and defines a returned item which is a file pointer corresponding to the opened file. File pointers, i.e., pointers to items of the type "FILE", are well-known and are described in the C Standard at Section 7.9.1. The file pointer is an external of function fopen() and the file specified by the parameter is the resource associated with the external. The function model for function fopen() specifies that a new resource whose initial state is state Q is created. The initial state of the resource is state Q rather than state A because function fopen() does not guarantee that the file is opened successfully.

Function fclose(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.5.1, defines a parameter which is a file pointer. Execution of function fclose() closes the file to whose file descriptor the parameter points. The function model for function fclose() specifies that an operation k is applied to the parameter to reflect closing, and thus deallocating, the associated file. Similarly, function models for functions of the C computer language defining read and write operations to the file specify application of an operation c to a resource representing the file to reflect use of the file.

If an item corresponding to a resource, e.g., the file pointer which is the returned item of function fopen(), is used as a predicate in a decision instruction, operation p is applied to the resource to thereby change the state of the resource according to state diagram 300. An item is used in a predicate if the item appears as an operand in a relational expression (e.g., an operation involving any of operators >, <, <=, >=, and |=) or a boolean expression (e.g., an operation involving any of operators &&, .vertline..vertline., and |) or if the item is used as the control expression in a "switch" statement. The "switch" statement is defined for the C computer language and controls flow of a function according to the value of the control expression. The "switch" statement is described more completely in the C Standard at Section 6.6.4.2.

If an item corresponding to a resource is used in a calculation, operation c is applied to the resource to thereby change the state of the resource according to state diagram 300. An item is used in a calculation (i) if the item appears as an operand to a mathematical operation (e.g., +, /, *, or -), (ii) if the resource appears as a dereference of a pointer or as an access into an array, or (iii) if the resource appears as an array index.

Pointers and arrays are well-known and are described in the C Standard. For completeness, pointers and arrays are briefly described herein. In the context of the C computer language, a pointer is an item whose value is the address in memory of another item. Thus, a pointer "points" to the other item. Dereferencing a pointer is retrieving the item to which the pointer points.

Data structures, which are used to implement the disclosed embodiment of the present invention and which are described below in greater detail, are described as including pointers to other data structures. It is appreciated that mechanisms other than pointers are known for uniquely identifying a data structure and that these mechanisms can be substituted for pointers without deviating from the principles of the present invention.

An array is a collection of one or more items of similar structure. The items of an array are called elements and are numbered sequentially. An access to an array is an access to an element of the array by reference to the number of the element, i.e., the index of the element.

Operation x is applied to a resource corresponding to an item which is assumed to be NULL. NULL is generally an invalid value and is assigned to an item to indicate that the item has no valid value. For example, a pointer whose value is NULL points to no item. In the context of the C computer language, NULL is also a boolean value of "false". An item is assumed to be NULL, i.e., to have a value of NULL, if the item is compared to NULL and the result of the comparison is assumed to be true. As described more completely below, analysis of a function requires that assumptions be made regarding the particular behavior of the function when executed. For example, function fopen() either successfully opens a file or fails to do so. If the returned item, i.e., the file pointer, is compared to NULL and the result is assumed to be true, i.e., if function fopen() is assumed to have failed, operation x is applied to the resource representing the file as described more completely below.

ILLUSTRATIVE EXAMPLES OF THE BASIC PRINCIPLES OF THE PRESENT INVENTION

The utility of the modelling of resources is described by way of example. The following source code excerpt (1) includes a programming error which is detected by the disclosed embodiment of the present invention. Source code excerpt (1) comports with the known C computer language and defines a function example.sub.-- 1(). Line numbers, which are not part of the C computer language, are added for clarity in the discussion below.

    ______________________________________
    1       #include <stdio.h>        (1)
    3       #define MAX.sub.-- STR.sub.-- LEN 100
    4       #define FALSE 0
    5       #define TRUE 1
    6
    7       int example.sub.-- 1(input.sub.-- file.sub.-- name) /* begin
            function */
    8       char *input.sub.-- file.sub.-- name; /* parameter to function */
    9       {
    10      char *str; /* Declaration of local variable "str" */
    11      FILE *fptr; /* Declaration of local variable "fptr" */
    12
    13      /* try to open a file */
    14      fptr = fopen(input.sub.-- file.sub.-- name, "r");
    15      if (fptr == NULL)
    16       {
    17       /* could not open the file */
    18       fprintf(stderr, "Could not open file %s.backslash.n",
    19        input.sub.-- file.sub.-- name);
    20       return FALSE; /* an error */
    21       }
    22      /* allocate some memory for a string buffer */
    23      str = (char *)malloc(MAX.sub.-- STR.sub.-- LEN);
    24      /* get some input from the file */
    25      fgets(str, MAX.sub.-- STR.sub.-- LEN - 1, fptr);
    26      /* print out the information */
    27      printf(str);
    28      /* clean up */
    29      free(str);
    30      fclose(fptr);
    31      return TRUE; /* no error */
    32      }
    ______________________________________


As function example.sub.-- 1() is analyzed, the state of each item, including each external, is tracked. Variable "str" is locally-defined, i.e., is defined only in the context of function example.sub.-- 1(). Variable "str" is a pointer to data whose type is "char" as defined in line 10. However, variable "str" is initially uninitialized and points to no specific data. Therefore, variable "str" is not associated with a resource.

Execution of function malloc(), which is defined for the C computer language and which is described in the C Standard at Section 7.10.3.3, accepts a request for allocated memory, e.g., memory 104 (FIG. 1), and either allocates the memory or fails to do so. Function malloc() returns, as the returned item, a pointer to the allocated memory if the memory is successfully allocated or a NULL pointer otherwise. Therefore, function malloc() creates a new resource whose initial state is state Q and associates the new resource with the returned item of function malloc(). After variable "str" is assigned the value of the returned item of function malloc() at line 23, variable "str" points to newly allocated memory if such memory is allocated or is a NULL pointer otherwise.

At line 25 of source code excerpt (1), variable "str" is used as a parameter in function fgets(), which is defined for the C computer language and which is described in the C Standard at Section 7.9.7.2. Execution of function fgets() dereferences the first parameter, which is variable "str" in the context of line 25 of source code excerpt (1). Therefore, operation i is applied to the resource associated with variable "str". As shown in state diagram 300 (FIG. 3A) and Tables C and D, application of operation i to a resource in state Q places the resource in state A, producing an error message indicating that potentially allocated data is used without checking.

At line 29 of source code excerpt (1), variable "str" is passed as a parameter to function free(), which frees, i.e., deallocates, the memory to which variable "str" points. Therefore, operation k is applied to the resource associated with variable "str". As shown in state diagram 300 and Tables C and D, application of operation k to a resource in state A places the resource in state U. Since deallocation of an allocated resource is proper, no error is reported.

Text (2) below illustrates the error messages produced by the disclosed embodiment of the present invention in analyzing function example.sub.-- 1() of source code excerpt (1). ##EQU1##

In text (2), "example.sub.-- 1.c" refers to a file containing source code excerpt (1) above, and thus defining function example.sub.-- 1(). Thus, function example.sub.-- 1() fails to account for the contingency that there may be insufficient memory to allocate the amount of memory requested in calling, i.e., invoking execution of, function malloc() at line 23 of source code excerpt (1). If function malloc() fails to allocate the requested memory during execution of function example.sub.-- 1(), the computer process in which function example.sub.-- 1() is executed aborts abruptly without giving to a user an indication of the reason for the unexpected termination of processing. However, detecting and reporting the failure to account for such a contingency using, for example, text (2) above provides the developer of function example.sub.-- 1() with the necessary information to correct the defect in function example.sub.-- 1() and to properly provide for such a contingency.

The utility of the present invention is further illustrated by considering the tracking of the state of file pointer "fptr" in function example.sub.-- 1() of source code excerpt (1). File pointer "fptr" is a locally-defined variable of function example.sub.-- 1(). File pointer "fptr" is a pointer to data of the type "FILE". Initially, file pointer "fptr" is uninitialized and is not associated with any resource.

The returned item of function fopen() is assigned to file pointer "fptr" at line 14. As described above, function fopen() creates a new resource, whose initial state is state Q, and associates the new resource with the returned item of function fopen(). The "if" statement at line 15 determines whether the file to which file pointer "fptr" points is successfully opened by comparing file pointer "fptr" to NULL. If file pointer "fptr" is NULL, the file is not successfully opened and function example.sub.-- 1() terminates after reporting to a user the failure to open the file. Conversely, if file pointer "fptr" is not NULL, the file to which file pointer "fptr" points is known to be successfully opened and function example.sub.-- 1() continues at line 22. The comparison of file pointer "fptr" in line 15 applies operation p to the resource associated with file pointer "fptr". Thus, the state of the resource associated with file pointer "fptr" is changed from state Q to state A. As a result, any uses of file pointer "fptr", either in calculation (applying operation c) or in a predicate (applying operation p) do not produce any error messages as shown in state diagram 300 and Table C. Therefore, no errors with respect to the treatment of file pointer "fptr" are detected.

As described above, functions fopen() and malloc(), when executed, perform specific processing on resources of parameters and returned items. Functions such as functions fopen() and malloc() are included in library functions 112 (FIG. 1) which are accessed by computer process 110. Calls to such functions are included in function 202 (FIG. 2). As used herein, a "call" to a function is a statement which, when executed, causes a processor, such as CPU 102 (FIG. 1), to (i) supply zero or more items as parameters to the function, (ii) execute the function, and (iii) produce a returned item representing the value to which the function evaluates if a returned item is defined by the function. A first function, which includes a call to a second function, is called a "calling function." The second function is called a "called function."

To properly analyze resources of function 202 (FIG. 2) affected by execution of functions called by statements of function 202, function models describing the behavior of such called functions are maintained. In one embodiment, such function models are created from well-known textual descriptions of the behavior of such functions, e.g., from the C Standard, and those function models are stored in memory 104 of computer 100. Those function models are then retrieved from memory 104 prior to analyzing a computer program as described more completely below.

The following are illustrative examples of function models of some of the functions called by function example.sub.-- 1() of source code excerpt (1) above. All of the called functions are from the C standard library's "stdio" (input/output) header file which is a well-known file for use with the C computer language and which is described in the C Standard in Sections 7.9 et seq.

    ______________________________________
    (malloc         /* model for function malloc() */
                                      (3)
     (retval (new Q "memory"))
                    /* returned item:
                    creates a new, possibly
                    allocated resource */
     ((param 0) (op c))
                    /* parameter 0: used in
                    a computation */ )
    ______________________________________


A function model structure, which represents in memory 104 (FIG. 1) a function model according to the disclosed embodiment of the present invention, is described more completely below. Function model (3) defines the effect of execution of function malloc() on the respective states of the externals of function malloc(). According to function model (3), a new resource is created, initialized to state Q, and associated with the returned item of function malloc(). Function model (3) also specifies that operation c is applied to parameter 0, i.e., the first parameter, of function malloc().

    ______________________________________
    (free           /* model for function free() */
                                      (4)
     ((param 0) (op k)))
                    /* parameter 0: free (kill) */
    ______________________________________


Function model (4) represents the effect of execution of function free() on the externals of function free() and specifies that operation k is applied to parameter 0, i.e., the first parameter in the argument list.

    ______________________________________
    (fgets       /* model for function fgets() */
                                      (5)
    ((param 0) (op i))
                    /* parameter 0 (string
                    buffer): apply operation i,
                    indirection */
    ((param 1) (op c))
                    /* parameter 1 (buffer
                    length): use in computation
                    (op c) */
    ((param 2) (op i))
                    /* parameter 2 (the file):
                    indirection (op i -- file must
                    be open) */
    ______________________________________


Function model (5) specifies that (i) operation i is applied to parameter 0, i.e., the first parameter, (ii) operation c is applied to parameter 1, i.e., the second parameter, and (iii) operation i is applied to parameter 2, i.e., the third parameter, by calling function fgets().

DETECTION OF RESOURCE LEAKS

By modelling resources and tracking associations of resources with externals of a function, the disclosed error detection mechanism provides a convenient mechanism for detecting resource leaks. A resource is "leaked" by a function when execution of the function terminates, leaving the resource in an allocated state, when the resource cannot be accessed by any external of the function. When a resource is leaked, the resource cannot be used since no pointer to the resource remains after execution of the leaking function terminates. If the resource is reusable, such as dynamically allocated memory 114 (FIG. 1), failure to free the resource prior to termination of execution of the function prevents other functions from reusing the resource. A process fragment which repeatedly leaks dynamically allocated memory can ultimately cause exhaustion of all memory which is available to the computer process of which the process fragment is a part.

As an example of detection of a resource leak, function example.sub.-- 2() of source code excerpt (6) is considered.

    ______________________________________
    0      #include <stdio.h>         (6)
    1      #include <string.h>
    3      #define MAX.sub.-- STR.sub.-- LEN 100
    4      #define FALSE 0
    5      #define TRUE 1
    6
    7      char *example.sub.-- 2(input.sub.-- file.sub.-- name) /* begin
           function */
    8      char *input.sub.-- file.sub.-- name; /* parameter to the function
           */
    9      {
    10     char *str; /* declare local variable "str" */
    11     FILE *fptr; /* declare local variable "fptr" */
    12
    13     /* allocate some memory for a string buffer */
    14     str = (char *)malloc(MAX.sub.-- STR.sub.-- LEN);
    15     /* check to ensure that the allocation succeeded */
    16     if (str == NULL)
    17      return NULL;
    18     /* try to open a file */
    19     fptr = fopen(input.sub.-- file.sub.-- name, "r");
    20     if (fptr == NULL)
    21      {
    22      /* could not open the file */
    23      fprintf(stderr, "Could not open file %s.backslash.n",
    24       input.sub.-- file.sub.-- name);
    25      return NULL; /* error condition */
    26      }
    27     fgets(str, MAX.sub.-- STR.sub.-- LEN - 1, fptr);
    28     fclose(fptr); /* close file */
    29     return str; /* no error */
    30     }
    ______________________________________


Variable "str" is local to function example.sub.-- 2() and is therefore not accessible to any function other than function example.sub.-- 2(). Since the memory to which variable "str" points is not freed prior to instruction "return" of line 25 of source code excerpt (6), that memory is not useable and cannot be deallocated or reallocated until computer process 110, which function example.sub.-- 2() partly defines, terminates. That resource therefore "leaks" from computer process 110.

Since an external of a function is an item which exists past the termination of execution of the function, any allocated resource reachable through an external is not leaked. A resource which is not associated with a particular external can, in some circumstances, be reachable through the external. For example, a resource which is associated with a particular element of an array of items is reachable through an external which is a different element of the array of items. This is true since the location in memory of an element of an array can be calculated from the location of any other element of the array according to the C computer language.

Leaks are checked at the conclusion of a traversal of a function. The detection of leaks is described more completely below and is summarized briefly here. All resources reachable through any external are marked. Any resource which is not marked and which is allocated is reported as leaked. Since variable "str", at line 25, is not returned, variable "str" is not an external. The memory pointed to by variable "str" is therefore allocated and not marked at the conclusion of the traversal of function example.sub.-- 2(). The memory pointed to by variable "str" is therefore leaked.

Analysis of function example.sub.-- 2() produces the following error message. ##EQU2##

Static checkers of the prior art cannot detect resource leaks. Run-time checkers of the prior art often do not consider all potential events which might cause a function to leak a resource and generally cannot analyze a single function outside of the context of a larger computer program to detect resource leaks in that single function. In contrast, the disclosed embodiment of the present invention provides for efficient detection of resource leaks by analysis of a single function of a larger computer program. As described more completely below, the disclosed error detection mechanism considers all possible events which might cause a function to leak a resource. The present invention therefore represents a significant improvement over the prior art.

COMPOSITE STATES OF EXTERNALS

As described more completely below, a function is analyzed by following the flow of control of the function, emulating execution of individual statements of the function, and tracking the state of externals and resources. The flow of control through a function is the particular sequence of computer instructions of the function executed during a particular execution of the function. When control transfers from a first computer instruction to a second computer instruction, the second computer instruction is executed following execution of the first computer instruction. The flow of control through a function is sometimes called herein the control flow path through the function. Flow of control through a function is often dependent upon particular events which occur during execution of the process fragment, defined by the function, in a computer process.

In analyzing a function, it is preferred to consider all possible control flow paths through the function. It is therefore preferred to consider all events which can influence the control flow path through the function. Static checkers of the prior art often do not consider control flow paths at all. Runtime checkers only consider all control flow paths through a particular function to the extent a user can coerce, through manipulation of the events which influence the control flow path of the function, a computer process to follow each possible control flow path during execution of the computer process. In contrast, the disclosed error detection mechanism analyzes each possible control flow path through a function automatically without user intervention. Furthermore, the disclosed error detection mechanism can analyze a function outside of the context of a computer program or computer process which includes the function. Thus, individual functions can be more completely checked for errors prior to inclusion in a larger function or computer program or process.

As an example, function example.sub.-- 2() of source code excerpt (6) is considered. The precise control flow path through function example.sub.-- 2() is not known until function example.sub.-- 2() is executed in a computer process. For example, control flows from the "if" statement at line 16 to a call to function fopen() at line 19 if function malloc(), called at line 14, successfully allocates memory as requested. In other words, if function malloc() successfully allocates memory as requested when called at line 14, the call to function fopen() at line 19 follows execution of the "if" statement at line 16. Conversely, control flows from the "if" statement at line 16 to the "return" statement at line 17 if the allocation of memory fails. Whether memory is successfully allocated by function malloc() as called at line 14 is typically not known until function example.sub.-- 2() is executed in a computer process.

In analyzing function example.sub.-- 2(), it is preferred that each possible control flow path through function example.sub.-- 2() is considered. Multiple control flow paths through a function are considered by multiple traversals of the function under varying assumptions. For example, function example.sub.-- 2() is traversed once under the assumption that function malloc(), called at line 14, successfully allocates the requested memory and once under the assumption that function malloc() fails to allocate the requested memory.

In one embodiment of the present invention which is described below in greater detail, a function is traversed repeatedly, and, during each traversal, assumptions are made by random chance. Each traversal of function example.sub.-- 2() tracks the state of the externals of function example.sub.-- 2(). Each external has a composite state which reflects the states of the external resulting from multiple traversals of function example.sub.-- 2().

Externals have composite RS, CP, and DK states. These composite states are used for the dual purposes of (i) detecting inconsistent uses of an external when varying control flow paths through the function are considered and (ii) building a function model describing the effect of execution of the function on the externals of the function. The function model can then be used to analyze other functions which call the modelled function.

Within the context of a particular function, each external has a CP state, a DK state, and a RS state. The CP state of an external is used to determine whether the external is checked before being used. The term "CP" is derived from the operations of primary concern: operation c, which represents use of the external, before operation p, which represents checking of the external. The DK state of an external is used to determine whether the function allocates and/or frees the external. The term "DK" is derived from the purpose of the DK state: to determine whether a resource is defined ("D") before being killed ("K"), i.e., freed. The RS state of an external is the state of the resource associated with the external if a resource is so associated. The term "RS" is derived from resource ("R") state ("S").

Each external of a function also has a composite CP state, a composite DK state, and a composite RS state reflecting multiple CP, DK, and RS states, respectively, resulting from multiple traversals of the function. After each iterative traversal of a function, a new composite RS state of an external is composed, as described more completely below, from the previous composite RS state of the external and the RS state of the resource associated with the external resulting from the most recent traversal of the function. In a similar fashion, as described more completely below, new composite CP and DK states are composed from previous composite CP and DK states, respectively, and CP and DK states, respectively, resulting from the most recent traversal of the function.

State diagram 350 (FIG. 3B) represents states and state transitions for a composite RS state. Arrows are used in state diagram 350 to represent composite RS state transitions from a previous composite RS state according to an RS state resulting from a traversal of the function. State diagram 350 is summarized in Table E.

                  TABLE E
    ______________________________________
    New Composite RS States
            next RS state:
            U       A     Q          X   E
    ______________________________________
    previous
    composite
    RS state:
    U:        U         Q     Q        Q   E
    A:        Q         A     Q        Q   E
    Q:        Q         Q     Q        Q   E
    X:        Q         Q     Q        X   E
    E:        E         E     E        E   E
    ______________________________________


State diagram 400 (FIG. 4A) represents states and state transitions for a CP state of an external. Arrows are used in state diagram 400 to represent CP state transitions resulting from application of operations. An external can have any of the following CP or composite CP states.

Table G

O=Used in neither a predicate nor a computation (initial state).

C=Used in computation before checking.

I=Used for indirection before checking.

P=Checked (used in predicate) before using.

N=Neither; assigned to before checking or using.

The operations which can be applied to an external are described above with respect to Table B. State diagram 400 is summarized in Table H below.

                  TABLE H
    ______________________________________
    New States Resulting from Operations
            operation:
    old state:
              a      m      k    c    p     i    x
    ______________________________________
    O:        N      N      C    C    P     I    N
    C:        C      C      C    C    C     C    C
    I:        I      I      I    I    I     I    I
    P:        P      P      P    P    P     P    P
    N:        N      N      N    N    N     N    N
    ______________________________________


State diagram 450 (FIG. 4B) represents states and state transitions for a composite CP state of an external. Arrows are used in state diagram 450 to represent composite CP state transitions from a previous composite CP state according to a CP state resulting from a traversal of the function. State diagram 450 is summarized in Table I below.

                  TABLE I
    ______________________________________
    New Composite CP States
    previous
    composite   next CP state:
    CP state:   O       C       I     P     N
    ______________________________________
    O:          O       C       I     P     N
    C:          C       C       I     C     C
    I:          I       I       I     I     I
    P:          P       P       I     P     P
    N:          N       C       I     P     N
    ______________________________________


State diagram 500 (FIG. 5A) represents states and state transitions for a DK state of an external. Arrows are used in state diagram 500 to represent DK state transitions resulting from application of operations. An external can have any of the following DK or composite DK states reflecting the effect of execution of the function on a resource associated with the external.

Table J

O=The function neither allocates nor kills the resource (initial state).

A=The function definitely allocates the resource.

Q=The function questionably allocates the resource.

K=The function kills, i.e., deallocates, the resource.

KA=The function kills, then definitely allocates, the resource.

KQ=The function kills, then questionably allocates, the resource.

E=Error (unknown state).

The operations which can be applied to an external are described above with respect to Table B. State diagram 500 is summarized in Table K below.

                  TABLE K
    ______________________________________
    New States Resulting from Operations
            operation:
    old state:
              a      m      k    c    p     i    x
    ______________________________________
    O:        A      Q      K    O    O     O    O
    A:        A      A      O    A    A     A    A
    Q:        Q      Q      O    Q    Q     Q    Q
    K:        KA     KQ     K    K    K     K    K
    KA:       KA     KA     K    KA   KA    KA   KA
    KQ:       KA     KQ     K    KQ   KQ    KQ   KQ
    E:        E      E      E    E    E     E    E
    ______________________________________


State diagram 550 (FIG. 5B) represents states and state transitions for a composite DK state of an external. Arrows are used in state diagram 550 to represent composite DK state transitions from a previous composite DK state according to a DK state resulting from a traversal of the function. State diagram 550 is summarized in Table L below.

                  TABLE L
    ______________________________________
    New Composite DK States
    previous
    composite
            next DK state:
    DK state:
            O       A      Q    K     KA    KQ    E
    ______________________________________
    O:      O       A      Q    K     KA    KQ    E
    A:      A       A      Q    E     E     E     E
    Q:      Q       Q      Q    E     E     E     E
    K:      K       E      E    K     E     KQ    E
    KA:     KA      E      E    E     KA    KQ    E
    KQ:     KQ      E      E    KQ    KQ    KQ    E
    E:      E       E      E    E     E     E     E
    ______________________________________


Function example.sub.-- 2() of source code excerpt (6) above provides an illustrative example of the utility of composite states of externals.

As described above, flow of control through function example.sub.-- 2() can take any of several paths depending on assumptions made with respect to events during an emulated execution of the function. For example, the "if" statement at line 16 can be followed by the "return" statement at line 17, if variable "str" is not NULL, or by the expression on line 19, otherwise. The returned item of function example.sub.-- 2() is an external of function example.sub.-- 2(). The returned item of function example.sub.-- 2() is assigned at line 17, line 25, or line 29 of source code excerpt (6) depending only the particular assumptions made during a particular traversal of function example.sub.-- 2().

At line 17 or line 25, the returned item has no associated resource. Thus, after a traversal of function example.sub.-- 2() in which control transfers through either line 17 or line 25 of source code excerpt (6), the composite RS state of the external representing the returned item is state U. After a subsequent traversal of function example.sub.-- 2() in which control transfers through line 29, the external representing the returned item is associated with a resource created within function example.sub.-- 2() and is definitely allocated, i.e., in state A. The resource is definitely allocated because lines 16-17 of source code excerpt (6) properly prescribe an action to be taken in the event that execution of function malloc() does not successfully allocate memory.

As shown in state diagram 350 (FIG. 3B), an external, whose previous composite RS state is state U and whose next RS state is state A, has a new composite RS state of state Q. Such reflects the fact that execution of function example.sub.-- 2 can allocate, but does not necessarily allocate, memory to which the returned item points. Thus, when forming a function model describing the behavior of function example.sub.-- 2, the returned item of function example.sub.-- 2 is described as associated with a newly created resource whose initial state is state Q.

Composite states can also be used to detect inconsistent use of an external by a function. For example, if a function terminates with an external in an allocated state, i.e., a RS state of state A, and, in a subsequent traversal of the function, the function terminates with the same external in a freed state, i.e., a RS state of state K, the composite RS state of the external is in state E. This can be viewed as an error since a calling function generally would not expect the function to allocate a resource associated with an external in one execution and to free a resource associated with the same external in another execution.

ANALYSIS OF A COMPUTER PROGRAM

A computer program 610 (FIG. 6) is analyzed in accordance with the present invention by a resource checker 602 which analyzes the use of resources prescribed by computer program 610 as described herein. In the disclosed embodiment, resource checker 602 is a computer process executing in CPU 102 from memory 104, which is connected to CPU 102 through bus 108.

The analysis of computer program 610 according to the present invention is illustrated by logic flow diagram 900 (FIG. 9). Processing begins in step 902 in which a command entered by a user, e.g., through keyboard 124 (FIG. 1) or mouse 122, initiates analysis of computer program 610 (FIG. 6) and specifies characteristics of the environment in which computer program 610 is analyzed. Characteristics of the environment which can be modified by the user include (i) specific types of errors to detect, (ii) a maximum number of errors to report, (iii) a maximum number of functions to analyze, (iv) a maximum number of iterative traversals of each function, and (v) the particular technique for traversing all possible control flow paths through a function.

Processing transfers from step 902 (FIG. 9) to step 904 in which resource checker 602 (FIG. 6) initializes function models, which describe the effect on resources of execution of the various functions used by the computer program. Resource checker 602 includes a model parser 702 (FIG. 7) which reads models from a model description file 604 (FIG. 6) and constructs therefrom function model structures which are described more completely below. By creating function model structures within resource checker 602, the function models are initialized. Step 904 (FIG. 9) is described more completely below with respect to logic flow diagram 904 (FIG. 10).

Processing transfers from step 904 (FIG. 9) to step 906, in which a program parser 704 (FIG. 7), which is part of resource checker 602, reads and parses computer program 610 (FIG. 6), using conventional techniques, according to the language to which computer program 610 comports. Program parser 704 (FIG. 7) parses computer program 610 (FIG. 6) into smaller program components, e.g., functions. In step 906 (FIG. 9), a single function is parsed from computer program 610 (FIG. 6) and a function structure, which represents the parsed function is transferred to a dynamic inspection engine 706, which is described more completely below. In an alternative embodiment, a preprocessor, which is described in more detail below, parses computer program 610 and stores a number of function structures representing the parsed functions of computer program 610. In this alternative embodiment, program parser 704 retrieves a single function structure and transfers the function structure to dynamic inspection engine 706. Processing transfers from step 906 (FIG. 9) to step 908.

In step 908, dynamic inspection engine 706 (FIG. 7), which is part of resource checker 602, analyzes the "subject function", i.e., the function represented by the function structure transferred to dynamic inspection engine 706 by program parser 704 in step 906 (FIG. 9). In other words, the effect on the resources used by computer program 610 resulting from the execution of the subject function is determined and the state transitions of each of the resources affected by execution of the subject function are analyzed as described more completely below. The function models initialized in step 904 are used to analyze the states and state transitions of the resources and externals of the subject function. Any detected state violations are reported as programming errors.

Once the behavior of the subject function with respect to resources and externals of the subject function is determined, model parser 702 forms and stores in model description file 604 a function model describing the behavior of the subject function. Step 908 (FIG. 9) is described more completely below with respect to logic flow diagram 908 (FIG. 24).

Processing transfers from step 908 (FIG. 9) to test step 910 in which program parser 704 (FIG. 7) further parses computer program 610 (FIG. 6) to determine whether computer program 610 contains a function which has yet to be analyzed by dynamic inspection engine 706 (FIG. 7) according to step 908 (FIG. 9). In the alternative embodiment described above, program parser 704 (FIG. 6) determines whether a function structure representing a function of computer program 610 has yet to be analyzed by dynamic inspection engine 706 (FIG. 7) according to step 908 (FIG. 9). If dynamic inspection engine 706 (FIG. 7) has not processed a function structure representing a function of computer program 610, processing transfers to step 906 (FIG. 9) in which program parser 704 (FIG. 6) transfers the function structure to dynamic inspection engine 706 (FIG. 7) as described above. Conversely, if dynamic inspection engine 706 (FIG. 7) has processed every function structure representing a function of computer program 610, processing according to logic flow diagram 900 (FIG. 9) terminates.

INITIALIZATION OF MODELS

As described above with respect to step 904 (FIG. 9) of logic flow diagram 900, function models describing the behavior of functions are initialized. Step 904 is shown in greater detail as logic flow diagram 904 (FIG. 10). Processing begins with step 1002 in which model description file 604 (FIG. 6), which contains function models as described above, is opened.

In one embodiment, function models are stored in textual format and are read in, then stored in data structures within memory 104 (FIG. 1), which are described more completely below. A function model includes information which identifies a function and a singly-linked list of external models for the externals of the function. The information which identifies the function includes (i) the name of the function, (ii) the name of the source code file in which the function is defined, (iii) the number of the textual line within the source code file at which the definition of the function begins, and (iv) a short description of the function. A source code file is a file stored in memory 104 (FIG. 1), typically in secondary storage such as a magnetic disk, which contains a computer program such as computer program 610. The external models, as stored in a singly-linked list, define the effect of execution of the function on externals of the function in terms of operations applied to those externals and any resources created on behalf of those externals.

An external model includes information specifying the type of external, information which identifies the external, and information which specifies the effect on the external of execution of the function. The information which identifies the external is either a parameter number, if the external is a parameter, a variable name, if the external is a global or static variable, or NULL, if the external is a returned item. The information which specifies the effect on the external of execution of the function includes (i) a list of the operations to be applied to the external, (ii) a flag specifying whether a new resource is created on behalf of the external, and (iii) the initial state of the new resource if one is created.

The textual format of the models as stored in model description file 604 (FIG. 6) is defined by the following Backus-Naur Form (BNF) definition (8). Backus-Naur Form is a well-known format for describing a formal language.

    ______________________________________
    <function-spec> ::= ( <function-prefix> <extern-list>)
                                  (8)
    <function-prefix> ::=
     <function-name>
      ›<defining-file> ›<defining-line> ›<description>!!!
    <extern-list> ::= <extern> .vertline. <extern> <extern-list>
    <extern> ::= ( <extern-type> <result-list> )
    <extern-type> ::=
       retval // returned item
       .vertline. ( param <param-number> ) // parameter
       .vertline. ( var <var-name>) // global/static item
    <result-list> ::= <result> .vertline. <result> <result-list>
    <result> ::=
       ( op <state-op>)
       .vertline. ( new <initial-state> ›<description>! )
    <initial-state> ::= A .vertline. Q .vertline. U .vertline. X .vertline.
    <state-op> ::= a .vertline. m .vertline. k .vertline. x .vertline. i
    .vertline. c .vertline. p
    ______________________________________


A function model, in textual format, is represented by non-terminal <function-spec> of BNF definition (8). In BNF, a terminal is a term that is not expanded further in a particular BNF definition, and, conversely, a non-terminal is a term that is expanded further. Terminal <function-name> is the identifier assigned to the function, i.e., is the identifier used by another function to call the function represented by the function model. Terminal <function-name> can be any function identifier which is valid according to the computer language with which the function is defined. Terminal <defining-file> is an alphanumeric identification of the source code file within which the function is defined. The alphanumeric identification can be a path description of the source code file, for example. Terminal <defining-line> is a textual representation of a non-negative number, i.e., using digits 0-9, specifying at which textual line of the source code file identified by terminal <defining-file> the definition of the modelled function begins.

It should be noted that, in BNF, terms which are optionally present are enclosed in brackets ("›!"). Therefore, in the definition of terminal <function-prefix>, terminals <defining-file>, <defining-line>, and <description> are optionally present. If should be further noted that successive slashes ("//") denote the beginning of a comment and the slashes, and any text following the slashes to the end of a textual line, are not considered part of the BNF definition.

Terminal <description> of BNF definition (8) is a series of one or more characters (i.e., letters, numerals, and/or symbols). Terminal <description> is not used by the resource checker 602 (FIG. 6) but is instead provided for the convenience and understanding of a user reading the model in the textual format. Terminal <param-number> of BNF definition (8) is a textual representation of a non-negative integer using the digits 0-9 and specifies a particular parameter in a list of parameters. Parameter zero is the first, i.e., leftmost, parameter in a list of parameters in a call to a function. Subsequent parameters are numbered sequentially. Terminal <var-name> of BNF definition (8) is an identifier of a variable.

Thus, function models retrieved from model description file 604 (FIG. 6) each describe the effect of execution of a respective function on externals of the function. Processing transfers from step 1002 (FIG. 10) to loop step 1004 in which each function model stored in model description file 604 (FIG. 6) is retrieved and processed according to a loop defined by loop step 1004 (FIG. 10) and next step 1014. During each iteration of the loop, the function model which is processed is called the current function model. When each and every function model stored in the model description file has been processed according to the loop defined by loop step 1004 and next step 1014, processing transfers from loop step 1004 to step 1006 in which model description file 604 (FIG. 6) is closed and processing according to logic flow diagram 904 (FIG. 10) terminates.

For each function model retrieved from the model description file, processing transfers from loop step 1004 to step 1008 in which the portion of the current function model corresponding to non-terminal <function-prefix> of BNF definition (8) above is parsed from the current function model. Processing transfers to step 1010 in which a function model structure is initialized and the information parsed from the current function model in step 1008 is stored in a function model structure.

A function model structure 1100 (FIG. 11) includes a field "name" 1102, a field "file" 1110, a field "line" 1112, and a field "description" 1108. Portions of the function model corresponding to terminals <function-name>, <defining-file>, <defining-line>, and <description> of BNF definition (8), all of which are part of non-terminal <function-prefix>, are parsed from the function model and stored in field "name" 1102, field "file" 1110, field "line" 1112, and field "description" 1108, respectively, of function model structure 1100. Processing transfers from step 1010 (FIG. 10) to loop step 1012.

Loop step 1012 and next step 1028 define a loop, in each iteration of which an external specified in the portion of the function model corresponding to non-terminal <extern-list> of BNF definition (8) above is processed. During each iteration of the loop defined by loop step 1012 and next step 1028, the currently processed external is called the subject external. After every external defined in the current function model has been processed according to the loop defined by loop step 1012 and next step 1028, processing transfers from loop step 1012 to next step 1014. Processing transfers from next step 1014 to loop step 1004 in which another function model retrieved from model description file 604 (FIG. 6) is processed or, if all function models have been processed, from which processing transfers to step 1006 (FIG. 10) as described above.

For each external specified in the portion of the current function model corresponding to non-terminal <extern-list> of BNF definition (8), processing transfers from loop step 1012 to step 1016. In step 1016, a new external model structure, external model structure 1200 (FIG. 12), is created.

External model structure 1200 includes a field "equivalent" 1202, a field "type" 1204, a field "parameter.sub.-- number" 1206, a field "name" 1208, a field "next" 1210, a field "number.sub.-- of.sub.-- operations" 1212, a field "operations" 1214, a field "new.sub.-- resource" 1218, a field "initial.sub.-- state" 1220, and a field "description" 1222. In step 1016 (FIG. 10), the portion of the subject external model corresponding to terminal <param-number> in the definition of non-terminal <external> of BNF definition (8) is parsed from the subject external model and is stored in field "parameter.sub.-- number" 1206 (FIG. 12) of external model structure 1200.

In one embodiment, field "equivalent" 1202 is used to identify a second external model structure. By doing so, external model structure 1200 is related to the second external model structure. Such would be appropriate if, for example, the returned item of a function is the first parameter. The embodiment described herein does not make use of field "equivalent" 1202, which is therefore initialized to a NULL value. From step 1016 (FIG. 10), processing transfers to step 1018.

In step 1018, the portion of the subject external model corresponding to non-terminal <extern-type> of BNF definition (8), which specifies the type of external represented by the subject external model, is parsed from the subject external model. As shown in BNF definition (8) above, an external represented by an external model can be a returned item, a parameter, or a globally-defined or static variable. Data specifying the type of external represented by the subject external model are stored in field "type" 1204 (FIG. 12) of external model structure 1200. Processing transfers from step 1018 (FIG. 10) to a loop step 1020.

As shown in BNF definition (8) above, execution of a function can have one or more effects or "results" on each external of the function. Each result is represented in BNF definition (8) as non-terminal <result>. One or more results are included in non-terminal <result-list>. Loop step 1020 and next step 1024 define a loop in which each result in the list of non-terminal <result-list> of the subject external model is processed. During an iteration of the loop defined by loop step 1020 and next step 1024, the result being processed is called the subject result. After every result of the subject external model has been processed according to the loop defined by loop step 1020 and next step 1024, processing transfers from loop step 1020 to step 1026 which is described below.

For each result for the subject external model, processing transfers from loop step 1020 to step 1022. In step 1022, the subject result is parsed from the subject external model. The result is then stored in an external model structure such as external model structure 1200 (FIG. 12). For example, function model (3), which is defined above, specifies one result for a first external, i.e., the returned item, and one result for a second external, i.e., parameter zero. The result of the returned item is specified as `(new Q "memory")`, indicating that a new resource is created for the returned item, the initial state of the resource is state Q, and provides "memory" as a brief description of the resource. Accordingly, if external model structure 1200 represents the external model for the returned item, (i) field "new.sub.-- resource" 1218 is set to a boolean value of "true" to indicate that a new resource is created, (ii) field "initial.sub.-- state" 1220 is set to indicate that the initial state of the new resource is state Q, and (iii) the text "memory" is stored in field description 1222.

As a second example, function model (3) above specifies a result "(op c)" for the second external, i.e., parameter zero. Result "(op c)" specifies that operation c is applied to the external. Accordingly, if external model structure 1200 represents the external model for parameter zero, field "number.sub.-- of.sub.-- operations" 1212, which initially has a value of zero, is incremented and an operation identifier "c" is stored in field "operations" 1214 corresponding to a position indicated by field "number.sub.-- of.sub.-- operations" 1212. In this example, field "number.sub.-- of.sub.-- operations" 1212 stores a value of one and the first operation identifier in field "operations" 1214 is an identifier of operation c. If a second operation is applied to the second external, field "number.sub.-- of.sub.-- operations" 1212 is again incremented to a value of two and the second operation identifier in field "operations" 1214 is the identifier of the second operation.

Processing transfers from step 1022 (FIG. 10) through next step 1024 to loop step 1020 which is described above. As described above, processing transfers from loop step 1020 to step 1026 once all results for the subject external model have been processed.

In step 1026, the external model structure representing the subject external model is added to a singly linked list of externals in the current function model structure. An illustrative example is discussed in the context of function model (3) above. An external model structure 1200A (FIG. 13) is first added to a function model structure 1100A by storing in fields "first.sub.-- external" 1104A and "last.sub.-- external" 1106A pointers to external model structure 1200A. A second external model structure 1200B is then added to function model structure 1100A by storing in field "next" 1210A of external model structure 1200A, and in field "last.sub.-- external" 1106A of function model structure 1100A (superseding the pointer previously stored in field "last.sub.-- external" 1106A), a pointer to external model structure 1200B as shown in FIG. 13.

Processing transfers from step 1026 (FIG. 10) through next step 1028 to loop step 1012. After every external model has been processed as described above, processing transfers from loop step 1012 through next step 1014 to loop step 1004. After every function model has been processed as described above, processing transfers from loop step 1004 to step 1006 in which the file containing function models in the textual format described above is closed as described above. Processing according to logic flow diagram 904 terminates after step 1006.

INTERNAL REPRESENTATION OF A FUNCTION

Once computer program 610 (FIG. 6) is parsed by program parser 704 (FIG. 7), computer program 610 is represented in memory 104 by a series of function structures. In an alternative embodiment as described above, program parser 704 retrieves from computer program 610 function structures which have been formed by a previous parsing of a source computer program conforming to a particular computer language, e.g., the C computer language. The source computer program is parsed by a source code preprocessor which parses the source computer program according to the computer language to which the source computer program comports and forms and stores in computer program 610 function structures representing the functions defined in the source computer program. The source code preprocessor (not shown) is a separate computer process from resource checker 602.

In this alternative embodiment, the source code preprocessor is based on the known GNU C compiler available from Free Software Foundation, Inc. of Cambridge, Mass. Appendix B, which is a part of this disclosure and is incorporated herein in its entirety, is a list of computer instructions which define data structures and functions for transporting parsed functions of a computer program from a source code preprocessor into data structures described more completely below for representing a parsed function. In one embodiment, a conventional compiler, such as the known GNU C compiler described above, is used to parse a computer program and the parsed program is represented in data structures such as those defined in Appendix B.

The following is a description of a function structure. Familiarity with fields and relationships within a function structure facilitates the subsequent description of the processing of dynamic inspection engine 706 (FIG. 7).

Function structure 1400 (FIG. 14) represents a function defined by computer program 610 or, in an alternative embodiment as described above, the source computer program and includes (i) a field "name" 1402, (ii) a field "line" 1404, (iii) a field "file" 1406, (iv) a field "result" 1408, (v) a field "externals" 1410, and (vi) a field "statement". Field "name" 1402 of function structure 1400 specifies the identifier of the function represented by function structure 1400. For example, the identifier of function example.sub.-- 1() of source code excerpt (1) above is "example.sub.-- 1".

Field "file" 1406 and field "line" 1404 specify the source code file and line number within that file, respectively, at which the function represented by function structure 1400 is defined. For example, if source code excerpt (1) above represents the entire contents of a single source code file whose file name is "example.sub.-- 1.c", field "file" 1406 and field "line" 1404 of a function structure representing function example.sub.-- 1() contain, respectively, data specifying the text "example.sub.-- 1.c" and an integer value of seven (7).

Field "result" 1408 points to a declaration structure 1418, which is analogous to declaration structure 1506 described below and which specifies the type of result returned by the function represented by function structure 1400. For example, function example.sub.-- 1() of source code excerpt (1) above returns a result which is an integer, i.e., data of the type "int", as specified at line 7 of source code excerpt (1). Thus, if function structure 1400 represents function example.sub.-- 1(), field "result" 1408 points to declaration structure 1418 which specifies integer data.

Field "externals" 1410 of function structure 1400 is a pointer to an external list structure 1414, which is described below in greater detail. As described more completely below, external list structures such as external list structure 1414 include a pointer which is used to link external list structures in a singly-linked list. Thus, pointing to an external list structure is to point to a singly-linked list of external list structures, even if the length of the list is one. Such a singly-linked list, which is pointed to by field "externals" 1410 of function structure 1400, includes external list structures representing the externals of the function represented by function structure 1400.

Field "first.sub.-- stmt" 1412 of function structure 1400 is a pointer to a statement structure 1416, which is described below in greater detail. As described more completely below, statement structures such as statement structure 1416 include a pointer which is used to link statement structures in a singly-linked list. Thus, pointing to a statement structure is to point to a singly-linked list of statement structures, even if the length of the list is one. Such a singly-linked list, which is pointed to by field "first.sub.-- stmt" 1412 of function structure 1400, includes statement structures representing the statements of the function represented by function structure 1400.

EXTERNAL LIST STRUCTURES

External list structure 1414 is shown in greater detail in FIG. 15. External list structure 1414 represents an external of the function represented by function structure 1400 (FIG. 14) and includes a field "first.sub.-- decl" 1502 (FIG. 15), a field "next" 1504, and a field "first.sub.-- external" 1510. Field "first.sub.-- decl" 1502 is a pointer to a declaration structure 1506, which specifies the data type of the external represented by external list structure 1414 and which is described below in greater detail. Field "next" 1504 is a pointer to another external list structure 1508 if external list structure 1508 immediately follows external list structure 1414 in the singly-linked list of externals. If no external list structure follows external list structure 1414 in the singly-linked list of external list structures, field "next" 1504 of external list structure 1414 is NULL, i.e., contains NULL data. Field "first.sub.-- external" 1510 is a pointer to an external state structure (not shown) which specifies the state of the external represented by external list structure 1414 and which is described below in greater detail.

DECLARATION STRUCTURES

Declaration structure 1506 is shown in greater detail in FIG. 16. A declaration structure is a structure which specifies a declared variable or function, i.e., a variable or function, respectively, specified in a declaration. Declarations in the context of the C computer language are well-known and are described in the C Standard. Declaration structure 1506 includes a field "kind" 1602, a field "name" 1604, and field "type" 1606, a field "item" 1608, and a field "model" 1610.

Field "kind" 1602 contains data specifying whether the declared item or function is globally defined, static, or a locally defined. Field "name" 1604 contains textual data specifying an identifier of the item or function. As described above, in the context of the C computer language, an item or function is identified by a textual identifier and identifiers must conform to a specific format, which is described in Section 6.1.2 of the C Standard.

Field "type" 1606 of declaration structure 1506 is a pointer to a type structure 1612 which specifies the particular type of data represented by the declared item or function. Type structure 1612 is described below. Field "item" 1608 is a pointer to item structure 2700 which represents the declared item. If declaration structure 1506 represents a declared function, field "item" 1608 is NULL and therefore points to no item structure.

Field "model" 1610 of declaration structure 1506 is a pointer to function model structure 1100 if declaration structure 1506 represents a declaration of a function whose model is represented by function model structure 1100. If declaration structure 1506 does not represent a declaration of a function, field "model" 1610 is NULL, i.e., contains NULL data, and therefore points to no function model structure. Furthermore, if declaration structure 1506 represents a declaration of a function for which no function model structure exists, field "model" 1610 is NULL.

TYPE STRUCTURES

Type structure 1612 is shown in greater detail in FIG. 17. A type structure such as type structure 1612 specifies a particular data type, such as integer, floating point, alphanumeric characters, and user-defined types such as structures. Type structure 1612 includes a field "kind" 1702, a field "name" 1704, a field "size" 1706, a field "points.sub.-- to" 1708, and a field "fields" 1710. Field "kind" 1702 contains data specifying whether the type represented by type structure 1612 is integer, real (i.e., floating point numerical data), pointer, array, structure (i.e., data type "struct" as defined for the C computer language), or union. Each of these types are well-known and are described in the C Standard at Sections 6.1.2.5 and 6.5 et seq.

Field "name" 1704 of type structure 1612 contains alphanumeric data specifying the identifier of the type if the type represented by type structure 1612 is user-defined. Otherwise, if the type represented by type structure 1612 is predefined by the C computer language, field "name" 1704 is NULL.

Field "size" 1706 specifies the size of the type represented by type structure 1612. If the type is not an array, field "size" 1706 specifies the number of bits of data included in an item of the type represented by type structure 1612. For example, if the type is a 32-bit integer, field "size" 1706 of type structure 1612 specifies the value 32. If the type is an array, field "size" 1706 specifies the number of bits of data included in the entire array, i.e., the number of bits of data included in an item of the type represented by an element of the array multiplied by the number of elements in the entire array. For example, a declaration