Managing software components

Automated method for building and maintaining software including methods for verifying that systems are internally consistent and correct relative to their specifications

6275976

Abstract

Software development and maintenance involves assembling components, sometimes with explicit support during the design process but more frequently not. In neither case has it been possible to ensure internal consistency and correctness in a manner that scales well to large as well as small system. The invention disclosed herein is an intuitive, repeatable, formally verifiable and automated method for developing and maintaining software systems that is practical and easy to use. Specifically, this method provides human designers with automated support for specifying, designing, implementing and maintaining arbitrarily large, complex software systems that are both internally consistent and logically correct (i.e., consistent with external specifications). This method avoids the "combinatorial explosion" problem and minimizes the need for human input, respectively, by constructing specifications and solution designs via successive levels of refinement and by categorizing input and output values into equivalence classes.


Claims

Having thus described our invention, what we claim as new, and desire to secure by Letters Patent is:

1. A computer implemented method for specifying, designing, testing, implementing and maintaining at least one of a plurality of systems that can be guaranteed to be internally consistent and abstractly correct with respect to their specifications via successive refinement, said method comprising the steps of:

a) building and maintaining a specification hierarchy representing a behavioral system in which input and output variables have been hierarchically structured so that abstract behavior of said behavioral system at successive levels of refinement in said specification hierarchy is equivalent in response to a request to specify the behavior of said system;

b) in response to a request building and maintaining a design hierarchy corresponding to the behavioral system in which each refinement in said design hierarchy conforms to consistency constraints, which guarantee that the children in each said refinement produce behavior that is abstractly equivalent to that of the parent, wherein each design in said design hierarchy produces abstract behavior equivalent to a corresponding specification hierarchy hierarchy when said specification hierachy of step a) exists;

c) analyzing at least one of said specification hierarchy and said design hierarchy in response to a request to extract information inherent in at least one of said specification hierarchy and said design hierarchy;

d) modifying said specification hierarchy in response to a request to modify the abstract behavior of said specification hierarchy;

e) modifying said design hierarchy in response to a request to modify the abstract behavior of said design hierarchy;

f) testing at least one design in response to a request subsequent to at least one of the steps of dynamically modeling the behavior of said design and creating test cases for said design and proving conclusively that a component in at least one said design executes according to its specification;

g) in response to a request implementing at least one component by converting said components in said design into executables; and

h) in response to a request performing at least one of the steps of executing said design, calling said design from an existing application and executing a component or a design corresponding to at least one component in said design.

2. A method in accordance with claim 1, further comprises the steps of:

a1) specifying at least one output variable, and for each said output variable in said specification hierarchy, specifying zero or one input variable and at least one input variable to said specification hierarchy which maps into said output variable;

b1) partitioning the values of each said input and output variable of said specification hierarchy into equivalence classes, said partition consisting of a finite number of abstract values, one abstract value being the default number;

c1) specifying a mapping of said abstract input values of step b1, into said abstract output values;

d1) given one of an input variable and an output variable in said specification hierarchy, refining said variable into one of no child variables, said variable then being terminal in said specification hierarchy, and one of component child variables and category child variables and relationships between child variables and operations, specifying mappings between input and output child variables, said components, categories, relationships and operations more fully characterizing said input and output variables;

e1) specifying a mapping between the abstract values of each said parent variable of step d1) and the abstract values of said children variables of said parent variable;

f1) for each said child output variable of step d1), specifying a mapping from said abstract values of said input children variables of step d1 ) into said abstract values of each said child output variable; and

g1) ensuring that said input and output mappings of step c1) are consistent with said child mappings of step a1), in which case the mapping of step e1) followed by the mapping of step c1) generates the same abstract values for each said output variable of step c1) as does the mapping of step f1) followed by the mapping of step e1).

3. A method in accordance with claim 2, wherein said step d1) of refining each said input and output variable into child variables further comprises:

a2) when said refinement is a component refinement, distinguishing between prototype component refinements in which one child component represents a finite set of child components, each of which has exactly the same structure and the same abstract values, and those said component refinements which are not said prototype refinements; and

b2) when said refinement is a category refinement, distinguishing between category refinements in which all child variables are atomic, without any internal component structure, and those said category refinements In which at least one said child variable has component variables.

4. A method in accordance with claim 2 for constructing a higher order specification hierarchy when the values of at least one said variable of claim 2 are specifications.

5. A method in accordance with claim 2 for constructing specifications for real time systems wherein the values of at least one variable of said specification are themselves one of time and timed events.

6. A method in accordance with claim 1 wherein said step b) of building a design hierarchy by successive refinement of design components in which each refinement in said hierarchy conforms to consistency constraints, which guarantee that the children In each refinement produce behavior that is abstractly equivalent to that of the parent, wherein each design component in said hierarchy produces abstract behavior equivalent to said specification when said specification hierarchy of step a) of claim 1 exists, further comprises the steps of:

a1) constructing and choosing a name for a top-level design component in said design hierarchy, in which the parameters of said top-level design component include all of the input and output variables at the top-level of said specification hierarchy of claim 1, and when said specification hierarchy of claim 1 hierarchy exists, said design component also including as parameters one of no external events and input events and their abstract values from outside said design hierarchy;

b1) refining each design component of said design hierarchy into one of a sequence, parallel, selection, iteration, affiliate, relation and operation refinement in accordance with specified consistency constraints, wherein said consistency constraints ensure that the abstract behavior of each said design component is equivalent to the abstract behavior of the child design components in said refinement;

c1) for each parameter of each said child design component of step b) which is not identical to a variable in said specification hierarchy of claim 1 or a parameter of the parent design component of said refinement of step b), partitioning the values of said parameter into abstract values, one said abstract value being the default, and specifying the abstract input-output behavior for each said design component in said each refinement, the full set of input and output parameters of each child design component in said each refinement comprising the specifications for said child design component; d1) verifying that the abstract behavior specified by each parent design component in each said refinement is consistent with the abstract behavior specified by the children design components in each said refinement; and

e1) when said specification hierarchy of step a) of claim 1 exists, repeating steps b1) through d1) until all variables in said specification hierarchy of step a) of claim 1 have been referenced in said design hierarchy.

7. A method in accordance with claim 6 wherein said step b1) of refining said design component into one of a sequence, parallel, selection, iteration, affiliates, relation and operation, in accordance with specified constraints on design refinements, wherein said specified constraints on design refinements ensure consistency of said refinement, further comprises the steps of,

a1) when at least one parameter of said design component corresponds to a variable in said specification hierarchy of claim 6, herein called a specification parameter

(I) selecting one said specification parameter of said design component,

(ii) determining the specification refinement type of said specification parameter, and

(iii) suggesting the type of refinement of said design component as follows;

when the type of said specification parameter is a component refinement, then suggesting a parallel refinement for said design component,

when the type of said specification parameter is a prototype refinement representing a variable number of components, then suggesting a loop refinement for said design component,

when the type of said specification parameter is a prototype refinement representing a fixed number of components, then suggesting a navigation sequence refinement for said design component,

when the type of said specification parameter is a terminal refinement with at least two abstract values, then suggesting a case refinement for said design component, otherwise make no said suggestion,

when the type of said specification parameter is a category refinement with atomic sub-categories, then suggesting one of a selection and case refinement for said design component,

when the type of said specification parameter is a category refinement with non-atomic sub-categories, otherwise known as a class inheritance hierarchy, then suggesting an affiliate refinement for said design component

when the type of said specification parameter is an relational refinement, otherwise known as a extraction refinement, then suggesting a sequence refinement for said design component,

when the type of said specification parameter is a dynamic refinement, then suggesting an interactional refinement for said design component;

(iv) choosing a refinement type for said design component,

(v) verifying that said refinement type is consistent with said specification hierarchy, wherein said consistency means:

when the refinement type of said design component is a parallel refinement, then at least one of said specification parameter of said design component must be one of a terminal refinement and a component refinement,

when the refinement type of said design component is a loop refinement, then at least one specification parameter of said design component must be one of a terminal refinement and a prototype refinement representing a variable number of components,

when the refinement type of said design component is a navigation sequence refinement, then at least one specification parameter of said design component must be one of a terminal refinement and a prototype refinement representing a fixed number of components,

when the refinement type of said design component is one of a case refinement and a selection refinement, then at least one specification parameter of said design component must be one of a terminal refinement and a category refinement with atomic sub-categories,

when the refinement type of said design component is a case refinement, then at least one specification parameter of said design component must be one of a terminal refinement and an atomic category refinement,

when the refinement type of said design component is an affiliate refinement, in which at least one parameter of said design component is an object, wherein said design component is an abstract design component, then at least one specification parameter of said design component must be one of a terminal refinement and a non-atomic category refinement,

when the refinement type of said design component is a sequence refinement, then at least one specification parameter of said design component must be one of a terminal refinement and a relational refinement,

when the refinement type of said design component is an interactional refinement, then at least one specification parameter of said design component must be one of a terminal refinement and a dynamic refinement, and

b2) for said refinement of step a2), checking to ensure that consistency constraints for the refinement type of said refinement are satisfied, and ensuring that the abstract behavior of said design component is equivalent to the abstract behavior of the child design components in said refinement.

8. A method in accordance with claim 6 for constructing a higher order design hierarchy when the values of at least one parameter of said design component of claim 6 are design components.

9. A method in accordance with claim 6 for constructing designs for real time system wherein the values of at least one variable of a design are themselves one of time and timed events.

10. A method in accordance with claim 1 wherein said step of maintaining said specification hierarchy of step a) of claim 1 in which input and output variables have been hierarchically structured so that abstract behavior at successive levels of refinement is equivalent, further comprises the steps of,

a1) modifying the abstract input-output behavior of at least one variable in said specification hierarchy;

b1) updating the abstract input-output behavior of the parent and the children of said variable of step a) to maintain abstract consistency;

c1) updating refinements in which said parent is a child and in which those said children which are changed are parents; and

d1) repeating step c1) until there are no more parent variables and no more children which are changed.

11. A method in accordance with claim 1 wherein said step of modifying a design hierarchy in which the behavior of the parent and children In each said refinement is equivalent, further comprises the steps of:

a1) modifying the abstract behavior of a design component in said design hierarchy;

b1) updating the parent of said design component in accordance with the consistency constraints of the refinement containing said parent and ensuring that the abstract behavior of the parent and children in said refinement is equivalent and updating the children of said design component in accordance with the consistency constraints of the refinement containing said design component and ensuring that the abstract behavior of said design component and its children is equivalent;

c1) updating the parent in those refinements in which said parent is a child when said parent has been changed;

d1) updating the children in those refinements in which said child is a parent when said child has been changed; and

e1) repeating step c1) and d1) until there are no more parent and child design components.

12. A method in accordance with claim 11 wherein said step c1) of updating the parent of said design component in said design hierarchy further comprises the steps of:

a2) identifying the parent of said design component in said design hierarchy,

b2) identifying the type of refinement associated with said parent; and

c2) constructing a parent design component for said refinement of step b2) from the children of said refinement using consistency constraints specified for said type of refinement of step b2).

13. A method in accordance with claim 12 wherein the step of constructing a parent design component from children design components further comprises:

a) for each sequence refinement In said program, using consistency constraints for sequences to construct the patent design component in said sequence refinement;

b2) for each selection refinement in said program, using consistency constraints for selections to construct the parent design component in said selection refinement;

c3) for each loop refinement in said program, using consistency constraints for loops to construct the parent design component in said loop refinement; and

d3) when said refinement is a loop refinement, said loop refinement contains one of an unelaborated loop body and a condition and an advance operation, an action operation and a condition.

14. A method in accordance with claim 13 wherein said step of constructing a parent design component from said children design components in said sequence refinement further comprises:

a4) identifying those subsequences of said refinement that are navigation sequences and those subsequences of said sequence refinement that are parallel refinements;

b4) inserting into Said design hierarchy of claim 13 a parent for each said subsequence, said subsequence parent replacing its children in said sequence and updating the contents of each said subsequence parent; and

c4)c) updating said sequence refinement.

15. A method in accordance with claim 14 wherein said step of identifying those subsequences of said sequence that are navigation sequences and those subsequences of said sequence that are parallel refinements further comprises:

a5) one of not partitioning the design components in said sequence refinement and partitioning Said sequence into navigation sub-sequences which introduce alias variables and non-navigation sequences, which do not introduce said aliases, and using consistency constraints to convert navigation sub-sequences into navigation sequence refinements;

b5) for each non-navigation sequence and each navigation sequence refinement in said program, one of not partitioning said sequences and partitioning said sequences into sub-sequences of steps which must be carried out sequentially and those sub-sequences which may be carried out in parallel, and

c5) when a said subsequence may be carried out in parallel, using consistency constraints to construct a parent design component for said parallel refinement, wherein new higher level parameters, composed of component variables in child design components, may be introduced in said parent design component.

16. A method in accordance with the method of claim 11, which creates an internally consistent design hierarchy from a design that has been constructed using traditional methods of assembling design components, wherein said design is constructed from design components in a canonical form, including refinement types and wherein goto's have been eliminated, wherein said design defines an implicit design hierarchy whose higher level elements include the successively embedded refinement types in said design and wherein higher level design components in said implicit design hierarchy are constructed from said design, said method comprising the steps of,

a2) embedding said design in an implicit design hierarchy; and

b2) applying the method claim 11 to each refinement in said implicit design hierarchy, wherein each child of each said refinement is a design component in one of said design and said implicit design hierarchy, to construct a parent design component in said refinement.

17. A method for converting the steps in a plurality of programs into components represented in said canonical form of claim 16, wherein said programs are written in a structured programming language in which goto's have been eliminated, wherein said programs may be one of program, subroutine, function, affiliate operation or method associated with an object, callback and remote procedure call, and wherein said canonical form means that each design component is an operation or function, whose input and output parameters are typeless and which does not reference any global variables, said method comprising the steps of;

a) receiving said program;

be) converting global variables used in said program to parameters and insuring that said program consists entirely of modules;

c3) converting each statement in each module in said program into a design component represented in said canonical form wherein the type of each parameter of each said canonical design component is represented as a data structure of typeless data elements and the input, output and Input-output role of each said parameter is represented in said canonical form, and

d3) reverse engineering, said program to extract the implicit design hierarchy of refinement types defined by said program.

18. A method in accordance with claim 1 wherein the step of testing at least one design by one of dynamically modeling the behavior of said design, creating test cases for said design and proving that a design component executes correctly, wherein said design may be at any level of abstraction in said design hierarchy, wherein parameters of said design components may have multiple data types in which case said design components are called abstract design components, further comprises:

a1) selecting a design

b1) the dynamically modeling the behavior of said design comprising,

(I) when said design element is an abstract design component, then determining which affiliates to invoke at run-time based on the structure of the parameters and replacing said design component with a sequence whose elements are affiliate design components,

(ii) when said design component is not executable, then presenting an external agent with the input values of the parameters of said design component, and obtaining the corresponding output values and data structures of said parameters from said external agent,

(iii) when said design component is executable, then executing said design component,

(iv) repeating steps (i) through (iii) until there are no more design components to interpret;

c1) in response to a request to create test cases for said design wherein said test cases can be used to ensure that said design produces behavior consistent with said specification of claim 1,

(i) selecting at least one set of actual values for each combination of abstract values of input variables in said specification of claim 1 associated with said selected design, wherein each said set contains one actual value for each said abstract value of input variables;

(ii) for each combination of said abstract values of input variables of said step (i), determining the corresponding set of abstract output values of output variables in said specification of claim 1, each said combination of actual input values of step (i) and said corresponding abstract output values constituting a test case; and

(iii) applying said test cases of step c1 (ii) to said design of step a1) and determining whether each actual output value belongs to said corresponding abstract value of output variables of step c1) (ii) for said corresponding test case; and

d1) in response to a request to prove conclusively that a given design component executes according to its specification,

(i) selecting a verification method including at least one of formal proofs, automated theorem proving techniques and other verification methods;

(ii) when at least one of formal proofs and automated theorem proving techniques is selected in said step d1) (i) and the specification of said design component is not sufficiently precise to allow use of at least one of said formal proofs and automated theorem proving techniques, then

identifying variables in said specification of said design component whose abstract values are too abstract to allow proof and

introducing a relational refinement, also called an extract refinement, wherein the specification of the children in said relational refinement is represented as a relationship between two or more variables, wherein said relationship is sufficiently precise, wherein said relationship is expressed in some suitable logic, for use in at least one of said formal proofs and automated theorem proving techniques.

19. A method in accordance with claim 1, wherein step g) of implementing at least one component in a design in a manner which guarantees abstract correctness, said method comprising the steps of:

a) in response to a request, one of receiving and identifying and constructing a finite set of design components sufficient for implementing all contemplated design components in a given set of designs;

b) in response to a request, receiving an unimplemented design component and converting said design component into a problem whose input consists of the specification for said design component and whose goal is satisfied by designs consistent with asid specification;

c) in response to a request, deriving a design component from said finite set of components consistent with said specification of said design component of step b); and

d) in response to a request, when absolute correctness must be guaranteed, verifying that said derived design component executes correctly; and

e) in response to a request, repeating steps b) through d) until all design components in said design of step b) have been implemented.

20. Given a design in a design hierarchy Implemented in accordance with the method of claim 19, a method for implementing an equivalent design in said design hierarchy, in which said equivalent design is at least one of faster in execution and more compact in size; said method compricing the steps of:

a2) when the designer seeks to speed execution,

(I) identifying a portion of said design, commonly known as a section of code, which Is operating slower than desired,

(ii) identifying that design component, which is at a higher level of abstraction In said design hierarchy and which is the parent of a sub-hierarchy in said design hierarchy which includes said identified section of code,

(iii) implementing said higher level design component in a lower level language and

(iv) when the designer seeks to eliminate redundancy, at least one of deleting unused local variables in said design and replacing a local variable used in one subtree of said design hierarchy with a reusable local variable used in a preceding subtree of said design hierarchy;

b2) when the designer seeks to make said design more general,

(I) identifying a component in said design, wherein at least one original parameter of said given component may be replaced by another parameter, said another parameter having the same commonly understood "real world" meaning and representing a set of data structures, which includes the data structure of said original parameter,

(ii) replacing said original parameter with said another parameter and converting said design component into an abstract design component and constructing a class hierarchy, including affiliates that support all data structures associated with said another parameter.

21. A method in accordance with claim 19, wherein said step of constructing a finite set of design components sufficient for implementing all contemplated designs, wherein said finite set of design components consists of lower and higher order design components, and wherein said lower and higher design components collectively provide a sufficient basis for constructing designs that satisfy an arbitrarily diverse set of specifications in some implicitly, but ill-defined domain of problems, further comprises

a2) constructing a finite sample of specifications in said ill-defined domain;

b2) constructing a design component for each specification in said finite sample of step a);

c2) for each said design component, one of converting and not converting said design into a higher order specification;

d2) constructing a higher order design component for each said higher order specification;

e2) one of repeating steps c) and d) and not repeating said steps.

22. A method in accordance with claim 21, wherein said step c) of converting said design into a higher order specification further comprises:

a) for each said design of step b) of claim 21, identifying corresponding Input designs from which said design can be constructed;

be) for each said design of step a), constructing a pair consisting of the specification for said design and the set of input designs of step a) that map into said design;

c3) partitioning said pairs of step b) into equivalence classes using at least one of artificial intelligence, another automated technique and designer judgment; and

d3) converting each said equivalence class of step c) into a higher order specification, wherein the inputs and output in said higher order specification are at least one of an abstraction and a generalization of a corresponding pair.

23. Given a problem, wherein the input to said problem is the specification for a to-be-implemented design component, together with a finite set of available components, and the goal in said problem Is a design that satisfies said specification, a method in accordance with claim 19 wherein the step of deriving said design from said finite set of available components further comprises the steps of:

a2) finding a component in said finite set of available components, which matches said problem, wherein the output of said component contains the specification of said problem at a specified level of outputs in said specification said initial level being one;

b2) when said component is found, then applying said component to said Input to said problem and said set of available components and adding the design generated by said component to the set of available components and decrementing the level of outputs in said specification which said finding of step a) searches;

c2) when said component of step a) is not found, then incrementing the level of outputs at which said finding searches; and

d2)repeating said steps a2)a) through c2) until one of said component of Step a) is found and said level of outputs reaches some prescribed number.

24. A method in accordance with claim 23, wherein said problem may be any of a plurality of problems, wherein the input to said problem may be any given, the set of available components may be any set of problems, the goal in said problem may be any goal and a solution to said problem is a design that operates on said given and generates an output that satisfies said goal of said problem.

25. A method in accordance with claims 23, wherein said step of finding a component whose specification matches said solution, further comprises:

a) a) when a component has the desired behavior but the interface is not in a prescribed canonical form, then semantically wrapping said component to conform to said canonical form.

26. A method in accordance with claim 1, wherein a specification hierarchy is constructed from a design hierarchy constructed in accordance with step b), wherein each kind of design refinement corresponds one-to-one with a unique specification refinement, said method comprising the steps of:

a1) when said design hierarchy has no explicit specification, select the top level input and output variables in said design hierarchy as said specification;

b1) for each refinement in said design hierarchy of each said variable, identifying the refinement type, the children variables of said refinement, and the abstract values of said children variables in said design refinement:

c1) converting the type of said design refinement into the corresponding specification refinement type and adding said converted refinement type, including its children variables and abstract values, to said specification hierarchy; and

d1) repeating steps b1) and c1) until one of all said children variables have been identified and converted and said specification has otherwise been deemed by some human or intelligent agent to be sufficiently detailed.

27. A method by which designs constructed in accordance with step h) of claim 25, wherein said design can be utilized within or in conjunction with one of a plurality of existing designs, components, applications and operating systems, said method comprising at least one of the steps of:

a) executing said design;

b) when one of said application and said operating system calls a component corresponding to said design, then converting calls from one of said application and said operating system into calls on said design; and

c) when said design calls a design component, which can be implemented using at least one of the steps of calling an existing design, calling an executable component, calling an application and calling an operating system, then replacing said design component in said design with one of said existing design, component, application and operating system.


Description

BACKGROUND OF THE INVENTION

Given the growing complexity of modern software systems, there is an increasing demand for easier ways of locating errors and for ensuring or verifying the correctness of software designs. Tools for accomplishing these purposes range from the simple type checking provided by many compilers and source code analysis tools to formal verification proofs.

The former, unfortunately, miss many common errors. The latter involve constructing rigorous mathematical proofs. Such proofs are extremely difficult to construct with automation research limited to small programs. Despite limited successes, these solutions do not scale to non-trivial software. Consequently, formal verification is rarely used in practice. Jackson (see U.S. Pat. No. 5,423,027) has proposed a compromise solution. This approach does not eliminate errors as does formal verification, but it lends itself to automation and does catch more errors than simple type checking (during compilation). This is accomplished via optional programmer input and an automated checking tool that references various aspects (e.g., properties) of objects (types) rather than variables. The invention involves (a) specifying input and output values of these aspects for each procedure in a program, (b) examining and constructing an approximation of actual dependencies in procedural code and (c) comparing the approximate dependencies with the procedural specifications. The comparison of specifications with the approximations in the procedural code is used to identify omissions in the latter.

The above approach, in effect, is concerned with better ways to check given source code. The Jackson invention does not deal with design issues, cannot guarantee correctness and any concrete realization must address language specific issues.

Gaboury (U.S. Pat. No. 5,481,717) takes a different approach to formal verification. In parallel with our own efforts, Gaboury has disclosed a method for verifying a computer program in relation to its specification. This method is an extension of an existing Binary Decision Diagram algorithm to compare Finite State Machines, and involves comparison of a program and its specification after they both have been represented as logic programs (e.g., in Prolog, using variables and production rules). The basic method disclosed by Gaboury involves: (a) converting two logic programs into corresponding finite state machines, each having internal states, input values and output values, (b) determining the data types of the variables of the logic programs, (c) converting all expressions in the logic program into disjunctive form, (d) expanding procedure calls into procedure bodies, (e) translating the programs into transition functions and (f) determining whether equivalencies between internal states, input values and output values of the two finite state machine descriptions produce equivalent output values for all equivalent input values and states.

As with the Jackson method, however, this method requires formal verification of both specifications and solutions as parameterized logic programs. This method offers one way to solve the problem of unspecified parameters (e.g., in specifications); it effectively replaces details with unspecified parameters for comparison purposes. Although this approach is theoretically plausible, it solves the problem by introducing non-intuitive elements into the specification. By way of contrast, the present disclosure has no requirement that either the specification or the program be represented as a logic program. Specifications deal exclusively with input-output behavior and implementations are represented in traditional (sequential/event driven) real world terms.

In a similar vein the Gaboury disclosure eliminates syntactic differences such as variable names. Syntactic differences involving names, for example, although of limited importance in mathematical formulations, play an important role in human understanding. They are essential in building and maintaining complex, real world software. In effect, the Gaboury approach is not likely to be very helpful either during development or during ongoing maintenance. Moreover, the method cannot be used with many kinds of software because comparison of unrestricted logic programs is an undecidable problem.

Rather than verifying correctness after implementation, other approaches attempt to improve software quality by design. One approach derives from contemporary software and system development methodologies (e.g., structured analysis [Hatley & Pirbhai, 1987; Yourdon, 1989]; information engineering [Chen, 1976; Flaatten et al, 1989], object oriented design [Booch, 1994; Rumbaugh et al., 1991]). Most of these methods make a sharp distinction between design, implementation and testing. Implementation, for example, comes after design, and although limited consistency checking is included in many CASE tools, correctness testing is postponed until after implementation. Given exponential growth with complexity in the number of paths to be tested, these methodologies make complete verification testing impossible. For example, after implementation, a simple system having a sequence of 100 binary decisions has 2.sup.100 independent paths to be tested. Clearly, it is impossible to fully verify the correctness of even moderate sized programs empirically. However, only 300 paths are involved if testing is done at successive levels of refinement (see Scandura, 1992, p. 92). In this case the number of paths goes up only additively.

Traditional approaches to proving that software is correct (i.e., does exactly what specifications call for) require proving that the program will produce the desired output for any given set of inputs. This is true whether one attempts to prove correctness after software has been developed (e.g., Hoare & Shepherdson, 1985) or at successive stages of process refinement as in clean room engineering (Linger, 1994) and higher order software (Martin, 1985).

In contrast to the above, clean room engineering is based on mathematically rigorous correctness methods and has been successfully applied in a number of projects. The approach, however, involves manual verification (e.g., Linger, 1994; Linger, Mills & Witt, 1979). It is based on a hardware metaphor (where one knows all inputs and outputs before designing a system) and has not been shown to scale well to larger software systems. In recognition of the complexities of verification during development, "clean room" puts a heavy emphasis on software testing after implementation based on statistical sampling methods. Martin (1985) describes an alternative approach developed by Hamilton and Zeldin that is based on provably correct design constructs. The latter, however, also requires formal, non-intuitive modeling techniques, which again are not easily applied to large systems. A fundamental limitation of these methods is they assume that all input and output variables to a proposed system have been specified prior to system analysis (e.g., Martin, 1985, p. 50). This complicates design, especially in larger systems. Furthermore, no provision is made in the Martin (1985) approach for formulating precise specifications. Even where this is done, proving correctness is impractical using these methods because the numbers of input-output relationships to be verified is simply too large.

In contrast with Jackson, this invention deals with software design as well as code. Like other software engineering methodologies, its goal is to make software easier to understand, construct and modify. Unlike software engineering methodologies, however, including the more complete consistency checking provided by cleanroom engineering and Hamilton's "provably correct designs", the method proposed herein not only ensures consistency but it guarantees that software is correct by design and lends itself to automation.

Rather than catching errors after implementation, like Jackson, the latter more promising approaches attempt to ensure correctness by design. Cleanroom engineering, for example, has been successfully applied in a number of projects, but to date the approach involves manual verification (e.g., Linger, 1994; Linger, Mills & Witt, 1979). The "correctness theorem" (e.g., Linger, Mills & Witt, 1979, p. 222) shows that proving that a sequence refinement is correct involves direct comparison with the parent's behavior; a selection refinement involves checking at least two (branching) conditions; iteration requires checking three conditions, the third involving loop termination. This theorem simply states what must be shown to prove correctness--it refers essentially to syntactic consistency of a refinement. We refer to a refinement, which syntactically distinguishes these conditions, as "internally consistent". Actually demonstrating the correctness of any such condition with respect to its specification requires formal proof.

Martin (1985) describes an alternative approach to internal consistency developed by Hamilton and Zeldin. Although variously referred to as "provably correct" or higher order software, this approach involves showing designs to be consistent in the sense that each type of refinement follows certain specified rules. For example, a sequence refinement is consistent if and only if the input of the first sequence element is the same as the input to the parent, the output of the last sequence element is identical to the output of the parent and the output of each successive sequence element serves as input to the next element in the sequence. This approach requires the designer to adopt an entirely different mind set. Rather than using sequence, selection and iteration refinements, for example, the designer must adopt refinement types that are unfamiliar, and less easily applied to large systems. Even more basic, the approach does not address the issue of correctness with respect to independently defined specifications.

Although proving consistency in clean room engineering and in higher order software appears to be relatively straightforward (e.g., Linger, Mills & Witt, 1979; Martin, 1985), ensuring consistency in practice has been difficult because all input and output variables must be specified prior to system analysis. Except possibly in well organized data base applications, it is impossible even in relatively small systems, to know all possible input and output variables before a system has been implemented. Even here, checking for internal consistency gets increasingly cumbersome as systems get increasingly complex. Indeed, practically speaking, it would be impossible to achieve this goal with the large, complex systems on which government, industry and increasingly the home user depend.

Although processes are refined hierarchically, this is not the case with data. Both clean room and higher order software are based on a hardware metaphor where it is assumed that all inputs and outputs are known from the beginning (of design). In order to design a chip, for example, one must first know all of the input and output pins.

Apart from the hardware metaphor, reliance on a priori specification of all input and output variables derives from the tendency of most software engineers to think about software in terms of code. While the number of inputs and outputs to be considered at each stage of analysis can be reduced via top-down analysis, existing software engineering technologies (e.g., Chen, 1976) deal with data independently of process. There is no relationship between the abstraction levels in data and those in the corresponding design. Although contemporary object-oriented approaches (Booch, 1994; Rumbaugh et al, 1991) allow both data and process abstraction, processes are syntactically grouped according to data. Real world modeling, however, frequently and naturally calls for grouping data (structures) by semantically equivalent processes defined on such data. Furthermore, none of the above methods comes close to ensuring logical consistency of actual software.

In order to ensure such consistency, the numbers of variables to be considered during the design and implementation processes becomes unwieldy. Consequently, neither clean room engineering nor higher order software has been shown to scale well to larger software systems. In a recent talk, Sherer (1994), for example, has admitted a practical limit of 10,000-12,000 lines of code. To summarize, both methods assume that all inputs and outputs to a proposed system have been specified prior to system analysis.

Proving correctness with respect to external specifications is an even more daunting task. One must show that every possible sequence of input values to the system produces the correct output. Indeed, in describing higher order software, Martin (1985, p. 271-2) explicitly disavows the possibility of proving correctness with respect to external specifications. Similarly, in clean room engineering the task is viewed as being of only theoretical interest.

Proving correctness requires showing that a program produces the same behavior as designated in some independent specification. Proving correctness is impractical not only with complete systems but even where individual refinements must be shown to be consistent and correct (children with respect to their parent) during the design process. In standard approaches, the number of inputs and outputs invariably is simply too large. It is difficult in practice to keep track of the variables allowed--much less to prove that each instance of behavior is correct. In recognition of this fact, "clean room" puts a heavy emphasis on software testing after implementation based on statistical sampling methods.

To summarize these two points, internal consistency involves syntactic constraints between the parent in a refinement and its children. Ensuring internal consistency gets increasingly complex as the number of input and output variables gets larger and larger. Proving correctness of any given refinement is even more complex. This requires that refinement components produce the same results as their parent. This is impractical (if not impossible) with realistic software systems.

A third limitation of current approaches to proving correctness derives from the traditional standards used for this purpose. In order to prove that a program is correct with respect to its specification, one must begin by assuming that some basic set of functions operates correctly (e.g., the standard arithmetic functions of addition, subtraction, etc.). Typically, the same basic set is predetermined and assumed irrespective of the program or application. All proofs are effectively judged relative to a single standard. For example, consider some specified functional relationship and a program

d:=prog(a, b, c)

which maps each set of inputs to the function (i.e., a, b and c) into the specified response (i.e., d). In this case we must prove that the program prog produces the same behavior as the specified function.

That is, the function prog is correct if and only if prog's behavior can be produced using the basic set of assumed component functions (e.g., +, /) of which prog is composed. Notice that these component functions are assumed to operate on the same input values and/or values derived therefrom, as does prog itself. This is true at all levels of refinement in a design for prog. In each case there exists a unique output integer d.sub.i for each set of input integers a.sub.i, b.sub.i, c.sub.i, to the specification. In general, the task of proving correctness can be reduced to showing that the behavior associated with any given level in a refinement hierarchy is identical to that of the corresponding refinement, and that, ultimately, with a terminal refinement consisting entirely of the assumed basic functions. Although proving correctness of individual refinements is much simpler than proving that a (complete) design or program is correct, even the latter is a very demanding task. There is no known general-purpose solution that can accommodate arbitrarily complex components.

We have identified three basic limitations of current formal verification techniques as they are presently formulated: (1) the implicit assumption of a hardware metaphor which implies that all input and output variables must be specified prior to process refinement, (2) the assumption that the values of the variables referenced at all abstraction levels in a program design must be identical and (3) the practice of proving correctness with respect to a predetermined set of basic functions. The first limitation leads to unmanageable numbers of input and output variables in developing logically consistent designs. The second leads to the impossibility of devising any universal method for proving correctness. The third leads to impracticability with non-trivial systems. Any one of these limitations makes standard approaches impractical for use in actual practice.

In addition, current approaches have been unfamiliar and nonintuitive requiring highly specialized training. Indeed, existing formal techniques deal primarily, if not exclusively, with internal consistency. The correctness of programs cannot be assured. This requires comparing specifications (what a program is supposed to do) with what the program actually does. The sheer variety of possible inputs and outputs makes this impractical.

Cognitive Research

A completely different approach to the problem of system design derives from earlier research on human cognition and, in particular on structural (cognitive task) analysis by the present inventor. Structural analysis of a problem domain (corresponding to a specification in software engineering) requires specifying one or more "solution rules" (corresponding to language independent designs) which is sufficient for solving all problems in the domain. Each solution rule consists of a triple, domain (inputs), range (outputs) and procedure or operation. Rather than assuming (as standard software engineering approaches) that one knows all inputs and outputs at the beginning of analysis, these inputs and outputs are assumed to be known--that is, objectively specified--only to the same degree as the corresponding operation.

Structural analysis evolved over a long period of time and has been applied in the analysis of a wide variety of problem domains for both basic research on human cognition and the design of instructional materials (e.g., Scandura, 1971, 1973, 1977; Scandura, Durnin, Ehrenpreis et al, 1971; Scandura & Scandura, 1980). Most of the earlier research was directed at the identification of higher order rules--that is to identifying rules which operate on and/or generate other rules. Given a problem domain, typically one so complex that no one solution rules could reasonably be identified, this method involved first identifying subsets of the domain for which solution rules could readily be specified. Then "parallels" (i.e., analogies, similarities, etc.) among the various solution rules were used to specify higher order rules making it possible to automatically generate these and/or new solution rules (for new problem subsets) from more basic rules. Scandura (1982, 1984a) both summarizes much of this research and more fully details a method for specifying individual lower and higher order rules.

As described in Scandura (1984a), this method (which at the time was evolving completely independently of parallel work in software engineering) involved analyzing both data (problems) and solution processes in parallel from the top down. In essence, representative problems where solved step-by-step at successive levels of analysis from the top-down. Essentials of this method are best understood from the description given by Scandura (1984a).

" . . . top-down analysis has the advantage of making clear to the user the "flow of control" of the solution procedure. It also allows for arbitrary levels of procedural refinement. Top-down analysis deals with specific problem characteristics only in an informal heuristic manner . . . there is a close connection between the domain of a problem and the corresponding solution procedure. Informal awareness of either one necessarily influences specification of the other. It would be highly desirable to make such awareness public.

Step-by-step analysis, by way of contrast, deals more directly with particular problems and solution steps. It also provides a potentially mechanical means of translating sequences of solution states into general procedures. But, there are sometimes ambiguities as to overall structure and flow of control. This ambiguity derives from lack of specification of the interrelationships between problem domain and procedural representation.

. . . a method of analyzing procedures which combines the advantages of top-down and step-by-step analyses could go a long way toward providing the kind of objective and systematic method we seek.

One method is to employ problem and step-by-step analyses at progressive levels of top-down analysis. In this method the problem representation and the solution algorithm are refined in parallel. We illustrate the method in the context of subtraction, while emphasizing general principles.

I. First, the analyst would construct a column subtraction problem (on intuitive grounds). For example: ##STR1##

This problem, then, would be used to specify the corresponding problem domain to a first level of approximation. Toward this end, the analyst would employ the following procedure:

A. Identify the minimum essential characteristics of the problem which make it a problem of the desired type. Specifically, with regard to column subtraction, the "-" sign, the ".sub.------ ", the two numerals (437 and 275), where the one on top designates a number that is larger than the bottom one, constitute the "given" of the problem. The unspecified solution numeral (the one that goes under ".sub.------ ") specifies the problem goal.

B. Identify those input (i.e., "given") characteristics which may vary without changing the essential nature of the problem. By essential nature is meant those qualities of the problem which make it solvable via the, at this point, only intuitively known solution procedure associated with the problem. In the case of a column subtraction problem these characteristics are the two numerals, where the top one is greater than or equal to the bottom one.

C. Replace each such input characteristic with a variable and specify the sets of allowable values of these variables. In the case of subtraction, there are two domain variables: Top number (T), and bottom number (B), such that T>B.

Carrying out Steps A, B and C yields a problem schema which formally represents the domain of the associated rule to a first approximation.

Given the rule domain, the analyst next solves the original problem as specified and uses the resulting states as a basis for specifying the solution procedure. More exactly, the analyst uses the following method.

D. Take the specified values of the input variables of the original problem as inputs and apply step by step the solution procedure, which at this point is known only intuitively. In the case of the above subtraction problem, the analyst just subtracts the two given numbers--i.e., specified the solution. In doing this, he can refer formally speaking only to the variables (numbers) as wholes. Hence, the result is just a pair of states, the subtraction problem before and the subtraction problem after the difference has been written. (The analyst cannot refer to individual digits, for example, since they do not yet exist from the standpoint of the formal analysis. They are known intuitively to the analyst, but have not yet been represented formally. Hence, the digits cannot be referred to explicitly in formally representing the solution procedure at this level.)

E. Go to the first pair of states. (In this case there is only one pair of states: the subtraction problem itself and the problem together with its solution.) Identify the atomic operation corresponding to this pair of states. In this case, let us just call it "subtract." Since these are the only states, we are finished.

F. Identify the "normal" domain of the operation(s) identified, using a variation of step C. Specifically, over what ranges of values can the initial problem state vary without affecting applicability of the atomic operation? Equivalently, what is the corresponding atomic rule? In the present case, the normal domain is just the domain of subtraction problems as specified above.

G. Identify the extra-domain condition, if any, which must be satisfied if this atomic rule (operation--plus domain and implied range) is to be applied. In the context of the intuitively known solution procedure, the "subtract" atomic rule applies to every element in its domain (i.e., to all pairs of numbers where the top one is not smaller than the bottom one). Hence, there are no extra-domain conditions.

Since there is only one atomic rule in the solution procedure, we are finished. The "subtract" atomic rule is THE required solution rule at this level. (Note: This subtract rule consists of the subtraction operation, together with the problem domain and goal.)

II. Reapplication of step-by-step analysis gives a more finely structured rule [i.e., representation] at the next level of top-down analysis. At this point, we may select any problem (including the original one) from the subtraction domain, which we have formally characterized in terms of the "-", ".sub.------ ", and two number variables. (The unspecified solution variable constitutes the range.)

The first phase in refined analysis is to repeat the above steps (A-G) at a next level of detail.

A. What additional characteristics of the problems are referred to in applying the intuitively known solution procedure? In the case of column subtraction problems, columns clearly constitute additional characteristics because solutions proceed column by column.

B/C. Representation of the domain of subtraction problems thereby is enriched by allowing for a variable number of columns (from one up).

D. Next, the given problem is solved taking columns into account. That is, the subtract atomic rule (i.e., component) is refined. It is replaced with a sequence of states of the form ##STR2##

each state having one more digit zi than the preceding state (indicating that one more column has been subtracted).

IMPORTANT: The (xi, yi) column and the (zi-1) are encircled to distinguish these ACTIVE components of the problem (at state i) from those that are inactive. Active components are those that are attended to (i.e., operated on by the associated atomic rules) at any given step (state) of a problem solution. The square surrounding "zi" denotes the goal to be achieved at state i. (Notice at the initial level of analysis that both of the two given numbers are "active" in this sense.)

E-F. Next, the transition between the first pair of states is specified by the atomic rule "Go to the first column on the right and subtract." In turn, each successive state transition is specified by the atomic rule "Go to the next column to the left and subtract." The domain of this atomic rule is a problem schema. The active components in the schema are the just determined (last) digit in the partial difference (i.e., zi), which determines what the next column to the left is, and that column, which determines the column difference. As before, the result is a sequence of atomic rules.

G. Here, we see that the first atomic rule applies to every subtraction problem. There are no "extra-domain" conditions. Each of the succeeding atomic rules, however, is applied only when there is another column to the left (otherwise the problem is finished). Consequently, we introduce the extra-domain condition "There is a next column."

Suppose now that we have applied Steps D-G to just one problem (it could be the original one). At this point, we have a sequence of atomic rules, together with extra-domain conditions immediately preceding all but the first.

Where such extra-domain conditions exist, further analysis is required.

H. Examine each of the extra-domain conditions to determine whether some succession of the following atomic rules ever generates an output state whose active components satisfy the extra-domain condition (as well as the domain of the first atomic rule following the extra-domain condition). The given problem has three columns, so there is another column after subtracting each of the first two columns.

Indeed, by choosing a suitable problem, this extra-domain condition can be satisfied any number of times. Irrespective of how many columns are chosen, there is always some problem with one more column. What this means, effectively, is that any number of applications of the atomic rule "Go to (i.e., attend to or activate) the next column and subtract the column" is possible (with some problem). Alternatively, reapplication is indicated until an output state is generated (e.g., a fully solved problem) that no longer satisfies the extra-domain condition--namely, "There is no next column." Moreover, when this extra-domain condition is no longer satisfied, the process STOPS so the underlying procedure is completely defined.

Whenever such reapplication (of some succession of atomic rules) is indicated, the extra-domain condition in question defines an iteration. Equivalently, the successive atomic rules may be replaced by a loop, given a more efficient representation of the solution procedure.

More generally, one checks first to see if an extra-domain condition defines an iteration (i.e., loop). If it does not, and only if it does not, then it necessarily defines a selection.

III. Again, one can reapply step-by-step structural analysis to components of the Level II solution rule. As in the previous more informal top-down analysis, the "Subtract the column" atomic rule is one obvious candidate.

A. In this case, close attention is paid to what is done in subtracting individual columns, again building on the analyst's informal awareness of the solution procedure. In subtracting the first column of the given problem ##STR3##

we get the partial answer (2) directly. In the next column, however, we must "borrow" in order to subtract (3-7)--where borrowing involves reference to the top digit (4) in the next column.

In effect, at this third level of analysis it is necessary to distinguish individual digits in the subtraction problem. (The digits then constitute the "additional" characteristics.)

B/C. Each of these digits, in turn, may be replaced with a variable ranging from 0 to 9 (subject to the overall constraint that the top numeral represents a larger number than that bottom one).

D. Next, we detail more fully the sequences of states associated with designated components (e.g., the subtract column atomic rule) of the previously determined solution procedure. This is accomplished by solving the above, more finely structured problem, giving explicit attention to the "additional" characteristics specified. Consider a case where the subtract column component rule is applied to a column which requires "borrowing." For example, consider the sequence of states ##STR4##

where the encircled entities are active and the square designates the goal at that state.

E/F. In turn, the successive pairs of states correspond to the atomic rules: "Go to the top digit in the next column and borrow," "Add the borrowed 10 to the top number in the original column," "Subtract the original column (using basic facts)." The natural domains, respectively, are: "Problems with next column," "All problems in which a borrow has been initiated," "Columns where the top digit(s) are not smaller than the bottom digit(s)."

G. There is an extra-domain attached to the first of these atomic rules. Specifically, this rule is applied only where the "top digit(s) in a column is less than the bottom digit."

At this point, then, associated with the Level II "subtract the column" rule (as illustrated with the second column of the given problem), we have a sequence of subatomic rules, preceded by an extra-domain condition.

H. The next question is whether some consecutive subsequence of these subatomic rules, including the first one, ever generates a state whose active components satisfy the extra-domain condition (i.e., top digit(s) less than the bottom one). Going to the next column and borrowing (the first of the subatomic rules) certainly does not generate such a state, and neither do the first two subrules--or all three.

Once one has borrowed (after the first two subrules have been applied), the top digits (now there are two digits at the top of the column) always represent a number larger than the bottom one. Consequently, the extra-domain condition does not apply. After subtracting the original column (applying all three subrules in turn), we have done the equivalent of the Level II "subtract the column" rule so control moves to the previously defined iteration "While there is a next column."

For these reasons, the extra-domain condition "Top digit(s) less than bottom one" is NOT an iteration. Therefore, it defines a selection.

Moreover, analysis of subtraction in the first column of the problem tells what happens when the complementary extra-domain condition is satisfied (i.e., when the top number is greater than or equal to the bottom one). In this case, one simply subtracts using the basic facts. Hence, the selection is completely defined. If the top number is greater than or equal to the bottom one, we subtract using the basic facts; otherwise, we borrow and then subtract.

Notice that the entire selection construct, including the alternative atomic rules, is contained within (bounded by) the iteration condition ("There is a next column") defined at the previous level of analysis.

It also is important to observe that the iterations and selections identified in an analysis need not be fully defined. In the present case, for example, we were fortunate (or had the clairvoyance) to select a problem that involved both borrow and non-borrow columns. If this had not been the case, then our analysis would not have been complete. We would have known that a selection was involved, and we would have known one of the two alternatives. But we would not have known the other alternative.

In circumstances such as this, one must repeat the above process with new problems--hopefully ones in which the previously unused conditions are satisfied. The results of each new analysis, then, must be combined level by level and component by component with the initial one. The process must be continued with new problems until the underlying procedure is completely defined. In this regard, notice that there always is a finite number of problems that will suffice for this purpose because the number of conditions in any procedure must be finite."

Nothing new is involved in extending the above process to borrowing across O's. This can be accomplished by introducing a selection (within the borrow selection).

Clearly, the goal of structural analysis was to develop an explicit, repeatable and potentially automatable process for analyzing content. Although the context was completely different, these goals clearly are quite analogous to those common in most software engineering methodologies (e.g., Booch, 1994; Rumbaugh, et al., 1991; Yourdon 1989). Structural analysis, however, differs from traditional software engineering methods in several fundamental ways.

First, unlike software engineering, the original goal of structural analysis (e.g., Scandura et al, 1971; Scandura, 1982, 1984a) was to analyze complex, sometimes called ill-defined domains. III-defined domains are problem domains that transcend any given finite set of solution rules. That is, no finite set of rules can be found that is sufficient for solving all problems in the domain. Structural analysis was designed to address this problem. The basic idea was to identify sets of rules and higher order rules that collectively would make it possible to solve qualitatively greater numbers of problems in ill-defined domains than would otherwise be possible.

As described in Scandura (1984a, pp. 23-7), structural analysis of complex domains begins by selecting a representative sample of solvable problems in the domain. Next, a solution rule is derived (e.g., by the above methods) for each type of problem. This results in a set of solution rules that is sufficient for solving problems similar to the sample problems. A key idea in structural analysis is that one can usually find parallels between different solution rules. These parallels typically reflect the fact that the solution rules can be derived via the same higher order rule. Moreover, as indicated by Scandura (1984a, p.23), solution rules correspond to higher order problems, namely problems for which they are solutions.

Structural analysis takes higher order problems as new starting place and the process (of identifying solution rules) is repeated. This time the results are higher order rules which operate on and/or generate other rules. The process of moving to higher and higher levels can continue indefinitely.

Each iteration leads to rules which are simpler individually, but which collectively are more powerful in the sense that they account for qualitatively broader varieties of problems. That is, when used in combination, higher order rules may generate new rules which in turn make it possible to make it possible to generate new solution rules and/or problems. Structural analysis also takes into account that solution rules derived in this way may include previously identified rules. Hence, a final step in structural analysis is to eliminate redundant rules. A number of applications of structural analysis are summarized in Scandura (1982).

A second major difference between software engineering and structural analysis is that data and process in structural analysis are refined in parallel (from the top down). In structured analysis, for example, all of the required inputs and outputs are assumed from the beginning. In structural analysis, data abstractions are defined only as needed in process refinement.

A major limitation of structural analysis as regards software engineering is its informality. Ambiguities, omissions, and lack of scalability to complex systems, make it unsuitable as a foundation for automation and even less for ensuring the correctness of complex software systems. Indeed, the possibility of ensuring correctness of an analysis, whether internal or external, was not seriously considered.

What is needed is a repeatable, formally verifiable and intuitive process that is easy to use. Specifically, this process must provide human designers with automated support making it easy to implement arbitrarily large, complex software systems that are both internally consistent and logically correct (that are consistent with specifications).

Domain Analysis

Domain analysis (e.g., Krut et al, 1996) has become a new and active area of research in recent years. Like structural analysis, domain analysis is concerned with ill-defined domains (e.g., word processing, software engineering [tools]). Most software engineers, however, are not familiar with prior work in structural analysis (understandably using such terms as analysis, design and application). Consequently, whereas structural analysis would seek parallels between different (lower order) applications (attempting to represent these parallels in terms of higher order designs), domain analysis has taken a different tact. Thus, instead of seeking ways to identify higher and lower order components for constructing applications from more basic components, domain analysis looks for similarities and differences between alternative applications. Although such information can be useful in designing specific applications in a given domain, domain analysis does not provide the information necessary to help automate the construction of such applications.

Control Mechanisms

A final line of prior research is relevant to the present disclosure. This pertains to the way in which given sets of rules are used in solving problems (e.g., see Winston, 1977, pp. 129-156). Given a problem together with a set of rules (and higher order rules), the question is how those rules are to be used in attempting to solve the problem. A variety of mechanisms have been proposed for searching and controlling interactions among available rules (or what are commonly called productions). Various combinations of forward and backward chaining have been most common. Forward chaining involves selecting rules in order beginning with the problem given. Backward chaining begins by selecting a rule that would solve the problem given that rule's input.

Two basic approaches have been widely used. An approach called means-ends analysis bases the selection and use of rules on computed differences between given and goal states. The success of this approach depends on finding a useful and universally applicable way to compute differences.

Another approach is based on sets of independent productions, each consisting of an operation and a set of conditions defining its domain of applicability. Rather than relying on differences between goals and what is given, production systems allow more flexible control. Individual productions are triggered by currently available inputs. Productions are selected by matching what is available with production domains. The order in which productions are listed can be important. When more than one production matches a given situation, conflicts can be resolved in any number of ways: first production found, most complete match, most recently used, etc. Productions and problem categories can also be partitioned into classes.

Productions systems are fully modular. New productions may be added as desired. Given productions can even control interactions among other productions. Within the artificial intelligence community, productions are typically identified by a process called "knowledge engineering". This is a largely informal process whereby a domain expert identifies and represents individual items of knowledge as productions.

Structural (cognitive task) analysis (e.g., Scandura et al, 1971, Scandura, 1982) represents a more systematic approach to the same problem. Two major characteristics distinguish structural analysis: (a) operations identified via structural analysis may be refined into their constituents and (b) higher order rules which operate on and/or generate other rules are identified in a systematic manner. The former supports the inclusion of productions with non-elementary operations, which we call solution rules. The explicit introduction of higher as well as lower order rules in the latter provides an explicit basis for predicting coverage of the initial problem domain. Lower order rules can only be combined and/or otherwise modified via explicitly identified higher order rules. The order in which rules are listed plays no essential role.

The Structural Learning Theory (e.g., Scandura, 1971, 1973, 1977) builds on these two characteristics of structural analysis. This theory includes corresponding mechanisms for: (a) assessing the knowledge state of individual human beings relative to a set of rules and (b) controlling interactions between rules. Specifically, in the latter case, a generalized "goal switching" mechanism was proposed to control interactions between higher and lower order rules: If an available rules applies, use it. Otherwise, control automatically switches to a higher order goal of deriving one. Higher order rules are applied in this case, thereby generating new rules that apply to the original situation. This mechanism was later modified (Scandura, 1993) so a rule is applied only where exactly one rule applies (no more and no fewer). Otherwise, higher order goals are invoked. This modification provides an explicit base for rule selection in which more than one rule can be used.

The goal switching mechanism appears to be straightforward and potentially applicable to all problem domains. Moreover, strong empirical support suggesting its universal availability to human beings has been found in a variety of research studies (e.g., Scandura, 1977; Scandura & Scandura, 1980). Nonetheless, this mechanism has never been detailed with the precision necessary for implementation in a devise with memory in a completely general manner. Goal switching control and higher order rules, for example, are intrinsically intertwined in the computer simulation reported by Wulfeck and Scandura (in Chapter 14 of Scandura, 1977). In the original goal switching control mechanism, higher order goals are defined as sets of rules for solving lower order problems (see Scandura, 1973, p.293). Because higher order goals and higher order rules are defined in the same way, it is impossible to construct an independent test to determine whether a given higher order rule satisfies the higher order goal.

SUMMARY OF THE INVENTION

Software development and maintenance involves assembling components, sometimes support during the design process but more frequently not. In neither case has it been possible to ensure that systems will operate correctly. At best, verifying program correctness (component assembly) requires formal techniques, which in practice have been difficult to apply. Existing design methodologies have suffered from incompleteness. When wisely used, these methods help avoid certain problems but leave the most difficult--ensuring internal consistency and correctness (with respect to specifications) in a manner that scales well to large as well as small systems.

The invention disclosed herein are intuitive, repeatable, formally verifiable, and automated methods, that are practical and easy to use, for developing and maintaining software systems. Specifically, these methods provide human designers with automated support for specifying, designing, implementing and maintaining arbitrarily large, complex software systems that are both internally consistent and logically correct (i.e., consistent with external specifications). These methods avoid the "combinatorial explosion" problem and minimize the need for human input. This is accomplished, respectively, by constructing specifications and solution designs via simultaneous and successive levels of data and process refinement, by capitalizing on semantically meaningful relationships between different kinds of specification and design refinement, and by partitioning input and output values and types into equivalence classes.

Disclosed herein are methods lending themselves to automation for: (1) building unambiguous specifications in a manner which dramatically reduces the number of distinct input and output variables that the human analyst needs to specify and even more so the numbers of distinct values of said variables which need to be specified and hence checked for correctness, (2) constructing solution designs by successive refinement of solution rules and/or conditions in a manner which minimizes the need for human input, ensures internal consistency, minimizes side effects and guarantees that the design produces abstract behavior consistent with specifications, (3) implementing designs in a manner that both automates key aspects of the process and guarantees correct execution, (4) constructing and/or updating virtual solution rules in an existing or implied design hierarchy from terminal solution rules in a solution design (in that hierarchy) by successive abstraction in a manner which minimizes the need for human input, ensures internal consistency and minimizes side effects, (5) converting programs written in a structured language to a canonical form to which the above method 4 applies. Also included are methods for: (6) making designs more efficient, (7) allowing designers to assemble virtual and/or real components in combination with the above methods at any desired level of abstraction, (8) incorporating the above methods in procedural, event-driven and client-server architectures, (9) accommodating systems of any size, (10) creating library and object resources, (11) creating test cases, (12) constructing an interpreter that supports a behavioral approach to debugging OO designs, (13) implementing design elements even when no matching component exists and (14) identifying components and higher order components which collectively make it possible to solve problems in arbitrarily complex domains.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DISCLOSURE

FIG. 1. Flexform representing specifications for cleaning a room.

FIG. 1A. The room1A.spc Flexform representing an INPUT variable "room" and two catagories: "kitchen" and "bedroom".

FIG. 1B. The room1B.cat Flexform representing a room category structure and its sub-categories kitchen and bedroom, where room and its sub-categories are non-atomic.

FIG. 1C. The kitchen1C.category Flexform representing the specialized portion of the room data structure of FIG. 1B needed to characterize kitchens.

FIG. 1D. Category hierarchies represented in integrated tree-views.

FIG. 2. Flexform representing a design for cleaning a room, along with top-level sections containing supporting information.

FIG. 2A. Flexform representing the RULE SPECS subsection of the DETAILED SPECIFICATIONS section in the clean room design.

FIG. 2B. Flexform representing the DATA SPECS subsection of the DETAILED SPECIFICATIONS section in the clean room design.

FIG. 2C. Flexform representing top levels of an executable version of the clean room design with DOMAIN, RANGE and INTERMEDIATE, and INCLUDED FILE and GLOBAL sections expanded.

FIG. 2D. Flexform representing the top-level class in a class hierarchy.

FIG. 2E. Flexform representing the kitchen class that inherits from the room class of FIG. 2D.

FIG. 3. Flexform representing a sample problem (data) on which the clean room design of FIG. 2 operates.

FIG. 4. Flexform representing a sample project for cleaning houses.

FIG. 4A. Tree view showing the clean_house project Flexform of FIG. 4 with relationships to referenced Flexforms.

FIG. 5. Flexform representing a clean_component_library containing "vacuum" and "make" designs and their specifications.

FIG. 5A. Flexform representing specifications for vacuum carpeting in clean_component_library.

FIG. 5B. Flexform representing the top level of a design and external specifications for vacuum carpeting in clean_component_library.

FIG. 6. Flexform representing a room class containing a data structure (set of components) which represents a room, two designs that operate on this data structure and their specifications.

FIG. 7A. Flexform representing the clean_house_spc specifications in the clean house project shown in FIG. 4.

FIG. 7B. Flexform representing the clean_house design corresponding to the specification in FIG. 7A.

FIG. 8A. Flexform representing a high level of the specifications callback, with emphasis on automatic (called continue--suggest next) mode.

FIG. 8B. Flexform representing a high level of the design callback, with emphasis on automatic (called continue--suggest next) mode.

FIG. 9. Flexform representing the high level case selection used in building various types of procedural refinement.

FIG. 9A. Flexform representing a high level design for building sequence refinements.

FIG. 9B. Flexform representing a high level design for building navigation sequence refinements.

FIG. 9C. Flexform representing a high level design for building parallel refinements.

FIG. 9D. Flexform representing a high level design for building selection refinements.

FIG. 9E. Flexform representing a high level design for building case refinements.

FIG. 9F. Flexform representing a high level design for building repeat . . . until refinements.

FIG. 9G. Flexform representing a high level design for building while . . . do refinements.

FIG. 10. Flexform representing a high level design for updating the parent solution rule in a given design refinement.

FIG. 10A. Flexform representing a high level design for updating sequence refinements.

FIG. 10B. Flexform representing a high level design for updating navigation sequence refinements.

FIG. 10C. Flexform representing a high level design for updating parallel refinements.

FIG. 10D. Flexform representing a high level design for updating selection refinements.

FIG. 10E. Flexform representing a high level design for updating case refinements.

FIG. 10F. Flexform representing a high level design for updating repeat . . . until refinements.

FIG. 10G. Flexform representing a high level design for updating while . . . do refinements.

FIG. 11. Flexform representing the top-level design for checking the consistency of a given design refinement.

FIG. 12. Flexform summarizing the design for checking the abstract or relative correctness of given design refinements.

FIG. 13A. Sample FORTRAN subroutine illustrating both data and process source code.

FIG. 13B. FORTRAN Flexform representing all essential information in the sample FORTRAN subroutine of FIG. 13A.

FIG. 13C. Terminal-level design in this HLD design hierarchy Flexform represents the HLD equivalent of the FORTRAN Flexform shown in FIG. 13B.

FIG. 14. Abstract solution rule fn (Z, X . . . ) representing the combined effects of affiliated operations embedded in the class inheritance hierarchy associated with object Z.

FIG. 15. Preferred method for constructing and maintaining specifications.

FIG. 16. Preferred method for constructing and maintaining designs.

FIG. 17. Preferred method for constructing and updating a design hierarchy from a design or design hierarchy.

FIG. 18. Preferred method for combining top-down and bottom-up methods for building and maintaining designs.

FIG. 19. Preferred method for implementing solution rules in designs constructed via the method of FIG. 16 (for constructing and maintaining designs).

FIG. 20. Preferred method for converting a procedural module (program, subroutine or method) written in another programming language to canonical form.

FIG. 21. Preferred method for making a design more compact, more efficient or both.

FIG. 22. Preferred method for incorporating designs into various architectures.

FIG. 23. Preferred method for building, maintaining and implementing behavioral models.

FIG. 24. Preferred method for creating and executing test cases.

FIG. 25. Preferred method for interpreting designs.

FIG. 26. Preferred method based on a universal (component independent) control mechanism for deriving executable designs for implementing virtual solution rules.

FIG. 26A. Elaboration of the initialize_problem_and_apply step in FIG. 26.

FIG. 27. Preferred method of structural analysis for constructing a set of designs.

FIG. 27A. Alternative method of structural analysis offering the designer less specific guidance but allowing more flexibility.

FIG. 27B. Elaboration of the partition_and_construct_higher_order_specs step in FIG. 27.

FIG. 28A. File pull-down menu used in the current AutoBuilder implementation, including the kinds of files supported.

FIG. 28B. Project pull-down.

FIG. 28C. Simulate pull-down, including debugging and support options.

FIG. 28D. Tools pull-down, including Semantic Wrapper, Semantic Integrator options and C code from HDL designs options.

FIG. 29a. Spec Builder Modeless Dialog Box Used in the Current Implementation.

FIG. 29B. Edit design option dialog box.

FIG. 29c. Refine Option Dialog Box.

FIG. 29D. I/O Behavior option dialog box.

FIG. 29E. Go to Design option dialog box.

FIG. 29F. Tree-view representation of a sample project.

FIG. 30A. Design Builder modeless dialog box used in the current implementation.

FIG. 30B. Edit design refinement dialog box.

FIG. 30C. Refine design dialog box.

FIG. 30D. Implement solution rule dialog box.

FIG. 30E. Check Sub-design dialog box.

FIG. 30F. Edit Rule Spec dialog box.

FIG. 30G. Sample warning message pertaining to Update Abstractions.

FIG. 30H. Confirmation message pertaining to Prepare for Interpreter option.

FIG. 30I. AutoBuilder Options dialog box.

FIG. 30J. Go to Specification dialog box.

FIG. 30K. flexform class hierarchy containing two subclasses, treeview_representation and flexform_representation.

FIG. 30L. Design containing the abstract solution rule abstract_redraw (: current_flexform;).

FIG. 30M. Update Impacted Designs dialog box for the Update Interfaces option in the Design Builder.

FIG. 30N. Treeview after Update Impacted Designs has been run and the design logic in clean_room_dsn in the USES HIERARCHY has been adjusted to reflect changes to its calls and data.

FIG. 30O. Parent-Child Relationships dialog box for the prototype refinement associated with vacuum (: carpet;).

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1. This Flexform represents specifications for cleaning a room and was automatically generated by the current AutoBuilder implementation. Each element in a specification Flexform contains one variable. Terminal elements contain the name of the variable in the name field [within square brackets] of the element and the variable's ABSTRACT VALUES in the value field (after the :) of the element. Non-terminal elements under INPUT also contain PARENT-CHILD RELATIONSHIPS. Non-terminal OUTPUT elements contain INPUT-OUTPUT BEHAVIOR as well as PARENT-CHILD RELATIONSHIPS. In this example, the variable room and its component descendants serve as both inputs and outputs. (NOTE: All Flexforms in the following figures have been generated by the AutoBuilder implementation.)

FIG. 1A. The room1A.spc Flexform represents an INPUT variable "room" and two categories: "kitchen" and "bedroom". Notice that abstract values "messy" and "neat" of room also partition both categories. Consequently, parent-child relationships are redundant in category refinements and may be omitted. In this example, room and its sub-categories, kitchen and bedroom, are atomic in the sense that they do not contain any components. This type of self-contained category representation is used where the analyst wishes to specify a procedural design. The extraction refinement of "kitchen" provides further guidance for the design, indicating a sequence refinement in which the abstract values of "stove" and "dishes" are first extracted from kitchen, followed by a function defined on these values which is used to compute the abstract value of kitchen.

FIG. 1B. The room1B.cat Flexform also represents a room category structure and its sub-categories kitchen and bedroom. This time, however, the room and its sub-categories are non-atomic. Specifically, the Flexform shows how data structures (categories) representing non-atomic variables in category refinements are represented in the preferred embodiment. The room1B.category Flexform represents a room category containing walls, a ceiling and a floor. This category Flexform also references kitchen (kitchen1C.category) and bedroom (bedroom.category) sub-categories. The latter represent specialized portions of the data structure needed to characterize kitchens and bedrooms, respectively. Non-atomic category refinements used in specifications correspond to class hierarchies in designs. Indeed, classes and sub-classes can easily be derived from category refinements by adding SPECIFICATION and DESIGN sections (see next section). The latter sections contain designs and corresponding specifications associated with the class data structures.

FIG. 1C. The kitchen1C.category Flexform represents the specialized portion of the room data structure of FIG. 1B needed to characterize kitchens. The kitchen category includes the data structure in the room category supplemented with that associated with the kitchen category.

FIG. 1D. Category hierarchies also can be represented in integrated tree-views. This figure shows a tree-view combining the room1B and kitchen1C categories. Notice that the room1B and kitchen1C categories are displayed in bold type indicating that they are in separate Flexforms.

FIG. 2. This Flexform represents a design for cleaning room, along with top-level sections containing supporting information. Definition of the condition "done" depends indirectly on the nature of the advance solution rule. Consequently, in the preferred AutoBuilder implementation, condition definitions can often be constructed automatically. The definition

done:=not (next_element_exists (current_rug, carpet))

is not shown in the figure. It is hidden below the "done" element where it can be accessed as needed. Nonetheless, the definition can be displayed in the Flexform as the child element of "done". (In general, condition definitions may be single elements or hierarchies of elements.)

FIG. 2A. This Flexform represents the RULE SPECS subsection of the DETAILED SPECIFICATIONS section in the clean room design. In the preferred embodiment, the RULE SPECS subsection lists specifications for each solution rule in the design sequentially in order as an alternative to retaining hierarchical relationships implicit in the design. The former representation is strictly optional and is done solely to speed access during processing. Solution rules are referenced by number rather than by name to reduce the need for checking (names) after editing. Notice also that the triples of +'s and -'s in the value field of the root spec for each rule refer to whether the rule has been checked, and if so, its consistency and correctness status.

FIG. 2B. This Flexform represent the DATA SPECS subsection of the DETAILED SPECIFICATIONS section in the clean room design. The DATA SPECS subsection lists all of the INPUT, OUTPUT and INTERMEDIATE variables used in the design. Notice that "room" is marked with a @ indicating that it is a "clone" shared by both INPUT and OUTPUT. This clone is displayed below as a separate section of the Flexform (with a "." between the initial and clone sections). In the preferred embodiment, clones can be referenced either separately (as in FIG. 2B) or in context. In the latter case only the first expansion is normally shown. The intermediate variable "done" is defined by data execution in the design (see FIG. 2).

FIG. 2C. This Flexform represents top levels of an executable version of the clean room design with DOMAIN, RANGE and INTERMEDIATE, and INCLUDED FILE and GLOBAL sections expanded. The former sections are automatically constructed from the DATA SPECS and the latter during implementation of the design. All of these sections are referenced during execution of a design. Notice that input-output variables are put in the RANGE section. Prototype variables, like "current_rug", become aliases used during processing and are listed under intermediate variables. Alias variables "point to" specific variables (e.g., specific "rugs") during execution. As noted above, the intermediate variable "done" is defined by data execution (refined in the design). The INCLUDED FILE and GLOBAL sections, respectively, contain sub-designs (e.g., "make_bed") and compiled components (e.g., "pickup_rug" and "vacuum_rug") introduced in implementing the design. These "executables" replace the virtual solution rules "make", "pickup" and "vacuum" in FIG. 2, respectively. "Next_element" and "=" are core components built into the current embodiment, and are not (but could be) listed.

FIG. 2D. This Flexform represents the top-level class in a class hierarchy. This Flexform represents a room class derived from the room category of FIG. 1B. This class is supplemented with designs (and one of the two optional specifications) defined on the room data structure. These designs are listed in the INCLUDED FILE (the name used for included designs in the current embodiment) section. Two classes inherit from this class: kitchen2E and bedroom. The major differences between class hierarchies and category hierarchies are their respective roles (designs vs. specifications) and the introduction of designs and specifications. Notice that sub-categories in category hierarchies correspond to subclasses in class hierarchies. These distinctions are reflected in the Flexform root suffix (*.category vs. *.class) and the section names (e.g., CATEGORY versus INHERITS). As with category refinements, various sections in class hierarchies can either be expanded in place or as file references. File references are used for sub-classes, designs and specifications in FIG. 2D but not for the data structure.

FIG. 2E. This Flexform represents the kitchen class that inherits from the room class of FIG. 2D. It contains two designs, one of which operates on stoves and the other on dishes. Other designs, of course, could operate on both. In the preferred embodiment, however, operations in a sub-class may not operate on data in super-ordinate classes.

FIG. 3. This Flexform represents a sample problem (data) on which the clean room design of FIG. 2 operates. In this example all of the input variables also are outputs. In the current embodiment, such variables are put in the GOAL section. Also notice that the particular carpet in this problem contains three rugs. (In general, a problem may have any number of rugs.) During execution, the design introduces two intermediate variables, current_rug and done. Consequently, the figure includes the INTERMEDIATE section in addition to the GIVEN and GOAL sections which comprise the problem. During execution, individual rugs are referenced by the alias current_rug: current_rug corresponds to the prototype element current_rug in the specification of FIG. 1. The value of the intermediate variable done is determined from its definition (see FIG. 2) during execution.

FIG. 4. This Flexform represents a sample project for cleaning houses. This Flexform references the specification clean_house_spc for the top-level design clean_house_dsn. In the current implementation, the INCLUDED FILE section contains references to Flexforms representing clean_component library and room_class resources. The referenced Flexforms are shown in FIGS. 5 through 7.

FIG. 4A. The tree view in this figure shows the clean_house project Flexform of FIG. 4 with relationships to referenced Flexforms. In this example, clean_house_dsn is the top-level design. It calls clean_room_dsn, which in turn calls vacuum_dsn and make_dsn. The latter designs are contained in clean_component_library.

FIG. 5. This Flexform represents a clean_component_library containing "vacuum" and "make" designs and their specifications.

FIG. 5A. This Flexform represents specifications for vacuum carpeting in clean_component_library.

FIG. 5B. This Flexform represents the top level of a design and external specifications for vacuum carpeting in clean_component_library.

FIG. 6. This Flexform represents a room class containing a data structure (set of components) which represents a room, two designs that operate on this data structure and their specifications. No sub-class inherits this room data structure.

FIG. 7A. This Flexform represents the clean_house_spc specifications in the clean house project shown in FIG. 4.

FIG. 7B. This Flexform represents the clean_house design corresponding to the specification in FIG. 7A. Notice that the "clean_room.room" solution rule refers to the "clean_room" design in the "room" class of FIG. 6. Reference to both an operation and the class containing it is common in OO design. An alternative OO implementation would be to introduce an abstract solution rule with affiliates in the room class. These object-oriented implementations can be further contrasted with the calling relationships depicted in FIG. 4A. In the latter case, clean_house_dsn calls clean_room_dsn, which in turn calls make_dsn and vacuum_dsn in the clean component library.

FIG. 8A. This Flexform represents a high level of the specifications callback, with emphasis on automatic (called continue--suggest next) mode. When the suggest-next button (see FIG. 29A) is selected, the AutoBuilder implementation automatically checks and advances through the specification hierarchy, ensuring that each refinement is complete and consistent. Suggest-next may also suggest which action(s) to perform next. Each of the case alternatives corresponds to a manual option that can be performed at each node in the specifications hierarchy. The REFINE option, for example, gives the analyst a choice of refinement types (e.g., component, prototype, category, and extraction). The refine callback also assists in construction of the refinement, including parent-child relationships and, in the case of output variables, input-output behavior. These manual options can be activated and/or turned off globally as desired (e.g., by a chief analyst).

FIG. 8B. This Flexform represents a high level of the design callback, with emphasis on automatic (called continue--suggest next) mode. When the suggest-next button (see FIG. 29A) is selected, the AutoBuilder implementation automatically checks and advances through the design hierarchy ensuring that each refinement is complete and consistent. Suggest-next may also suggest which action(s) to perform next. Each of the case alternatives corresponds to a manual option that can be performed at each node in the design hierarchy. The REFINE option, for example, automatically constructs a design refinement for the current node (see FIG. 9). In the process, it requests information on abstract behavior associated with the child solution rules while ensuring consistency (FIG. 11). Where specifications are available, the design callback suggests an appropriate refinement and abstract values based on the corresponding specifications variable. Otherwise, the callback asks for the key variable on which to base the design refinement. See FIG. 30A for other options. Where appropriate, global options can be preset to assure given degrees of automation. The check sub-tree option, for example, can be configured to check for consistency, correctness with respect to specifications or both (FIG. 12).

FIG. 9. This Flexform represents the high level case selection used in building various types of procedural refinement. Details on each case alternative are given in FIGS. 9A-G. Similar case logic is used in checking for consistency and correctness. Actual checking in the current embodiment is in accordance with the consistency rules detailed above as implemented in the FIGS. 11 and 12.

FIG. 9A. This Flexform represents a high level design for building sequence refinements.

FIG. 9B. This Flexform represents a high level design for building navigation sequence refinements.

FIG. 9C. This Flexform represents a high level design for building parallel refinements.

FIG. 9D. This Flexform represents a high level design for building selection refinements.

FIG. 9E. This Flexform represents a high level design for building case refinements.

FIG. 9F. This Flexform represents a high level design for building repeat . . . until refinements.

FIG. 9G. This Flexform represents a high level design for building while . . . do refinements.

FIG. 10. This Flexform represents a high level design for updating the parent solution rule in a given design refinement. The Update-Abstractions operation also involves a case with alternatives for each type of refinement. Notice in the preferred embodiment that the sequence alternative distinguishes those sub-sequences that may be converted to navigation sequences and those that may be carried out in parallel.

FIG. 10A. This Flexform represents a high level design for updating sequence refinements. In the preferred embodiment, a second pass is made through the design hierarchy to construct data abstractions where possible.

FIG. 10B. This Flexform represents a high level design for updating navigation sequence refinements.

FIG. 10C. This Flexform represents a high level design for updating parallel refinements.

FIG. 10D. This Flexform represents a high level design for updating selection refinements.

FIG. 10E. This Flexform represents a high level design for updating case refinements.

FIG. 10F. This Flexform represents a high level design for updating repeat . . . until refinements.

FIG. 10G. This Flexform represents a high level design for updating while . . . do refinements.

FIG. 11. This Flexform represents the top-level design for checking the consistency of a given design refinement. These consistency rules are detailed in the section on DESIGN DEFINITIONS.

FIG. 12. This Flexform summarizes the design for checking the abstract or relative correctness of given design refinements.

FIG. 13A. This sample FORTRAN subroutine illustrates both data and process source code.

FIG. 13B. This FORTRAN Flexform represents all essential information in the sample FORTRAN subroutine of FIG. 13A. It also represents the implicit design hierarchy extracted from the code during reverse engineering. Higher levels in the Flexform design hierarchy are blank (see the rectangular regions surrounded by dots). These regions may be filled in with documentation or, as in the current disclosure, with abstract solution rules representing the functionality of the refinements below.

FIG. 13C. The terminal-level design in this HLD design hierarchy Flexform represents the HLD equivalent of the FORTRAN Flexform shown in FIG. 13B. Notice that the higher levels in this HLD design hierarchy have been abstracted level by level from the terminal-level. All variables have been "passed" upward from children solution rules in this case because there are no hierarchical relationships between any of the variables. This is indicated by the flat list in the DATA SPECS section. Data abstractions are introduced by the method summarized in FIG. 10A.

Although the terminal-level HLD solution rules representing FORTRAN procedural statements in this example correspond one-to-one, this will not always be the case. FORTRAN FORMAT and PRINT statements provide a case in point. The sample data conversions are more complex. In particular, FORTRAN includes parameters and initializations in separate sections of the code. In an HLD design hierarchy parameters are associated with the top-level solution rule. Initial values of parameters belong to data files (problems) on which the HLD design operates. Moreover, because individual HLD elements are type-less, base types like INTEGER are ignored (although they need not be). Defined types, such as three-dimensional arrays, are represented hierarchically. These kinds of abstraction are characteristic of those required in converting from standard programming languages to HLD. Constructs which designate compiler options, or which serve primarily to improve efficiency (e.g., inherits in VAX Pascal) are simply ignored. Pointers (e.g., in C) may require manual adjustment. For example, pointers must correspond in HLD abstractions to aliases in navigation sequences or loops. Assuring that this is so may require reordering statements in a sequence, and/or other changes to satisfy predetermined refinement constraints.

FIG. 14. Abstract solution rules defined on objects raise polymorphism to a higher level. The abstract solution rule fn (Z, X . . . ) represents the combined effects of affiliated operations embedded in the class inheritance hierarchy associated with object Z. Rather than resolving reference to overloaded operations in different classes (e.g., A, B, C), abstract solution rules use the notion of affiliate operations to integrate semantically equivalent functionality across objects. Abstract solution rules also support dynamic definition (data execution) of objects (e.g., X) in terms of other objects (i.e., Y).

FIG. 15. This is the preferred method for constructing and maintaining specifications. This method corresponds to claim 1 and subordinate claim 12.

FIG. 16. This is the preferred method for constructing and maintaining designs. This method corresponds to claim 2 and subordinate claims 13 and 14.

FIG. 17. This is the preferred method for constructing and updating a design hierarchy from a design or design hierarchy. This method corresponds to claim 4 and subordinate claim 15.

FIG. 18. This is the preferred method for combining top-down and bottom-up methods for building and maintaining designs. This method corresponds to claim 7.

FIG. 19. This is the preferred method for implementing solution rules in designs constructed via the method of FIG. 16 (for constructing and maintaining designs). This method corresponds to claim 3.

FIG. 20. This is the preferred method for converting a procedural module (program, subroutine or method) written in another programming language to canonical form. This method corresponds to claim 5.

FIG. 21. This is the preferred method for making a design more compact, more efficient or both. This method corresponds to claim 6.

FIG. 22. This is the preferred method for incorporating designs into various architectures. This method corresponds to claim 8.

FIG. 23. This is the preferred method for building, maintaining and implementing behavioral models (also called projects or systems) possibly containing sub-projects as well as units (libraries and/or classes) that in turn contain designs and/or specifications. This method corresponds to claim 9.

FIG. 24. This is the preferred method for creating and executing test cases. This method corresponds to claim 11.

FIG. 25. This is the preferred method for interpreting designs.

FIG. 26. This is the preferred method based on a universal (component independent) control mechanism for deriving executable designs for implementing virtual solution rules. The input of the problem to be solved is the to-be-implemented solution rule. The solution is a plan or executable design whose specification matches the solution rule specification.

FIG. 27. This is the preferred method of structural analysis for constructing a set of designs that solves a "sufficiently large" (as determined by the structural analyst) subset of problems in an ill-defined domain.

FIG. 27A. This is an alternative method of structural analysis offering the designer less specific guidance but allowing more flexibility.

FIG. 28A. This figure shows the File pull-down menu used in the current AutoBuilder implementation, including the kinds of files supported. These include specifications and designs, and the projects, libraries and classes that reference them. Large categories also may be maintained as separate files.

FIG. 28B. This figure shows the Project pull-down. The Edit Full option shows the full Flexform file, including all support information, and can be used for debugging purposes. Edit Corresponding switches between corresponding specifications and designs.

FIG. 28C. This figure shows the Simulate pull-down, including debugging and support options for determining (and removing) test coverage information and automatic initialization of variables.

FIG. 28D. This figure shows the Tools pull-down. The Semantic Wrapper and Semantic Integrator options are used to make independently developed components interoperable with and available for use in HLD designs. Such components can then be used in implementing designs as disclosed herein. The highlighted option is used to generate compilable C code from HLD designs.

FIG. 29A. This figure shows the Spec Builder modeless dialog box used in the current implementation. This dialog box contains various options that control the specification callback, together with the corresponding specification Flexform and/or tree-view displays on which this callback operates. In this example, the clean1a.spc specification has been loaded (as shown in the background window). Its contents, including the type of refinement used (e.g., category), are displayed in the treeview in the foreground.

FIG. 29B. This figure shows the Edit design option dialog box. In this example, the clean1a.spc specification has been displayed as a Flexform showing more of the detail. The Parent-Child Relationships dialog box indicates that the room category refinement is being edited.

FIG. 28C. This figure shows the Refine option dialog box. In this example, the terminal variable bedroom is being refined. The designer must choose a type of refinement and identify the children and abstract values associated with bedroom. One abstract value is assumed when none is given for a variable. Completeness of the refinement is checked when either Accept Specs or Check Completeness (used for intermediate checking) is pressed.

FIG. 29D. This figure shows the I/O Behavior option dialog box. In this example, the abstract output value of the variable kitchen is presentable, irrespective of its input value.

FIG. 29E. This figure shows the Go to Design option dialog box. The default design is the design with the same prefix as the specification, in this case clean1a. Here, clean1a.spc is the specification so the corresponding design would be clean1a.dsn.

FIG. 29F. This figure shows a tree-view representation of a sample project containing: a) hidden specifications, b) a clean_house_dsn design, which in turn calls a clean_room_dsn, c) a clean_component_library, and d) a room_class.

FIG. 30A. This figure shows the Design Builder modeless dialog box used in the current implementation, along with a sample cleanrm2.dsn design. The dialog box includes various options controlling the design callback. The latter operates on the cleanrm2.dsn design Flexform (or other representation).

FIG. 30B. This figure shows the Edit design refinement dialog box. This example shows the Edit Sequence dialog box in conjunction with compound solution rule pickup_vacuum.

FIG. 30C. This figure shows the Refine design dialog box. This example shows the Refine Terminal dialog box in conjunction with the terminal solution rule make_bed.

FIG. 30D. This figure shows the Implement solution rule dialog box. This example shows the Implement dialog box in conjunction with the terminal virtual solution rule, make. After implementation, make will be converted to the executable component make_bed.

FIG. 30E. This figure shows the Check Sub-design dialog box. Because the cursor is at the top of the clean_room design hierarchy, the subtree in question is the entire design hierarchy. It was checked and, as reported, found to be consistent.

FIG. 30F. This figure shows the Edit Rule Spec dialog box. This example shows the input-output behavior associated with make (: bed;).

FIG. 30G. This figure shows a warning message pertaining to Update Abstractions. This option automatically reconstructs virtual solution rules (also called design abstractions) from their children in accordance with specified consistency constraints. Update Abstractions can be used, as in this example, where higher-level virtual solution rules already exist. This option also can be used in conjunction with legacy designs, which have been embedded via reverse engineering in an implicit design hierarchy. Higher level abstractions in the implicit design hierarchy may be missing or consist perhaps of English or other documentation.

FIG. 30H. This figure shows a confirmation message pertaining to the Prepare for Interpreter option. This option is used to update Flexform DOMAIN, RANGE and INTERMEDIATE data sections so the design can be interpreted.

FIG. 30I. This figure shows the AutoBuilder Options dialog box. This option provides control over the type of checking used. In the current embodiment, one can check for consistency, correctness and/or implementation status. Auto Implement automates the implementation of virtual solution rules used more than once in a design. The three advance options designate the order in which the Design Builder options, SUGGEST NEXT and Advance, operate. Other options control the way Update Abstractions, Undo and Implementation in the Design Builder are to work.

FIG. 30J. This figure shows the Go to Specification dialog box. It operates analogously to that in FIG. 29E.

FIG. 30K. This figure shows the flexform class hierarchy, which contains two subclasses, treeview_representation and flexform_representation. The flexform, treeview_representation (not expanded) and flexform_representation classes each contain two designs, one for creating objects and the other for redrawing representations.

FIG. 30L. This figure shows a design containing the abstract solution rule,

abstract_redraw (: current_flexform;)

This abstract solution rule has as an affiliate redraw in the flexform class. In turn, redraw has two affiliates, redraw_treeview in the treeview_representation sub-class and redraw_flexform in the flexform_representation sub-class (neither expanded). Although the Flexform design is expanded to show redraw, this expansion is simply part of the current implementation of the AutoBuilder Interpreter and C Code Generator. This expansion is accomplished automatically, and transparently. The designer only needs to introduce abstract_redraw. The use of affiliates handles calls on class-specific redraws automatically. (Although the affiliates in this example all have different names, overloading and polymorphism are supported similarly.)

FIG. 30M. This figure shows the Update Impacted Designs dialog box for the Update Interfaces option in the Design Builder. The treeview shows both a USES (Impact) HIERARCHY and the corresponding CALL (Behavioral) HIERARCHY in which make_dsn is highlighted. The dialog box shows that make_dsn's old interface is to be changed to the new interface. When OK is pressed the calls and data structures (corresponding to changes in parameters) in the impacted designs are automatically changed to reflect the new interface. In the current embodiment, the designer then adjusts the design logic of the impacted designs as necessary.

FIG. 30N. This figure shows the treeview after Update Impacted Designs has been run and the design logic in clean_room_dsn in the USES HIERARCHY has been adjusted to reflect changes to its calls and data. clean_house_dsn is marked with an "X" indicating that it still needs to be adjusted. In the preferred embodiment, the methods for building and maintaining designs summarized in FIGS. 16, 17 and 18 are used for this purpose. In particular, notice that various automation technologies may be used to automate the process.

FIG. 30O. This figure shows the Parent-Child Relationships dialog box for the prototype refinement associated with vacuum (: carpet; ). This example shows that the parent variable carpet is clean when all of the children variables associated with prototype variable current_rug are clean (*clean).

Specifications may be created externally, independently of designs. They consist of INPUT and OUTPUT hierarchies of variables. A variable in such a hierarchy may have two or more parents. The analyst must specify the top-level system variables. Optionally, the analyst also may specify a finite partition on the values of each variable. The partition effectively categorizes those values which the analyst believes should be distinguished, placing them in separate equivalence classes. These equivalence classes are herein referred to as "abstract values." In addition, the analyst may specify the functional relationship between abstract values of each parent variable and its children, as well as the nature of this relationship (e.g., component, prototype, category). Indeed, there is a close relationship between kinds of parent-child relationship and corresponding design refinements.

For each output variable, the analyst must also specify the associated abstract behavior. Abstract behavior is specified in terms of a functional mapping from abstract values of input variables to abstract values of the corresponding output variables. The parent-child and input-output mappings must be consistent. That is, input-output mappings between children must produce the same abstract behavior, as does the corresponding mapping between the parents.

The default number of abstract values is one. That is, unless otherwise specified, all values associated with a given variable are assumed to be in just one equivalence class or abstract value--namely, the variable itself. Two or more abstract values are introduced to ensure (to whatever degree the analyst desires) that the specifications will be adequate for their purpose. Refinements in which explicit relationships are specified between abstract values of parent and children variables are referred to as complete. Refinements need not be complete, but they must be consistent. They may be created prior to, in parallel with and/or after solution designs are formulated. In the preferred embodiment, information pertaining to the specification is stored both in the specification hierarchy and, for easier reference during design, also under DETAILED SPECIFICATIONS. (The latter includes a copy of the original external specifications [EXTERNAL SPECS], detailed specifications for each solution rule and data specifications.) The above ideas are represented in the Flexform shown in FIG. 1.

Solution designs are characterized by hierarchies of refinements of semantically meaningful components (herein called solution rules) at successive levels of abstraction. Designs are supplemented with (a) a DETAILED SPECIFICATIONS section which in addition to EXTERNAL SPECS contains hierarchies of parent-child relationships [DATA SPECS] and input-output specifications for all solution rules introduced in the design hierarchy [RULE SPECS], including non-terminal virtual rules, terminal sub-designs in which specifications and designs are elaborated separately and terminal atomic rules (which may be executable), (b) a DOMAIN hierarchy corresponding to the specification INPUT which specifies the structure of the formal input parameters of the design--optionally including their initial (abstract) values (i.e., each DOMAIN variable has exactly one of the abstract values listed in the INPUT specifications), (c) a RANGE hierarchy similarly corresponds to the specification OUTPUT and specifies the formal output parameters of the design, (d) an INTERMEDIATE section representing variables which are "local" to the design--in the preferred embodiment this section contains a list of variables, some of which may be hierarchically related to others, (e) an INCLUDED FILE section containing the names and file names of those sub-systems which elaborate terminal solution rules separately and (f) a GLOBAL section containing a list of executable rules and previously referenced sub-systems. A sample a design Flexform is shown in FIG. 2 with top levels of the supporting sections. The latter are expanded in FIGS. 2A, 2B and 2C.

Specific constraints guarantee that successive levels of refinement are syntactically consistent with respect to previous levels. These constraints, which are automatable by means of a device with memory, also make it possible to partially construct and/or correct solution refinements automatically. Similar constraints make it possible to construct abstractions (i.e., virtual solution rules) from given ones and/or to detect (to-be-remedied) inconsistencies. Other largely automatable processes ensure that specifications are consistent and that design semantics are abstractly correct with respect to the specifications.

Abstract correctness differs in th