Data flow analysis

Optimization apparatus and computer-readable storage medium storing optimization program

6289507

Abstract

In an optimization apparatus embedded in a compiler apparatus that compiles a high-level language program to a machine language program, equivalence relations among a plurality of expressions are analyzed in a short time period by calculating an equivalent expression set group of each basic block, the equivalent expression set group being composed of equivalent expression sets with equivalence relations. Specifically, an equivalent expression set group at the entry point of a basic block is calculated from equivalent expression set groups at the exit points of basic blocks that precede the basic block, and then an equivalent expression set group at the exit point of the basic block is calculated from the equivalent expression set group at the entry point of the basic block. These calculations are repeated until there are no more changes to the equivalent expression set group at the exit point of any of the basic blocks.


Claims

What is claimed is:

1. An optimization apparatus equipped in a compiler that converts a program into a plurality of machine language instructions,

the program being composed of a plurality of basic blocks which are each made up of instructions, each instruction having two expressions respectively on a left side and a right side, the expressions being selected from the group consisting of three types of expressions that are (a) a variable, (b) a constant, and (c) a combination of at least one operator and at least two operands that are variables and/or constants,

the compiler being equipped with a code generation apparatus for assigning variables included in expressions to registers or memories and generating the machine language instructions,

the optimization apparatus comprising:

analysis means for analyzing, for each basic block, equivalence relations among a plurality of expressions at an entry point of an analyzed basic block, and generating an E_IN equivalent expression set group that is composed of at least one set of equivalent expressions which have an equivalence relation at the entry point of the analyzed basic block, the equivalence relation meaning that one of the expressions can be replaced with a different one of the expressions without changing what the program computes; and

optimization means for optimizing all instructions in each basic block, using the E_IN equivalent expression set group for each basic block,

wherein the optimization by the optimization means includes an operation of replacing an operator-operand combination expression in an instruction with an equivalent expression that is a variable or constant.

2. The optimization apparatus of claim 1,

wherein the analysis means includes:

first analysis means for analyzing, for each basic block, equivalence relations among a plurality of expressions at an exit point of an analyzed basic block, the equivalence relations being maintained until an entry point of a basic block that is a branch destination for the analyzed basic block;

second analysis means for analyzing, for each basic block, which equivalence relations at an entry point of an analyzed basic block disappear due to processing of each instruction in the analyzed basic block, and for analyzing equivalence relations that are newly produced by the processing of each instruction in the analyzed basic block;

qualification judgement means for judging, for each basic block, whether an E_OUT equivalent expression set group, that is composed of at least one set of equivalent expressions which are found to have an equivalence relation at an exit point of a basic block as a result of analysis by the second analysis means, qualifies for being used to optimize the basic block; and

repeat means for having the first analysis means and the second analysis means repeat respective analyses, when the qualification judgement means judges that an E_OUT equivalent expression set group of any of the plurality of basic blocks does not qualify, and

the optimization means includes

block internal optimization means for optimizing, when the qualification judgement means judges that an E_OUT equivalent expression set group of each basic block qualifies, all instructions in each basic block rising an E_IN equivalent expression set group composed of at least one set of equivalent expressions which are found to have an equivalence relation at an entry point of each basic block, that is most recently obtained by the first analysis means and the second analysis means.

3. The optimization apparatus of claim 2, further comprising:

E_GEN equivalent expression set group generation means for generating, for each basic block, an E_GEN equivalent expression set group composed of at least one set of equivalent expressions whose equivalence relation is newly produced by processing of each instruction in a basic block;

E_PRE expression group generation means for generating, for each basic block, an E_PRE expression group composed of expressions, among all expressions appearing in the program, that are unchanged by the processing of each instruction in the basic block; and

initialization means for performing an equivalence union calculation on an E_GEN equivalent expression set group and an E_PRE expression group of each basic block other than an initial block, setting the calculation result as an initial E_OUT equivalent expression set group of each basic block other than the initial block, and setting an E_GEN equivalent expression set group of the initial block as an initial E_OUT equivalent expression set group of the initial block,

wherein the first analysis means analyzes, for each basic block, which equivalent expression sets, among equivalent expression sets in initial E_OUT equivalent expression set groups, maintain equivalence relations at an entry point of a basic block,

wherein the second analysis means analyzes, for each basic block, which equivalence relations maintained at the entry point of the basic block disappear due to processing of each instruction in the basic block, and analyzes which equivalence relations are newly produced by the processing of each instruction in the basic block,

wherein the first analysis means includes

first calculation means for regenerating, for each basic block, after an E_OUT equivalent expression set group of each basic block is generated, an E_IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E_OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, and

wherein the second analysis means includes

second calculation means for regenerating, for each basic block, an E_OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E_IN equivalent expression set group regenerated by the first calculation means, an E_GEN equivalent expression set group generated by the E_GEN equivalent expression set group generation means for the basic block, and an E_PRE expression group generated by the E_PRE expression group generation means for the basic block,

wherein "Formula 1" is

E.sub.-- OUT[B]=E.sub.-- GEN[B]4e(E.sub.-- IN[B]3eE.sub.-- PRE[B]),

B representing a basic block in the program,

3e representing an equivalence intersection operator, and

4e representing an equivalence union operator.

4. The optimization apparatus of claim 3,

wherein the qualification judgement means judges whether the E_OUT equivalent expression set group regenerated by the second calculation means qualifies for being used to optimize the basic block, and

wherein, when the qualification judgement means judges that an E_OUT equivalent expression set group of any of the plurality of basic blocks does not qualify, the first calculation means newly generates, for each basic block, an E_IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E_OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, and

the second calculation means newly generates, for each basic block, an E_OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E_IN equivalent expression set group newly generated by the first calculation means, the E_GEN equivalent expression set group generated by the E_GEN equivalent expression set group generation means for the basic block, and the E_PRE expression group generated by the E_PRE expression group generation means for the basic block.

5. The optimization apparatus of claim 4,

wherein the qualification judgement means includes:

a transient group storage unit for storing an E_OUT equivalent expression set group generated by the second calculation means as a transient group;

a comparison unit for comparing, when a new E_OUT equivalent expression set group is generated by the second calculation means, the new E_OUT equivalent expression set group with the transient group; and

a judgement unit for judging that the transient group qualifies if the new E_OUT equivalent expression set group and the transient group match, and judging that the transient group does not qualify if the new E_OUT equivalent expression set group and the transient group are different.

6. The optimization apparatus of claim 3,

wherein the E_GEN equivalent expression set group generation means includes:

a first retrieval unit for retrieving an instruction from the basic block to analyze equivalence relations;

an E_GEN processing storage unit for storing equivalent expression sets whose equivalence relations were generated by processing of instructions which precede the retrieved instruction in the basic block;

a first E_GEN processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the equivalent expression sets stored in the E_GEN processing storage unit;

a second E_GEN processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the equivalent expression sets in the E_GEN processing storage unit;

a third E_GEN processing storage unit update unit for adding, when the retrieved instruction is an assignment instruction and the E_GEN processing storage unit stores an equivalent expression set including an expression located on one side of the assignment instruction, an expression on another side of the assignment instruction to the equivalent expression set, and for generating, when the E_GEN processing storage unit stores no equivalent expression sets including an expression located on any side of the assignment instruction, an equivalent expression set on both sides of the assignment instruction and adding the generated equivalent expression set to the E_GEN processing storage unit; and

a fourth E_GEN processing storage unit update unit for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the equivalent expression sets in the E_GEN processing unit.

7. The optimization apparatus of claim 3,

wherein the E_PRE expression group generation means includes:

a second retrieval unit for retrieving an instruction from the basic block to analyze equivalence relations;

an E_PRE processing storage unit for storing all expressions which appear in the program;

a first E_PRE processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the E_PRE processing storage unit;

a second E_PRE processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the E_PRE processing storage unit; and

a third E_PRE processing storage unit update unit for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the E_PRE processing storage unit.

8. The optimization apparatus of claim 3,

wherein the equivalence intersection calculation is performed for sets in two set groups, the two set groups being respectively composed of mutually exclusive sets of members.

9. The optimization apparatus of claim 3,

wherein the equivalence union calculation first performs a set union calculation for two set groups and then joins sets in a union calculation result that have common members to form single sets until all sets in the union calculation result are mutually exclusive.

10. The optimization apparatus of claim 2,

wherein the block internal optimization means includes:

a processing storage unit for storing, in an initial state, an E_IN equivalent expression set group most recently obtained by the first analysis means for a basic block, and for storing, after optimization of the basic block starts, an equivalent expression set group obtained by updating the E_IN equivalent expression set group in response to changes resulting from the optimization;

a third retrieval unit for successively retrieving instructions from the basic block;

instruction optimization means for optimizing a retrieved instruction; and

update means for updating the E_IN equivalent expression set group stored in the processing storage unit after the retrieved instruction is optimized.

11. The optimization apparatus of claim 10,

wherein the block internal optimization means further includes

processing storage unit initialization means for setting the E_IN equivalent expression set group, most recently obtained by the first analysis means for the basic block, in the processing storage unit,

wherein the instruction optimization means includes:

a first redundancy elimination unit for replacing an expression in the retrieved instruction with an expression which has an equivalence relation with the expression by referring to equivalent expression sets in the E_IN equivalent expression set group stored in the processing storage unit; and

a second redundancy elimination unit for deleting, when the retrieved instruction is an assignment instruction whose expressions on both sides are included in one of the equivalent expression sets in the processing storage unit, the retrieved instruction from the program, and

wherein the update means includes:

a first processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the equivalent expression sets in the processing storage unit;

a second processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the equivalent expression sets in the processing storage unit;

a third processing storage unit update unit for adding, when the retrieved instruction is an assignment instruction and the processing storage unit stores at least one equivalent expression set including an expression located on one side of the assignment instruction, an expression on another side of the assignment instruction to the equivalent expression sets, and for generating, when the processing storage unit stores no equivalent expression sets including an expression located on any side of the assignment instruction, an equivalent expression set composed of expressions on both sides of the assignment instruction and adding the generated equivalent expression set to the processing storage unit;

a fourth processing storage unit update unit for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the equivalent expression sets stored in the processing storage unit; and

an equivalence replacement optimization control unit for activating the processing storage unit initialization means, then activating the first redundancy elimination unit and the second redundancy elimination unit for each instruction, and then successively activating the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, land the fourth processing storage unit update unit.

12. The optimization apparatus of claim 11,

wherein the block internal optimization means includes a third redundancy elimination unit

for replacing, when the retrieved instruction is an assignment instruction whose expression on a right side is included in an equivalent expression set in the processing storage unit along with a constant, the expression with the constant, and

for replacing, when the retrieved instruction is an assignment instruction whose expression on the right side is one of a binary calculation expression and a monadic calculation expression with each variable used in the expression being included in an equivalent expression set in the processing storage unit along with a constant, each variable used in the expression with a corresponding constant, calculating the expression, and replacing the expression with the calculation result, and

wherein the equivalence replacement optimization control unit activates the processing storage unit initialization means, then activates the first redundancy elimination unit, the second redundancy elimination unit, and the third redundancy elimination unit for each instruction, and then successively activates the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, and the fourth processing storage unit update unit.

13. The optimization apparatus of claim 12,

wherein the block internal optimization means includes

a fourth redundancy elimination unit for replacing, when the retrieved instruction is a conditional branch instruction and expressions on both sides of a conditional expression written in the conditional branch instruction are included in equivalent expression sets in the processing storage unit, the conditional expression with "(1)" if it is validated that a condition given by the conditional expression is definitely met, and replacing the conditional expression with "(0)" if it is validated that the condition given by the conditional expression is definitely unmet, and

wherein the equivalence replacement optimization control unit activates the processing storage unit initialization means, then activates the first redundancy elimination unit, the second redundancy elimination unit, the third redundancy elimination unit, and the fourth elimination unit for each instruction, and then successively activates the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, and the fourth processing storage unit update unit.

14. A computer-readable storage medium storing an optimization program for use in a compiler that converts a program into a plurality of machine language instructions,

the program being composed of a plurality of basic blocks which are each made up of instructions, each instruction having two expressions respectively on a left side and a right side, the expressions being selected from the group consisting of three types of expressions that are (a) a variable, (b) a constant, and (c) a combination of at least one operator and at least two operands that are variables and/or constants,

the compiler being equipped with a code generation apparatus for assigning variables included in expressions to registers or memories and generating the machine language instructions,

the optimization apparatus comprising:

an analysis step for analyzing, for each basic block, equivalence relations among a plurality of expressions at an entry point of an analyzed basic block, and generating an E_IN equivalent expression set group that is composed of at least one set of equivalent expressions which have an equivalence relation at the entry point of the analyzed basic block, the equivalence relation meaning that one of the expressions can be replaced with a different one of the expressions without changing what the program computes; and

an optimization step for optimizing all instructions in each basic block, using the E_IN equivalent expression set group for each basic block,

wherein the optimization by the optimization step includes an operation of replacing an operator-operand combination expression in an instruction with an equivalent expression that is a variable or constant.

15. The computer-readable storage medium of claim 14,

wherein the analysis step includes:

a first analysis step for analyzing, for each basic block, equivalence relations among a plurality of expressions at an exit point of an analyzed basic block, the equivalence relations being maintained until an entry point of a basic block that is a branch destination for the analyzed basic block;

a second analysis step for analyzing, for each basic block, which equivalence relations at an entry point of an analyzed basic block disappear due to processing of each instruction in the analyzed basic block, and for analyzing equivalence relations that are newly produced by the processing of each instruction in the analyzed basic block;

a qualification judgement step for judging, for each basic block, whether an E_OUT equivalent expression set group, composed of at least one set of equivalent expressions which are found to have an equivalence relation at an exit point of a basic block as a result of analysis by the second analysis step, qualifies for being used to optimize the basic block; and

a repeat step for having the first analysis step and the second analysis step repeat respective analyses, when the qualification judgement step judges that an E_OUT equivalent expression set group of any of the plurality of basic blocks does not qualify, and

the optimization step includes

a block internal optimization step for optimizing, when the qualification judgement step judges that an E_OUT equivalent expression set group of each basic block qualifies, all instructions in each basic block using an E_IN equivalent expression set group composed of at least one get of equivalent expressions which have an equivalence relation at an entry point of each basic block, that is most recently obtained in the first analysis step and the second analysis step.

16. The computer-readable storage medium of claim 15,

wherein the optimization program further comprises:

an E_GEN equivalent expression set group generation step for generating, for each basic block, an E_GEN equivalent expression set group composed of at least one set of equivalent expressions whose equivalence relation is newly established by processing of each instruction in a basic block;

an E_PRE expression group generation step for generating, for each basic block, an E_PRE expression group composed of expressions, among all expressions appearing in the program, that are unchanged by the processing of each instruction in the basic block; and

an initialization step for performing an equivalence union calculation on an E_GEN equivalent expression set group and an E_PRE expression group of each basic block other than an initial block, setting the calculation result as an initial E_OUT equivalent expression set group of each basic block other than the initial block, and setting an E_GEN equivalent expression set group of the initial block as an initial E_OUT equivalent expression set group of the initial block,

wherein the first analysis step analyzes, for each basic block, which equivalent expression sets, among equivalent expression sets in initial E_OUT equivalent expression set groups, maintain equivalence relations at an entry point of a basic block,

wherein the second analysis step analyzes, for each basic block, which equivalence relations maintained at the entry point of the basic block disappear due to processing of each instruction in the basic block, and analyzes which equivalence relations are newly produced by the processing of each instruction in the basic block,

wherein the first analysis step includes

a first calculation step for regenerating, for each basic block, after an E_OUT equivalent expression set group of each basic block is generated, an E_IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E_OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, and

wherein the second analysis step includes

a second calculation step for regenerating, for each basic block, an E_OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E_IN equivalent expression set group regenerated in the first calculation step, an E_GEN equivalent expression set group generated in the E_GEN equivalent expression set group generation step for the basic block, and an E_PRE expression group generated in the E_PRE expression group generation step for the basic block,

wherein "Formula 1" is

E.sub.-- OUT[B]=E.sub.-- GEN[B]4e(E.sub.-- IN[B]3eE.sub.-- PRE[B]),

B representing a basic block in the program,

3e representing an equivalence intersection operator, and

4e representing an equivalence union operator.

17. The computer-readable storage medium of claim 16,

wherein the qualification judgement step judges whether the E_OUT equivalent expression set group regenerated in the second calculation step qualifies for being used to optimize the basic block, and

wherein, when the qualification judgement step judges that an E_OUT equivalent expression set group of any of the plurality of basic blocks does not qualify, the first calculation step newly generates, for each basic block, an E_IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E_OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, and

the second calculation step newly generates, for each basic block, an E_OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E_IN equivalent expression set group newly generated by the first calculation step, the E_GEN equivalent expression set group generated in the E_GEN equivalent expression set group generation step for the basic block, and the E_PRE expression group generated in the E_PRE expression group generation step for the basic block.

18. The computer-readable storage medium of claim 17,

wherein a computer that reads the computer-readable storage medium includes a transient group storage unit for storing an E_OUT equivalent expression set group generated in the second calculation step as a transient group, and

wherein the qualification judgement step includes:

a comparison substep for comparing, when a new E_OUT equivalent expression set group is generated in the second calculation step, the new E_OUT equivalent expression set group with the transient group; and

a judgement substep for judging that the transient group qualifies if the new E_OUT equivalent expression set group and the transient group match, and judging that the transient group does not qualify if the new E_OUT equivalent expression set group and the transient group are different.

19. The computer-readable storage medium of claim 16,

wherein a computer that reads the computer-readable storage medium includes an E_GEN processing storage unit for storing equivalent expression sets whose equivalence relations were generated by processing of instructions which precede a retrieved instruction in a basic block, and

wherein the E_GEN equivalent expression set group generation step includes:

a first retrieval substep for retrieving the instruction from the basic block to analyze equivalence relations;

a first E_GEN processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the equivalent expression sets stored in the E_GEN processing storage unit;

a second E_GEN processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the equivalent expression sets in the E_GEN processing storage unit;

a third E_GEN processing storage unit update substep for adding, when the retrieved instruction is an assignment instruction and the E_GEN processing storage unit stores an equivalent expression set including an expression located on one side of the assignment instruction, an expression on another side of the assignment instruction to the equivalent expression set, and for generating, when the E_GEN processing storage unit stores no equivalent expression sets including an expression located on any side of the assignment instruction, an equivalent expression set on both sides of the assignment instruction and adding the generated equivalent expression set to the E_GEN processing storage unit; and

a fourth E_GEN processing storage unit update substep for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the equivalent expression sets in the E_GEN processing storage unit.

20. The computer-readable storage medium of claim 16,

wherein a computer that reads the computer-readable storage medium includes an E_PRE processing storage unit for storing all expressions which appear in the program, and

wherein the E_PRE expression group generation step includes:

a second retrieval substep for retrieving an instruction from the basic block to analyze equivalence relations;

a first E_PRE processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the E_PRE processing storage unit;

a second E_PRE processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the E_PRE processing storage unit; and

a third E_PRE processing storage unit update substep for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the E_PRE processing storage unit.

21. The computer-readable storage medium of claim 15,

wherein a computer that reads the computer-readable storage medium includes a processing storage unit for storing, in an initial state, an E_IN equivalent expression set group most recently obtained in the first analysis step for a basic block, and for storing, after optimization of the basic block starts, an equivalent expression set group obtained by update of the E_IN equivalent expression set group in response to changes resulting from the optimization, and

wherein the block internal optimization step includes:

a third retrieval substep for successively retrieving instructions from the basic block;

an instruction optimization step for optimizing a retrieved instruction; and

an update step for updating the E_IN equivalent expression set group stored in the processing storage unit after the retrieved instruction is optimized.

22. The computer-readable storage medium of claim 21,

wherein the block internal optimization step further includes

a processing storage unit initialization step for setting the E_IN equivalent expression set group, most recently obtained in the first analysis step of the basic block, in the processing storage unit,

wherein the instruction optimization step includes:

a first redundancy elimination substep for replacing an expression in the retrieved instruction with an expression which has an equivalence relation with the expression by referring to equivalent expression sets in the E_IN equivalent expression set group stored in the processing storage unit; and

a second redundancy elimination substep for deleting, when the retrieved instruction is an assignment instruction whose expressions on both sides are included in one of the equivalent expression sets in the processing storage unit, the retrieved instruction from the program, and

wherein the update step includes:

a first processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the equivalent expression sets in the processing storage unit;

a second processing storage unit update substep for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the equivalent expression sets in the processing storage unit;

a third processing storage unit update substep for adding, when the retrieved instruction is an assignment instruction and the processing storage unit stores at least one equivalent expression set including an expression located on one side of the assignment instruction, an expression on another side of the assignment instruction to the equivalent expression sets, and for generating, when the processing storage unit stores no equivalent expression sets including an expression located on any side of the assignment, instruction, an equivalent expression set composed of expressions on both sides of the assignment instruction and adding the generated equivalent expression set to the processing storage unit;

a fourth processing storage unit update substep for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which are potentially changed by processing of the function call instruction from the equivalent expression sets stored in the processing storage unit; and

an equivalence replacement optimization control substep for activating the processing storage unit initialization step, then activating the first redundancy elimination substep and the second redundancy elimination substep for each instruction, and then successively activating the first processing storage unit update substep, the second processing storage unit update substep, the third processing storage unit update substep, and the fourth processing storage unit update substep.

23. The computer-readable storage medium of claim 22,

wherein the block internal optimization step includes a third redundancy elimination substep

for replacing, when the retrieved instruction is an assignment instruction whose expression on a right side is included in an equivalent expression set in the processing storage unit along with a constant, the expression with the constant, and

for replacing, when the retrieved instruction is an assignment instruction whose expression on the right side is one of a binary calculation expression and a monadic calculation expression with each variable used in the expression being included in an equivalent expression set in the processing storage unit along with a constant, each variable used in the expression with a corresponding constant, calculating the expression, and replacing the expression with the calculation result, and

wherein the equivalence replacement optimization control substep activates the processing storage unit initialization step, then activates the first redundancy elimination substep, the second redundancy elimination substep, and the third redundancy elimination substep for each instruction, and then successively activates the first processing storage unit update substep, the second processing storage unit update substep, the third processing storage unit update substep, and the fourth processing storage unit update substep.

24. The computer-readable storage medium of claim 23,

wherein the block internal optimization step includes

a fourth redundancy elimination substep for replacing, when the retrieved instruction is a conditional branch instruction and expressions on both sides of a conditional expression written in the conditional branch instruction are included in equivalent expression sets in the processing storage unit, the conditional expression with "(1)" if it is validated that a condition given by the conditional expression is definitely met, and replacing the conditional expression with "(0)" if it is validated that the condition given by the conditional expression is definitely unmet, and

wherein the equivalence replacement optimization control substep activates the processing storage unit initialization step, then activates the first redundancy elimination substep, the second redundancy elimination substep, the third redundancy elimination substep, and the fourth redundancy elimination substep for each instruction, and then successively activates the first processing storage unit update substep, the second processing storage unit update substep, the third processing storage unit update substep, and the fourth processing storage unit update substep.


Description

This application is based on an application No. 9-265655 filed in Japan, the content of which is hereby incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an optimization apparatus equipped in a compiler apparatus, and to a computer-readable storage medium storing an optimization program.

2. Description of the Background Art

In recent years, electronics engineers have found it very difficult to develop embedded microcomputer systems that realize high-level and complex control. In general, an embedded microcomputer system refers to a computer system in which a mask ROM that stores all control programs from the firmware to application programs is integrated with a microprocessor. Such embedded microcomputer systems have increasingly been used in household electrical appliances, machine tools, information apparatuses, and communication apparatuses.

Nowadays it is common to develop programs embedded in such microcomputer systems using high-level programming language, such as C, since in view of recent rapid increases in scale of embedded-type application software, it is no longer possible to realize the high-level processing required for these embedded programs in the old software development environment based on assembly language. Also, to develop the embedded programs that realize high-level processing using assembly language puts a considerable burden on engineers.

However, when compared with application software developed using assembly language, machine-language software developed using high-level language has a problem of high redundancy. Accordingly, manufacturers who intend to suppress the cost of their products are reluctant to use high-level programming language to develop embedded programs.

For embedded processors, programs have to be stored in ROM, so that increases in program size can greatly affect manufacturing cost. Also, when specific performance (execution speed) is required for products, more expensive microprocessors have to be used or microprocessors have to operate at higher clock speeds, due to an increase in the execution time of embedded programs.

Thus, there are the notable disadvantages in using high-level programming language to develop embedded programs. To allow greater use of high-level programming language when developing embedded software, it is necessary to establish a high-level optimization algorithm that can eliminate redundancy of the resulting software.

While there are many definitions of program redundancy, in the present specification program redundancy indicates all factors, that are present in a program written in high-level language or intermediate language, which cause increases in code size and execution time of a machine language program after compiling.

Before explaining conventional optimization apparatuses, the construction of conventional compiler apparatuses is explained below, with reference to the following publications.

(1) A. V. Aho, R. Sethi, J. D. Ullman (1986): Compilers: Principles, Techniques, and Tools, Addison-Wesley Publishing Company Inc. (translated in Japanese by Kenichi Harada (1990): Compilers I, II, Science company Inc.)

(2) Hans Zima (1991): Supercompilers for Parallel and Vector Computers, Addison-Wesley Publishing Company Inc. (translated in Japanese by Yoichi Muraoka (1995): Supercompilers, Oum Company Inc.)

(3) Masataka Sasa (1989): Programming Language Processing System, Iwanami

FIG. 1 shows the construction of a conventional compiler apparatus. In the figure, the compiler apparatus includes a syntax analysis apparatus 41, an optimization apparatus 42, and a code generation apparatus 49.

The syntax analysis apparatus 41 performs lexical analysis, syntax analysis, and semantic analysis on a source program which is stored as a file in a storage device (not illustrated) and converts the source program to an intermediate program to simplify the processing by the compiler apparatus. Here, each step (construct) in the intermediate program is called an intermediate instruction. Types of intermediate instructions include a quadruple, a triple, and an abstract syntax tree (reference (1): p.464). Such intermediate instructions are further converted to object code by the code generation apparatus 49. In this specification, the syntax analysis apparatus 41 is not explained in detail, since it is not especially related to optimization processing which is the main focus of the present invention.

The optimization apparatus 42 optimizes the intermediate program to reduce the program size and the execution time of the resulting machine language program. The optimization apparatus 42 includes an optimization control unit 43, a control flow information analysis unit 44, a data flow information analysis unit 45, an intermediate code optimization unit 46, a control flow information storage unit 47, and a data flow information storage unit 48.

The control flow information analysis unit 44 divides the intermediate program into basic blocks where the control flow is unidirectional, and obtains control flow information that shows the control flow between basic blocks (reference (1): p.528). The basic blocks are explained in detail later.

The data flow information analysis unit 45 analyzes the intermediate code using the control flow information and obtains data flow information showing reaching definitions, available expressions, and live variables (for more details, see reference (1): pp.608-722). This data flow information is obtained by finding information that is to be used as data flow information, setting data flow equations for the entry and exit points of the basic blocks, and calculating the data flow equations according to an iterative algorithm.

The intermediate code optimization unit 46 uses the control flow information and the data flow information to optimize the intermediate code. Examples of such optimization processing are deletion of basic blocks that are beyond control using the control flow information, optimization of common subexpressions using the information on available expressions (reference (1): p.592), copy propagation using the information on reaching definitions (reference (1): p.594), and deletion of unnecessary code using the information on live variables (reference (3): p.482). The optimization processing is explained in greater detail later.

The control flow information storage unit 47 stores the control flow information generated by the control flow information analysis unit 44.

The data flow information storage unit 48 stores the data flow information generated by the data flow information analysis unit 45.

The code generation apparatus 49 allocates registers or memory to variables written in the intermediate program and converts each intermediate instruction to a machine language instruction. This code generation apparatus 49 is not the main focus of the present invention and so is not explained in detail here.

The following is a more detailed explanation of the optimization apparatus 42, focusing on features related to the present invention, namely, the copy propagation using the information on reaching definitions and the optimization of common subexpressions using the information on available expressions.

First, the terms used in the explanation are introduced.

<Program Point>

Any point located between two adjacent intermediate instructions (reference (3):p.461).

<Basic Block> (reference (1): p.528)

When optimizing the program, there is a danger that the algorithm will be destroyed by rewriting instructions which include jump instructions or jump destinations. Accordingly, in the optimization processing, the execution order has to be unidirectional from the start to the end. Hence, in the intermediate program, each part (block) that includes neither a jump nor a jump destination is called a basic block, which is the minimum unit of the optimization processing. A point immediately before a first intermediate instruction in a basic block is called the entry point of the basic block, while a point immediately after a last intermediate instruction in the basic block is called the exit point of the basic block. A basic block is not divided when it includes a subroutine (function) call intermediate instruction, since optimization information can be analyzed more extensively if the program is divided into larger blocks.

In the following explanation, each basic block is a set of intermediate instructions, whereby "s.epsilon.B" expresses that intermediate instruction s belongs to basic block B. Basic blocks executed immediately before basic block B are called "preceding blocks", a set of preceding blocks of basic block B being expressed as "pred(B)" (reference (1): p.532). Also, basic blocks executed immediately after basic block B are called "succeeding blocks", a set of succeeding blocks of basic block B being expressed as "succ(B)". An example of basic blocks is shown in FIG. 2. Preceding blocks of basic block BLK2 are basic blocks BLK1 and BLK5, while succeeding blocks of basic block BLK2 are basic blocks BLK3 and BLK4. A basic block which is executed first and which does not have preceding blocks is called an "initial block".

<Definitions and Uses of Variables>

"to define" means to set a certain value in a variable, and "to use" or "to refer" means to use the set value (reference (3): p.419). In an intermediate instruction "s1: a=b op c" (a, b, c=variables, op=operator), intermediate instruction s1 is both a definition of variable a and a use (reference) of variables b and c. For simplicity's sake, hereinafter "s1(a=)" expresses that intermediate instruction s1 is a definition of variable a, and "s1(=b)" expresses that intermediate instruction s1 is a use (reference) of variable b.

<Reaching Definition>

A reaching definition is one of the fundamental elements in data flow information. If variable x is used in intermediate instruction s, definition d that sets the value of variable x is called a reaching definition that reaches, which is to say, is unbroken until, intermediate instruction s. While the word "reaching" is used in reference (3), the present specification also uses the word "unbroken" to express this concept.

If, in a route from definition d that defines variable x to intermediate instruction s, a different definition of variable x exists, definition d does not reach intermediate instruction s. If, on the other hand, no other definitions exist, definition d reaches intermediate instruction s. Such definitions d that reach intermediate instruction s are called reaching definitions.

Note here that variable x is not necessarily used in intermediate instruction s. Also, in the above explanation, "intermediate instruction s" can be replaced with "point p immediately before intermediate instruction s".

When a basic block has a plurality of preceding blocks, a plurality of definitions included in the plurality of preceding blocks may reach the same intermediate instruction in the basic block. An example of this case is shown in FIG. 5. In a route from intermediate instruction s17 that defines variable b2 in basic block B1 to intermediate instruction s33 that uses variable b2, intermediate instruction s17 is a definition that reaches intermediate instruction s33. In the same way, intermediate instruction s29 that defines variable b2 in basic block B2 is also a definition that reaches intermediate instruction s33. Thus, variable b2 in intermediate instruction s33 has the two reaching definitions located in the two preceding blocks. Also, though intermediate instruction s33 does not use variable t21, intermediate instruction s13 which defines variable t21 is a definition that is unbroken until intermediate instruction s33, when no other definitions that define variable t21 exist between intermediate instruction s13 and intermediate instruction s33.

<Data Flow Equations>

In order to obtain data flow information showing which definition(s) reaches each intermediate instruction, it is necessary to set data flow equations and solve the data flow equations using an iterative method (reference (3): pp.471-472). This data flow equations are explained below.

Here, a group of definitions that are generated in basic block B and that are unbroken until the exit point of basic block B is expressed as "GEN[B]". This group is expressed by Equation 2.

{Equation 2}

GEN[B]={s.vertline.s(x=).epsilon.B, where s'(x=) does not exist between s and exit point B}

Next, when there are a group of definitions that define variable x in basic blocks other than basic block B, and when intermediate instruction "s'(x=)" exists in basic block B, the group of definitions of variable x is expressed as "KILL[B]". This group is expressed by Equation 3.

{Equation 3}

KILL[B]={s.vertline.s(x=).delta.(basic blocks other than B), where s'(x=).epsilon.B exists}

Next, a group of definitions that potentially reach the entry point or basic block B is expressed as "IN[B]", while a group of definitions that potentially reach the exit point of basic block B is expressed as "OUT[B]". These groups are expressed by Equation 4.

{Equation 4}

IN[B]={s.vertline.where s'(x=) does not exist between intermediate instruction s(x=) and entry point B}

OUT[B]={s.vertline.where s'(x=) does not exist between intermediate instruction s(x=) and exit point B}

From the above equations, data flow equations are expressed by Equation 5.

{Equation 5}

IN[B]=.orgate.OUT [B'] (1)

B'.epsilon.pred(B)

OUT[B]=GEN[B].orgate.(IN[B]-KILL[B]) (2)

Here, ".orgate." indicates a set total, while "-" indicates a set difference. Equation 5(1) shows that definitions which are unbroken until the entry point of basic block B are a set total of definitions which are unbroken until the exit points of preceding blocks of basic block B. Equation 5(2) shows that definitions which are unbroken until the exit point of basic block B are a set total of definitions GEN[B] generated in basic block B and definitions which are unbroken until the entry point of basic block B and which define variables that are not defined in basic block B.

Equation 5 is a simultaneous equation in which IN[B] and OUT[B] are variables. By solving this equation, IN[B] that reaches basic block B is obtained. As a result, a reaching definition for each intermediate instruction in basic block B can be easily obtained in the execution order by changing the content of IN[B] as necessary (see reference (3): p.475).

<Iterative Algorithm for the Data Flow Equations>

An iterative algorithm is commonly used to solve the data flow equations as in the case of reaching definitions (reference (3): pp.473-474). FIG. 3 shows such an iterative algorithm.

Here, each sentence, such as "repeat", "for", and "if", and each operator, such as "=", "!=", and "==", are written in C, while "false" and "true" are respectively given values 0 and 1. Other algorithms described in this specification are written in the same way.

In the above algorithm, the calculations of IN[B] and OUT[B] are repeated until the value of OUT[B] of each basic block B no longer changes, that is, until the value converges, IN[B] and OUT[B] obtained as such are the solutions of the data flow equations. For the convergency of the iterative algorithm, it is usually necessary for the sets of data flow information to be a semi-lattice to a confluent calculation (reference (2): pp.79-88). For instance, in the case of reaching definitions, it is necessary to show that a group of reaching definitions is a semi-lattice to the confluence calculation ".orgate." for preceding blocks.

Also, it is necessary to show that function f which shows the effect of each basic block is a monotonic function. For example, in the case of reaching definitions, it is necessary to show a function "f(X)=GEN[B].orgate.(X-KILL[B])" is a monotonic function, where X is IN[B] in the equation for OUT[B].

<Use-Definition Chain Information, Definition-Use Chain Information> (reference (3): p.476)

Use-definition chain information is information, produced from the reaching definition information, that shows a list of definitions which are unbroken until each use of a variable. In FIG. 4, when intermediate instructions s12 and s24 that define variable b1 are reaching definitions for intermediate instruction s32 that uses variable b1, the use-definition chain information for variable b1 in intermediate instruction s32 is given as a list (s12,s24).

On the other hand, definition-use chain information shows a list of uses of a variable until which each definition is unbroken. In FIG. 4, when intermediate instruction s12 which defines variable b1 reaches intermediate instruction s32, definition-use chain information for variable b1 in intermediate instruction s12 is given as a list (s32). The same can be applied to intermediate instruction s24. Also, in FIG. 7, when intermediate instruction s5 which defines variable x4 is unbroken until intermediate instructions s16, s27, and s35 that each use variables x4, definition-use chain information for variable x4 in intermediate instruction s5 shows a list (s16,s27,s35).

In FIGS. 4-8, use-definition chain information and definition-use chain information are illustrated by dashed arrows. For example, in FIG. 4, the dashed arrow from s32 to s12 shows use-definition chain information for variable b1, while the dashed arrow from s12 to s32 shows definition-use chain information for variable b1.

<Available Expressions>

An available expression is also a fundamental element in data flow information. In each route from the start point of the program to intermediate instruction s, an expression "E: x op y" (op=operator) is evaluation of expression E and intermediate instruction s, expression E is available in intermediate instruction s. In the above explanation, "intermediate instruction s" can be replaced with "point p immediately before intermediate instruction s".

An example is given in FIG. 7. In the route from intermediate instruction s16 to intermediate instruction s35, when intermediate instruction s16 is the last evaluation of expression x4+y4 (that is to say, there is no intermediate instruction, other than s16, that executes expression x4+y4 between s16 and s35), when there are no definitions of variables x4 and y4, and when the same applies to s27, expression x4+y4 is an available expression in intermediate instruction s35.

The information on available expressions is obtained by defining data flow equations relating to the available expressions and by calculating the equations according to an iterative algorithm, in the same way as the information on reaching definitions (reference (3): pp476-479).

The following is an explanation of the copy propagation and the common subexpression optimization using the data flow information on reaching definitions and available expressions.

<Copy Propagation>

When there is an intermediate instruction "s: x=y" (hereinafter referred to as "copy"), and intermediate instruction s is the only definition that is unbroken until intermediate instruction s' which uses variable x, variable x in intermediate instruction s' is replaced with y (this replacement is called copy propagation). Also, if there is no intermediate instruction, aside from intermediate instruction s', that uses x, intermediate instruction s is deleted. In FIG. 6, intermediate instruction s6 is the only definition of variable t32 that reaches intermediate instructions s26 and s30, while intermediate instructions s26 and s30 are the only uses of variable t32 defined in intermediate instruction s6. Accordingly, variable t32 in each intermediate instruction s26 and s30 is replaced with x3, and intermediate instruction s6 is deleted (reference (3): p.445).

<Common Subexpression Optimization>

The common subexpression optimization is performed in order to avoid the evaluation (execution) of an expression that has already been evaluated.

In FIG. 7, for instance, expression x4+y4 is an available expression in intermediate instruction s35. Here, since expression x4+y4 has already been evaluated in intermediate instructions s16 and s27, it is unnecessary to evaluate expression x4+y4 in intermediate instruction s35. Accordingly, the common subexpression optimization is performed by introducing new variable w, rewriting intermediate instructions s16, s27, and s35, and writing new intermediate instructions s50 and s51 as copies, as shown in FIG. 8. This type of optimization is used when, despite the inclusion of new intermediate instructions s50 and s51, an overall reduction of cost (program size and execution time) is possible by canceling the evaluation of expression x4+y4 in intermediate instruction s35 (reference (3): p.446).

However, to perform optimization based on global dependency between a plurality of basic blocks using the data flow information, much analysis time is required to confirm that the copy propagation or the deletion of instructions is possible. Since a plurality of execution orders and a plurality of dependence relations exist when a plurality of basic blocks are activated by conditional branch instructions, the analysis of the global dependency has to be performed thoroughly. Also, thorough analysis has to be performed on the assumption of feedback-type dependency, in which the control flow of a branch origin basic block depends on the control flow of a branch destination basic block. Even when considerable time is spent in analyzing the global dependency which crosses over between basic blocks, it is still uncertain whether it is safe to perform the optimization on the basic blocks. For these reasons, the above types of optimization are problematic.

The following is a description on how analysis to confirm the safety of optimization is executed, with reference to FIGS. 4-7.

In FIG. 4, since variable a1 is equal to variable b1 in intermediate instruction s32, it appears that intermediate instruction s32 can be deleted. However, there are two definitions of variable b1, namely, intermediate instructions s12 and s24, that are unbroken until intermediate instruction s32. Accordingly, copy propagation cannot be performed on intermediate instructions s12 and s24, so that a different type of optimization needs to be executed. To execute the different type of optimization, it is necessary to check which intermediate instruction is a reaching definition for variable b1 in intermediate instruction s32, and how variable b1 is defined in that intermediate instruction. In the example shown in FIG. 4, it is necessary to check that intermediate instructions s12 and s24 are copies that each have variable a1 on the right side. Here, use-definition dependency for variable b1 in intermediate instruction s32 is relatively simple, so that the need for optimization can be determined by referring to use-definition chain information for variable b1.

In FIG. 5, on the other hand, there are more complex dependency relations among the basic blocks and accordingly use-definition chain information needs to be analyzed in more detail. In the figure, it appears that intermediate instruction s33 can be deleted according to the use-definition chain information and definition-use chain information.

However, in order to confirm the safety in deleting intermediate instruction s33, first it is necessary to check intermediate instruction s17 shown in the use-definition chain information for variable b2 in intermediate instruction s33 and detect the right side member (variable t21) in intermediate instruction s17. Then, by referring to the use-definition chain information for variable t21 in intermediate instruction s17, intermediate instruction s13 that defines variable t21 is obtained and its right side member (variable a2) is detected. The same processing has to be performed on intermediate instruction s29 shown in the use-definition chain information for variable b2 in intermediate instruction s33 and intermediate instruction s25 shown in the use-definition chain information for variable t22 in intermediate instruction s29. Thus, it is necessary to thoroughly check that variable b2 is equal to variable a2 in every case by tracing use-definition chain information step by step. As a result, it is confirmed that intermediate instruction s33 can be deleted.

In FIG. 6, dependency relations are further complex, so that it is necessary to analyze not only use-definition chain information but definition-use chain information. Though it appears that intermediate instruction s34 can be deleted, both use-definition chain information and definition-use chain information have to be analyzed in order to confirm the safety in deleting intermediate instruction s34. First, as in FIG. 5, intermediate instructions s18 and s15 are traced respectively from the use-definition chain information for variable b3 in intermediate instruction s34 and the use-definition chain information for variable t31 in intermediate instruction s18, in order to confirm that variable b3 is equal to variable a3. Next, intermediate instruction s30 is traced from the use-definition chain information for variable b3 in intermediate instruction s34, and intermediate instruction s6 is traced from the use-definition chain information for variable t32 in intermediate instruction s30. Then, intermediate instruction s26 is traced from the definition-use chain information for variable t32 in intermediate instruction s6. Finally, by checking the left side member (variable a3) in intermediate instruction s26, it is confirmed that variable a3 is equal to variable b3 in intermediate instruction s34. If the left side member in intermediate instruction s26 is not variable a3, it is necessary to trace other intermediate instructions shown in the definition-use chain information for variable t32 in intermediate instruction s6.

Thus, considerable analysis time is spent in judging that variable x is equal to variable y in an intermediate instruction "s: x=y" in order to confirm the safety in optimization, since use-definition chain information and definition-use chain information have to be traced step by step.

In FIG. 7, it appears that intermediate instruction s35 can be deleted, since in the preceding blocks variable a4 has the same value as expression x4+y4 as shown in the use-definition chain information and the definition-use chain information in intermediate instructions s16, s20, s27, and s31. However, in the common subexpression optimization described above, intermediate instruction s35 can be changed to a copy but cannot be deleted. Accordingly, optimization different from the common subexpression optimization is required. To confirm that variable a4 is equal to the value of expression x4+y4 in intermediate instruction s35, not only the use-definition chain information and definition-use chain information (as in FIGS. 4-6) but expression x4+y4 needs to be analyzed, so that the analysis time is further prolonged.

The above analysis is necessary to eliminate the dangers associated with optimizing the program. However, since it requires considerable analysis time, such optimization processing is far from practical. Besides, there is no guarantee that the analysis time can be completed within a specified period. Thus, the conventional optimization methods are not effective in optimizing the program over a plurality of basic blocks, as it cannot sufficiently reduce the redundancy between the plurality of basic blocks.

SUMMARY OF THE INVENTION

In view of the above problems, the first object of the present invention is to provide an optimization apparatus that can eliminate the dangers associated with program rewriting within a short analysis time and that can efficiently eliminate program redundancy.

The second object of the present invention is to provide an optimization apparatus that has a high-level analysis capability for global optimization over a plurality of basic blocks.

In view of the stated objects, the most significant feature of the present invention is to analyze, for each basic block, which inter-expression equivalence relations are present at the exit point of a basic block, and how the equivalence relations are distributed to the entry points of branch destination basic blocks for the basic block. The main problem associated with conventional optimization methods is the difficulty to analyze inter-expression equivalence relations between the exit point of a branch origin basic block and the entry point of a branch destination basic block. The present invention addresses this problem and realizes an optimization apparatus that can analyze global dependency in the program.

The optimization apparatus of the present invention is an optimization apparatus for optimizing a program by analyzing inter-expression equivalence relations which each show that an expression located on one of a left side and a right side of an instruction can be replaced with another expression without causing changes in a program execution, the program being divided into a plurality of basic blocks based on branch origin-branch destination relations, the optimization apparatus including: a first analysis unit for analyzing, for each basic block, equivalence relations among a plurality of expressions at an exit point of an analyzed basic block, the equivalence relations being maintained until an entry point of a basic block that is a branch destination for the analyzed basic block; a second analysis unit for analyzing, for each basic block, which equivalence relations at an entry point of an analyzed basic block disappear due to processing on each instruction in the analyzed basic block, and for analyzing equivalence relations that are newly produced by the processing of each instruction in the analyzed basic block; a qualification judgement unit for judging, for each basic block, whether an E_OUT equivalent expression set group, composed of equivalent expression sets found to have equivalence relations at an exit point of a basic block, qualifies for being used to optimize the basic block, after analysis by the second analysis unit; a repeat unit for having the first analysis unit and the second analysis unit repeat respective analyses, when the qualification judgement unit judges that an E_OUT equivalent expression set group of any of the plurality of basic blocks does not quality; and a block internal optimization unit for optimizing, when the qualification judgement unit judges that an E_OUT equivalent expression set group of each basic block qualifies, instructions in each basic block using an E_IN equivalent expression set group, composed of equivalent expression sets found to have equivalence relations at an entry point of each basic block, that is most recently obtained by the first analysis unit and the second analysis unit.

The inter-expression equivalence relation shows that an expression in the program can be replaced with another expression without causing changes in program execution. The present specification provides data flow equations to calculate the E_IN and E_OUT equivalent expression set groups, which respectively show all equivalent expression sets whose equivalence relations are established at the entry point and exit point of each basic block, from the E_GEN equivalent expression set group and E_PRE expression group obtained by analyzing each basic block. By repeatedly calculating the data flow equations until the convergence solutions are obtained, equivalent expression set groups between a plurality of basic blocks can be efficiently obtained, even when basic blocks are activated by conditional branch instructions in complex ways and thus various execution orders exist, or when feedback dependency is present in the program in which the control flow of branch destination basic block.

With the above construction, global equivalence between a plurality of expressions can be efficiently analyzed using the E.sub.13 IN and E.sub.13 OUT equivalent expression set groups that show generation and disappearance of equivalence relations between a plurality of basic blocks.

Thus, by repeatedly calculating the data flow equations for each basic block, all equivalent expression sets which establish equivalence at the entry point of each basic block can be obtained as an equivalent expression set group, regardless of an execution order of the plurality of basic blocks. Accordingly, the global optimization of the program can be executed safely using the obtained equivalent expression set groups.

Here, the first analysis unit may include a first calculation unit for regenerating, for each basic block, after an E.sub.13 OUT equivalent expression set group of each basic block is generated, an E.sub.13 IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E.sub.13 OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, wherein the second analysis unit includes a second calculation unit for regenerating, for each basic block, an E.sub.13 OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E.sub.13 IN equivalent expression set group regenerated by the first calculation unit, an E.sub.13 GEN equivalent expression set group generated by the E.sub.13 GEN equivalent expression set group generation unit for the basic block, and an E.sub.13 PRE expression group generated by the E.sub.13 PRE expression group generation unit for the basic block, wherein "Formula 1" is E.sub.13 OUT[B]=E.sub.13 GEN[B].orgate.e(E.sub.13 IN[B].andgate.eE.sub.13 PRE[B]), B representing a basic block in the program, .andgate.e representing an equivalence intersection operator, and .orgate.e representing an equivalence union operator.

With the above construction, the first and second calculation units repeat the equivalence union calculation and equivalence intersection calculation for calculating the data flow equations, until the equivalent expression set groups of each basic block become unchanged.

Since the data flow equations are monotonic functions, the process of repeatedly solving the data flow equations can be completed within a limited time period.

Also, the initial E.sub.13 OUT equivalent expression set group of each basic block is calculated from the E.sub.13 PRE expression group which is composed of expressions, among all expressions written in the program, that are not changed by successive processing of instructions in each basic block, and the E.sub.13 GEN equivalent expression set group which is generated in each basic block. Accordingly, the largest solutions can be obtained for each basic block by repeatedly solving the data flow equations.

Since an equivalent expression set group is calculated from the E.sub.13 IN equivalent expression set group of a basic block whenever an instruction in the basic block is executed, an expression present in an instruction can be replaced with an expression which has an equivalence relation with the expression. Thus, redundant instructions can be effectively rewritten. As a result, the program redundancy is eliminated, and the execution time and the program size of the resulting machine language program can be reduced.

Here, the qualification judgement unit may judge whether the E.sub.13 OUT equivalent expression set group regenerated by the second calculation unit qualifies for being used to optimize the basic block, wherein, when the qualification judgement unit judges that an E.sub.13 OUT equivalent expression set group of any of the plurality of basic blocks does not qualify, the first calculation unit newly generates, for each basic block, an E.sub.13 IN equivalent expression set group of a basic block by performing an equivalence intersection calculation on E.sub.13 OUT equivalent expression set groups of basic blocks which have the basic block as a common branch destination, and the second calculation unit newly generates, for each basic block, an E.sub.13 OUT equivalent expression set group of the basic block by calculating "Formula 1" using the E.sub.13 IN equivalent expression set group newly generated by the first calculation unit, the E.sub.13 GEN equivalent expression set group generated by the E.sub.13 GEN equivalent expression set group generation unit for the basic block, and the E.sub.13 PRE expression group generated by the E.sub.13 PRE expression group generation unit for the basic block.

With the above construction, the process of solving the data flow equations is repeated until the most appropriate solutions to be used for the global optimization of the plurality of basic blocks are obtained.

Here, the qualification judgement unit may include: a transient group storage unit for storing an E.sub.13 OUT equivalent expression set group generated by the second calculation unit as a transient group; a comparison unit for comparing, when a new E.sub.13 OUT equivalent expression set group is generated by the second calculation unit, the new E.sub.13 OUT equivalent expression set group with the transient group; and a judgement unit for judging that the transient group qualifies if the new E.sub.13 OUT equivalent expression set group and the transient group match, and judging that the transient group does not qualify if the new E.sub.13 OUT equivalent expression set group and the transient group are different.

With the above construction, inter-expression equivalence relations can be analyzed over a plurality of basic blocks in a short time using the data flow equations. As a result, the overall program redundancy can be eliminated, so that the execution time and the program size of the machine language program can be reduced.

Here, the block internal optimization unit may include a processing storage unit initialization unit for setting the E.sub.13 IN equivalent expression set group, most recently obtained by the first analysis unit for the basic block, in the processing storage unit, wherein the instruction optimization unit includes: a first redundancy elimination unit for replacing an expression in the retrieved instruction with an expression which has an equivalence relation with the expression by referring to equivalent expression sets in the E.sub.13 IN equivalent expression set group stored in the processing storage unit; and a second redundancy elimination unit for deleting, when the retrieved instruction is an assignment instruction whose expressions on both sides are included in one of the equivalent expression sets in the processing storage unit, the retrieved instruction from the program, and wherein the update unit includes: a first processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is a variable, expressions affected by processing of the assignment instruction from the equivalent expression sets in the processing storage unit; a second processing storage unit update unit for deleting, when the retrieved instruction is an assignment instruction whose left side member is an indirect calculation expression, expressions affected by variables which are potentially changed by processing of the assignment instruction from the equivalent expression sets in the processing storage unit; a third processing storage unit update unit for adding, when the retrieved instruction is an assignment instruction and the processing storage unit stores at least one equivalent expression set including an expression located on one side of the assignment instruction, an expression on another side of the assignment instruction to the equivalent expression sets, and for generating, when the processing storage unit stores no equivalent expression sets including an expression located on any side of the assignment instruction, an equivalent expression set composed of expressions on both sides of the assignment instruction and adding the generated equivalent expression set to the processing storage unit; a fourth processing storage unit update unit for deleting, when the retrieved instruction is a function call instruction, expressions affected by variables which was potentially changed by processing of the function call instruction from the equivalent expression sets stored in the processing storage unit; and an equivalence replacement optimization control unit for activating the processing storage unit initialization unit, then activating the first redundancy elimination unit and the second redundancy elimination unit for each instruction, and then successively activating the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, and the fourth processing storage unit update unit.

With the above construction, when expressions on both sides of an assignment instruction are equivalent, such an assignment instruction can be deleted from the program.

Here, the block internal optimization unit may include a third redundancy elimination unit for replacing, when the retrieved instruction is an assignment instruction whose expression on a right side is included in an equivalent expression set in the processing storage unit along with a constant, the expression with the constant, and for replacing, when the retrieved instruction is an assignment instruction whose expression on the right side is one of a binary calculation expression and a monadic calculation expression with each variable used in the expression being included in an equivalent expression set in the processing storage unit along with a constant, each variable used in the expression with a corresponding constant, calculating the expression, and replacing the expression with the calculation result, wherein the equivalence replacement optimization control unit activates the processing storage unit initialization unit, then activates the first redundancy elimination unit, the second redundancy elimination unit, and the third redundancy elimination unit for each instruction, and then successively activates the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, an the fourth processing storage unit update unit.

With the above construction, redundancy can be eliminated by replacing an expression with a constant if possible.

Here, the block internal optimization unit may include a fourth redundancy elimination unit for replacing, when the retrieved instruction is a conditional branch instruction and expressions on both sides of a conditional expression written in the conditional branch instruction are included in equivalent expression sets in the processing storage unit, the conditional expression with "(1)" if it is validated that a condition given by the conditional expression is definitely met, and replacing the conditional expression with "(0)" if it is validated that the condition given by the conditional expression is definitely unmet, wherein the equivalence replacement optimization control unit activates the processing storage unit initialization unit, then activates the first redundancy elimination unit, the second redundancy elimination unit, the third redundancy elimination unit, and the fourth redundancy elimination unit for each instruction, and then successively activates the first processing storage unit update unit, the second processing storage unit update unit, the third processing storage unit update unit, and the fourth processing storage unit update unit.

With the above construction, redundancy can be eliminated by replacing a conditional expression with either "(1)" or "(0)" which shows whether the condition is definitely met. Also, while redundancy of instructions in each basic block is eliminated in the optimization process, the equivalent expression set group is modified in accordance with changes resulting from the optimization process. As a result, the highly accurate equivalent expression set group can be obtained, so that the execution time and the program size of the machine language program can be reduced.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects, advantages and features of the invention will become apparent from the following description thereof taken in conjunction with the accompanying drawings that illustrate a specific embodiment of the invention. In the drawings:

FIG. 1 shows the construction of a conventional compiler apparatus;

FIG. 2 shows an example of a plurality of basic blocks;

FIG. 3 shows an iterative algorithm for reaching definitions;

FIG. 4 shows an example program used in the description of the background art;

FIG. 5 shows another example program used in the description of the background art;

FIG. 6 shows another example program used in the description of the background art;

FIG. 7 shows another example program used in the description of the background art;

FIG. 8 shows another example program used in the description of the background art;

FIG. 9 shows the construction of the optimization apparatus 1 of the embodiment of the present invention;

FIG. 10 shows the construction of the equivalence information analysis unit 14 show in FIG. 9;

FIG. 11 shows an example program;

FIG. 12A shows an example of a POINT set;

FIG. 12B shows an example of a CHANGE set;

FIG. 13A shows the storage content of the universal expression storage unit 23;

FIG. 13B shows the storage content of the E.sub.13 GEN group storage unit 30;

FIG. 13C shows the storage content of the E.sub.13 PRE group storage unit 31;

FIG. 14A shows the transition of the storage content of the E.sub.13 GEN processing expression set storage unit 28;

FIG. 14B shows the transition of the storage content of the E.sub.13 PRE processing expression storage unit 29;

FIG. 14C shows the transition of the storage content of the former E.sub.13 OUT group storage unit 35;

FIG. 15A shows the initial content of the E.sub.13 IN group storage unit 32;

FIG. 15B shows the initial content of the E.sub.13 OUT group storage unit 33;

FIG. 16A shows the storage content of the E.sub.13 IN group storage unit 32 after the first loop processing;

FIG. 16B shows a result of an equivalence intersection calculation between the E.sub.13 IN equivalent expression set group and the E.sub.13 PRE expression group;

FIG. 16C shows the storage content of the E.sub.13 OUT group storage unit 33 after the first loop processing;

FIG. 17A shows the content of the E.sub.13 IN equivalent expression set group in the equivalence information storage unit 19;

FIG. 17B shows the content of the E.sub.13 OUT equivalent expression set group in the equivalence information storage unit 19;

FIG. 18A shows the transition of the storage content of the processing expression set storage unit 101 when basic block BLK2 is processed;

FIG. 18B shows the transition of the storage content of the processing expression set storage unit 101 when basic block BLK4 is processed;

FIG. 18C shows the transition of the storage content of the processing expression set storage unit 101 when basic block BLK5 is processed;

FIG. 19 shows the program obtained after the equivalence replacement optimization unit 16 optimizes the program shown in FIG. 11;

FIG. 20 shows an example of the process of generating the E.sub.13 GEN equivalent expression set group;

FIG. 21A shows an example of an indicated variable set;

FIG. 21B shows an example of a changed variable set;

FIG. 22 shows a calculation algorithm for the E.sub.13 GEN equivalent expression set groups;

FIG. 23 shows a calculation algorithm for the E.sub.13 PRE expression groups;

FIG. 24 shows a calculation algorithm for the equivalent expression set groups;

FIG. 25 shows an equivalence replacement algorithm;

FIG. 26 is a flowchart showing the processing of the variable utilizing expression judgement unit 26;

FIG. 27 is a flowchart showing the processing of the top layer of the E.sub.13 GEN group generation unit 25;

FIG. 28 is a flowchart showing the processing of the middle layer of the E.sub.13 GEN group generation unit 25;

FIG. 29 is a flowchart showing the processing of the bottom layer of the E.sub.13 GEN group generation unit 25;

FIG. 30 is a flowchart showing the processing of the bottom layer of the E.sub.13 GEN group generation unit 25;

FIG. 31 is a flowchart showing the processing of the bottom layer of the E.sub.13 GEN group generation unit 25;

FIG. 32 is a flowchart showing the processing of the top layer of the E.sub.13 PRE group generation unit 27;

FIG. 33 is a flowchart showing the processing of the middle layer of the E.sub.13 PRE group generation unit 27;

FIG. 34 is a flowchart showing the processing of the bottom layer of the E.sub.13 PRE group generation unit 27;

FIG. 35 is a flowchart showing the processing of the equivalent expression set group generation unit 24;

FIG. 36 is a flowchart showing the processing of the equivalent expression set group generation unit 24;

FIG. 37 is a flowchart showing the processing of the top layer of the equivalence replacement optimization unit 16;

FIG. 38 is a flowchart showing the processing of the middle layer of the equivalence replacement optimization unit 16;

FIG. 39 is a flowchart showing the processing of the bottom layer of the equivalence replacement optimization unit 16;

FIG. 40 is a flowchart showing the processing of the bottom layer of the equivalence replacement optimization unit 16;

FIG. 41 is a flowchart showing the processing of the bottom layer of the equivalence replacement optimization unit 16;

FIG. 42 is a flowchart showing the processing of the middle layer of the equivalence replacement optimization unit 16; and

FIGS. 43A, 43B, and 43C show concepts of the equivalence union calculation.

DESCRIPTION OF THE PREFERRED EMBODIMENT(S)

1. Introduction

Here, definitions, features, and terms of the main concepts of the present invention are explained. Also explained are a new type of data flow information called "equivalent expression sets", date flow equations for calculating the equivalent expression sets, iterative algorithms for solving the data flow equations, and optimizations using the equivalent expression sets.

1.1. Concepts Used in Program System of Present Embodiment

<Intermediate Instruction>

There are several types of intermediate instructions, as stated in the description of the background art. For ease of understanding, this specification will refer to a three-address format that is the most common type. However, it should be remembered that the methods of the present invention may also be applied to other kinds of intermediate instructions. The formats of the intermediate instructions are as follows.

(1) Assignment Intermediate Instructions "a=b"

In this format, a is a variable or an indirect calculation expression, and b is variable, a constant, a monadic operator, or binary operator. These operators comply to the rules of C language. Examples of this format are intermediate instructions s1 and s17 in FIG. 11.

(2) Conditional Branch Intermediate Instructions

These intermediate instructions have a format given as "if (conditional expression) goto (branch label)". The conditional expression must comply to the rules of C language. Examples of this format are intermediate instruction s8 and s39 in FIG. 11.

(3) Unconditional Branch Intermediate Instructions

These intermediate instructions have a format given as "goto (branch label)". An example of this format is intermediate instruction s24 in FIG. 11.

(4) Function (Subroutine) Call Intermediate Instructions

These intermediate instructions have a format given as "function name (real argument list)". An example of this format is intermediate instruction s20 in FIG. 11.

(5) Labels

These intermediate instructions are used to express specific positions within the intermediate program. Examples of labels are L1, L2, and L3 in FIG. 11.

<Expressions>

In the 1996 edition of JIS Handbook No. 58 Information Processing--Terminology, Codes, Data Codes, an "expression" is defined as a "language element that calculates a value from one or more operands". In this specification, the term "expression" is defined as a language element such as a variable, constant, monadic expression, or binomial. Out of the monadic expressions, expressions of the format "*(variable)", which use an indirect operator "*" as in C language, are called indirect calculation expressions. Note that the variable used in "*(variable)" format is called a pointer variable.

<Indicated Variable Sets>

When variable v is a pointer variable, the set of variables which may be indicated by variable v in intermediate instruction s is expressed as POINT[s,v]. This indicated variable set is found according to the method described in p648-p660 of reference (1), so that only a simple description is given here, with reference to FIG. 21A.

In FIG. 21A, it is supposed that there is no definition of pointer variable p from a point immediately after intermediate instructions s1, s2 to intermediate instruction s3.

In the present example, intermediate instructions s1 and s2 assign the addresses of variables a and b to pointer variable p, so that for intermediate instruction s3, pointer variable p indicates variables a and b. As a result, indicated variable set POINT [s3,p] for intermediate instruction s3 includes variables a and b. In intermediate instruction s3, variable a or variable b is converted, which is to say, intermediate instruction s3 is a definition of variable a or variable b. Also, in this example, intermediate instruction s4 also uses variable a or variable b.

This calculation of the indicated variable set is performed by the data flow information analysis unit 13 shown in FIG. 9, before the calculation of the equivalent expression sets in the present method.

<Changed Variable Sets>

The set of variables that may have their values changed as a result of intermediate instruction s calling function f is expressed as changed variable set CHANGE[s,f]. This changed variable set is found according to the method described in p648-p660 of reference (1), so that only a simple description is given here, with reference to FIG. 21B.

In FIG. 21B, the called relationship of a function in the intermediate program is shown in C language notation, with variable a as an external variable and function f1 including variables b1 and b2 as local variables. Intermediate instructions s1 and s2 pass over the addresses of variables b1 and b2 as real arguments and call function f2. Function f2 uses pointer variable p which is a virtual argument. This function f2 assigns a value to the expression indicated by pointer variable p, as shown by arrow y1, and a value to external variable a, as shown by arrow y2.

As a result, function call f2 in intermediate instruction s1 in function f1 may result in a change in the values of variables b1 and a, so that CHANGE[s1,f2] includes these variables b1 and a.

In the same way, variables b2 and a are included in CHANGE[s2,f2].

This calculation of the changed variable set is performed by the data flow information analysis unit 13 shown in FIG. 9, before the calculation of the equivalent expression sets of the present method.

<Equivalent Expression Sets>

In this specification, equivalence refers to when one expression a may be replaced with another expression b, or vice versa, at one point in the program. This means that at that point, expression a is equivalent to b, and vice versa. Expressions that exhibit such equivalence relations are called equivalent expressions.

An equivalent expression set is a grouping of all expressions that are mutually equivalent. In the present specification, variables, constants, monadic expressions, and binomials are defined as expressions, so that for the example in FIG. 7, the lack of a definition of variable a4 between a point directly after intermediate instruction s20 and a point directly before intermediate instruction s35 and between a point directly after intermediate instruction s31 and a point directly before intermediate instruction s35 means that expressions x4+y4 and a4 are mutually equivalent just before intermediate instruction s35, making an equivalent expression set (x4+y4,a4). From this equivalent expression set (x4+y4,a4), it is clear that x4+y4 can be replaced with expression a4, which means that intermediate instruction s35 can be deleted.

In general, equivalent expression sets are different at different points in a program, with the possibility of there being several equivalent expression sets at certain points in the program. For the example in FIG. 7, suppose that the aforementioned assumption is valid for variable a4. Also assume that there are no definitions of variables x4 and p41 in any of the routes from a position directly after intermediate instruction s5 to a position directly before intermediate instruction s35 and no definitions of variables y4, p42 in any of the routes from a position directly after intermediate instruction s7 to a position directly before intermediate instruction s35. In such a case, a plurality of equivalent expression sets given as {(x4,p41),(y4,p42),(a4,x4+y4)} will be present directly before intermediate instruction s35. When the same assumptions are valid, the equivalent expression sets, given as {(x4,p41),(y4,p42),(t42,x4+y4,a4)}, will be present immediately after intermediate instruction s31. This means that it is normal for there to be a group of equivalent expression sets (hereinafter the equivalent expression set group) for each point in the program.

In FIG. 7, intermediate instruction s5 establishes equivalency between variable x4 and variable p41, and since this equivalency is still unbroken at a point immediately before intermediate instruction s35, the equivalency between variable x4 and variable p41 is said to "reach" the point immediately before intermediate instruction s35.

1.2. Mathematical Notation and Terminology Used in this Specification

The following is a description of the symbols that are generally used in set theory. The symbol ".di-elect cons." means "belongs to the set", ".andgate." indicates the intersection of sets, ".orgate." indicates the union of sets, "-" indicates the difference between sets, and ".OR right." indicates a state where two sets are equal or where one set includes all members of another set. The symbol ".o slashed." represents the empty set. Sets where the intersection is the empty set are called "mutually exclusive". The symbol ".A-inverted." represents the universal quantifier and the symbol ".E-backward." represents an existential quantifier.

<Universal Set of Expressions E>

The set of all expressions in the intermediate program aside from conditional expressions is set as "E". As one example, when all of the expressions are as shown in FIG. 11, the total set of expressions E is as shown in FIG. 13A.

In general the total set of expressions E will be the finite set shown by Equation 6 below.

E={e1,e2, . . . en} Equation 6

In the following explanation, the total set is given as E, the subset group of E as P(E) (all of the possible subsets of E), and the subset group of the subsets P(E) (all of the possible subsets of P(E)) as P02(E). The subset group of the subsets P02(E) (all of the subsets of P(E)) is given as P04(E). In general, the set E and its subset groups are expressed by the following Equation 7. ##EQU1##

<PR(E) Set>

PR(E) is a set of subsets that belong to P02(E) and are mutually exclusive. This can be formally found by the following Equation 8.

PR(E)={S.vertline.S.di-elect cons.P02(E), Si.di-elect cons.S, Sj.di-elect cons.S, Si.noteq.Sj, Si.andgate.Sj=.phi.} Equation 8

As one example, the set {(e1,e2),(e3,e4),(e5,e6,e7)} belongs to PR(E), but the subsets in {(e1,e2,e3),(e3,e4)} both include the element e3, so that the set does not belong to PR(E).

It should be noted here that the equivalent expression set groups that will be described later belong to PR(E) due to this property.

<CM relation>

The CM relation for the set X.di-elect cons.P02(E) is as follows.

(1) When the elements A,B of X have an intersecting part, A,B are said to be in a CM relation. This is formally expressed below.

If A,B.di-elect cons.X, A.andgate.B.noteq..phi., A,B are said to be in a CM relation. This is expressed as [A cm B].

(2) When the elements A,B, and C of X are such that A and B are in a CM relation and B and C are in a CM relation, then A and C will also be in a CM relation. This is formally expressed below.

If A,B,C.di-elect cons.X, and A and B are in a CM relation and B and C are in a CM relation, then A and C are also in a CM relation. This means that

if A cm B and B cm C, the A cm C.

(3) The element A in X is in a CM relation with itself. This is formally expressed below.

A.di-elect cons.X, A is in a CM relation with itself. This means A cm A.

As one example when X={(e1,e2),(e2,e3),(e3,e4),(e5,e6)}and these elements are respectively named A, B, C and D, elements A and B have the common element (intersection) e2, while elements B and C have the intersection e3. so that A cm B, B cm C, and A cm C. Element D, however, does not have a common element with any of the elements A, B or C, so that none of A, B, and C has a CM relation with element D.

As a result of the above definition, it can be seen that a CM relation is a reflexive (property (3) above), symmetric (property (1)) and transitive (property (2)) relation, as described on page 351 of reference (2). This means that a CM relation is an equivalence relation. Based on CM relations, set X.di-elect cons.P02(E) can be divided into mutually exclusive subsets.

As one example, the set X={(e1,e3),(e2,e4,e6),(e3,e5), (e4,e8),(e5,e7,e9)} can be divided into the two subsets X1={(e1,e3),(e3,e5),(e5,e7,e9)} and X2={(e2,e4,e6),(e4,e,8)}due to the above CM relation.

It should be especially noted here that an element in one of the sets within a given one of the subsets resulting from the above division, such as e3 in X1, is not present in any of the sets within the other subset (in this case, any of the sets in X2).

<CM.sub.13 PT[X] set>

The set CM.sub.13 PT[X] is a set that is produced by dividing the set X(.di-elect cons.P02(E)) based on CM relations. Here, it should be clear that CM.sub.13 PT[X].di-elect cons.P04(E).

As one example, for the same set X={(e1,e3),(e2,e4,e6), (e3,e5),(e4,e8),(e5,e7,e9)}, CM.sub.13 PT[X]={{(e1,e3),(e3,e5),(e5,e7,e9)}, {(e2,e4,e6),(e4,e8)}}.

<E.sub.13 SET[Y]>

E.sub.13 SET[Y](.di-elect cons.P02(E)) is a grouping of sets S(.di-elect cons.(E)) that are composed of elements e(.di-elect cons.E) in the set E(.di-elect cons.P(E)) belonging to set P, for each element set P(.di-elect cons.P02(E)) belonging to the set Y(.di-elect cons.P04(E)). This is formally expressed by Equation 9 below.

Y.di-elect cons.P04(E), Equation 9

E.sub.13 SET[Y]={S.vertline.{i.di-elect cons.Y, S=.orgate.Ej}

where

Ej.di-elect cons.Pi. 9

As one example, for the above case where X={(e1,e3), (e2,e4,e6),(e3,e5),(e4,e8),(e5,e7,e9)}and Y=CM.sub.13 PT[X]={{(e1,e3),(e3,e5), (e5,e7,e9)}, {(e2,e4,e6), (e4,e8)}}(.di-elect cons.P04(E)), E.sub.13 SET[Y] is given by Equation 10 below. ##EQU2##

FIGS. 43A to 43C give a graphic depiction of the set E.sub.13 SET[CM.sub.13 PT[X]]. FIG. 43A is a Venn diagram for the sets given in Equation 11 below.

X1=(e1, e3) Equation 11

X2=(e2, e4, e6)

X3=(e3, e5)

X4=(e4, e8)

X5=(e5, e7, e9)

FIG. 43B shows the CM_PM[X] set and FIG. 43C shows the E.sub.13 SET[CM_PT[X]]. In these drawings, sets are joined at what were originally intersecting parts. As one example, the set X1, X3, and X5 of FIG. 43A are joined to form a single set (X1.orgate.X3.orgate.X5 in FIG. 43B) that is part of E_SET[CM_PT[X]]. Since the sets X1.orgate.X3.orgate.X5 and X2.orgate.X4 that are elements in E_SET[CM_PT[X]] have different elements, the set E_SET[CM_PT[X]] belongs to the set PR(E). This property is also exhibited in general by the sets in P 2(E) aside from X

Definition of the Equivalence Union Calculation (Definition of the [.orgate.e] Operator)

The calculation of sets shown as Equation 12 below is called the equivalence union calculation, with this being expressed using the operator [.orgate.e].

A, B.di-elect cons.PR(E) Equation 12

A.orgate.e B=E_SET[CM_PT[A.orgate.B]]

As one example when A={(e1,e3), (e2,e4,e6)}, B={(e1, e3), (e4,e8), (e5,e7,e9)}, A.orgate.e B is as shown by Equation 13 below.

A.orgate.e B=E_SET[CM_PT[A.orgate.B]]=E_SET[CM_PT[(e1,e3), (e2,e4,e6), Equation 13

(e4,e8), (e5,e7,e9)]]=E_SET[{(e1,e3)}, {(e2,e4,e6), (e4,e8)}, {(e5,e7,e9)}]={(e1,e3), (e2,e4,e6,e8), (e5,e7,e9)}

As should be more apparent from FIG. 43, the set E_[CM_PT[X]] itself is equal to a set that results from repetition of a calculation, for the elements of the set X(.di-elect cons.P (E)), that links sets with an intersection to form a single set until no more elements with intersections are left.

Properties of the Equivalence Union Operator .orgate.e

As shown on page 81 of reference (2), the operator .orgate.e for PR(E) has the following properties.

(1) A.orgate.e A=A (idempotency)

Proof: From the properties of the union [.orgate.] of sets, A.orgate.A=A and A.di-elect cons.PR(E), so that A.orgate.e A=E_SET[CM_PT[A.orgate.A]]=E_SET[CM_PT[A]]=A is satisfied.

(2) A.orgate.e B=B.orgate.e A (commutation)

Proof: From the properties of the union [.orgate.] of sets, A.orgate.B=B.orgate.A, so that the relationship A.orgate.e B=E_SET[CM_PT[A.orgate.B]]=E_SET[CM_PT[B.orgate.A]]=B.orgate.e A is satisfied.

(3) (A.orgate.e B).orgate.e C=A.orgate.e(B.orgate.e C) (association)

Proof: As should be clear from the definition of the operator .orgate.e, first performing E_SET[CM_PT[B.orgate.B]]=X and then performing E_SET[CM_PT[X.orgate.C]] is the equivalent of performing E_SET[CM_PT[A.orgate.B.orgate.C]]. As a result, (A.orgate.e B).orgate.e C=E_SET[CM_PT[A.orgate.B.orgate.C]].

In the same way, A .orgate.e(B.orgate.e C)=E_SET[CM_PT[A.orgate.B.orgate.C]], so that (A.orgate.e B).orgate.e C=A.orgate.e(B.orgate.e C) is also satisfied.

Definition of the Equivalence Intersection Calculation (Definition of the [.andgate.e]operator

The set calculation shown below as Equation 14 is called the equivalence intersection calculation, with this being expressed using the operator [.andgate.e].

A,B.di-elect cons.PR(E) Equation 14

A.andgate.e B={C.vertline.Ai.di-elect cons.A, C=Ai.andgate.Bi}

As one example, when A={(e1,e3), (e2,e4,e6)} and B={(e1,e3), (e4,e8), (e5,e7,e9)}, A .andgate.e B={(e1,e3), (e4)}.

Properties of the Equivalence Intersection Operator .andgate.e

As shown on page 81 of reference (3), the operator .andgate.e for PR(E) has the following properties.

(1) A.andgate.e B=A (idempotency)

Proof: When Ai.di-elect cons.A, Aj.di-elect cons.A, and i=j, Ai.andgate.Ai is satisfied. When i.noteq.j, Ai.andgate.AJ=.phi., so that A.andgate.e A=A is satisfied. As one example, when A=(A1,A2, . . . An), for element A1, A1.andgate.Aj=.phi.(n.gtoreq.j.gtoreq.l), so that only A1.andgate.A=A1 is satisfied.

(2) A.andgate.e B=B.andgate.e A (commutation)

Proof: This is clear from the properties of the intersection [.andgate.] or sets under definition of .andgate.e.

(3) (A.andgate.e B).andgate.e C=C=A.andgate.e(B.andgate.e C) (association)

Proof: This is clear from the properties of the intersection [.andgate.] of sets under the definition of .andgate.e.

(4) For all A.di-elect cons.PR(E), A.andgate.e.phi.=.phi., so that the zero element is .phi..

(5) For all A.di-elect cons.PR(E), A.andgate.e{E}=A, so that the one element is {E}.

From the above properties, it can be seen that the set PR(E) is a semi-lattice for the calculation .andgate.e, as shown in page 81 of the reference (3).

Distribution of the Operator .andgate.e for the Operator .orgate.e

The following is an explanation of the proposition in the following Equation 15 where the operator .andgate.e has been distributed for the operator .orgate.e, using the property described below as "proof of the monotonicity of the *function f(X)=E_GEN[B].orgate.e(X.andgate.e {E_PRE[B]})".

for A,B,C.di-elect cons.PR(E) Equation 15

.A-inverted.Wi.di-elect cons.(A.andgate.e C).orgate.e(B.andgate.e C),

.E-backward.Xj.di-elect cons.(A.orgate.e B).andgate.e C,

Wi.OR right.Xj . . . (A)

(where for all Wi.di-elect cons.(A.andgate.e C).orgate.e(A.andgate.e C),

Xj.di-elect cons.(A.orgate.e B).andgate.e C exists, and Wi.OR right.Xj)

Proof

First, from the definition of .andgate.e.

(1) .A-inverted.Ak.di-elect cons.A, .E-backward.X1.di-elect cons.A.orgate.e B, Ak.OR right.X1

(2) .A-inverted.Bm.di-elect cons.B, .E-backward.Xn.di-elect cons.A.orgate.e B, Bm.OR right.Xn

(3) From (1), for .A-inverted.Cp.di-elect cons.C

Ak.andgate.Cp.OR right.X1.andgate.Cp

Here, (Ak.andgate.Cp).di-elect cons.(A.andgate.e C), (X1.andgate.Cp).di-elect cons.((A.orgate.e B).andgate.e C) Ak, Cp are arbitrary values so that

.A-inverted.ACi.di-elect cons.(A.andgate.e C), .E-backward.Yj.di-elect cons.((A.orgate.e B).andgate.e C), and ACi.OR right.Yj are satisfied.

(4) In the same way as (3), from (2) for .A-inverted.Cq.di-elect cons.C, (Bm.andgate.Cq).OR right.(Xn.andgate.Cq)

Here, (Bm.andgate.Cq).di-elect cons.(B.andgate.e C), (Xn.andgate.Cq).di-elect cons.((A.orgate.e B).andgate.e C) Bk, Cq are arbitrary values, so that

.A-inverted.BCi.di-elect cons.(B.andgate.e C), .left brkt-top.Yk.di-elect cons.((A.orgate.e B).andgate.e C), and BCi.OR right.Ym

(6) From the definition of the operator .orgate.e, for Wi.di-elect cons.(A.andgate.e C).orgate.e(B.andgate.e C)

Dividend Pi={Ek, Ek+1, . . . En}

(Pi.di-elect cons.CM_PT[(A.andgate.e C).orgate.(B.andgate.e C)],

Ei.di-elect cons.((A.andgate.e C).orgate.(B.andgate.e C)) (k=1. . . n)) is present, with Wi=.orgate.Ej

where j=k . . . n

(7) In (6), when Pi only includes one element, (which is to say when Pi={Ek}),

Wi=Ek is valid, so that from (5),

Wi.OR right.Ym, so that

Ym.di-elect cons.((A.orgate.e B).andgate.e C) exists.

(8) Since the CM relation is transitive, when Pi has a plurality of elements in (6)

for any elements E1,En.di-elect cons.Pi, there is a value Et.di-elect cons.Pi (t=1, . . . n) so

E1.andgate.E2.noteq.=.phi.

Et-1.andgate.Et.noteq.=.phi.

Et.andgate.Et+1.noteq.=.phi.

En-1.andgate.En.noteq.=.phi.

(9) Here, from (5) it can be seen that Et.OR right.Yt, Et+1.OR right.Yt+1 (Yt,Yt+1.di-elect cons.((A.orgate.e B).andgate.e C)) are valid, and that Et.andgate.Et+1.noteq..phi., so that the common element Yc=Et.andgate.Et+1 is included in both Yt and Yt+1.

This means that

Yc.OR right.Yt and Yc.OR right.Yt+1

However, since elements that conform to ((A.orgate.e B).andgate.e C).di-elect cons.PR(E) are mutually exclusive,

Yt=Yrt=1

(10) Since E1, En are arbitrary elements in Pi, this means that all elements in Pi are included in Yt, and that Wi, which is the set union of all elements of Pi, is included in Yt. This means that for Wi, a value Yt.di-elect cons.((A.orgate.e B).andgate.e C) where Wi.OR right.Yt exists.

(11) Conclusion

(A) is proved from (7) and (10) (End of Proof).

Variable Utilizing Expression Set (VEXP[x,v])

A set of expressions that are affected by an updating of the value of the variable v in a given intermediate instruction s is called the variable utilizing expression set of the variable v for intermediate instruction s. This is expressed as VEXP[x,v]. This set is formally expressed by the following Equation 16.

VEXP[s,v]={e.vertline. Equation 16

{e.vertline.e is an expression of the following format that belongs to E. Here, "op" may be any operator aside from the address operator "&".

variable v

expression [op v] (a monadic expression with v as an operand)

expression [v op b] or expression [b op v] (binomial with v as an operand)

expression [*p], However, v .di-elect cons. POINT[s,p]}

1.3. Special Uses Within the Present Embodiment

E_Group

In basic block B, E_GEN[B] is the group of the equivalent expression sets described below. Putting this another way, these are the equivalence relations that occur in basic block B and reach the exit point of basic block B. This group E_GEN[B] is found according to the algorithm shown in FIG. 22.

In FIG. 22, "WSET" represents the work variables that store sets of expressions, "a" represents a variable or an indirect calculation expression, and "b" represents an expression (a variable, a constant, a monadic expression or a binomial). Statement(1) in FIG. 22 refers to the extraction of an expression that utilizes variable a from the elements in set X belonging to WSET. Statement(2) refers to the extraction of an expression that utilizes variable v which is indicated by pointer variable p from set X belonging to WSET. The processing in statement(3) adds expression b to set Y if set Y, to which expression a belongs, is already present as an element in the set WSET. The processing in statement(4) adds expression a to set Y if set Y to which expression b belongs is already present as an element of the set WSET. Statement (5) adds a new equivalent expression set that has expressions a and b as elements to the set WSET. Statement (6) extracts expression that use variable v, which may be updated as a result of a call of function f, from the elements of set X belonging to WSET.

Next, the transition in the set WSET during basic block B1 in FIG. 20 is described. Here, it is assumed that the variables indicated by pointer variable d in intermediate instruction s4 are variables a and b. This, is to say, POINT[s4,d]={a,b}. It is also assumed that the variable whose value changes as a result of intermediate instruction s6 calling function f is x. This is to say, CHANGE[s6,f]={x}.

First, in the processing for intermediate instruction s1, {(a,b+10)} is added to WSET in FIG. 22(5), so that WSET becomes {(a,b+10)}.

In the processing for intermediate instruction s2, the expression b+10 is deleted from set (a,b+10) in FIG. 22(1), so that the set WSET becomes {(a)}. Next, in FIG. 22(5), {(a,x+20)} is added to the set WSET, so that WSET becomes {(a), (b,x+20)}.

In the processing for intermediate instruction s3, {(c,x+y)} is added to the set WSET in FIG. 22(5), so that WSET becomes {(a), (b,x+20), (c,x+y)}.

In the processing for intermediate instruction s4 in FIG. 22(2), a and b are deleted from the expression sets (a) and (b,x+20) that include expressions that utilize a and b, so that WSET becomes {(x+20), (c,x+y)}. Next, in FIG. 22(5), {(*d,30(} is added to WSET, so that WSET becomes {(x+20), (c,x+y), (*d,30)}.

In the processing for intermediate instruction s5, (e,x+y) is added to the element (c,x+y) in WSET in FIG. 22(4), so that the WSET becomes {(x+20), (c,x+y,e), (*d,30)}.

In the processing for intermediate instruction s6, expressions that utilize expression x are deleted from WSET in FIG. 22(6), so that the WSET becomes {(c,e), (*d,30)}.

In the processing for intermediate instruction s7, (z,*d) is added to the element (*d,30) in WSET in FIG. 22(4), so that WSET becomes {(c,e), (*d,30,z)}.

In the processing for intermediate instruction s8, the value of variable a is changed in FIG. 22(1), so that *d is deleted from (*d,30,z) making WSET={(c,e), (30,z)}. In FIG. 22(5), {(z,x+z)} is added to WSET, so that WSET becomes {(c,e), (30,z), (a,x+z)}.

As a result of the above processing, the final state of E_GEN[B1] is {(c,e), (30,z), (a,x+z)}.

While it can be understood from the calculation algorithm of E_GEN[B] that equivalent variables generated by assignments will belong to the same set, it should also be noted that these sets are mutually exclusive, so that E_GEN[B] belongs to the set PR(E).

E_PRE GROUP

The E_PRE group is the group of expressions that are unaffected by the updating of variables that results from all intermediate instructions in the basic block B. The algorithm for finding E_PRE[B] is shown in FIG. 23.

1.4. Data Flow Equations that are Special to this Embodiment

Data Flow Equations that relate to Equivalent Expression Sets

The following is an explanation of the equivalent relations between expressions at the entry point and exit point of basic block B, using the terminology defined above. The data flow equations that relate to the equivalent expression set groups E_IN[S] and E_OUT[B] are shown as Equation 17 below.

E_IN[B]=.andgate.e E_OUT[P] Equation 17

P.di-elect cons.pred[B]

E_OUT[B]=E_GEN[B].orgate.e {E_IN[B].andgate.e {E_PRE[B]}}

The notion of "reaching" of equivalent relations used in these data flow equations is as follows. Only equivalent relations that are established at the exit point of every preceding block for basic block B are included in E_IN[B], with these equivalent relations being said to "reach" the entry point of basic block B. E_OUT[B] shows the equivalent relations that reach the exit point B, which are the equivalent relations E_GEN[B] that are produced within basic block B and the equivalent relations present at the entry point of basic block B that are unaffected by the processing in basic block B.

Proof of the Monotonicity of Function f(X)=E_GEN[B].orgate.e(X.andgate.e {E_PRE[B]})

As mentioned earlier in the definition of "reaching", to solve the data flow equations that relate to equivalent expression sets using iterative algorithms, the following proofs are necessary, as shown between pages 79 and 88 in reference (2). First, it is necessary to prove that for the confluent calculation ".andgate.e", PR(E) is a semi-lattice (this has already been shown). Second, it is necessary to prove the monotonicity of the function f(X)=E_GEN[B] .orgate.e(X .andgate.e {E_PRE[B])} in the data flow equations that relate to the aforementioned equivalent expression sets, where the term E_IN[B] on the right side of the E_OUT[B] function (showing the effect of basic block B) has been replaced by X. Here, E_GEN[B]. {E_PRE[B]} are respectively set as G and P for ease of explanation, with the monotonicity of f(X)=G .andgate.e (X .andgate.e P) being proved below. From pages 81 to 83 of reference (3), it is necessary to prove Equation 18 below to prove the monotonicity.

X, Y.di-elect cons.PR(E) Equation 18

f(X.andgate.e Y).andgate.e(f(X).andgate.e f(Y))=f(X.andgate.e Y)

However, from page 81 in reference (3), it can be seen that it is sufficient to prove the following Equation 19.

For any Ai.di-elect cons.f(X.andgate.e Y), a value Bj.di-elect cons.f(X).andgate.e f(Y) exists

where Ai.OR right.Bj Equation 19

Proof

First, by using f(X.andgate.e Y) and f(X).andgate.e f(Y) as actual examples of function f, the following equations are given.

f(X.andgate.e Y)=G.orgate.e((X.andgate.e Y).andgate.e P) (1)

f(X).andgate.e f(Y)=(G.orgate.e(X.andgate.e P)).andgate.e(G.orgate.e(Y.andgate.e P)) (2)

Here, the right side of formula (2) above further changes.

If the right side of formula (2) is thought of as being the A part of "G", the part of "(X.andgate.e P)", and the C part of "(G .orgate.e(Y.andgate.e P)", the C part may be distributed among the A part and B part as follows.

(G.andgate.e(G.orgate.e(Y.andgate.e P))).orgate.e((X.andgate.e P).andgate.e(G.orgate.e(Y.andgate.e P))) (3)

Rearranging this formula based on the commutation of .andgate.e give the following.

(3)=((G.orgate.e(Y.andgate.e P)).andgate.e G).orgate.e((G.orgate.e(Y.andgate.e P)).andgate.e(X.andgate.e P)) =S

If formula (3) is thought of as being the A part of the original "G", the B part of "(Y.andgate.e P)", the C part of "G", the A part of the next "G", the B part of the next "(Y.andgate.e P)", and the C part of the next "(X.andgate.e P)", the C part may be distributed among the A part and B part as follows.

((G.andgate.e G).orgate.e((Y.andgate.e P).andgate.e G)).orgate.e ((G.andgate.e(X.andgate.e P)).orgate.e((Y.andgate.e P).andgate.e(X.andgate.e P))) (4)

Rearranging this formula based on the idempotency, commutation, and association of .andgate.e gives the following.

(4)=(G.orgate.e(G.andgate.e(Y.andgate.e P))).orgate.e ((G.andgate.e(X.andgate.e P)).orgate.e((X.andgate.e Y).andgate.e P))

Rearranging this formula based on the commutation and association of the calculation .orgate.e.

(4)=(G.orgate.e((X.andgate.e Y).andgate.e P)).orgate.e((G.andgate.e(Y.andgate.eP)).orgate.e(G.andgate.e(X.andgate.e P)))

Here, since G .orgate.e((X.andgate.e Y).andgate.eP) in formula (4) is f(X.andgate.e Y) as given in formula (1).

(4)=f(X.andgate.e Y).orgate.e((G.andgate.e Y.andgate.e P).orgate.e(G.andgate.e X.andgate.e P))=T

This is solved as follows.

X,Y.di-elect cons.PR(E), .A-inverted.Xi.di-elect cons.x, .E-backward.Zj.di-elect cons.(X.orgate.e Y), xi.OR right.zj.

Therefore

.A-inverted.Ai.di-elect cons.f(X.andgate.e Y), .E-backward.Tj.di-elect cons.T, Ai.OR right.Tj.

Based on the results of the "distribution of the operator .andgate.e into the operator .orgate.e" described earlier, the following relationships are established.

.A-inverted.Ti.di-elect cons.T, .E-backward.Sj.di-elect cons.S, Ti.OR right.Sj

.A-inverted.Si.di-elect cons.S, .E-backward.Bj.di-elect cons.(f(X).andgate.e f(Y)), Si.OR right.Bj

This means

.A-inverted.Ai.di-elect cons.f(X.andgate.e Y), .E-backward.Bj.di-elect cons.(f(X).andgate.e f(Y)), Ai.OR right.Bj is established. (End of proof)

1.5. Algorithms specially used in this Embodiment

Iterative Algorithm for the Equivalent Expression Set Groups

The iterative algorithm for solving the data flow equations relating to the equivalent expression set groups is shown in FIG. 24.

The algorithm of FIG. 24 repeatedly performs a calculation for all basic blocks B until there are no more changes to E_OUT[B] of any the basic blocks. The values of the E_IN[B] and E_OUT[B] once these groups have converged are the solutions for the data flow equations.

As is apparent from FIG. 24(1), the equivalent relations at the exit point of basic block B are set as the equivalent relations that are produced by the execution of the processing in basic block B shown as E_GEN[B]. Also, all expressions that are unaffected by basic block B are assumed to have equivalent relations with other expression by adding the expression set groups that have only E_PRE[B] as elements through an equivalence union calculation. By doing so, larger equivalence set groups can be found, which enables the largest solution to the data flow equations to be found.

Equivalence Replacement Algorithm

FIG. 25 shows an optimal example of an equivalence replacement algorithm that uses the equivalent expression set groups found by the iterative algorithm for the equivalent expression set groups. The processing in this algorithm is performed for one basic block at a time.

2.1. Overall Explanation of the Embodiment

The following is an explanation of an embodiment of an optimization apparatus of the present invention with reference to the figures. As a method for distributing and retailing optimization apparatuses that is familiar to those skilled in the art, software that achieves the functions of the present embodiment may be recorded onto a recording medium and then distributed and sold as packaged software. A customer who has bought this packaged software may install it into a standard computer, and the standard computer may thereafter operate in accordance with the installed software to achieve the functioning of the optimization apparatus.

When considering the widespread use of this software format, it is more appropriate to regard the functions of an optimization apparatus as software recorded on a recording medium, rather than hardware resources such as a processor or memory that are provided in a standard computer. Software that realizes complex processing will generally be composed of a plurality of subroutines and work areas, with it being appropriate to consider each subroutine and work area as an independent component. The following explanation describes the functions that need to be realized by subroutines and work areas to achieve the functioning of the optimization apparatus. It should be noted here that there is no need to develop entirely new components for this optimization apparatus, so that it is normal to use files that are stored in library format by an existing operating system, compiler, or optimization apparatus. The present specification will no describe the components that can be realized by such conventional subroutines and work areas in detail.

2.2. Overall Configuration of Optimization Apparatus

FIG. 9 shows the configuration of the optimization apparatus 1 in the embodiment of the present invention. As shown in FIG. 9, the optimization apparatus 1 is composed of an optimization control unit 11, a control flow information analysis unit 12, a data flow information analysis unit 13, an equivalence information analysis unit 14, an intermediate code optimization unit 15, an equivalence replacement optimization unit 16, a control flow information storage unit 17, a data flow information storage unit 18, an equivalence information storage unit 19, one a processing expression set storage unit 101. These components are described below.

The optimization control unit 11 controls the entire optimization process.

The control flow information analysis unit 12 is activated by the optimization control unit 11 and, as in the background art, divides an intermediate program into basic blocks where the control flow is unidirectional. The control flow information analysis unit 12 also produces control flow information showing the control flow between basic blocks.

The data flow information analysis unit 13 is activated by the optimization control unit 11 and, as in the background art, uses the control flow information described above to further analyze the intermediate code. By doi