|
|
|
Platform-independent form (e.g., abstract code) |
ANDF compiler using the HPcode-plus compiler intermediate language5339419
Abstract
A computer software compiler system and method for distributing a machine independent computer program, created on a native computer platform, to heterogeneous target computer platforms. The system is comprised of a producer component and one or more installer components. The producer component receives the machine independent computer program as input and generates a compiler intermediate representation in a machine independent manner according to an HPcode-Plus compiler intermediate language. The compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format (ANDF). The compiler intermediate representation is distributed to heterogeneous target computer platforms where the installer components reside. The installer components receive the compiler intermediate representation as input and generate object code representations in a machine dependent manner according to the HPcode-Plus compiler intermediate language, such that the object code representations are architecture dependent, or machine dependent, on the target computer platforms.
Claims
What is claimed is:
1. A computer software compiler system, adapted for use with a machine independent computer program that may use machine dependent standard header files, said computer software compiler system comprising:
one or more heterogeneous computer platforms:
a producer, implemented in one of the one or more heterogeneous computer platforms, which receives the machine independent computer program as input and which generates a compiler intermediate representation of the machine independent computer program, said compiler intermediate representation comprising compiler intermediate instructions from a compiler intermediate language, wherein said producer generates said compiler intermediate representation in a machine independent manner, such that machine dependent decisions are deferred, and such that said compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format; and
one or more installers, implemented in the one or more heterogeneous computer platforms, said one or more installers receiving said compiler intermediate representation as input and generating object code representations of the machine independent computer program, said object code representations being machine dependent on the one or more heterogeneous computer platforms upon which said one or more installers reside.
2. The computer software compiler system of claim 1, wherein said producer comprises:
(1) means for assigning unique key words to standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers which are defined in the machine dependent standard header files and referenced by the machine independent computer program;
(2) means for translating references to said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers in the machine independent computer program to said compiler intermediate instructions by using said unique key words to refer to said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers, such that machine dependent decisions concerning said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers are deferred to said one or more installers, said compiler intermediate instructions forming part of said compiler intermediate representation.
3. The computer software compiler system of claim 1, wherein said producer further comprises:
(1) means for defining and declaring data types, functions, and/or variables contained in the machine independent computer program by using said compiler intermediate instructions for creating and assigning unique symbolic identifiers to said data types, said functions, and/or said variables, said compiler intermediate instructions forming part of said compiler intermediate representation;
(2) means for translating references to said data types, said functions, and/or said variables in the machine independent computer program to said compiler intermediate instructions by using said unique symbolic identifiers to refer to said data types, said functions, and/or said variables such that machine dependent decisions concerning said instructions having said data types, said functions, and/or said variables are deferred to said one or more installers, said compiler intermediate instructions forming part of said compiler intermediate representation.
4. The computer software compiler system of claim 1, wherein said producer further comprises:
(1) means for translating constant expressions, which are not guaranteed to fold without overflow, contained in the machine independent computer program to said compiler intermediate instructions without evaluating said constant expressions, said compiler intermediate instructions forming part of said compiler intermediate representation;
(2) means for assigning unique symbolic identifiers to said constant expressions; and
(3) means for translating references to said constant expressions in the machine independent computer program to said compiler intermediate instructions by using said unique symbolic identifiers to refer to said constant expressions, such that machine dependent decisions concerning said constant expressions are deferred to said one or more installers, said compiler intermediate instructions forming part of said compiler intermediate representation.
5. The computer software compiler system of claim 1, wherein said producer further comprises a conversion means for conversion of operands in the machine independent computer program from a first data type to a second data type by inserting said compiler intermediate instructions into said compiler intermediate representation, said first data type and said second data type being from a set of data types, said conversion means operating in a machine independent manner such that machine dependent decisions concerning said conversion of operands are deferred to said one or more installers.
6. The computer software compiler system of claim 1, wherein said producer further comprises:
(1) means for assigning two part sequences to character constants in the machine independent computer program, each of said two part sequences comprising:
(a) a first part, said first part identifying a first character set; and
(b) a second part, said second part identifying said character constant within said first character set;
(2) means for translating instructions having said character constants in the machine independent computer program to said compiler intermediate instructions by using said two part sequences to refer to said character constants, such that machine dependent decisions concerning said character constants are deferred to said one or more installers, said compiler intermediate instructions forming part of said compiler intermediate representation.
7. The computer software compiler system of claim 1, wherein said compiler intermediate representation is distributed to said one or more heterogeneous computer platforms.
8. The computer software compiler system of claim 1, wherein each of said one or more installers comprise means to make said machine dependent decisions that were deferred from said producer.
9. A computer software compiler system, adapted for use with a machine independent computer program that may use machine dependent standard header files, said computer software compiler system comprising:
one or more heterogeneous computer platforms;
a producer, implemented in one of the one or more heterogeneous computer platforms, which receives the machine independent computer program as input and which generates a compiler intermediate representation of the machine independent computer program, wherein said producer generates said compiler intermediate representation in a machine independent manner according to an HPcode-Plus compiler intermediate language, such that said compiler intermediate representation comprises HPcode-Plus instructions from said HPcode-Plus compiler intermediate language, and such that said compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format; and
one or more installers, implemented in the one or more heterogeneous computer platforms, each of said one or more installers receiving said compiler intermediate representation as input and generating object code representations of the machine independent computer program, said object code representations being machine dependent on the one or more heterogeneous computer platforms upon which said one or more installers reside.
10. The computer software compiler system of claim 9, wherein said producer comprises:
(1) a preprocessor, which translates the machine independent computer program to an expanded machine independent computer program; and
(2) a compiler, which translates said expanded machine independent computer program to said compiler intermediate representation;
wherein said expanded machine independent computer program and said compiler intermediate representation are machine independent.
11. The computer software compiler system of claim 10, wherein said compiler intermediate representation is distributed to said one or more heterogeneous computer platforms.
12. The computer software compiler system of claim 11, further adapted for use with a machine configuration file, the machine configuration file containing information describing a target computer platform, the target computer platform being one of the one or more heterogeneous computer platforms and having a register architecture, a memory architecture, and an instruction set, wherein each of said one or more installers which reside on said target computer platform comprise:
(1) a tuple-generator, which receives a second ANDF header file and said compiler intermediate representation as input and which translates said HPcode-Plus instructions contained in said compiler intermediate representation into quadruple instructions;
(2) a low-level code generator, which receives said quadruples instructions and the machine configuration file as input and which translates said quadruple instructions into low-level instructions, said low-level instructions being from the instruction set, said low-level instructions being stored in a low-level compiler intermediate representation;
(3) a register allocator, which receives said low-level compiler intermediate representation and the machine configuration file as input and which maps said low-level instructions to the register architecture of the target computer platform to produce machine instructions;
(4) an object file generator, which receives said machine instructions as input and which translates said machine instructions to said object code representation of the machine independent computer program, said object code representation being stored in an object code file;
wherein said tuple-generator, said low-level code generator, said register allocator, and said object file generator operate in a machine dependent manner according to said HPcode-Plus compiler intermediate language, such that said tuple-generator, said low-level code generator, said register allocator, and said object file generator make said machine dependent decisions that were deferred from said producer; and
wherein said object code representation is machine dependent upon said target computer platform.
13. The computer software compiler system of claim 12, wherein said compiler further comprises means for creating data types by using SYM HPcode-Plus instructions for creating and assigning unique symbolic identifiers to said data types, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation and having a syntax
SYM <symid> <sym kind> <sym info>
wherein
<symid> is a field which represents one of said unique symbolic identifiers and which uniquely identifies an item;
<sym kind> is a field which contains symbolic kind information that describes said item;
<sym info> is a field which contains symbolic information that further describes said item;
wherein said item represents one of said data types.
14. The computer software compiler system of claim 13, wherein said compiler further comprises:
(1) means for locating first occurrences of repeated instruction sequences in said expanded machine independent computer program;
(2) means for translating said first occurrences of said repeated instruction sequences to said HPcode-Plus instructions to produce macros, said macros forming part of said compiler intermediate representation;
(3) means for creating and assigning unique symbolic identifiers to said macros by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said macros, and wherein said <sym kind> field has a value KIND.sub.-- MACRO;
(4) means for locating all occurrences of said repeated instruction sequences; and
(5) means for translating said occurrences of said repeated instruction sequences to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said macros.
15. The computer software compiler system of claim 14, wherein said compiler further comprises means for creating and assigning unique symbolic identifiers to constants contained in said expanded machine independent computer program by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said constants, and wherein said <sym kind> field can have a value from one or more of KIND.sub.-- MEMBER, KIND.sub.-- OFFSETOF, KIND.sub.-- SIZEOF, KIND.sub.-- MAXOF, KIND.sub.-- MINOF, and KIND.sub.-- .sub.-- CONST.
16. The computer software compiler system of claim 15, wherein said compiler further comprises:
(1) means to define and declare functions and variables contained in said expanded machine independent computer program by using said SYM HPcode-Plus instructions for creating unique symbolic identifiers and assigning said unique symbolic identifiers to said functions and said variables, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said functions and said variables, and wherein said <sym kind> field can have a value from one or more of KIND.sub.-- FUNCTION, KIND.sub.-- FUNC.sub.-- DCL, KIND.sub.-- FPARAM, KIND.sub.-- SVAR, and KIND.sub.-- DVAR;
(2) means for translating references to said functions and said variables in said expanded machine independent computer program to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said functions and said variables such that machine dependent decisions concerning said functions and said variables are deferred to said one or more installers, said HPcode-Plus instructions forming part of said compiler intermediate representation.
17. The computer software compiler system of claim 16, wherein said compiler further comprises:
(1) means for translating constant expressions, which are not guaranteed to fold without overflow, contained in said expanded machine independent computer program to said HPcode-Plus instructions without evaluating said constant expressions, said HPcode-Plus instructions forming part of said compiler intermediate representation;
(2) means for creating and assigning unique symbolic identifiers to said constant expressions by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to one of said constant expressions, and wherein said <sym kind> field has a value KIND.sub.-- NEW.sub.-- CONST;
(3) means for translating references to said constant expressions in said expanded machine independent computer program to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said constant expressions, such that machine dependent decisions concerning said constant expressions are deferred to said one or more installers, said HPcode-Plus instructions forming part of said compiler intermediate representation.
18. The computer software compiler system of claim 17, wherein said tuple generator comprises:
(1) means for creating a symbol table, said symbol table comprising symbol table entries for said unique symbolic identifiers assigned to said variables, said functions, and said macros;
(2) means for storing in said symbol table entries said symbolic information from said SYM HPcode-Plus instructions used for creating and assigning said unique symbolic identifiers to said variables, said functions, and said macros;
(3) means for creating a type table, said type table comprising type table entries for said unique symbolic identifiers assigned to said data types;
(4) means for storing in said type table entries said symbolic information from said SYM HPcode-Plus instructions used for creating and assigning said unique symbolic identifiers to said data types.
19. The computer software compiler system of claim 18, wherein said tuple generator further comprises:
(1) means for locating and evaluating said HPcode-Plus instructions associated with said constant expressions to produce evaluated constant expressions, said evaluated constant expressions being machine dependent upon the target computer platform;
(2) means for storing in said symbol table entries said evaluated constant expressions and said unique symbolic identifiers which are associated with said evaluated constant expressions.
20. The computer software compiler system of claim 19, wherein said low-level code generator comprises:
(1) means for mapping said variables to the memory architecture of the target computer program by referring to the machine configuration file and to said symbol table;
(2) means for mapping said data types to the memory architecture of the target computer program by referring to the machine configuration file and to said type table.
21. The computer software compiler system of claim 12, wherein said preprocessor comprises:
(1) means for locating standard type identifiers, object-like macro identifiers, and function-like macro identifiers which are defined in the machine dependent standard header files and referenced by the machine independent computer program;
(2) means for accessing first ANDF header files, said first ANDF header files being related to the machine dependent standard header files and comprising object-like macro definitions and function-like macro definitions which assign unique key words to said standard type identifiers, object-like macro identifiers, and function-like macro identifiers contained in the machine dependent standard header files; and
(3) means for substituting said unique key words for said standard type identifiers, object-like macro identifiers, and function-like macro identifiers contained in the machine independent computer program to produce said expanded machine independent computer program.
22. The computer software compiler system of claim 21, wherein said compiler comprises means for translating references to said unique key words in said expanded machine independent computer program to said HPcode-Plus instructions by using unique symbolic identifiers having negative values to refer to said unique key words, such that machine dependent decisions concerning said instructions containing said unique key words are deferred to said one or more installers, said HPcode-Plus instructions forming part of said compiler intermediate representation.
23. The computer software compiler system of claim 22, wherein said tuple generator comprises:
(1) means for locating said HPcode-Plus instructions having said unique negative symbolic identifiers;
(2) means for accessing said second ANDF header files, said second ANDF header files being related to the machine dependent standard header files and comprising data type definitions and macro definitions which correspond to said unique negative symbolic identifiers, said data type definitions and said macro definitions being defined by and machine dependent upon said target computer platform;
(3) means for replacing said unique negative symbolic identifiers with said data type definitions and said macro definitions.
24. The computer software compiler system of claim 12, wherein said compiler further comprises:
(1) means for identifying constant values with indeterminate data types in said expanded machine independent computer program;
(2) means for storing information in said compiler intermediate representation regarding said constant values with indeterminate data types by using CLDC HPcode-Plus instructions, said CLDC HPcode-Plus instructions forming part of said compiler intermediate representation and having a syntax
CLDC <flag> <constant value>
wherein
<flag>=a flag which indicates a data type set, said data type set comprising one or more data types, said one or more data types being organized sequentially from a first data type to a last data type;
<constant value>=a field which equals one of said constant values with indeterminate data types; wherein said information is sufficient to enable said one or more installers to determine said data types of said constant values with indeterminate data types, such that machine dependent decisions concerning said constant values with indeterminate data types are deferred to said one or more installers.
25. The computer software compiler system of claim 24, wherein said low-level code generator further comprises means for assigning a data type to said <constant value> field from said CLDC HPcode-Plus instruction, said data type being from said data type set indicated by said <flag> field from said CLDC HPcode-Plus instruction, wherein said assigning means operates according to the ANSI-C computer programming language to assign said data type to said <constant value>.
26. The computer software compiler system of claim 12, further adapted for use with an ANSI-C computer programming language, wherein said compiler further comprises means to perform integral promotions specific to the ANSI-C computer programming language upon operands contained in said expanded machine independent computer program by using ICVT HPcode-Plus instructions, said ICVT HPcode-Plus instructions forming part of said compiler intermediate representation.
27. The computer software compiler system of claim 26, wherein said low-level code generator further comprises:
(1) means for locating said ICVT HPcode-Plus instructions in said compiler intermediate representation, said ICVT HPcode-Plus instructions operating to perform said integral promotions upon operands for integral promotion in said compiler intermediate representation; and
(2) means for performing said integral promotions upon said operands for integral promotion according to said ICVT HPcode-Plus instructions and the ANSI-C computer programming language.
28. The computer software compiler system of claim 12, wherein said compiler further comprises means for preparing two operands for processing by arithmetic operations by using an ACVT HPcode-Plus instruction to convert said two operands to a common data type, said two operands and said arithmetic operations being contained in said expanded machine independent computer program, said ACVT HPcode-Plus instruction forming part of said compiler intermediate representation, said ACVT HPcode-Plus instruction converting said two operands to said common data type by using conversion rules that are specific to the ANSI-C computer programming language, such that machine dependent decisions concerning said two operands are deferred to said one or more installers.
29. The computer software compiler system of claim 28, wherein said low-level code generator further comprises:
(1) means for locating said ACVT HPcode-Plus instructions in said compiler intermediate representation, said ACVT HPcode-P1 us instructions operating to convert operands for arithmetic operations in said compiler intermediate representation to a common data type; and
(2) means for converting said operands for arithmetic operations to said common data type according to said ACVT HPcode-Plus instructions and the ANSI-C computer programming language.
30. The computer software compiler system of claim 12, wherein said compiler further comprises:
(1) means for assigning two part sequences to character constants in said expanded machine independent computer program, each of said two part sequences comprising:
(a) a first part, said first part identifying a first character set; and
(b) a second part, said second part identifying said character constant within said first character set;
(2) means for translating instructions having said character constants in said expanded machine independent computer program to said HPcode-Plus instructions by using said two part sequences to refer to said character constants, such that machine dependent decisions concerning said character constants are deferred to said one or more installers, said HPcode-Plus instructions forming part of said compiler intermediate representation.
31. The computer software compiler system of claim 30, wherein said low-level code generator further comprises means to assign character constant values to said two part sequences, said character constant values being from an execution character set of the target computer platform, said execution character set being machine dependent on the target computer platform.
32. The computer software compiler system of claim 12, wherein each of said one or more installers further comprise:
(1) a low-level optimizer, which optimizes said low-level instructions to produce optimized low-level instructions, said optimized low-level instructions being stored in said low-level compiler intermediate representation;
(2) a machine specific optimizer, which optimizes said machine instructions to produce optimized machine instructions;
wherein said low-level optimizer and said machine specific optimizer operate in a machine dependent manner.
33. The computer software compiler system of claim 10, wherein said producer further comprises:
(1) a High-Level Optimizer, which optimizes said compiler intermediate representation to produce an optimized compiler intermediate representation;
(2) an Archiver/Linker, which archives or links said optimized compiler intermediate representation with other compiler intermediate representations to produce an HPcode-Plus Archive file or a Linked HPcode-Plus file;
wherein said High-Level Optimizer and said Archiver/Linker operate in a machine independent manner such that said optimized compiler intermediate representation, said HPcode-Plus Archive file, and said Linked HPcode-Plus file are machine independent.
34. The computer software compiler system of claim 33, wherein said HPcode-Plus archive file or said Linked HPcode-Plus file is distributed to said one or more heterogeneous computer platforms.
35. The computer software compiler system of claim 34, further adapted for use with a machine configuration file and an extracting device which extracts said compiler intermediate representation from said HPcode-Plus archive file or said Linked HPcode-Plus file, the machine configuration file containing information describing a target computer platform, the target computer platform being one of the one or more heterogeneous computer platforms and having a register architecture, a memory architecture, and an instruction set, wherein each of said one or more installers which reside on said target computer platform comprise:
(1) a tuple-generator, which receives a second ANDF header file and said compiler intermediate representation as input and which translates said HPcode-Plus instructions contained in said compiler intermediate representation into quadruple instructions;
(2) a low-level code generator, which receives said quadruples instructions and the machine configuration file as input and which translates said quadruple instructions into low-level instructions, said low-level instructions being from the instruction set, said low-level instructions being stored in a low-level compiler intermediate representation;
(3) a register allocator, which receives said low-level compiler intermediate representation and the machine configuration file as input and which maps said low-level instructions to the register architecture of the target computer platform to produce machine instructions;
(4) an object file generator, which receives said machine instructions as input and which translates said machine instructions to said object code representation of the machine independent computer program, said object code representation being stored in an object code file;
wherein said tuple-generator, said low-level code generator, said register allocator, and said object file generator operate in a machine dependent manner according to said HPcode-Plus compiler intermediate language, such that said tuple-generator, said low-level code generator, said register allocator, and said object file generator make said machine dependent decisions that were deferred from said producer; and
wherein said object code representation is machine dependent upon said target computer platform.
36. A computer software compiler system, adapted for use with a machine independent computer program that may use machine dependent standard header files, said computer software compiler system comprising:
one or more heterogeneous computer platforms;
a producer, implemented in one of the one or more heterogeneous computer platforms, which receives the machine independent computer program as input and which generates a compiler intermediate representation of the machine independent computer program, wherein said producer generates said compiler intermediate representation in a machine independent manner according to an HPcode-Plus compiler intermediate language, such that said compiler intermediate representation comprises HPcode-Plus instructions from said HPcode-Plus compiler intermediate language, and such that said compiler intermediate representation architecture neutral and represents an architecture neutral distribution format; and
one or more installer interpreters which receive said compiler intermediate representation as input and which execute said HPcode-Plus instructions contained therein without converting said compiler intermediate representation to object code representations of the machine independent computer program.
37. A computer software compiler method, adapted for use with a machine independent computer program that may use machine dependent standard header files, and with one or more heterogeneous computer platforms, for translating the machine independent computer program to object code representations of the machine independent computer program, such that said object code representations are machine dependent on the one or more heterogeneous computer platforms, said computer software compiler method comprising the steps of:
(a) a producer step for translating the machine independent computer program to a compiler intermediate representation of the machine independent computer program in a machine independent manner according to a compiler intermediate language, said compiler intermediate representation comprising compiler intermediate instructions from said compiler intermediate language, such that machine dependent decisions are deferred, and such that said compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format;
(b) distributing said compiler intermediate representation to the one or more heterogeneous computer platforms; and
(c) an installer step for translating said compiler intermediate representation to said object code representations of the machine independent computer program in a machine dependent manner according to said compiler intermediate language, such that said object code representations are machine dependent on the one or more heterogeneous computer platforms.
38. The computer software compiler method of claim 37, wherein said producer step comprises the steps of:
(a) assigning unique key words to standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers which are defined in the machine dependent standard header files and referenced by the machine independent computer program;
(b) translating references to said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers in the machine independent computer program to said compiler intermediate instructions by using said unique key words to refer to said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers, such that machine dependent decisions concerning said standard type identifiers, object-like macro identifiers, and/or function-like macro identifiers are deferred to said installer step, said compiler intermediate instructions forming part of said compiler intermediate representation.
39. The computer software compiler method of claim 37, wherein said producer step further comprises the steps of:
(a) defining and declaring data types, functions, and/or variables contained in the machine independent computer program by using said compiler intermediate instructions for creating and assigning unique symbolic identifiers to said data types, said functions, and/or said variables, said compiler intermediate instructions forming part of said compiler intermediate representation;
(b) translating references to said data types, said functions, and/or said variables in the machine independent computer program to said compiler intermediate instructions by using said unique symbolic identifiers to refer to said data types, said functions, and/or said variables such that machine dependent decisions concerning said instructions having said data types, said functions, and/or said variables are deferred to said installer step, said compiler intermediate instructions forming part of said compiler intermediate representation.
40. The computer software compiler method of claim 37, wherein said producer step further comprises the steps of:
(a) translating constant expressions, which are not guaranteed to fold without overflow, contained in the machine independent computer program to said compiler intermediate instructions without evaluating said constant expressions, said compiler intermediate instructions forming part of said compiler intermediate representation;
(b) assigning unique symbolic identifiers to said constant expressions; and
(c) translating references to said constant expressions in the machine independent computer program to said compiler intermediate instructions by using said unique symbolic identifiers to refer to said constant expressions, such that machine dependent decisions concerning said constant expressions are deferred to said installer step, said compiler intermediate instructions forming part of said compiler intermediate representation.
41. The computer software compiler method of claim 37, wherein said producer step further comprises the step of converting operands in the machine independent computer program from a first data type to a second data type by inserting said compiler intermediate instructions into said compiler intermediate representation, said first data type and said second data type being from a set of data types, said converting step operating in a machine independent manner such that machine dependent decisions concerning said conversion of operands are deferred to said installer step.
42. The computer software compiler method of claim 37, wherein said producer step further comprises the steps of:
(a) assigning two part sequences to character constants in the machine independent computer program, each of said two part sequences comprising:
(1) a first part, said first part identifying a first character set; and
(2) a second part, said second part identifying said character constant within said first character set;
(b) translating instructions having said character constants in the machine independent computer program to said compiler intermediate instructions by using said two part sequences to refer to said character constants, such that machine dependent decisions concerning said character constants are deferred to said installer step, said compiler intermediate instructions forming part of said compiler intermediate representation.
43. The computer software compiler method of claim 37, wherein said installer step comprises the step of making said machine dependent decisions that were deferred from said producer step.
44. A computer software compiler method, adapted for use with a machine independent computer program that may use machine dependent standard header files, and with one or more heterogeneous computer platforms, for translating the machine independent computer program to object code representations of the machine independent computer program, such that said object code representations are machine dependent on the one or more heterogeneous computer platforms, said computer software compiler method comprising the steps of:
(a) a producer step for translating the machine independent computer program to a compiler intermediate representation of the machine independent computer program in a machine independent manner according to an HPcode-Plus compiler intermediate language, said compiler intermediate representation comprising HPcode-Plus instructions from said HPcode-Plus compiler intermediate language, such that said compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format;
(b) an installer step for translating said compiler intermediate representation to said object code representations of the machine independent computer program in a machine dependent manner according to said HPcode-Plus compiler intermediate language, such that said object code representations are machine dependent on the one or more heterogeneous computer platforms.
45. The computer software compiler method of claim 44, wherein said producer step comprises the steps of:
(a) a preprocessor step for translating the machine independent computer program to an expanded machine independent computer program;
(b) a compiler step for translating said expanded machine independent computer program to said compiler intermediate representation;
wherein said expanded machine independent computer program and said compiler intermediate representation are machine independent.
46. The computer software compiler method of claim 45, further comprising the step of distributing said compiler intermediate representation to said one or more heterogeneous computer platforms.
47. The computer software compiler method of claim 46, further adapted for use with a machine configuration file, the machine configuration file containing information describing a target computer platform, the target computer platform being one of the one or more heterogeneous computer platforms and having a register architecture, a memory architecture, and an instruction set, wherein said installer step comprises the steps of:
(a) a tuple-generator step for receiving a second ANDF header file and said compiler intermediate representation as input and for translating said HPcode-Plus instructions contained in said compiler intermediate representation into quadruple instructions;
(b) a low-level code generator step for receiving said quadruples instructions and the machine configuration file as input and for translating said quadruple instructions into low-level instructions, said low-level instructions being from the instruction set, said low-level instructions being stored in a low-level compiler intermediate representation;
(c) a register allocator step for receiving said low-level compiler intermediate representation and the machine configuration file as input and for mapping said low-level instructions to the register architecture of the target computer platform to produce machine instructions;
(d) an object file generator step for receiving said machine instructions as input and for translating said machine instructions to said object code representation of the machine independent computer program, said object code representation being stored in an object code file;
wherein said tuple-generator step, said low-level code generator step, said register allocator step, and said object file generator step operate in a machine dependent manner according to said HPcode-Plus compiler intermediate language, such that said tuple-generator step, said low-level code generator step, said register allocator step, and said object file generator step make the machine dependent decisions that were deferred from said producer step; and
wherein said object code representation is machine dependent upon said target computer platform.
48. The computer software compiler method of claim 47, wherein said compiler step further comprises the step of creating data types by using SYM HPcode-Plus instructions for creating and assigning unique symbolic identifiers to said data types, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation and having a syntax
SYM <symid> <sym kind> <sym info>
wherein
<symid> is a field which represents one of said unique symbolic identifiers and which uniquely identifies an item;
<sym kind> is a field which contains symbolic kind information that describes said item;
<sym info> is a field which contains symbolic information that further describes said item;
wherein said item represents one of said data types.
49. The computer software compiler method of claim 48, wherein said compiler step further comprises the steps of:
(a) locating first occurrences of repeated instruction sequences in said expanded machine independent computer program;
(b) translating said first occurrences of said repeated instruction sequences to said HPcode-Plus instructions to produce macros, said macros forming part of said compiler intermediate representation;
(c) creating and assigning unique symbolic identifiers to said macros by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said macros, and wherein said <sym kind> field has a value KIND.sub.-- MACRO;
(d) locating all occurrences of said repeated instruction sequences; and
(e) translating said occurrences of said repeated instruction sequences to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said macros.
50. The computer software compiler method of claim 49, wherein said compiler step further comprises the step of creating and assigning unique symbolic identifiers to constants contained in said expanded machine independent computer program by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said constants, and wherein said <sym kind> field can have a value from one or more of KIND.sub.-- MEMBER, KIND.sub.-- OFFSETOF, KIND.sub.-- SIZEOF, KIND.sub.-- MAXOF, KIND.sub.-- MINOF, and KIND.sub.-- CONST.
51. The computer software compiler method of claim 50, wherein said compiler step further comprises the steps of:
(a) defining and declaring functions and variables contained in said expanded machine independent computer program by using said SYM HPcode-Plus instructions for creating unique symbolic identifiers and assigning said unique symbolic identifiers to said functions and said variables, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to said functions and said variables, and wherein said <sym kind> field can have a value from one or more of KIND.sub.-- FUNCTION, KIND.sub.-- FUNC.sub.-- DCL, KIND.sub.-- FPARAM, KIND.sub.-- SVAR, and KIND.sub.-- DVAR;
(b) translating references to said functions and said variables in said expanded machine independent computer program to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said functions and said variables such that machine dependent decisions concerning said functions and said variables are deferred to said installer step, said HPcode-Plus instructions forming part of said compiler intermediate representation.
52. The computer software compiler method of claim 51, wherein said compiler step further comprises the steps of:
(a) translating constant expressions, which are not guaranteed to fold without overflow, contained in said expanded machine independent computer program to said HPcode-Plus instructions without evaluating said constant expressions, said HPcode-Plus instructions forming part of said compiler intermediate representation;
(b) creating and assigning unique symbolic identifiers to said constant expressions by using said SYM HPcode-Plus instructions, said SYM HPcode-Plus instructions forming part of said compiler intermediate representation, wherein said item refers to one of said constant expressions, and wherein said <sym kind> field has a value KIND.sub.-- NEW.sub.-- CONST;
(c) translating references to said constant expressions in said expanded machine independent computer program to said HPcode-Plus instructions by using said unique symbolic identifiers to refer to said constant expressions, such that machine dependent decisions concerning said constant expressions are deferred to said installer step, said HPcode-Plus instructions forming part of said compiler intermediate representation.
53. The computer software compiler method of claim 52, wherein said tuple generator step comprises the steps of:
(a) creating a symbol table, said symbol table comprising symbol table entries for said unique symbolic identifiers assigned to said variables, said functions, and said macros;
(b) storing in said symbol table entries said symbolic information from said SYM HPcode-Plus instructions used for creating and assigning said unique symbolic identifiers to said variables, said functions, and said macros;
(c) creating a type table, said type table comprising type table entries for said unique symbolic identifiers assigned to said data types;
(d) storing in said type table entries said symbolic information from said SYM HPcode-Plus instructions used for creating and assigning said unique symbolic identifiers to said data types.
54. The computer software compiler method of claim 53, wherein said tuple generator step further comprises the steps of:
(a) locating and evaluating said HPcode-Plus instructions associated with said constant expressions to produce evaluated constant expressions, said evaluated constant expressions being machine dependent upon the target computer platform;
(b) storing in said symbol table entries said evaluated constant expressions and said unique symbolic identifiers which are associated with said evaluated constant expressions.
55. The computer software compiler method of claim 54, wherein said low-level code generator step comprises the steps of:
(a) mapping said variables to the memory architecture of the target computer program by referring to the machine configuration file and to said symbol table;
(b) mapping said data types to the memory architecture of the target computer program by referring to the machine configuration file and to said type table.
56. The computer software compiler method of claim 47, wherein said preprocessor step comprises the steps of:
(a) locating standard type identifiers, object-like macro identifiers, and function-like macro identifiers which are defined in the machine dependent standard header files and referenced by the machine independent computer program;
(b) accessing first ANDF header files, said first ANDF header files being related to the machine dependent standard header files and comprising object-like macro definitions and function-like macro definitions which assign unique key words to said standard type identifiers, object-like macro identifiers, and function-like macro identifiers contained in the machine dependent standard header files; and
(c) substituting said unique key words for said standard type identifiers, object-like macro identifiers, and function-like macro identifiers contained in the machine independent computer program to produce said expanded machine independent computer program.
57. The computer software compiler method of claim 56, wherein said compiler step comprises the step of translating references to said unique key words in said expanded machine independent computer program to said HPcode-Plus instructions by using unique symbolic identifiers having negative values to refer to said unique key words, such that machine dependent decisions concerning said instructions containing said unique key words are deferred to said installer step, said HPcode-Plus instructions forming part of said compiler intermediate representation.
58. The computer software compiler method of claim 57, wherein said tuple generator step further comprises the steps of:
(a) locating said HPcode-Plus instructions having said unique negative symbolic identifiers;
(b) accessing said second ANDF header flies, said second ANDF header files being related to the machine dependent standard header flies and comprising data type definitions and macro definitions which correspond to said unique negative symbolic identifiers, said data type definitions and said macro definitions being defined by and machine dependent upon said target computer platform;
(c) replacing said unique negative symbolic identifiers with said data type definitions and said macro definitions.
59. The computer software compiler method of claim 47, wherein said compiler step further comprises the steps of:
(a) identifying constant values with indeterminate data types in said expanded machine independent computer program;
(b) storing information in said compiler intermediate representation regarding said constant values with indeterminate data types by using CLDC HPcode-Plus instructions, said CLDC HPcode-Plus instructions forming part of said compiler intermediate representation and having a syntax
CLDC <flag> <constant value>
wherein
<flag>=a flag which indicates a data type set, said data type set comprising the steps of one or more data types, said one or more data types being organized sequentially from a first data type to a last data type;
<constant value>=a field which equals one of said constant values with indeterminate data types;
wherein said information is sufficient to enable said installer step to determine said data types of said constant values with indeterminate data types, such that machine dependent decisions concerning said constant values with indeterminate data types are deferred to said installer step.
60. The computer software compiler method of claim 59, wherein said low-level code generator step further comprises the step of assigning a data type to said <constant value> field from said CLDC HPcode-Plus instruction, said data type being from said data type set indicated by said <flag> field from said CLDC HPcode-Plus instruction, wherein said assigning means operates according to the ANSI-C computer programming language to assign said data type to said <constant value>.
61. The computer software compiler method of claim 47, further adapted for use with an ANSI-C computer programming language, wherein said compiler step further comprises the step of performing integral promotions specific to the ANSI-C computer programming language upon operands contained in said expanded machine independent computer program by using ICVT HPcode-Plus instructions, said ICVT HPcode-Plus instructions forming part of said compiler intermediate representation.
62. The computer software compiler method of claim 61, wherein said low-level code generator step further comprises the steps of:
(a) locating said ICVT HPcode-Plus instructions in said compiler intermediate representation, said 1CVT HPcode-Plus instructions operating to perform said integral promotions upon operands for integral promotion in said compiler intermediate representation; and
(b) performing said integral promotions upon said operands for integral promotion according to said ICVT HPcode-Plus instructions and the ANSI-C computer programming language.
63. The computer software compiler method of claim 47, wherein said compiler step further comprises the step of preparing two operands for processing by arithmetic operations by using an ACVT HPcode-Plus instruction to convert said two operands to a common data type, said two operands and said arithmetic operations being contained in said expanded machine independent computer program, said ACVT HPcode-Plus instruction forming part of said compiler intermediate representation, said ACVT HPcode-Plus instruction converting said two operands to said common data type by using conversion rules that are specific to the ANSI-C computer programming language, such that machine dependent decisions concerning said two operands are deferred to said installer step.
64. The computer software compiler method of claim 63, wherein said low-level code generator step further comprises the steps of:
(a) locating said ACVT HPcode-Plus instructions in said compiler intermediate representation, said ACVT HPcode-Plus instructions operating to convert operands for arithmetic operations in said compiler intermediate representation to a common data type; and
(b) converting said operands for arithmetic operations to said common data type according to said ACVT HPcode-Plus instructions and the ANSI-C computer programming language.
65. The computer software compiler method of claim 47, wherein said compiler step further comprises the steps of:
(a) assigning two part sequences to character constants in said expanded machine independent computer program, each of said two part sequences comprising:
(1) a first part, said first part identifying a first character set; and
(2) a second part, said second part identifying said character constant within said first character set;
(b) translating instructions having said character constants in said expanded machine independent computer program to said HPcode-Plus instructions by using said two part sequences to refer to said character constants, such that machine dependent decisions concerning said character constants are deferred to said installer step, said HPcode-Plus instructions forming part of said compiler intermediate representation.
66. The computer software compiler method of claim 65, wherein said low-level code generator step further comprises the step of assigning character constant values to said two part sequences, said character constant values being from an execution character set of the target computer platform, said execution character set being machine dependent on the target computer platform.
67. The computer software compiler method of claim 47, wherein said installer step further comprises the steps of:
(a) a low-level optimizer step for optimizing said level instructions to produce optimized low-level instructions, said optimized low-level instructions being stored in said low-level compiler intermediate representation;
(b) a machine specific optimizer step for optimizing said machine instructions to produce optimized machine instructions;
wherein said low-level optimizer step and said machine specific optimizer step operate in a machine dependent manner.
68. The computer software compiler method of claim 45, wherein said producer step further comprises the steps of:
(a) a High-Level Optimizer step for optimizing said compiler intermediate representation to produce an optimized compiler intermediate representation;
(b) an Archiver/Linker step archiving or linking said optimized compiler intermediate representation with other compiler intermediate representations to produce an HPcode-Plus Archive file or a Linked HPcode-Plus file;
wherein said High-Level Optimizer step and said Archiver/Linker step operate in a machine independent manner such that said optimized compiler intermediate representation, said HPcode-Plus Archive file, and said Linked HPcode-Plus file are machine independent.
69. The computer software compiler method of claim 68, further comprising the step of distributing said HPcode-Plus archive file or said Linked HPcode-Plus file to said one or more heterogeneous computer platforms.
70. The computer software compiler method of claim 69, further adapted for use with a machine configuration file and an extracting device which extracts said compiler intermediate representation from said HPcode-Plus archive file or said Linked HPcode-Plus file, the machine configuration file containing information describing a target computer platform, the target computer platform being one of the one or more heterogeneous computer platforms and having a register architecture, a memory architecture, and an instruction set, wherein each of said installer step which reside on said target computer platform comprise:
(a) a tuple-generator step for receiving a second ANDF header file and said compiler intermediate representation as input and for translating said HPcode-Plus instructions contained in said compiler intermediate representation into quadruple instructions;
(b) a low-level code generator step for receiving said quadruples instructions and the machine configuration file as input and for translating said quadruple instructions into low-level instructions, said low-level instructions being from the instruction set, said low-level instructions being stored in a low-level compiler intermediate representation;
(c) a register allocator step for receiving said low-level compiler intermediate representation and the machine configuration file as input and for mapping said low-level instructions to the register architecture of the target computer platform to produce machine instructions;
(d) an object file generator step for receiving said machine instructions as input and for translating said machine instructions to said object code representation of the machine independent computer program, said object code representation being stored in an object code file;
wherein said tuple-generator step, said low-level code generator step, said register allocator step, and said object file generator step operate in a machine dependent manner according to said HPcode-Plus compiler intermediate language, such that said tuple-generator step, said low-level code generator step, said register allocator step, and said object file generator step make the machine dependent decisions that were deferred from said producer step; and
wherein said object code representation is machine dependent upon said target computer platform.
71. A computer software compiler method, adapted for use with a machine independent computer program that may use machine independent standard header files, and with one or more heterogeneous computer platforms for translating the machine independent computer program to object code representation of the machine independent computer program, such that said object code representations are machine dependent on the one or more heterogeneous computer platforms, said computer software compiler method comprising the steps of:
(a) a producer step for translating the machine independent computer program to a compiler intermediate representation of the machine independent computer program in a machine independent manner according to an HPcode-Plus compiler intermediate language, said compiler intermediate representation comprising HPcode-Plus instructions from said HPcode-Plus compiler intermediate language, such that said compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format;
(b) an installer interpreter step for receiving said compiler intermediate representation as input and for executing said HPcode-Plus instructions contained therein without converting said compiler intermediate representation to object code representations of the machine independent computer program.
Description
CROSS-REFERENCE TO OTHER APPLICATIONS
The following pending applications of common assignee contain some common disclosure, and are believed to have effective filing dates identical with that of the present application:
ANDF INSTALLER USING THE HPCODE-PLUS COMPILER INTERMEDIATE LANGUAGE Ser. No. 07/542,922, filed Jun. 25, 1990;
ANDF PRODUCER USING THE HPCODE-PLUS COMPILER INTERMEDIATE LANGUAGE Ser. No. 07/543,021, filed Jun. 25, 1990.
BACKGROUND OF THE INVENTION
The present invention relates generally to computer software compiler systems and methods, and specifically to computer software compiler systems and methods for enhanced distribution of computer software.
Ideally, the same version of a computer program could be distributed to heterogeneous computer platforms (heterogeneous computer platforms being computer platforms having different computer architectures and different computer operating systems). The computer program would operate, without modifications, on the heterogeneous computer platforms.
This distribution ideal is desirable for a number of reasons. First, the availability of computer software is enhanced if software is easily distributed. For end-users, easily-distributed computer programs means that their software acquisition and purchasing tasks are simplified. For software vendors, easily-distributed computer programs means their stocking and distribution costs are minimized.
Additionally, for software producers, easily-distributed computer programs are desirable for economic efficiency reasons. Initial development and subsequent maintenance costs would be minimized if a programming team could limit their design, implementation, and maintenance efforts to a single computer program version, Distribution costs would also be minimized if a single computer program version could be marketed to heterogeneous computer platforms.
The ability to reach this distribution ideal depends on two factors: the manner in which software is written and the format in which software is distributed.
Today, software is ordinarily written in a machine dependent manner, For example, software written for an IBM Personal Computer (IBM PC) will often use the function calls that are provided by DOS (Disk Operating System), the IBM PC operating system. Such software is machine dependent because it includes references to specific features (i.e., DOS function calls) of a particular computer platform (i.e., the IBM PC).
Machine dependent software can operate only on its native computer platform (i.e., the computer platform on which it was created). Modifications are necessary for it to operate on other computer platforms. Therefore, machine dependent software is economically inefficient because separate versions of each computer program are required, one for each target computer platform (i.e., a computer platform on which a computer program is meant to operate).
It is possible to write software so that it does not depend on the specific features of any particular computer platform. That is, software that depends neither on the specific hardware nor specific software features of any particular computer platform. Such software is said to be machine independent. Theoretically, machine independent software (or machine independent computer programs) can operate on heterogeneous target computer platforms without any modifications.
But the ability of software to operate on heterogeneous target computer platforms also depends on the manner in which software is distributed (i.e., the format of the software distribution copy). There are two software distribution formats: an architecture neutral distribution format and an architecture dependent distribution format.
A machine independent computer program that is distributed in the architecture dependent distribution format (ADDF) can only operate on its native computer platform. Object and executable code formats are examples of ADDFs. ADDFs are inefficient because multiple versions of the software distribution copy are required, one for each heterogeneous target computer platform.
Conversely, a machine independent computer program that is distributed in the architecture neutral distribution format (ANDF) can operate on any computer platform. Thus, ANDFs are efficient because only one version of the software distribution copy is required, and this version can be distributed without modifications to heterogeneous target computer platforms.
Therefore, the distribution ideal is reached through the combination of machine independent computer programs plus ANDF. That is, the combination of machine independent computer programs plus ANDF produces computer programs that can operate, without any modifications, on heterogeneous computer platforms.
There have been many attempts at defining a working ANDF specification. Perhaps the first attempt was in 1969 with the creation of UNCOL. UNCOL was a compiler intermediate language which had some ANDF features. The creators of UNCOL, however, were not attempting to define an ANDF specification. Thus, UNCOL, while having some ANDF features, was not a complete ANDF specification.
In November 1988, the European Roundtable commissioned Logica to perform an ANDF feasibility study. The Logica study, which was completed in April 1989, reiterated the goals, the requirements, and the impact of ANDF, but did not define a complete ANDF specification.
In April 1989, the Open System Foundation (OSF) solicited proposals, via a Request for Technology (RFT), for an ANDF standard for Unix computer platforms. OSF received over 20 proposals (hereinafter referred to as the "OSF proposals") in response to its RFT.
Generally, ANDF specification proposals are based on one of the four generally accepted ANDF approaches: ANDF Using Source Code; ANDF Using Encrypted Source Code; ANDF Using Tagged Executable Code; and ANDF Using Compiler Intermediate Representation.
The first ANDF approach, ANDF Using Source Code, uses the computer program source code as the software distribution format. Under this approach, machine independent source code is distributed to heterogeneous target computer platforms. At each target computer platform, computer operators use their compilers to compile their source code copies.
The ANDF Using Source Code approach, however, is inherently flawed because proprietary secrets, embedded within the source code, cannot be protected if the source code is used as the ANDF. Therefore, distributing computer programs at the source code level, although being architecturally neutral, is not feasible for most business applications.
The second ANDF approach, ANDF Using Encrypted Source Code, is a variation of the first. Under this approach, encrypted source code is distributed to heterogeneous target computer platforms. The operators at each target computer platform use special compilers to compile their copies of the encrypted source code. These special compilers have two parts, an decrypter and a conventional compiler. The special compilers first decrypt, and then compile, the encrypted source code.
The ANDF Using Encrypted Source Code approach seemingly solves the security problem of the first approach, since embedded proprietary secrets are protected by the encryption process. The security problem is not completely solved, however, because the de-encrypted source code can be intercepted after de-encryption by the special compiler. Thus, like the first approach, the ANDF Using Encrypted Source Code approach is inherently flawed because it exposes embedded proprietary secrets to the public.
Under the third ANDF approach, ANDF Using Tagged Executable Code, the software distribution format is composed of a first part and a second part. The first part contains executable code in the native computer platform's machine language. The second part contains information concerning the native computer platform's machine language. This second part is called a Key.
Special compilers use the Key to convert the first part of the software distribution copy to executable code for their respective target computer platforms.
This third ANDF approach, however, is inherently flawed because it is not truly architecturally neutral. Instead, it is architecturally biased.
The fourth ANDF approach, ANDF Using Compiler Intermediate Representation, uses a compiler intermediate representation as the software distribution format. To understand this approach, it is necessary to describe some high-level software compiler concepts.
Software compilers are composed of two parts, a front end and a back end. The compiler front end receives computer programs as input. These computer programs are normally written in high level programming languages, such as Pascal, C, and Ada.
The compiler front end scans, parses, and performs semantic analysis on the computer program. In other words, the front end is responsible for language dependent processing of the computer program. After all language dependent processing is complete (and if no errors have been found), the front end generates a compiler intermediate representation of the computer program. The compiler intermediate representation is analogous to an assembly language representation of the computer program.
Compiler back ends receive the compiler intermediate representations as input and convert the compiler intermediate representation to object code representations for specific computer platforms.
The object code representations are is then converted to executable code representations by linkers on the target compiler platforms. Linkers are not part of compilers.
Normally, the front end generates compiler intermediate representations in a machine dependent manner. This is particularly true for operations involving memory allocation, data type conversion, and include file processing. Thus, compiler intermediate representations are normally machine dependent and thus unsuitable as an ANDF.
If, however, the front end operates in a machine independent manner, and if the resulting compiler intermediate representation makes no assumptions about the specific architectural features of particular computer platforms, then the compiler intermediate representation is architecturally neutral. Thus, such a compiler intermediate representation is an ANDF.
Under the ANDF Using Compiler Intermediate Representation approach, therefore, an architecture neutral compiler intermediate representation is used as the software distribution format. ANDF Compiler front ends (or "ANDF Producers") are located on native computer platforms and ANDF Compiler back ends (or "ANDF Installers") are located on target computer platforms.
ANDF Producers create compiler intermediate representations of computer programs. These compiler intermediate representations, being architecturally neutral, are distributed to heterogeneous target computer platforms. ANDF Installers install the compiler intermediate representations on target computer platforms. An ANDF Interpreter may be substituted for the ANDF Installer. An ANDF Interpreter directly executes intermediate instructions without first translating them to executable code.
The ANDF Using Compiler Intermediate Representation approach solves the security problems of the first and second ANDF approaches. High-level source code constructs, which encompass the computer program's proprietary secrets, are represented with difficult-to-read low-level instruction sequences. Also, low-level instruction sequences are represented by strings of numbers, rather than mnemonics.
The ANDF Using Compiler Intermediate Representation approach solves the inherent problems of the third ANDF approach, since the ANDF Using Compiler Intermediate Representation approach is truly architecture neutral (i.e., machine independent).
Thus, the ANDF Using Compiler Intermediate Representation approach has no inherent flaws. This ANDF approach, however, presents many difficult design and implementation problems.
Specifically, a compiler intermediate language must be defined so that the ANDF Producer, based on this definition, can produce compiler intermediate representations that are free from the machine dependencies which are normally produced by the application of inherently machine dependent computer operations, such as memory allocation, data type conversion, data folding, and include file processing. These operations are described below.
Additionally, the compiler intermediate language must be defined so that the ANDF Installer, based on this definition, can receive the compiler intermediate representation as input and produce executable code for any target computer platform.
Memory allocation operations are inherently machine dependent because they depend on a particular computer platforms specification for data alignment, data sizes, and data attributes. For example, some computer platforms align integers so that the most significant byte is in the lowest memory address, while others align integers so that the least significant byte is in the lowest memory address. Also, some computer platforms specify integers as being signed and 32 bits wide, while others specify integers as being unsigned and 16 bits wide.
Memory allocation operations are also dependent upon a particular computer platforms data representation scheme. For example, for computer platforms which support the ASCII character set, the string "HELLO" would be represented in memory as the following sequence of hexidecimal bytes: 48 45 4C 4C 4F. However, the string "HELLO" would be represented as a different sequence of hexidecimal bytes in computer platforms which support the EBCDIC character set
Data type conversion and data folding operations are also inherently machine dependent. For example, in converting a signed short integer (with a value of less than zero) to a standard sized signed integer, some computer platforms will insert all zeroes in front of the most significant digit. Other computer platforms will insert all ones.
Also, the resulting data type of an expression is not always apparent. For example, in the expression y=x+20000+17000, some computer platforms may represent the result of 20000+17000 as an integer, while others may represent the result as a long integer.
Many high level languages, such as C, allow computer programmers to add predefined or often-used code into their programs through the use of include files. Often, these include files include macro operations, which are similar to software procedures and functions. Macros defined on one computer platform may not exist or may exist in different forms on other computer platforms.
Many of the OSF proposals were based on the ANDF Using Compiler Intermediate Representation approach. For the most part, however, the OSF proposals were not completely architecture neutral because they failed to address all the implementation problems described above.
A proposal describing the present invention was submitted in response to the OSF RFT.
The present invention represents an ANDF specification based on the ANDF Using Compiler Intermediate Representation approach. Unlike other ANDF specifications, the present invention is based on the Ucode compiler intermediate language. Additionally, the ANDF specification defined by the present invention is completely architecture neutral.
SUMMARY OF THE INVENTION
The present invention is directed to a computer software compiler system and method for distributing a machine independent computer program, created on a native computer platform, to heterogeneous target computer platforms. Specifically, the present invention is directed to an architecture-neutral distribution format (ANDF) computer software compiler (hereinafter called an "ANDF Compiler").
ANDF Compilers are composed of two components, an ANDF Producer and an ANDF Installer. The ANDF Producer and the ANDF Installer are analogous to a front end and a back end, respectively, of a convention computer software compiler.
The ANDF Producer and the ANDF Installer reside on heterogeneous computer platforms. Specifically, the ANDF Producer resides on a producer site or a native computer platform, and the ANDF Installer reside on a install site or a target computer platform.
The ANDF Producer receives the machine independent computer program as input and generates a compiler intermediate representation of the machine independent computer program. The ANDF Producer generates the compiler intermediate representation according to an HPcode-Plus compiler intermediate language, such that the compiler intermediate representation is composed of HPcode-Plus instructions from the HPcode-Plus compiler intermediate language.
In generating the compiler intermediate representation according to the HPcode-Plus compiler intermediate language, the ANDF Producer is operating in an architecture neutral, or machine independent, manner.
Specifically, according to the HPcode-Plus compiler intermediate language, the ANDF Producer defers machine dependent decisions concerning inherently machine dependent computer operations to the ANDF Installer. Therefore, the compiler intermediate representation which is generated by the ANDF Producer is free from the machine dependencies which are normally produced by the application of inherently machine dependent computer operations, such as memory allocation, data type conversion, data folding, and include file processing.
Therefore, the compiler intermediate representation is architecture neutral and represents an architecture neutral distribution format (ANDF).
In the present invention, the compiler intermediate representation which is generated by the ANDF Producer is distributed to the install sites. The install sites represent heterogeneous target computer platforms.
The ANDF Installers, which reside on the install sites, receive the compiler intermediate representation as input and generate object code representations of the machine independent computer program. In generating object code, the ANDF Installers make the machine dependent decisions which were deferred from the ANDF Producer. Thus, the object code representations are architecture dependent, or machine dependent, on the target computer platforms.
Thus, as is clear from the above, the machine independent computer program can be compiled using the ANDF Compiler of the present invention to produce the compiler intermediate representation which is free from any machine dependencies. The compiler intermediate representation represents an architecture neutral distribution format and can be distributed to heterogeneous target computer platform. Distribution is enhanced since a single version of the machine independent computer program can be used, with no modifications, on heterogeneous target computer platforms.
Further features and advantages of the present invention will be apparent from the ensuing description with reference to the accompanying drawings to which, however, the scope of the present invention is in no way limited.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a high-level structural and operational block diagram of conventional computer software compilers. The square blocks represent modules and the arrows represented operation and data flow.
FIG. 2 is a high-level structural and operational block diagram of a preferred embodiment of the present invention. The square blocks represent modules and the arrows represented operation and data flow.
FIGS. 3A, 3B, and 3C are tables listing the instruction classes, instruction mnemonics, operational code (opcode) hex values, and descriptions of the HPcode-Plus compiler intermediate language instruction set for the ANSI-C computer programming language.
FIG. 3D illustrates the manner in which FIGS. 3A, 3B, and 3C are connected.
FIGS. 4A-4G are all associated with instruction classes, instruction mnemonics, operational code (opcode) hex values, and descriptions of the HPcode-Plus compiler intermediate language instructions for programming languages other than the ANSI-C computer programming language.
FIG. 4A is a table listing the symbolic identifiers and descriptions of additional HPcode-Plus predefined data types which support computer programming languages other than ANSI-C.
FIG. 4B is a table listing additional <sym kind> values for the SYM HPcode-Plus instruction which support computer programming languages other than ANSI-C.
FIGS. 4C1 and 4C2 are tables listing additional HPcode-Plus operators which support computer programming languages other than ANSI-C.
FIG. 4C3 illustrates the manner in which FIGS. 4C1 and 4C2 are connected.
FIGS. 4D, 4E, 4F, and 4G are tables listing additional HPcode-Plus operators which support the ADA, COBOL, FORTRAN, and PASCAL, respectively, computer programming languages.
FIG. 5 is a table listing HPcode-Plus predefined data types for the ANSI-C computer programming language.
FIG. 6 is a table showing the mapping from ANSI-C data types to HPcode-Plus data types.
FIG. 7 is a table listing values of <sym kind> of a HPcode-Plus instruction SYM for defining data types other than HPcode-Plus predefined data types.
FIG. 8 is a sequence of HPcode-Plus instructions which show the structure of a HPcode-Plus object file.
FIG. 9 is a table listing predefined symbolic identifiers for HPcode-Plus predefined data types for the ANSI-C computer programming language.
FIG. 10 is a table which lists values of <sym kind> of the HPcode-Plus instruction SYM.
FIG. 11 is a structural and operational block diagram of a preferred embodiment of the ANDF Producer. The square blocks represent modules and the arrows represented operation and data flow.
FIG. 12 is a structural and operational block diagram of a preferred embodiment of a compiler component of the ANDF Producer. The square blocks represent modules and the arrows represented operation and data flow.
FIG. 13 is a structural and operational block diagram of a preferred embodiment of the ANDF Installer. The square blocks represent modules and the arrows represented operation and data flow.
FIG. 14 is a partial computer program listing of an example machine independent ANSI-C computer program.
FIGS. 15A, 15B, 15C, and 15D show a sequence of HPcode-Plus instructions which represents a HPcode-Plus translation of the partial computer program listing of FIG. 14.
FIG. 15E illustrates the manner in which FIGS. 15A, 15B, 15C, and 15D are connected.
______________________________________
DETAILED DESCRIPTION
OF THE PREFERRED EMBODIMENTS
TABLE OF CONTENTS
______________________________________
1. ANDF Compiler
2. HPcode-Plus
2.1. Virtual Machine Model (Expression Stack Model)
2.2. Memory Model
2.3. Memory Allocation and Data Types
2.4. HPcode-Plus Object File
2.5. HPcode-Plus Instruction Set for ANSI-C
3. ANDF Producer
3.1. Preprocessor
3.2. Compiler
3.2.1. Scanner/Parser
3.2.2. Semantic Analyzer
3.2.3. Code Generator
3.2.3.1.
Memory Allocation
3.2.3.2.
Scope, Linkage, and Declaration
of Variable
3 2.3.3.
Constants
3.2.3.3.1.
Floating Point and
Integer Constants
3.2.3.3.2.
Enumeration
Constants
3.2.3.3.3.
Character Constants
3.2.3.4.
Data Conversions
3.2.3.5.
Postfix Expressions
3.2.3.6.
Unary Operations
3.2.3.7.
Other Operations
3.2.3.8.
Folding of Constant Operations
3.2.3.9.
Initialization
3.2.3.10.
Statements
3.2.3.11.
Functions
3.2.3.12.
Example
3.3. High-Level Optimizer
3.4. Archiver
4. ANDF Installer
4.1. Tuple-Generator
4.2. Low-Level Code Generator
4.2.1. Instruction Selection
4.2.2. Memory Allocation
4.2.3. Symbolic Debug Support
4.2.4. Optimization Support
4.2.5. Object File Management
4.3. Low-Level Optimizer
4.4. Register Allocator
4.5. Machine Specific Optimizer
4.6. Object File Generator
______________________________________
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
______________________________________
DETAILED DESCRIPTION
OF THE PREFERRED EMBODIMENTS
TABLE OF CONTENTS
______________________________________
1. ANDF Compiler
2. HPcode-Plus
2.1. Virtual Machine Model (Expression Stack Model)
2.2. Memory Model
2.3. Memory Allocation and Data Types
2.4. HPcode-Plus Object File
2.5. HPcode-Plus Instruction Set for ANSI-C
3. ANDF Producer
3.1. Preprocessor
3.2. Compiler
3.2.1. Scanner/Parser
3.2.2. Semantic Analyzer
3.2.3. Code Generator
3.2.3.1.
Memory Allocation
3.2.3.2.
Scope, Linkage, and Declaration
of Variable
3 2.3.3.
Constants
3.2.3.3.1.
Floating Point and
Integer Constants
3.2.3.3.2.
Enumeration
Constants
3.2.3.3.3.
Character Constants
3.2.3.4.
Data Conversions
3.2.3.5.
Postfix Expressions
3.2.3.6.
Unary Operations
3.2.3.7.
Other Operations
3.2.3.8.
Folding of Constant Operations
3.2.3.9.
Initialization
3.2.3.10.
Statements
3.2.3.11.
Functions
3.2.3.12.
Example
3.3. High-Level Optimizer
3.4. Archiver
4. ANDF Installer
4.1. Tuple-Generator
4.2. Low-Level Code Generator
4.2.1. Instruction Selection
4.2.2. Memory Allocation
4.2.3. Symbolic Debug Support
4.2.4. Optimization Support
4.2.5. Object File Management
4.3. Low-Level Optimizer
4.4. Register Allocator
4.5. Machine Specific Optimizer
4.6. Object File Generator
______________________________________
1. ANDF Compiler
The present invention is directed to the standards being promulgated by ANSI for the C programming language (i.e., ANSI-C) and by the Open Software Foundation (OSF) for Unix computer platforms. It should be understood, however, that the present invention is not limited to the ANSI and OSF standards.
The present invention, in either its present form or in the forms now contemplated, is applicable to computing environments which use deviations, modifications, and extensions of the ANSI and OSF standards. The scope of the present invention with respect to the ANSI and OSF standards is more fully described in the following text.
As shown in FIG. 1, a conventional compiler 106 is logically divided into two parts, a compiler front end 108 and a compiler back end 116. The compiler front end 108 receives as input a computer program source code 102 as input. The computer program 102 is ordinarily written in a high-level computer programming language such as Pascal, C, and Ada.
The compiler front end 108 is responsible for the language processing of computer programs, such as scanning, parsing, and semantic analysis. Following the completion of all language processing, the compiler front end 108 translates the computer program source code 102 into a compiler intermediate representation 112. The compiler intermediate representation 112 is written in a compiler intermediate language, such as Pcode and Ucode.
The compiler back end 116 receives as input the compiler intermediate representation 112 and generates object code 120 for a target computer platform (not shown). The target computer platform is the computer platform where the compiler back end 116 resides. The object code 120 is written in a particular machine language of the target computer platform.
Ordinarily, the compiler front end 108 operates in a machine dependent, or architecture dependent, manner. Thus, the compiler intermediate representation 112, which is generated by the compiler front end 108, is usually dependent upon the computer architecture of the native computer platform (i.e., the computer platform where the compiler front end 108 resides).
The present invention is a significant improvement from the conventional compiler 106 shown in FIG. 1. The improvement of the present invention is achieved by using a HPcode-Plus compiler intermediate language as the compiler intermediate language. The HPcode-Plus compiler intermediate language (or simply, HPcode-Plus) is an improvement upon conventional compiler intermediate languages, such as Pcode and Ucode, in that HPcode-Plus is architecture neutral. As such, compilers which are based on the HPcode-Plus compiler intermediate language operate in an architecture neutral, or machine independent, manner.
FIG. 2 presents an overview of a preferred embodiment of the present invention. Included in FIG. 2 is a high-level block diagram of an architecture neutral distribution format compiler 234 of the present invention (i.e., an ANDF compiler). The ANDF compiler 234 shown in FIG. 2 is based on the HPcode-Plus compiler intermediate language.
In the preferred embodiment of the present system, logic for the ANDF Compiler 234 is stored in a computer program. The computer program is stored in a computer readable medium, such as a magnetic tape, a magnetic disk, and a read only memory (ROM).
Like conventional compilers 106, the ANDF compiler 234 of the present invention has a front end. With the ANDF compiler 234, however, the front end is called an ANDF Producer 208. The ANDF compiler 234 also has one or more back ends. In the preferred embodiment of FIG. 2, two back ends, called ANDF Installers 218 and 228, are shown.
The ANDF Producer 208 resides on a native computer platform 206. It should be noted that the native computer platform 206 is also called a producer site. The one or more ANDF Installers 218 and 228 reside on install sites 216 and 26, respectively, which are also called target computer platforms. It should be noted that the producer site 206 and install sites 216, 226 may or may not represent the same computer platform.
The ANDF Producer 208 and the ANDF Installers 218, 228 can operate either independently, as two separate computer programs, or together, as two phases of a single computer program.
As shown in FIG. 2, the ANDF Producer 208 receives the ANSI-C source code of machine independent computer programs 202. It should be understood, however, that the present invention is not limited to support for only ANSI-C source language programs. As described below, support for other high-level languages is contemplated.
Machine independent computer programs 202 are computer programs which are composed of instructions. These instructions include high-level source statements and expressions.
Machine independent computer programs 202 do not make assumptions on system specific features, such as memory architectures and register architectures. Machine independent computer programs 202 also do not contain references to system specific functions, such as non-standard operating system function calls. Machine independent computer programs 202 may contain references to standard object-like macros, function-like macros, and data type definitions from standard header files. Machine independent computer programs 202 may also contain references to standardized function calls. The standard header files and standardized function calls are defined by a language standard, such as ANSI-C.
The ANDF Producer 208 operates in an architecture neutral, or machine independent, manner according to the HPcode-Plus compiler intermediate language. Thus, in translating the ANSI-C source language program 202 to a compiler intermediate representation 212, the ANDF Producer 208 makes no assumptions about the architecture of the target computer platforms 216 and 226. Thus, the compiler intermediate representation 212 generated by the ANDF Producer 208 is architecture neutral, or machine independent, and represents an architecture neutral distribution format (ANDF).
The compiler intermediate representation 212 is distributed to the target computer platforms 216 and 226. The ANDF Installers 218 and 228, which reside on the target computer platforms 216 and 226, respectively, translate the compiler intermediate representation 212 to object code representations 222 and 232.
In an alternative embodiment of the present invention, ANDF Installers 216, 226 are replaced by ANDF Interpreters (not shown in FIG. 2). The ANDF Interpreters directly execute the compiler intermediate representation 212 without first translating the compiler intermediate representation 212 to object code representations 222, 232.
As noted above, the HPcode-Plus compiler intermediate language is used as the compiler intermediate language in the preferred embodiment of the present invention. Thus, the ANDF Producer 208 writes the compiler intermediate representation 212 in the HPcode-Plus compiler intermediate language.
The HPcode-Plus compiler intermediate language is an improvement upon HPcode, which was an improvement upon U-Code. U-Code is a compiler intermediate language which was originally used for distributing a Pascal compiler to the CRAY-1 and S-1 computer platforms. It was developed by Stanford and the University of California at San Diego. U-Code, however, was not architecture neutral and could only support Pascal and Fortran source language programs.
HPcode is also a compiler intermediate language. It was based on U-Code, but its instruction set was expanded to support other high-level languages, including Pascal, Fortran, Ada, Cobol, RPG, Business Basic, an internal Algol-like language, and a fourth generation language called HP-Transact.
As noted above, the HPcode-Plus compiler intermediate language is an improvement upon HPcode, and upon all conventional compiler intermediate languages, in that HPcode-Plus is architecture neutral. As such, compilers which are based on the HPcode-Plus compiler intermediate language, such as the ANDF compiler 234 of the present invention, operate in an architecture neutral, or machine independent, manner.
The HPcode-Plus compiler intermediate language is a further improvement upon HPcode, in that HPcode-Plus supports the C programming language.
The following sections describe the present invention in more detail.
Section 2 describes the HPCode-Plus compiler intermediate language, which is the compiler intermediate language of the present invention.
Section 3 describes the ANDF Producer 208 of the present invention.
Section 4 describes the ANDF Installer 218, 228 of the present invention.
Some aspects of the present invention can be implemented using existing compiler technology. However, modifications upon existing compiler technology are required to achieve the improvements of the present invention. The discussions in Sections 2, 3, and 4 focus on these modifications upon existing compiler technology. For a general discussion of existing compiler technology, see Compilers, Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman (Addison Wesley 1986), which is incorporated in its entirety herein by reference.
As it presently exists, the HPcode-Plus compiler intermediate language can be used as the compiler intermediate language for architecture neutral C compilers on OSF computer platforms, such as the ANDF compiler 234 of the present invention. Thus, the descriptions of the preferred embodiment of the present invention in Sections 2, 3, and 4 are, for the most part, focused on this computing environment.
It should be understood, however, that the present invention is not restricted to this computing environment.
Since it is an improvement of HPcode, the HPcode-Plus compiler intermediate language has a rich instruction set for supporting other high-level languages, such as Pascal, Fortran, Ada, Cobol, RPG, Business Basic, an internal Algol-like language, and a fourth generation language called HP-Transact. It is contemplated to modify HPcode-Plus to support these languages in an architecture neutral manner.
Additionally, the compiler intermediate representations 212 produced by ANDF compilers 234 should operate on any computer platform, provided that (1) the source language program 202 was written in a machine independent manner using standardized function calls to a run-time library, and (2) appropriate ANDF Installers 218, 228 exist on the target computer platforms 216, 226.
2. HPcode- Plus
In the preferred embodiment of the present invention, the HPcode-Plus compiler intermediate language is used as the compiler intermediate language.
HPcode-Plus is an improvement upon conventional compiler intermediate languages in that HPcode-Plus is architecture neutral. As such, compilers which are based on the HPcode-Plus compiler intermediate language, such as the ANDF compiler 234 of the present invention, operate in a architecture neutral, or machine independent, manner.
Features of the HPcode-Plus compiler intermediate language are described in this section. HPcode-Plus instructions are shown with all letters capitalized.
Other features of HPcode-Plus are described as necessary in Sections 3 and 4 of this document. When reading these sections, it may be helpful to refer to FIGS. 3A, 3B, 3C and 3D which present a list of the HPcode-Plus compiler intermediate language instructions for ANSI-C.
2.1 Virtual Machine Model (Expression Stack Model)
The HPcode-Plus compiler intermediate language is very similar to assembly language for a HPcode-Plus virtual (i.e., fictional ) computer platform. Correspondingly, compiler intermediate representations written in the HPcode-Plus compiler intermediate language are very similar to assembly language programs for the HPcode-Plus virtual computer platform.
Referring to FIG. 2, for example, the ANDF Producer 208 translates the source code 202 into its equivalent assembly language representation 212.
The ANDF Producer 208, however, does not generate the assembly language representation 212 for its native computer platform 206. Rather, the ANDF Producer 208 generates the assembly language representation 212 for the HPcode-Plus virtual computer platform. The HPcode-Plus compiler intermediate language represents the assembly language for the HPcode-Plus virtual computer platform.
The ANDF Installer 218, 228 receives the compiler intermediate representation 212, which represents assembly language for the HPcode-Plus virtual computer platform, and generates the object code 222, 232 for its target computer platform 216, 226.
The HPcode-Plus virtual computer platform contains an expression stack and a memory. Most HPcode-Plus instructions receive their arguments from and push their results to the expression stack. A data type is associated with each data object on the expression stack.
HPcode-Plus instructions do not directly manipulate arbitrary expression stack elements. At most, HPcode-Plus instructions manipulate only the top N elements of the expression stack, where N is defined for each HPcode-Plus instruction.
HPcode-Plus instructions DUP, SWP, and ROT manipulate the top elements on the expression stack without altering their values. A HPcode-Plus instruction DEL deletes the top element on the expression stack.
Several HPcode-Plus instructions are provided for moving data between the expression stack and the memory, including HPcode-Plus instructions LOD and ILOD for direct and indirect load, HPcode-Plus instructions STR and ISTR for direct and indirect store, and HPcode-Plus instruction INST for indirect non-destructive store.
In addition, data object addresses, labels, and procedures are loaded on the expression stack with HPcode-Plus instructions LDA, LDL, and LDP, respectively. HPcode-Plus instructions LDC and LCA are used to load constants and constant addresses on the expression stack.
The expression stack may not physically exist on target computer platforms 216, 226. For example, if target computer platforms 216, 226 are register-based, then the expression stack may be modeled in registers.
Care must be taken by the ANDF Installers 218, 228 to preserve the semantics of the expression stack. Values loaded onto the expression stack are "copied" onto the expression stack. HPcode-Plus instructions do not alter values already on the expression stack.
Certain restrictions are enforced concerning the use of the expression stack. Branching and label HPcode-Plus instructions require the expression stack to be empty. This relieves the ANDF Installer 218, 228 from having to determine all possible jump sources for each label.
HPcode-Plus procedure calls may occur when the expression stack is empty. Elements which are on the expression stack when a HPcode-Plus instruction MST is executed will not be visible to the called procedure but will become visible again upon return.
2.2 Memory Model
HPcode-Plus defines a memory model in which the memory is divided into 4 areas: Static memory, Local memory, Constant memory, and Parameter memory. Data objects in each memory area can have 3 attributes associated with them: Constant attribute, Register attribute, and Volatile attribute.
In the architecture neutral environment of the present invention, the ANDF Producer 206 makes no assumptions concerning the manner in which data objects are mapped into the memories of target computer platforms 216, 226. A HPcode-Plus instruction SYM is provided to defer actual memory allocation to the ANDF Installers 218, 228.
The ANDF Producer 208 uses the HPcode-Plus instruction SYM to give a unique symbolic identifier to each data object and data type. The only exception to this rule is that the symbolic identifier of data types and data objects in local scopes can be reused.
The ANDF Producer 208 also uses the HPcode-Plus instruction SYM to associate a memory type and attribute (i.e., Constant, Register, and Volatile) with each data object.
HPcode-Plus memory reference instructions access memory through the symbolic identifiers of data objects defined by the SYM HPcode-Plus instruction. These HPcode-Plus memory reference instructions include LDA and LCA to load an address of an HPcode-Plus data object onto the expression stack. Also, HPcode-Plus contains instructions which manipulate memory indirectly, such as the string instructions.
The ANDF Installers 218, 228 map data objects into the memories of their target computer platforms 216, 226. Such mapping depends on the memory type and attribute associated with each data object. A description of the memory types and attributes is presented in the following paragraphs.
Data objects with the Static memory type maintain their values from one invocation of a procedure to the next. Static memory can be subdivided into global static memory, imported static memory, exported static memory, and procedure static memory. Variables mapping into these memories include file static variables, imported static variables, exported static variables, and procedure (or local) static variables, respectively.
Data objects with the Local memory type do not maintain their values from one invocation of a procedure to the next. These objects are allocated to the Local memory area. Each procedure receives one Local memory area.
For languages which do not support nested level procedures, such as ANSI-C, local data objects can only be referenced by the defining procedure.
For languages subdistributing nested level procedures, such as Pascal, local data objects can be referenced by the defining procedure or its lower level nested procedures. For these languages, the scoping rules are equivalent to those in Pascal.
Data objects with the Parameter memory type are allocated in the Parameter memory area. Each procedure receives one Parameter memory area.
For languages which do not support nested procedures, such as ANSI C, variables in the Parameter memory area may be referenced only by the defining procedure.
For languages which support nested procedures, such as Pascal, variables in the Parameter memory area may be referenced from outside the defining procedure subject to the Pascal scoping rules. This area is allocated by the calling procedure.
All HPcode-Plus constants reside in the Constant memory area. In addition, any data objects with the Constant memory attribute defined by the SYM HPcode-Plus instruction of KIND.sub.-- MODIFIER are allocated in the Constant memory area.
The ANDF Installers 218, 228, if possible, treat the Constant memory area as read-only. The ANDF Producer 208, however, does not assume that the target computer platforms 216, 226 can support read-only memory. Assignment to the Constant memory area yields undefined behavior.
The Register attribute serves as a hint to the ANDF Installers 218, 228 that the associated data object should be allocated in fast cache memory. The ANDF Installers 218, 228, however, are not obliged to honor the request. Loading an address of a variable with the register attribute set can automatically turn off the register attribute.
2.3 Memory Allocation and Data Types
The ANDF Producer 208 makes no assumptions concerning the manner in which data objects are mapped into the memories of target computer platforms 216, 226. Instead, the ANDF Producer 208 satisfies memory requests through the SYM HPcode-Plus instruction.
Actual memory allocation is performed by the ANDF Installers 218, 228. The actual size and alignment of each data object is determined by ANDF Installers 218, 228, based on the data type specified in the SYM HPcode-Plus instruction.
HPcode-Plus defines predefined data types. A list of the HPcode-Plus predefined data types is presented in FIG. 5. With minor differences, the HPcode-Plus predefined data types map into the corresponding data types in ANSI-C. The mapping of ANSI-C data types to HPcode-Plus predefined data types is presented in FIG. 6.
The HPcode-Plus predefined data types have unique predefined symbolic identifiers. The HPcode-Plus predefined data types are the only data types that need not be defined by the SYM HPcode-Plus instruction. They are the building blocks for user-defined data types.
The user-defined data types, which represent all data types besides the HPcode-Plus predefined data types, are defined using the SYM HPcode-Plus instruction with the <sym kind> parameter equal to one of the type values listed in FIG. 7.
The ANDF Installers 218, 228 must ensure that their memory allocation schemes are consistent with those of a compiler on the native computer platform 206.
2.4 HPcode-Plus Object File
The compiler intermediate representation 212 produced by the ANDF Producer 208 is stored in a HPcode-Plus Object file 1160 (FIG. 11).
The HPcode-Plus Object file 1160 is a file containing a sequence of HPcode-Plus instructions in ASCII form which follow certain rules of form. The HPcode-Plus Object file 1160 is also referred to as a compilation unit.
As shown in FIG. 11, several HPcode-Plus Object files 1150, 1160 can be archived or linked by an Archiver/Linker 1154 to produce a single HPcode-Plus Archive file 1158 or Linked HPcode-Plus File 1170.
The format of the HPcode-Plus Object file 1150, 1160 is shown in FIG. 8.
HPcode-Plus instruction names are shown in FIG. 8 in text form for readability only. In actual HPcode-Plus Object files 1150, 1160, numeric opcodes are used instead.
Instructions are delimited by ASCII new line characters. All integers, including opcodes, are represented by hexadecimal literals to minimize the size of the HPcode-Plus Object file 1150, 1160.
Within a line, fields are delimited by one or more blanks. An `opcode` field is first, followed by zero or more `operand` fields. Operand fields consist of integers, quoted strings, labels, real numbers, digit strings representing sets and `%` or `#` followed by integers representing macro arguments and SYM symbolic ids, respectively. Quoted strings are delimited by double quotes (an internal quote is represented by two double quotes in a row).
Each ENT/END HPcode-Plus instruction sequence denotes the code of a procedure. SYM HPcode-Plus instructions within a pair of KIND.sub.-- FUNCTION and KIND.sub.-- END SYM HPcode-Plus instructions represent the data declarations of that procedure.
For languages which allow multiple entry points, more than one ENT HPcode-Plus instruction appears before the END HPcode-Plus instruction. The first ENT HPcode-Plus instruction signals the primary entry point and defines the start of the scope of a procedure. The END HPcode-Plus instruction signals the end of the entire procedure.
If the HPcode-Plus Object file 1150, 1160 contains the procedure which serves as the program entry point. Then the HPcode-Plus Object file 1150, 1160 is specially marked with an OPTN HPcode-Plus instruction. Execution of an HPcode-Plus computer program begins with the program entry point procedure. Each HPcode-Plus instruction is executed in sequence unless an error occurs or a HPcode-Plus instruction is executed which transfers control.
If the source computer program 202 contains nested procedures (Pascal, Cobol or Ada), the outer level procedure and all its inner procedure must appear in the same HPcode-Plus Object file 1150, 1160.
2.5 HPcode-Plus Instruction Set for ANSI-C
As it presently exists, HPcode-Plus can be used as the compiler intermediate language for architecturally neutral C compilers on OSF computer platforms. It should be understood, however, that the present invention is not restricted to this computing environment.
As described above, HPcode-Plus has a rich instruction set for subdistributing high-level languages other than C, such as Pascal, Fortran, Ada, Cobol, RPG, Business Basic, an internal Algol-like language, and a fourth generation language called HP-Transact. It is contemplated to modify HPcode-Plus to support these languages in an architecture neutral manner.
Additionally, the compiler intermediate representations 212 produced by ANDF compilers 234 should operate on any computer platform, provided that (1) the source computer program 202 was written in a machine independent manner using standardized function calls to the run-time library, and (2) appropriate ANDF Installers 218, 228 exist on the target computer platforms 216, 226.
The HPcode-Plus instruction set for ANSI-C is presented in FIGS. 3A, 3B, 3C and 3D. These HPcode-Plus instructions are described in the following sections. In these sections, expression stack elements and mandatory passed parameters are denoted by "<>". Optional passed parameters are denoted by "[]". The abbreviation "op" represents "operand".
In its present form, HPcode-Plus does not support parallelism and vectorization. Currently, all HPcode-Plus instructions operate on scalar items. However, HPcode-Plus is capable of carrying sufficient information to support vector operations. Thus, HPcode-Plus can easily be enhanced to support both parallelism and vectorization.
Existing HPcode-Plus instructions which are required to support other high-level languages, but which are not yet completely architecture neutral, are presented in FIGS. 4A through 4G.
2.5.1. ACVT--ArithmeticConVerT
The syntax of HPcode-Plus instruction ACVT is presented below:
ACVT
ACVT is used to convert operands to a known data type. ACVT pops <op2> and <opl> from the expression stack. ACVT performs data type conversions on <op2> and <op1> to prepare <op2> and <op1> for arithmetic processing. After conversion, <op2> and <op1> are pushed onto the expression stack.
Conversion rules are language specific. HPcode-Plus instruction LANGUAGE OPTN indicates which set of rules to use. For example, the usual arithmetic conversions for ANSI-C are:
If either operand has type long double, the other operand is converted to long double.
Otherwise, if either operand has type double, the other operand is converted to double.
Otherwise, if either operand has type float, the other operand is converted to float.
Otherwise, if one operand has type long int and the other has type unsigned int, then if a long int can represent all values of an unsigned int, then the operand of type unsigned int is covered to long int; if a long int cannot represent all the values of an unsigned int, both operands are converted to unsigned long int.
Otherwise, if either operand has type long int, the other operand is converted to long int.
Otherwise, if either operand has type unsigned int, the other operand is converted to unsigned int.
Otherwise, both operands are converted to type int.
The ANDF Installers 218, 228 use a general conversion table to implement the conversion rules. This conversion table is implemented as a two dimension conversion table as follows:
__________________________________________________________________________
TYPE.sub.-- CHAR
TYPE.sub.-- UNS.sub.-- CHAR
. . .
TYPE.sub.-- CHAR
TYPE.sub.-- INT
TYPE.sub.-- INT
. . .
TYPE.sub.-- UNS.sub.-- CHAR
TYPE.sub.-- INT
TYPE.sub.-- INT
. . .
. . . . . . . . . . . .
__________________________________________________________________________
Each language has a unique conversion table which resides on each target computer platform 216, 226.
2.5.2. ADD--ADD
The syntax of HPcode-Plus instruction ADD is presented below:
ADD
ADD pops <op1> and <op2> from the expression stack. The addition <op1>+<op2> is performed to produce a result. The result, having the same data type as <op1> and <op2>, is then pushed on the expression stack.
2.5.3 AND--logical AND
The syntax of HPcode-Plus instruction AND is presented below:
AND
AND pops <op1> and <op2> from the expression stack. A logical AND operation, <op1> AND <op2>, is performed to produce a result. The result, having the same data type as <op1> and <op2>, is then pushed on the expression stack.
A bitwise AND is performed when <op1> and <op2> are integers or characters. That is, corresponding bits in <op1> and <op2> are ANDed to produce <result>.
2.5.4. CEND--Conditional evaluate END
The syntax of HPcode-Plus instruction CEND is presented below:
CEND
This HPcode-Plus instruction marks the end of a conditionally evaluated instruction sequence begun by a matching HPcode-Plus instruction CEXP.
The CEND HPcode-Plus instruction does not manipulate the expression stack. The item that is on top of the expression stack is the result produced by the matching CEXP instruction. The type of the item on top of the stack must be the same as the data type of the two CEXP clauses.
The CEND HPcode-P1 us instruction can only be used to terminate the most recent CEXP HPcode-Plus instruction. It is an error if there is no preceding conditional instruction.
2.5.5. CEVL--Conditional EVaLuate
The syntax of HPcode-Plus instruction CEVL is presented below:
CEVL
This HPcode-Plus instruction leaves the expression stack unchanged. CEVL acts as a delimiter between two parts of a conditional expression. For example, two CEVLs are required with each CEXP instruction. One CEVL separates <boolean value> from <true expression>, and the other separates <false expression> from <true expression>.
Stack items on the expression stack at the time of CEVL are not accessed until the conditional expression is terminated with a CEXP instruction.
2.5.6. CEXP--Conditionally evaluate EXPression
The syntax of HPcode-Plus instruction CEXP is presented below:
CEXP
This HPcode-Plus instruction pops a boolean operand <boolean op> off the top of the expression stack. If <boolean op> is FALSE, then control is passed to a matching CSEP instruction. An item that is on top of the expression stack when a matching CEND instruction is encountered is treated as a result of the CEXP instruction.
If the item popped off the top of the expression stack is TRUE, then HPcode-Plus instructions up to the matching CSEP are evaluated. Control is then passed to the matching CEND HPcode-Plus instruction. The item that is on top of the stack when the matching CSEP instruction is encountered is treated as the result of the CEXP HPcode-Plus instruction.
The HPcode-Plus instructions between the CEXP and the matching CSEP HPcode-Plus instructions represent the true clause of the conditional evaluation. The HPcode-Plus instructions between the CSEP and the matching CEND HPcode-Plus instructions represent the false clause of the conditional evaluation. Both the true clause and the false clause must either result in zero or one item being pushed onto the expression stack. The data type of the resulting item pushed on to the expression stack by either clause must match and corresponds to the data type of the result of the CEXP HPcode-Plus instruction.
It is possible for the true and false clauses to not leave any result on the stack. In particular, this happens with the ANSI-C conditional expression operator if the true and false clauses are of type void. There is no <result> pushed to the expression stack in this case.
It is an error to specify the CEXP HPcode-Plus instruction without specifying matching CSEP and CEND HPcode-Plus instructions. It is also an error for either the true clause or the false clause to reference items pushed on the expression stack outside the respective clauses.
2.5.7. CLDC--C Load Constant
The syntax of HPcode-Plus instruction CLDC is presented below:
CLDC <flag> <constant value>
CLDC converts <constant value> to a data type indicated by <flag> to produce a result. CLDC then pushes the result onto the expression stack.
The values and corresponding data types of <flag> are presented below.
______________________________________
<flag> Data Types
______________________________________
0 TYPE.sub.-- INT, TYPE.sub.-- LONGINT, TYPE.sub.-- UNS.sub.--
LONGINT
1 TYPE.sub.-- INT, TYPE.sub.-- UNS.sub.-- INT, TYPE.sub.-- LONGINT,
TYPE.sub.-- UNS.sub.-- LONGINT
2 TYPE.sub.-- UNS.sub.-- INT, TYPE.sub.-- UNS.sub.-- LONGINT
3 TYPE.sub.-- LONGINT, TYPE.sub.-- UNS.sub.-- LONGINT
4 TYPE.sub.-- UNS.sub.-- LONGINT
______________________________________
If <flag> is 0, then the ANDF Installer 218, 228 converts <constant value> to TYPE.sub.-- INT, if possible. If it is not possible, then the ANDF Installer 218, 228 converts <constant value> to TYPE.sub.-- LONGINT, if possible. If it is not possible, then the ANDF Installer 218, 228 converts <constant value> to TYPE.sub.-- UNS.sub.-- LONGINT.
The ANDF Producer 208 does not know what the ultimate data type of <constant value> will be. Thus, the ANDF Producer 208 cannot use CLDC prior to arithmetic operations. For arithmetic operations, ANDF Producers must use the ACVT HPcode-Plus instruction.
2.5.8. COMM--COMMent Syntax
The syntax of HPcode-Plus instruction COMM is presented below:
COMM <comment>
COMM is used to place comments within the HPcode-Plus Object file 1150, 1160. COMM HPcode-Plus instructions can appear anywhere in the HPcode-Plus Object file 1150, 1160.
2.5.9. CSEP--Conditional evaluation SEParator
The syntax of HPcode-Plus instruction CSEP is presented below:
CSEP
CSEP leaves the stack unchanged. CSEP is used in conjunction with CEXP HPcode-Plus instructions. CSEP HPcode-Plus instructions simply act as delimiters between true clauses and false clauses of matching CEXP HPcode-Plus instructions.
An error is generated if CSEP HPcode-Plus instructions are not accompanied by CEXP HPcode-Plus instructions.
2.5.10. CSJP--CaSe JumP
The syntax of HPcode-Plus instruction CSJP is presented below:
CSJP <else label>
CSJP pops <selector> from the expression stack.
CSJP uses <selector> to jump to either a location specified by HPcode-Plus instruction CTAB or <else label>.
CSJP must be immediately followed by one or more consecutive CTAB HPcode-Plus instructions, each of which specifies a label and a range of values. No pair of ranges may overlap.
CSJP branches to the label specified by the CTAB HPcode-Plus instruction whose range includes <selector>. If none of the CTAB ranges include <selector>, then CSJP branches to <else label>.
<selector> must be the only item on the expression stack when CSJP is executed.
2.5.11. CTAB--Case TABle
The syntax of HPcode-Plus instruction CTAB is presented below:
CTAB <case label> [#]<low bound> [#]<high bound>
CTAB specifies an entry in a case jump table. CTAB must immediately follow either HPcode-Plus instruction CSJP or another CTAB HPcode-Plus instruction. Different CTABs are allowed to have the same <case label>.
<Low bound> is an integer or a character specifying the lower bound of the range which selects <case label>. If [#] is passed, <low bound> must be an integer symbolic identifier of an integer or character constant.
<High bound> is an integer or a character specifying the upper bound of the range which selects <case label>. <Low bound> has to be less than or equal to <high bound>. If [#] if passed, <high bound> must be an integer symbolic identifier of an integer or character constant.
2.5.12. CUP--Call User Procedure
The syntax of HPcode-Plus instruction CUP is presented below:
CUP <proc symid>
CUP is used to call a procedure or function. <Proc symid>, which represents the symbolic id of the procedure or function, must be previously defined by the KIND.sub.-- FUNC.sub.-- DCL or KIND.sub.-- FUNCTION SYM HPcode-Plus instruction.
CUP initiates the procedure or function call using parameters in the procedure's or function's parameter area. These parameters were placed in the procedure's or function's parameter area by previous PAR HPcode-Plus instructions.
For function calls (i.e., when the type of the procedure is not TYPE.sub.-- VOID), CUP reserves an area in memory for a return value. The return value is actually placed in this memory area by HPcode-Plus instruction STFN.
2.5.13. CVT--ConVerT
The syntax of HPcode-Plus instruction CVT is presented below:
CVT <result type>
CVT pops <value> from the expression stack. CVT converts <value> to the data type indicated by <result type>. The converted <value> is pushed on the expression stack. |