Compilation and virtual machine arrangement and process for source code including pre-runtime executable language structure constructs5838980Abstract Compilation and virtual machine arrangement and process for translating source code including pre-runtime executable instruction into compiled code having enhanced runtime effectiveness. The source code is formatted in accordance with a user determined and pre-runtime modifiable language definition. The source code is compiled by a generalized compiler and includes executable language specific structure constructs or instructions which pass through the generalized compiler in unexecuted form. The instructions are then executed in a virtual machine which produces compiled code of reduced size, which renders runtime execution of the compiled code of increased effectiveness. Claims What is claimed is: Description This application is related to co-pending patent applications each of them filed on Jan 18, 1994 and having the same inventorship as herein, and respectively entitled "Object Oriented Dispatch and Supercall Process and Arrangement," "Alternate Dispatch Variable Process and Arrangement for Run-Time Message Redirection and Enablement of Selected Object Oriented Methods," and "Variable Resolution Method and Arrangement." These co-pending patent applications have respective Ser. Nos., 08/184,492, 08/183,478, and 08/184,497. These patent applications are assigned to the same assignee as herein, and are incorporated herein by reference in their entirety.
______________________________________
METHOD XYZ (A INTEGER, B INTEGER)
)
RETURN ((A+B)*(A-B));
}.
______________________________________
This compiled source code exhibits a certain post-compilation syntax consistent with the expected compilation syntax of structure functions, as stated in the left-hand column of FIG. 3 setting forth a structure definition for applicable portions of the block of source code 12 set forth in FIG. 1, and which is consistent with the language definitions set forth in FIG. 10. The syntax, as indicated, requires that the function, "METHOD XYZ," which is a function of two integer variables, A and B, be expressed in a terms of open and closed parentheses surrounding the integer variable names followed by an indication of their status as integers, for each such integer variable, each of them separated by commas. Additionally, within brace brackets, the function represented by the method, i.e., RETURN ((A+B)*(A-B)) is set forth. FIG. 4 shows details of the generalized compiler 14 (or compilers) in diagram form. The compiler(s) are effective to translate one or more languages into virtual machine code and into the executing address space within which the compiler(s) reside as well. More particularly, the generalized compiler 14 is shown in FIG. 4 to receive source code from source code block 12 which has been produced by the user at a suitable user interface, e.g., preferably a GUI interface. Generalized compiler 14 performs a syntax check on received source code in syntax checker 40 which operates on parsed code. The source code for generalized compiler 14 is received by a parser/tokenizer 42 operating according to conventional parser definitions well-known to one skilled in the art which are applied by parser definition element 44. The resultant parsed and tokenized source code is next stored in a parsed source location 46 able to produce the parsed and tokenized source code for evaluation by syntax checker 40 in accordance with a known process of recursive descent known to those skilled in the art. In accordance with the syntax check performed, indicated processes or functions 41a through 41n, corresponding to FUNCTIONS 1-N in FIG. 4, are conducted on associated parsed syntax elements. In the event of failure with respect to the syntax check, an error report 48 is prepared. A DG traversal pattern 50 is established in accordance with block 50. Finally, compiler virtual machine code generator 52 is established effectively to receive an input from parsed source output 46, functions 41a-41n, and DG traversal pattern 50. Accordingly, generalized compiler 14 produces compiler virtual machine code 16 from compiler virtual machine code generator 42. FIG. 5 expresses the variably tokenized compiler virtual machine code 16 produced by the generalized compiler 14 in response to receipt of source code 12 and compiled language definition 22. In particular, FIG. 5 is a symbol table showing examples of tokens making up or comprising the compiler virtual machine code 16. As shown, there are many different kinds of tokens associated with the compiler virtual machine code 16 generated by the generalized compiler 14. FIG. 5 shows as many as "N+4" tokens in the symbol table, based upon an arbitrary integer number N. Specifically stated, a token is a selected integer number corresponding to a particular kind of code. For example, a particular integer can be selected to represent the first token, i.e., token 1. The kind of code assigned to token 1 in the example of FIG. 5 is a "compiler virtual code (CVM) instruction code," called a .sub.-- STRUCTURE.sub.-- DEFINE code. Another kind of code represented in the symbol table of FIG. 5 is VALUE code. VALUE code includes such classes of code as "VARIABLE," "NAME," and "STRING" classes, as suggested in FIG. 5 respectively with reference to tokens 2, 4, and 5. Other kinds of CVM code include -ADD.sub.-- ELEMENT, .sub.-- PUSH.sub.-- VARIABLE, and PLUS, as shown in FIG. 5. These particular kinds of code are considered to be INSTRUCTION code types. Moreover, all of the above indicated kinds of code are examples of token code types. FIG. 6 shows functional details of compiler virtual machine 18 in diagram form, according to the invention herein. As already noted above, compiler virtual machine 18 is effective for creation of compiled virtual machine code 20, based upon the compiler virtual machine code 16 which it receives as its input. The process of compiler virtual machine 18 begins with a first in order compiler virtual machine code token which is at the highest level of bracketing, in accordance with block 60. As will be discussed with reference to FIG. 10, structure constructs which are executable instructions are at a relatively high level of bracketing, permitting them priority of processing. Thus, at decision block 62, a determination is made whether the current token is a compiler virtual machine instruction. If the current token does not represent an instruction which is to be executed, the current token is output directly to a combined output stream, as suggested at block 64. After completion of output directly to the combined output stream as per block 64, a sequence is taken to the next compiler virtual machine token. If the current token in a compiler virtual machine instruction, then the current token is dispatched to a compiler virtual machine instruction handler according to block 66. Compiler virtual machine instruction handlers 68a-68n are effective for performing a predetermined array of kinds of instruction programming and are designated as respective CVM INSTRUCTIONS 1-N. After processing has been completed by compiler virtual machine instruction handlers 68a-68n, a determination is made at decision block 70 as to whether the process has reached the end of the compiler virtual machine code 16. If not, then a return is made to decision block 62, for operation in connection with a next in order current token. If the end of the particular level of compiler virtual machine code 16 has been reached, then a check is conducted in accordance with decision block 72 regarding whether operation is at the lowest bracket level. If not, according to block 74 operation proceeds by going to a next lower level of bracketing with a return to decision block 62 for continued operation at the next lower level of bracketing. However, if operation is already at the lowest bracketing level, then establishment of compiled virtual machine code 20 will have been completed. FIG. 7 shows the compiled code 20 which may be produced in connection with the invention herein. In particular, FIG. 7 shows instances of instruction code including for example .sub.-- PUSH.sub.-- VARIABLE and PLUS. All of the above indicated kinds of code are examples of token code. Another kind of compiled code includes structures, as illustrated in the lower section of FIG. 7. In particular, the code produced includes structures of two kinds: variable structures and integer structures. Variable structures in the compile code include variables and integers. The variable structures have metadata categories respectively of name, type, length, and number of flags. The integer structures have only a single metadata category, value. In FIG. 7, two variables, "A" and "B" are shown, and both are variables of type "INTEGER." The length of an integer variable is indicated to be zero, or "0." Integer variables are indicated to have eighteen flags. Integer structures are indicated to have integer values. Two examples of possible integer values are indicated in FIG. 7. These are "127" and "768." The compiler virtual machine 18 is effective for executing instructions which passed through generalized compiler 14. The result is to establish the already executed metadata structures of FIG. 7, which creates advantages at runtime, because there is accomplished a considerable advance in speed due to the pre-execution of certain instructions. FIG. 8a shows the compiled language definition 22 employed by the generalized compiler 14 to compile the source code from source code block 12. As FIG. 8a shows, the compiled language definition 22 includes a plurality of nodes. In the example shown, three nodes are indicated, NODE 1, NODE 2, and NODE 3. Each node contains a plurality of tokens, for example tokens 1 through 11. In the example shown, token 6 of NODE 1 indicates a link between NODES 1 and 3. Further, token 11 of NODE 1 indicates a link between NODES 1 and 2. Simply stated, NODE 1 is directed toward both NODE 2 and NODE 3. FIG. 8b is a symbol table relating to the compiled language definition 22 of FIG. 8a. A first token is the alphanumeric expression "IF." A second token is the integer number 127. Another token is the expression "FUNCTION." Another token represents the number 1.4142135. A next token represents the positive integer 768. Yet another token represents the expression "THEN." Another token represents the open brace {and another the close brace}. Each token corresponds to a different number. FIG. 8c is a symbol table reference with respect to the compiled language definition 22 of FIG. 8a. The symbol table reference includes an indication of the type of token, which may be a literal token, an instruction token, or a right margin (RM) token. Each symbol has a particular node table reference number and a function table reference number. FIG. 9 illustrates the language definition compiler 24 according to the invention herein. In particular, language definition compiler 24 receives as its input language definition 26. Then, the language definition compiler 24 transforms the input language definition 26 into a compiled language definition 22 (CLD). This is accomplished by parsing the input language definition received from language definition 26. The parsing is accomplished by a parser 90 indicated in FIG. 9 as an input oval from language definition 26. Language definition 26 will be discussed in greater detail in connection with FIG. 10 below. By parsing, it is meant that the elements of the structures and language expressed in the syntax of the language are resolved into separate components. Each parsed language element and structure is tokenized by tokenized definition block 92. Tokenization is the association of a particular language element portion or structure element with a predetermined token integer number representing the particular kind of language element portion. As an example in connection with FIG. 10, and as will be expressed in greater detail below, the structure element "STRUCTURE" is associated with the token semantic ››.sub.-- STRUCTURE.sub.-- DEFINE!! which in turn has an integer number associated with it. FIG. 8a illustrates the expression of particular nodes as tokens expressed in terms of numerical integer values. Language definition compiler 24 in FIG. 9 further includes a process for identifying nodes 94 from tokenized definitions 92 which are produced by a parser 90 receiving language definition 26 in ASCII, for example. Once the nodes are identified according to oval 94 set forth in FIG. 9, these nodes are entered as information into a node table 96. A syntax check is undertaken according to oval 98 to determine conformance of each node with the syntax required of the particular structure or language element, as set forth in FIG. 10. The syntax check oval procedure 98 may result in an error according to block 99, which results either in termination or resolution of the error identified. Particular node references are resolved in accordance with the resolve node references procedure represented by oval 100. Next, a cyclical directed graph of nodes is established as suggested at block 102. This cyclical directed graph of nodes is packaged in position independent form, as suggested at oval 104. Finally, the result of the position independent form data is provided to block 22 as a compiled language definition (CLD) which represents a compiled interrelationship of nodes useful for compilation by generalized computer 14. FIG. 10 is a table of the language definition 26 according to the invention. The language definition 26 is expressed in two categories, syntax and semantics. Additionally, the language definition is directed toward structures (or structure constructs) and standard language elements or expressions. For example, as FIG. 10 shows, the syntax of a STRUCT.sub.-- DEF is open parentheses (, the word STRUCTURE in quotes, the open brace {, a selected alphanumeric character string indicating the kind of structure, a close brace }, an open parentheses (in quotes, one or more instances of the word ELEMENT in quotes followed by two open close brace combinations each bounding another selected alphanumeric character string, a close parentheses character in quotes, and another close parentheses character. The semantics associated with the indicated structure definition is a number of double square bracketed expressions, the first of which includes the expression .sub.-- STRUCTURE.sub.-- DEFINE in double square brackets. Next, again in double square brackets and additionally in braces is the token number associated with the token expression CURRENT.sub.-- TOKEN. Then, for each of the one or more instances of the word ELEMENT, the expression .sub.-- ADD.sub.-- ELEMENT is provided in double square brackets. Finally, after each .sub.-- ADD.sub.-- ELEMENT double square bracketed item, follow two similarly double bracketed token expressions of the kind CURRENT.sub.-- TOKEN, enclosed in open and close braces. The language definition 26 of FIG. 10 further is expressed in terms of the same two categories, syntax and semantics with respect to standard language elements. For example, as FIG. 10 shows, the syntax and semantics of a plurality of language elements are expressed in detail. The language elements expressed in terms of syntax and semantics include the categories of EXPRESSION, ASSIGNMENT, VARIABLE, OPERATOR, and VARIABLE.sub.-- REF. The language element EXPRESSION in FIG. 10 has the syntax of open parentheses (, the word VARIABLE OPERATOR, and in double open and close braces the word INTEGER followed by close parentheses ). The associated semantics related to the language element EXPRESSION is the expression in single square brackets PUSH.sub.-- VAL and in double open and close braces the value of the token number associated with the expression CURRENT.sub.-- TOKEN. The language element ASSIGNMENT in FIG. 10 has the syntax of open parentheses (, the word VARIABLE.sub.-- REF, and in double open and close quotes the expression=followed by the name of a particular selected expression, followed by close parentheses ). The associated semantics related to the language element ASSIGNMENT is the expression bounded in open and close square brackets of the value of ASSIGN. The language element VARIABLE in FIG. 10 has the syntax of open parentheses (, the value of the expression ALPHA set between open and close braces, followed by close parentheses ). The associated semantics related to the language element VARIABLE are the successive first and second expressions in single square brackets: PUSH.sub.-- VARIABLE and in double open and close braces the token number associated with the expression CURRENT.sub.-- TOKEN, and .sub.-- STRUCT.sub.-- INSTANCE VARIABLE. The language element OPERATOR in FIG. 10 has the syntax of open parentheses (, and each in succession in double open and close quotes the two expressions + and -, followed by close parentheses ). The associated semantics related to the language element OPERATOR are the expressions bounded in open and close square brackets of the tokens PLUS and MINUS, respectively. The language element VARIABLE.sub.-- REF in FIG. 10 has the syntax of open parentheses (the expression in double open and close braces of the word INTEGER followed by close parentheses ). The associated semantics related to the language element VARIABLE.sub.-- REF are the expression in single square brackets PUSH.sub.-- ADDRESS and in double open and close braces the token number associated with the expression CURRENT.sub.-- TOKEN. In summary, the invention herein is effective for the more highly effective compilation of source code in according to selected language definitions permitting the establishment of instruction structures which pass through a screening compilation stage and are executed at a subsequent compilation step which provides virtual compilation and a reduced set of compiled code for more efficient runtime operation. The invention is directed toward a compilation and virtual machine arrangement and process for translating source code including pre-runtime executable instruction into compiled code having enhanced runtime effectiveness. The source code is formatted in accordance with a user determined and pre-runtime modifiable language definition. The source code is compiled by a generalized compiler and includes executable language specific structure constructs or instructions which pass through the generalized compiler in unexecuted form. The instructions are then executed in a virtual machine which produces compiled code of reduced size, which renders runtime execution of the compiled code of increased effectiveness. In summary, the invention is directed toward method of processing source code constrained by preselected language definitions and including predetermined pre-runtime executable language specific structure constructs. The method of the invention includes processing predetermined source code constrained by a predetermined language definition including predetermined language expressions and pre-runtime executable language specific structure constructs, in order to produce a compiler virtual machine code which includes unexecuted instructions reflecting the pre-runtime executable language specific structure constructs. The method further includes processing the produced compiler virtual machine code to produce compiled code for runtime execution. The effect of this is improved runtime processing. This method may further include making modifications to the predetermined source code on a graphical user interface (GUI) up to and including runtime, or making modifications to the language definition on a graphical user interface up to and including runtime. The method of the invention further includes, according to one version, executing the pre-runtime executable language specific structure constructs. The method may further include, under the invention, employing the language definition is employed to structure said predetermined source code. Further under the invention, the source code processing is conducted by a generalized compiler. Further, the invention includes as a feature compiling the language definition to produce a compiled language definition prior to processing the predetermined source code. Additionally, processing the source code includes parsing and tokenizing the source code. The invention further includes a source code processing system for processing predetermined source code expressions and pre-runtime executable structure constructs established according to a predetermined syntactic and semantic scheme. The system includes a language subsystem for establishing a syntactic and semantic scheme according to which source code can be prepared for compilation according to the predetermined syntactic and semantic scheme. It includes a source of source code expressions and pre-runtime executable structure constructs, the form of the pre-runtime executable structure constructs and expressions conforming to the syntax and semantics of said language subsystem. Further, the system includes a compilation system connected to the language subsystem and the source of source code. The compilation system is effective for parsing and tokenizing said predetermined source code, and being effective to produce compiler virtual machine code including unexecuted instructions representing said pre-runtime executable structure constructs. The system further includes a compiler virtual machine processing system connected to the compilation system and receiving compiler virtual machine code including unexecuted instructions from the compilation system. The compiler virtual machine processing system is effective for executing executable instructions received from said compiler system. Consequently, compiled code is produced which enables enhanced runtime execution. The compilation system is further effective for encapsulating pre-runtime executable structure constructs within predetermined characters effective for establishing a processing hierarchy to be observed by the compiler virtual machine processing system. The system includes an arrangement for making modifications in the source code. This arrangement may include a graphical user interface and a computer monitor. The language system preferably includes a language definition compiler which is effective for producing a compiled language definition. The language definition compiler includes a parser for parsing input language definition information. The language definition compiler further includes a tokenizer for tokenizing language definition information. The language definition compiler includes a node identifier. According to the invention, the language definition information is parsed, tokenized, node-identified, set in a node table, resolved as to node references subject to a syntax check, and established in a cyclical directed graph of nodes for packaging into position independent form as a compiled language definition. The invention herein further addresses a semantic resolution mechanism for receiving expressions and their semantic relationships. This mechanism includes an arrangement for structuring predefined semantic relationships, which is effective to establish semantic structures defining relationships between received expressions. The mechanism further includes a source of expressions and defined semantic relationships conforming to established semantic structures applicable to received expressions. The source of expressions and defined semantic relationships produces executable instructions for handling selected expressions in accordance with methods for enabling interaction between said expressions. The mechanism further includes an arrangement for parsing, tokenizing, and segregating expressions from received expressions and defined semantic relationships, this arrangement for parsing, tokenizing, and segregating being effective for producing intermediate virtual expressions and instructions for executably interrelating said intermediate virtual expressions. The mechanism additionally includes a system for receiving said intermediate virtual expressions and instructions, the system for receiving being effective for recognizing the virtual expressions and instructions and being effective for output producing the virtual expressions and executing the virtual instructions in prioritized fashion. The arrangement for parsing, tokenizing, and segregating is effective for delimiting virtual instructions in prioritized manner. The system for receiving recursively executes said virtual instructions in accordance with the level of delimitation set by said means for parsing, tokenizing, and segregating. The system for receiving includes an instruction handler effective for dispatching virtual instructions to a suitable instruction handling mechanism selected from a predetermined set of instruction handling mechanisms. Further, the invention is directed toward a system comprising a data model and an evaluation model, each separately expressed in one or more languages and effective for permitting behavior to relate to data according to the data model at run time. While this invention has been described in terms of several preferred embodiments, it is contemplated that many alterations, permutations, and equivalents will be apparent to those skilled in the art. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
|
Same subclass Same class Consider this |
||||||||||
