Text object compilation method and system6202201Abstract A Text Object Compiler and Language able to produce binary and text objects that are not machine language code. An object oriented computer language that produces target files of information in any text or binary format; files are defined by the programmer as "pages" and file locations are defined by the programmer as "targets." The compiler compiles the language to produce any variety of output, which include text formats (such as HTML, SGML, and other scripting languages) and binary formats (such as graphical pictures, binary data, or other multimedia information). Claims We claim: Description BACKGROUND OF THE INVENTION
TABLE 1
TOL Operators
Operator Description
:= The equality operator tells the compiler to define
the
keyword on the left-hand-side of the operator
as the
value on the right-hand-side of the operator.
For
example, the usage "length := 5" would define
the
variable "length" with the value of 5.
// This operator indicates to the compiler the
presence of
a comment-line. The compiler ignores the
remainder of
the line.
#include := filename Imports a source file named "filename" for
compilation.
class classname := baseclass Defines a page class. Classname must be
specified.
Baseclass, if omitted, is assumed to be the
base class for
TOL.
A class inherits all of the variables and
functions of its
base class. Classes can be public, private, or
protected.
A class can declare other classes as friends;
a friend
class is given public access to all of the
declaring class'
variables and functions. Within a class,
variables or
functions can be declared friends.
cleartargets := directory Clears all targets previously inline.
debug :=[on/off] Enables/Disables debug while compiling. The default
is to stop the debugger from source code.
func (classname) := name (var1, ... , Begins a new function. If classname
is omitted, then
varN = value) the class is assumed to be the base class. The function
endfunc name, name, must be unique for a class. The
function
can contain zero or more variables, var1,
var2, etc.
Functions can be overloaded, with several
functions of
the same name, but with different number of
variables;
overloaded functions must contain a unique
number of
variables.
Functions can be public, private, or
protected.
Functions can be made virtual, forcing them
to be
defined in a derived class. Once a function is
declared
virtual, all derived class instances of the
function are
virtual.
Functions can contain a default value, for
example,
var1=default.
Endfunc ends a function section.
lfcr := [on/off] Enables/Disables output file wrapping.
page (classname) := filename Begins a new page section. If classname is
omitted,
endpage then the class is assumed to be the base
class. The
compiler builds an output page for each
page/endpage
section. Within this section, all variables
and functions
are resolved. Filename is the fully qualified
location of
the output file.
Endpage ends a page section.
targets (classname) Targets allow pages output be directed to
different or
endtargets multiple locations. Example targets include
any media
or memory storage device, such as disk drives,
networked drives, FTP locations, memory cards.
Endtargets ends a target section.
vars (classname) Begins a new variables section. If classname
is
endvars omitted, then the class is assumed to be the
base class.
Variables can be public, private, or
protected.
Variables can be made virtual, forcing them
to be
defined in a derived class. Once a variable is
declared
virtual, all derived class instances of the
variable are
virtual.
The TOL is similar to other programming languages, in that it has variables, classes, functions and subroutines, but uses a language syntax recognized only by the TOC 102. In addition, to the TOL operators, a number of compile-time utility commands exist to facilitate the document generation process.
TABLE 2
TOL Compile-Time Utility Commands
Utility Command Description
beep := length Causes the system to sound beep with duration of
length in milliseconds. The default duration is
100
milliseconds.
chdir := path Changes the current directory to path. Any reference
to a path or filename that does not include a
drive letter
will default to the current drive. "chdir" is
executed by
the compiler inline.
chdrive := driveletter Changes the current drive. Any reference to a path
or
filename that does not include a drive letter
will assume
the current drive. "chdrive" is executed by the
compiler
inline.
copy := source, destination Copies source to destination when the compiler
reaches this inline command.
exec := program Runs the specified application in program.
kill := filespec Deletes the files specified in filespec.
md := path Creates, the directory path.
rd := path Removes the directory path.
An exemplary source file 101 written in the Text Object Language of Table 1 can be seen in Table 3.
TABLE 3
Example Source File written in the Text Object Language
("mysource.txt")
//define the class structure and relationships
class myclass := baseclass
class newclass := myclass
//variable definitions
//baseclass variables
vars
title := This is my document title
endvars
//variables for myclass pages only
vars (myclass)
title := This is the document title for myclass
endvars
//functions
//baseclass function example
func:=myfunc (var1, var2)
<H1>var1</H1>
<H2>var2</H2>
endfunc
//function for newclass pages only
func (newclass):=myfunc(var1, var2)
<center>
var1<br>
var2
</center>
endfunc
//target documents
page:=first.txt
myfunc(title, This is a base class example)
endpage
page (myclass):=second.txt
myfunc(title, This is a myclass example)
endfunc
page (newclass):=third.txt
myfunc(title, This is a newclass example)
endpage
//target locations
targets
Local Drive := c:\web
LAN Drive := n:\web
Live Site := ftp://www.netcreate.com/web/html
endtargets
As illustrated in the example Text Object code source file 101 of Table 3, there are five primary components of TOL: classes, variables, functions, pages, and targets. Although the concepts of classes, variables, and functions exist in prior art computer languages, the additional concepts of pages and targets exist in the Text Object Language. TOL classes are similar to and share many of the elements of common Object Oriented Programming (OOP) classes. Classes allow a document programmer to organize document sections into objects that can be reused throughout the source files and applies to any of the target files. Specifying classes is optional, since a default class, or "base class," is always assumed. A "derived" or "child" class inherits all of the variables and functions of its base or "parent" class. Classes can be public, private, or protected. Public classes allow their functions and variables to be redefined by other classes. By default, all classes are public. Private classes allow their functions and variables to be redefined only by other member or friend classes. Protected classes allow its variables and function to be used only by member functions, friends of the class in which it is declared, and by member functions and friends of classes derived from the protected class. In addition, a class can declare other classes as fiends; a friend class is given public access to all of the declaring class' variables and functions. Variables allow the source file programmer to represent elements of a target document by reference, and use the reference to create sections or target documents rather than using the actual data. Like classes, in the preferred embodiment, variables can be can be public, private, or protected. Variables can also be made virtual, forcing them to be defined in a derived class; however, once a variable is declared virtual, all inherited class instances of the variable are virtual. Note that no virtual variables of the class may exist within the program until the virtual variable is defined by the derived (child) class. A function is a convenient way to encapsulate some computation, which can then be used without worrying about its implementation. Functions allow programmers a conceptual way to abstract a recurring procedure without worrying about the details. Functions are similar to typical programming subroutines. Like classes and variables, functions can be public, private, or protected. Function name overloading allows multiple function instances that provide a common operation on different argument types to share a common name. Functions can be overloaded, with several functions sharing the same name, but each having a different number of variables. Each overloaded function must have a unique number of variables, which allows the compiler to distinguish between each instance of the overloaded function. Functions can be made virtual, forcing them to be defined in a derived (child) class. Once a function is declared virtual, all derived class instances of the function are virtual. A virtual baseclass function is also virtual in the derived class if inherited by the derived class; such a function is treated as an abstract class, and no objects of the class may exist within the program until the function is defined by a derived class. Pages are unique to the Text Object Language; page parameters instruct the Text Object Compiler 102 how to combine or parse source files 101 into the actual individual target documents 103. A page defines the starting and ending point of a resulting target document 103, and the contents of the target document 103. Targets are also unique to the Text Object Language. Target parameters define the target location; once defined, the target parameters instruct a Text Object Compiler 102 on where to place the target documents. This location is referred to as the "target location." The target location may be local to the computer running the TOC 102, or at a remote location that can be accessed over a computer network by the computer. If the source files 101 define multiple target locations with the target parameter, the TOC 102 will produce identical target documents 103 at each target location. Multiple targets are useful for creating experimental output, creating backup files for redundancy purposes, and updating main/production server files. For example, a programmer may define two targets to create a primary web-site and its "mirror" web-site at an alternate location. If target parameters are omitted from the Text Object code 101, the target documents 103 will be created in a default local location. The Text Object Compiler (TOC) The Text Object Compiler 102 performs the compilation of the text object language source files 101, resulting in target documents 103 defined by the pages parameter as output at a target location defined by the targets parameter. As discussed, target documents 103 may be in any format; note that this distinguishes the TOC 102 from prior art software compilers that which only produce object machine language code, i.e. executable files. Note however, that programmers define the output format of the files with their source program code 101. Attention will now be given to the TOC structure and method. The TOC 102 is similar in structure to a conventional compiler. Like a conventional compiler, the TOC contains a lexical analyzer 10 and a parser 20. However unlike the TOC, a conventional compiler, shown in FIG. 2, feeds parser output into a computer code generator 3, to generate executable computer object code 3. As illustrated in FIG. 4, in a Text Object Compiler 102, the parser 20 output is presented to a page generator 200 to produce the target documents 103 as output. The TOC lexical analyzer 10 examines expressions in a similar fashion to a conventional compiler lexical analyzer. This division into units, known as "tokens," is a process known in the art as "lexical analysis." Essentially, the lexical analyzer looks for regular expressions. A regular expression is a pattern description using the computer language. The lexical analyzer performs as many regular expression matches as possible, and attempts to classify the text of the entire source file into tokens. In the Text Object Language, the expressions may include variable names, function names, class names, target locations, page definitions, constants, strings, operators, punctuation, and so forth. For example, when compiling the source file in Table 3, the compiler initially classifies each instance of a known operator (as listed in Table 1) as a known token. However, if the word or expression is unknown to the compiler, it too is still tokenized, but its value or relationship must still be determined by the parser. As the input is divided into tokens, the compiler must establish the relationship between the tokens. The Text Object Compiler needs to find the expressions, statements, declarations, blocks, functions/procedures, class structures, and pages in the program, a process known as "parsing." The list of rules that define the relationships that the compiler understands is called grammar. The grammar of an exemplary Text Object Language is shown above in Table 1. The Text Object Compilation process is best explained by example. An existing source file 101, such as the example in Table 3 is written in the Text Object Language. The compiler reads the source file, as illustrated in block 250 of FIG. 5. In its process of compiling the source code, a compiler performs two tasks over and over: a.) dividing the input source code into meaningful units (block 260), and b.) discovering the relationship between the units (block 270). These two processes are respectively called "lexical analysis" (block 260) and "parsing" (block 270). If the parser cannot determine the relationship of the token, it next determines whether the end of the source file has been reached, block 280. If the end of the source file has been reached (block 280), the undetermined tokens are an error in either syntax or usage, and an error is reported, block 282. If the end of the source file has not been read, the compiler loops back to block 250, and reads the source file. Similarly, if the parsing of block 270 is successful, and the entire file has not been read, as determined by block 284, the compiler continues to read more lines of the source file, block 250. An example of the lexical analysis and parsing are as follows. The compiler initially reads the first line of Table 3, block 250. Each word is tokenized, and matched against a known set of regular expressions, such as the TOL Operators. The first known operator, the comment operator ("//") is identified, block 260. As defined by the implementation of this TOL grammar, the remainder of the line is determined to be a comment, and the compiler ignores the remainder of the line, block 270. Since the end of the source file has not been reached, as determined by block 284, the compilation process continues, and the compiler reads the next line of the source file, block 250. The lexical analyzer notes the presence of four tokens on the second line, the words "class," "myclass" ":=" and "baseclass." block 260. Two of these tokens, "class" and the equality operator (":=") are identified as operators, and a third token, "baseclass" is identified as the "baseclass" keyword, which defines the TOL base class. The token "myclass" is initially unknown by the lexical analyzer. The token information is forwarded to the parser, which realizes that the source file defines a child class "myclass" which descends from the TOL baseclass; block 270. The parser constructs a memory table, memory tree, or equivalent memory structure to categorize the class structure. The process is repeated with the next line, resulting in a class inheritance relationship depicted by FIG. 6. Class myclass 310 is derived from the base class 300, and class newclass 320 is a "child" class derived from the "parent" class myclass 310. The memory tree is expanded to reflect the newclass class. The compiler processes the next several lines of Table 3, which consist of variable definitions for the variable "title." The variables are parsed and stored in the memory tree, linked to their appropriate class definition, as shown by FIG. 7. The baseclass 300 is associated with a variable "title" 301. Similarly, myclass 310 is associated with a different definition for another variable called "title" 311. FIG. 8. illustrates the relationships of the functions declared with their defined classes. The code in Table 3 defines a "myfunc" function that is different for the baseclass 300 and newclass 320; consequently, a myfunc 302 is associated with the baseclass 300, and a different myfunc 322 function is associated with newclass 320. FIG. 9. consolidates the inheritance diagrams with their related variables and functions. Baseclass 300 has both a variable, title 301, and a function, myfunc 302. The class myclass 310 also has a variable, title 311, and since it does not have a definition for myfunc, it inherits the function definition for myfunc 312 from the baseclass 300 definition of myfunc 302. Similarly, the class newclass 320 does not have a value for the "title" variable, and thus inherits its definition for title 321 from the myclass title definition 311. Newclass 320 does have its own definition for the function myfunc 322, and this is also reflected in the inheritance diagram. The lexical analysis and parsing process is repeated for both the target document and target location sections of the code. As shown in FIG. 10, the target document "first.txt" 400 is of the baseclass 300, "second.txt" 410 is of class myclass 310, and "third.txt" 420 is of class newclass 320. Each of the three target documents consist of a single function call to the appropriate class function "myfunc." Once all the source files have been tokenized and parsed to known values, the established token and relationship information is passed to the compiler page generator, block 286. As shown previously in FIG. 4, the compiler page generator 200 creates each target document based on the relationships and tokens forwarded from the parser 20, replacing variables with their appropriate values, evaluating function calls, and substituting the resulting information into the page table shown in FIG. 10. The page generator sub-process is elaborated in FIG. 14. The token relationship information is passed to the compiler page generator, block 286. For each page class, the variables are replaced with their respective definitions, block 288. In a simple embodiment, this can merely be the substitution of the value into each memory table location where the variable appears. Each function call for every page class is then evaluated, block 290. The existence of each named target location is verified; if the location, such as a directory, does not exist, it may be created at this time by the compiler, block 292. Each page, corresponding to a target document, is then written at each target location, block 294. Lastly, the write is verified by the compiler, block 296. For example, as illustrated in FIG. 11, the output for "first.txt" 400 is generated by noting the appropriate class, baseclass 300, which defines the functions and variables used in generating the page. Table 3 defines "first.txt" as a page generated by a function call to "myfunc" using the "title" variable and "This is a base class example" as the input. Since "first.txt" is of class baseclass, the definitions for "title" and "myfunc" are taken directly from the baseclass. The results are shown in Table 4.
TABLE 4
Compiled output for "first.txt"
<H1>This is my document title</H1>
<H2>This is a base class example</H2>
FIG. 12 continues the compilation for "second.txt" 410, which is of class "myclass" 310. The definitions for "title" is taken directly from class myclass 310. The definitions for "myfunc" would normally also be taken from class myclass 310. However, since "myfunc" is not defined for myclass 310, the myfunc function definition for myclass' parent class, baseclass 300, is used. The compiled results for "second.txt" 410 are shown in Table 5.
TABLE 5
Compiled output for "second.txt"
<H1>This is the document title for myclass</H1>
<H2>This is a myclass example</H2>
FIG. 13 continues the compilation for "third.txt" 420, which is of class "newclass" 320. The definitions for "title" and "myfunc" are normally taken directly from class newclass 310. However, since the "title" variable is not defined for newclass 320, the "title" variable definition for newclass' parent class, myclass 310, is used. Since "myfunc" is defined for the class newclass 310, the newclass "myfunc" definition is used. The compiled results for "third.txt" 420 are shown in Table 6.
TABLE 6
Compiled output for "third.txt"
<center>
This is the document title for myclass<br>
This is a newclass example
</center>
Once the compiler generates each page in memory, each page is written as a target document at each target location. The compiler may optionally create previously non-existing target locations, and verify the writing at the target locations; in one embodiment, the TOC performs both actions, reporting a warning message if a target location is not created, or an error message if a problem in writing the target document occurs.
|
Same subclass Same class Consider this |
||||||||||
