Method and apparatus for the modeling and query of database structures using natural language-like constructs5495604Abstract Computerized tools for modeling database designs and specifying queries of the data contained therein. Once it is determined that an information system needs to be created, the Fact Compiler of the present invention is invoked to create it. After creating the information system, the user creates a fact-tree as a prelude to generating queries to the system. After creating the fact-tree, the user verifies that it is correct using the Tree Interpreter of the present invention. Once the fact tree has been verified, the Query Mapper of the present invention is used to generate information system queries. Claims What is claimed is: Description TECHNICAL FIELD
__________________________________________________________________________
FIRST NOUN
SECOND NOUN
FACT TABLE COLUMN COLUMN
__________________________________________________________________________
Person lives at address
Person.sub.-- Table
Person Address
Person has Phone Number
Phone.sub.-- table
Person Phone Number
Person studies Subject
Studies.sub.-- Table
Person Subject Studies
Subject is taught by Person
Subject.sub.-- Table
Person Teacher Person
__________________________________________________________________________
The first step in query processing is specifying the fact-tree. In Fact-Tree Specification, the user selects a noun relevant to the query. For example, if the user wanted to find out the address, phone number, subjects studied, and teachers of Mr. Smith, they would start with the Person noun because the query is basically about a person. After choosing Person as the root of the query, they can select more information about the person--to find out their address etc. The only information they are able to select is the information contained in the facts about the person, i.e. 0 A person lives at an address. 0 A person has a phone number. 0 A person studies a subject. 0 A person teaches a subject. This set of facts is all of the information possible about a particular person. The information is displayed conceptually and the user didn't need to know any special keywords or phrases. In this case the user would select the facts 0 A person lives at an address. 0 A person has a phone number. 0 A person studies a subject. 0 A subject is taught by a person. since that is what they want to know about Mr. Smith. This would build up the following fact-tree.
______________________________________
Person
.sub.-- that lives at an address
.sub.-- that has a phone number
.sub.-- that studies a subject
.sub.-- that is taught by a person.
______________________________________
Finally, the user would restrict the person at the root of the tree to be equal to Mr. Smith, since this is the only person they are interested in. The meaning of the final tree is: Show the person Mr. Smith, the address that he lives at, the phone numbers that he has, the subject that he studies, and for the subjects he studies, show the people that teach those subject. After generating the fact-tree, the user verifies that the fact-tree is correct using the Tree Interpreter of the present invention. Doing so will preclude the possibility of an ambiguous query being generated. In use, the Tree Interpreter algorithm constructs a natural language description of the fact-tree. This algorithm is a recursive depth-first search function which is described in the following best mode section. This interpretation allows the user to verify that the question he or she is asking will retrieve the information desired. Once the user has specified the fact-tree and checked it using the Tree Interpreter, all that remains to do is generate the relational query itself. The algorithm to do this is again recursive on fact-tree nodes, and is detailed in the following section detailing the best mode of carrying out the invention. BRIEF DESCRIPTION OF THE DRAWINGS The various features and advantages of the present invention may be more readily understood with reference to the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings, wherein like reference characters designate like elements. FIG. 1 is a diagram of the external view of a digital, programmable, general purpose computer configured for operation of the present invention. FIG. 2 is a block diagram of the computer of FIG. 1 configured with the present invention. FIG. 3 is a flow chart illustrating the initial selection menu of the present invention, after selecting the Fact Compiler of the present invention. FIG. 4 is a flow chart illustrating the Fact Compiler of the present invention, including it's three main functions. FIG. 5 is a flow chart illustrating the Drag and Drop over Diagram function of the Fact Compiler of the present invention. FIG. 6 is a flow chart of the Drag and Drop Parse function invoked by the Drag and Drop over Diagram function. FIG. 7 is a flow chart of the Compile into Repository Only function invoked by the Fact Compiler. FIG. 8 is a flow chart of the Parse function invoked by the Compile into Repository Only function. FIG. 9 is a flow chart of the Scan function invoked by the Parse function. FIG. 10 is a table indicating the procedure for entering an error condition according to the present invention. FIG. 11 is a flow chart illustrating the Look for Object Specifications function invoked by the Parse function. FIG. 12 is a flow chart representing the Look for Fact Specifications function invoked by the Parse function. FIG. 13 is a flow chart representing the Look for Constraint Specifications function invoked by the Parse function. FIG. 14 is a flow chart illustrating the Allocate a New Object function invoked by the Look for Object Specifications function. FIG. 15 is a flow chart illustrating the Allocate a New Fact function invoked by the Look for Fact Specifications function. FIG. 16 is a flow chart illustrating the Allocate a New Constraint function invoked by the Look for Constraint Specifications function. FIG. 17 is a flow chart representing the Update Record in Repository function invoked by both the Drag and Drop Parse and Compile into Repository Only functions. FIG. 18 is a flow chart illustrating the initial selection menu of the present invention, after selecting Fact Tree Specification according to the present invention. FIG. 19 is a flow chart illustrating the Fact Tree Formation function and selection of either the Query Mapper or Tree Interpreter functions of the present invention. FIG. 20 is a flow chart representing the Fact Tree to SQL Query function invoked when a user selects the Query Mapper function of the present invention. FIG. 21 is a flow chart illustrating the Node to SQL function invoked by the Fact Tree to SQL function. FIG. 22 is a flow chart representing the Create Join function invoked by the Node to SQL function. FIG. 23 is a flow chart illustrating the Add Selector 1 function invoked by the Node to SQL function. FIG. 24 is a flow chart illustrating the Add Selector 2 function invoked by the Node to SQL function. FIG. 25 is a flow chart of the Fact Tree to Description function invoked when a user selects the Tree Interpreter of the present invention. FIG. 26 is a flow chart of the Create Text for Root function invoked by the Fact Tree to Description function. FIG. 27 is a flow chart illustrating the Create Text for Node function invoked by the Fact Tree to Description function. BEST MODE OF CARRYING OUT THE INVENTION The preferred embodiment of the present invention incorporates computer system 1 configured as shown in FIG. 1. Computer system 1 is a programmable digital computer. The invention is executable on an IBM compatible computer having an Intel 80386 or higher chip set, operating under the MS-DOS operating system, version 5.0 or higher. A minimum or 6 megabytes of available RAM is required for execution, as is a minimum of 6 megabytes of available hard disk storage space. These computers typically include a CPU, main storage, I/O resources, and a user interface, including a manually operated keyboard and mouse. The present invention also requires a graphical user interface program: Microsoft Windows is one well known example. The present invention was programmed on an IBM compatible computer having an Intel 80486 chip set, running Microsoft MS-DOS operating system, version 5.0. Microsoft Windows Version 3.1 was installed to provide the required graphical user interface. Finally, the system whose description follows was programmed in the Borland C language. FIG. 2 depicts the bus structure of the general purpose programmable computer of FIG. 1, with the present invention implemented thereon. Referring now to FIG. 3, a user initiates the system at manual input 50 and selects the desired function at function selection 51. The present invention provides a method and apparatus that allows users to: 1. Develop an information system description using a graphical user interface to a natural language-like computer language. One such language is FORML. 2. Specify the fact tree for query generation. 3. Check queries for semantic correctness. 4. Generate queries to the database system. Once it has been determined that an information system needs to be created, the Fact Compiler of the present invention is invoked. The Compile function of the Fact Compiler enables a user to type in text, using a natural language-like computer language. One such language is FORML. The text is typed in a window provided by the system, and may contain objects, facts and/or constraints. Using a translation function called "Drag and Drop over Diagram" and a graphical user interface, the user then drags the text from the entry window to the appropriate place over the ORM conceptual schema diagram of the Fact Compiler. The user then drops the text onto the diagram. The Fact Compiler validates the text entered and notifies the user of any errors encountered. During validation, the Fact Compiler first parses the text and creates an object list, a fact list and a constraint list in memory. Then the Fact Compiler iteratively compiles the text into the repository. The repository is essentially a "database of databases". Finally, the validated objects, facts and/or constraints are drawn in proper notation on the ORM conceptual schema diagram. At this point the information system specification may be considered complete. After the information system has been created, the user may wish to check and/or edit the previously entered information. This is accomplished by using the Decompile function of the Fact Compiler. Decompile is essentially the reverse of the previously discussed Compile function, in that it takes an ORM conceptual schema diagram and returns a textual listing of the objects, facts and constraints entered in the repository. The user can use this listing to verify the information system specification or to edit the system as it exists. Once the information system specification is complete, the conceptual schema depicted in the ORM representation of the information system is mapped to a relational database using RMAP. The RMAP process is fully described in McCormack et al (1993), which is incorporated by reference as if fully set forth herein. By way of example, for an example set of facts: Person lives at address Person has Phone Number Person studies Subject Subject is taught by Person if the relational database associated with an example fact tree is: Person.sub.-- Table: (Person, Address) Phone.sub.-- Table: (Person, Phone Number) Studies.sub.-- Table: (Person, Subject Studied) Subject.sub.-- Table: (Subject, Teacher Person) The associated RMAP mappings would be:
__________________________________________________________________________
FIRST NOUN
SECOND NOUN
FACT TABLE COLUMN COLUMN
__________________________________________________________________________
Person lives at address
Person.sub.-- Table
Person Address
Person has Phone Number
Phone.sub.-- table
Person Phone Number
Person studies Subject
Studies.sub.-- Table
Person Subject Studies
Subject is taught by Person
Subject.sub.-- Table
Person Teacher Person,
if the relational
database associ-
ated with an ex-
ample fact tree is:
__________________________________________________________________________
The first step in query processing is specifying the fact-tree. In Fact-Tree Specification, the user selects a noun relevant to the query. For example, if the user wanted to find out the address, phone number, subjects studied, and teachers of Mr. Smith, they would start with the Person noun because the query is basically about a person. After choosing Person as the root of the query, they can select more information about the person--to find out their address etc. The only information they are able to select is the information contained in the facts about the person, i.e. 0 A person lives at an address. 0 A person has a phone number. 0 A person studies a subject. 0 A person teaches a subject. This set of facts is all of the information possible about a particular person. The information is displayed conceptually and the user didn't need to know any special keywords or phrases. In this case the user would select the facts 0 A person lives at an address. 0 A person has a phone number. 0 A person studies a subject. 0 A subject is taught by a person. since that is what they want to know about Mr. Smith. This would build up the following fact-tree.
______________________________________
Person
.sub.-- that lives at an address
.sub.-- that has a phone number
.sub.-- that studies a subject
.sub.-- that is taught by a person.
______________________________________
Finally, the user would restrict the person at the root of the tree to be equal to Mr. Smith, since this is the only person they are interested in. The meaning of the final tree is: Show the person Mr. Smith, the address that he lives at, the phone numbers that he has, the subject that he studies, and for the subjects he studies, show the people that teach those subject. After generating the fact-tree, the user verifies that the fact-tree is correct using the Tree Interpreter of the present invention. Doing so will preclude the possibility of an ambiguous query being generated. In use, the Tree Interpreter algorithm constructs a natural language description of the fact-tree. This algorithm is a recursive depth-first search function which can be summarized as follows:
__________________________________________________________________________
function: Interpret.sub.-- Tree (fact-tree.sub.-- node) Il Interpret-Tree
operates on a node of
the fact-tree
begin
If the node is the root of the tree then
noun is the noun in the node
e.g.
Person
Print `For all noun(s)`
e.g.
For all Person(s)
if the node has a restriction then
e.g.
is equal to Mr. Smith
print "(where noun restriction)"
e.g.
(where Person = Mr. Smith)
Print "show:" and move on to a new line
otherwise
noun is the noun in the node
e.g.
Address
parent-noun is the noun in the node's parent
e.g.
Person
phrase is the phrase in the noun
e.g.
lives at
Print "the noun(s) that the parent-noun phrase"
e.g.
the Address(es)
that the person
lives at
if the node has a restriction then
e.g.
is equal to
Seattle
print "(where noun restriction)"
e.g.
(where address =
Seattle)
if the node has any children then
print ", and for those noun(s) show:"
move on to a new line
for all children of the node do
call Interpret.sub.-- Tree on the child-node
end
__________________________________________________________________________
The result of Interpret.sub.-- Tree on the example fact-tree would be
______________________________________
For all Person(s) (where Person = Mr. Smith) show:
the Address that the Person lives at
he Phone Number that the Person has
the Subjects that the Person studies,
and for those subjects show the Person(s) that the
Subject is taught by.
______________________________________
This interpretation allows the user to verify that the question he or she is asking will retrieve the information desired. Once the user has specified the fact-tree and checked it using the tree interpreter, all that remains to do is generate the relational query itself. The algorithm to do this is again recursive on fact-tree nodes.
______________________________________
function Create.sub.-- Query (fact-tree.sub.-- node)
begin
node is the node being mapped by this call to the function
child i . . . childn are the children of node
sentence1, . . . sentencen. are the respective sentences for
child i . . . childn each sentence i, (i=1 . . . n) has a mapping'
associated with it. The mapping corresponds to the relational
structure used to represent the sentence and contains the table
that the sentence was mapped to, the column for the first noun
and the column for the second noun. For example, the sentence
Person lives at Address maps to the Person table, with noun1
(person) being column 1, and noun2 (address) being column 2.
Join all of the mappings for sentences 1 . . . n together using outer
joins based on the noun in node and the respective positions of
that noun in sentences 1 to n. Form a query, including restriction
when required.
An SQL query representative of this applied to the example
fact-tree would be:
select Person. Person, Phone Number, Subject
from Person, Person has Phone.sub.-- Number, Person.sub.-- studies.sub.--
Subject
Outer join Person. Person = Person has Phone.sub.-- Number.Person
Outer join Person.Person = Person.sub.-- studies.sub.-- Subject.Person
where Person.Person = `Mr. Smith` . . .
______________________________________
If any childi i=1 . . . n have children, apply Create.sub.-- Query to child and use an outer join to include the result into the existing query. In the example fact-tree, this would result in Create.sub.-- Query being executed on the Subject node of the Person that studies Subject branch and would result in the query:
__________________________________________________________________________
select Person.Person, Phone Number, Person.sub.-- studies.sub.-- Subject.
Subject
Subject.Person
from Person, Person.sub.-- has.sub.-- Phone Number, Person.sub.--
studies.sub.-- Subject, Subject
Outer join Person.Person = Person.sub.-- has.sub.-- Phone.sub.-- Number.Pe
rson
Outer join Person.Person = Person.sub.-- studies.sub.-- Subject.Person
Outer join Person.sub.-- studies.sub.-- Subject.Subject
= Subject.Subject
where Person.Person = 'Mr. Smith . . . . .
__________________________________________________________________________
Note that SQL is used as a notational convenience and its use has no bearing on the theory behind the algorithm. It is a particular feature of the present invention that any relational language could have been used. The advantages of the Fact Compiler, Query Mapper and Tree Interpreter algorithms of the present invention are that they substantially reduce the number of concepts and amount of training required for a naive end-user to express meaningful queries in a relational database. The algorithm set of the present invention allows users to form conceptual queries without having to know keywords and physical structures. Also, the algorithms of the present invention provide a generated natural language description of the query to assure that the query is correct in syntax. These advantages are illustrated by contrasting the previously described example with the same example expressed in the state of the art natural language and SQL implementations. To express the query in natural language, the user would need to construct and type in the query: Show me the person called Mr. Smith, his Address, his Phone Number, the subjects he studies, and, for those subjects show people who teach them The distinct pieces of information required to phrase this query are: 1. An overall knowledge of how to express the query (having to include keywords like person, etc.) 2. The knowledge that you could ask for Address, Phone Numbers, Subjects, etc . . . Fact Compiler The fact compiler of the present invention is selected at 100. The fact tree of the present invention is selected at 300. As explained above, a Fact Compiler is provided by the present invention, a detailed description of which follows. Referring now to FIG. 4, after selecting fact compiler 100, the user opens a diagram which represents one level of the information system to be modelled. After opening the fact compiler diagram, the user types in a factual sentence, an object type, or a constraint, using a natural language-like computer language. One such language is FORML. An example of such an input is "the INSTRUCTOR with the ID "100" is qualified to teach the SUBJECT with the name "database design" at the SUBJECT LEVEL 300". At this point, the user may select one of three options: validate the input using Validate Only function 105; compile the information only into the repository 140; or to drag and drop the fact over the diagram at function 120. Referring now to Validate Only function 105 of FIG. 4 of the present invention, after all text has been combined from the edit windows at 110, it is parsed into its component words using function 111. An error checker at 112 determines if there are errors in the text which is input. If there are no errors, the system indicates successful validation at 113. If error checker 112 determines that there are errors in the textual input, the errors will be shown at error window 114. Drag and Drop over Diagram function 120, is detailed in FIG. 5. After the user selects Drag and Drop over Diagram at 120, the user utilizing a mouse moves a pointer over the icon from the edit window at 121. The user presses the left mouse button down and holds it down at 122. The system will test for the type of item which was input in the edit window at 123. Item kind selector 124 will change the nature of the cursor depending upon the type of information input in the edit window. If the data input is a fact, the cursor will change to a fact cursor at 125. If the data input is a constraint, the cursor will be changed to a constraint cursor at 126. If the data input is an object, the cursor will change to an object cursor at 127. The user drags the modified cursor over the diagram at 128. The edit window will disappear while the cursor is being moved. If the cursor moves over a non-drawing area the cursor change to a "Can't Draw" cursor. In the event the user needs to reorient the direction of the data which is input, the right mouse button is used. For each click of the right mouse button, the icon changes its orientation 90.degree. at 130. When the user releases the left mouse button at 131, the cursor will drop the data text over the diagram. Cursor checker 132 determines if the cursor is actually over the diagram or not. If it is, Drag and Drop Parse function 135 is invoked. If not, an indication is given the user that the cursor is not in the diagram. Drag and Drop Parse function 135 is detailed in FIG. 6. After collecting text at the current edit window, the text is parsed using Parse function 142. After parsing an error checker 161 determines if an error has been made in the textual input. If no error has been made, the record is updated to the repository using the Update Record to Repository function 149. After the record has been updated to the repository, the item is drawn on the diagram using Draw Item on Diagram function 164. At error checker 161, should an error condition be determined to exist, Enter Error Condition function 162 is invoked and function 135 exited at 165. Referring again to FIG. 4, function 140 gives the user the option of compiling the input into the repository only. Function 140 is detailed in FIG. 7. After selecting function 140, Compiling into Repository Only, the system combines all text from the edit windows at 141 and parses the input data using Parse function 142. Error checker 143 determines if an error has been made in the textual input. If errors have been made, they are shown at error window 145. In the event no errors were incurred, successful compilation is shown in the status bar at 144. An iterative process is detailed at 146. Each of the lists generated, the object list, fact list and constraint list, is searched record by record. Each record is retrieved from it's respective list at 147 and its status tested at 148. In the event the record has been changed, function 149, which updates the record to the repository, is invoked. In the event the record has not been changed, the list pointer is incremented at 150 and a new record is retrieved from the list. In the event the record is new, function 149 is again invoked after which the record type is tested at 152. If the new record is an object, function 152, which allocates a new object, is invoked. In the event the new record is a fact, a new fact is allocated using function 154. If the new record type is a constraint, a new constraint is allocated at 155. Following any of these allocations, the list pointer is again incremented at 150. Parse function 142 of FIGS. 6 and 7 is detailed at FIG. 8. After Parse function 142 is invoked, it invokes Scan function 170 to retrieve a token. After the token is retrieved, it is partitioned into sections at 171. An iterative loop through each of the tokens is set up at 172. Each token is tested for its type at 173. In the event the section is an object, function 174, which looks for object specifications, is invoked. In the event the section is a fact, function 175, which looks for fact specifications is invoked. In the event the section being tested is a constraint, function 176 is invoked which looks for constraint specifications. After either functions 174, 175 or 176 have been invoked, a test is made to determine if the section tested is the terminal section. If not, the next section is subjected to the same test. In the event the section is a terminal section, Parse function 142 is exited at 179. Scan function 170 of FIG. 8 is detailed at FIG. 9. After Parse function 142 invokes Scan function 170, it reads characters from the input source text at 180. At 181, the function matches each character sequence with a token specification shown as regular expressions in the Lexical Analyzer Declarations at 181, after which the token is returned at 182. Function 162, Enter Error Condition which was previously invoked at FIG. 6, is detailed at FIG. 10. For each error encountered, the system will retrieve the character position and line number where the error occurred. This function will then store the error number, error text and error context information in an error list. Referring again to FIG. 8, function 174, Look for object Specification, is detailed at FIG. 11. When function 174 is invoked, the object is syntactically specified by any "Object Declaration" at 190. The function continues to "parse" and "scan". A determination is made at 191 as to whether it is still an object section. If not, the function exits at 192. If the object is still in the object section, it breaks the textual components into structures for storing in the object list at 193. At 194, the object status is tested. If it is a new object, function 220 which allocates a new object, is invoked, If the object already exists in an object list, function 174 is exited at 198. If the object exists in the repository but it is not in the object list, the system reads the structure at 196 from the repository and puts it in the object list. After either function 196 or 220 is invoked, the function is again exited at 198. Function 175, Look for Fact Specification, previously invoked at FIG. 8, is detailed in FIG. 12. After function 175 is invoked, the fact is syntactically specified by a "fact declaration". The system continues to "parse" and "scan". A determination is made at 201 as to whether the fact is still in the fact section. If not, function 175 is exited at 202. If it is still in the fact section, the system breaks the textual components into structures for storing in a fact list at 203 after which the fact status is tested at 204. If the fact is a new fact, function 230 is invoked, which allocates a new fact after which fact parsing is exited at 206. If the fact already exists in a fact list, the function 175 again exits at 206. If the fact exists in the repository but is not in the fact list, the system reads the structure at 205 from the repository and puts it in the fact list, after which function 175 exits once again at 206. Function 176, also previously invoked at FIG. 8, is detailed at FIG. 13. When function 176 is invoked to look for constraint specifications, a constraint is syntactically specified by a "constraint declaration" at 210. The system continues to "parse" and "scan". At 211 a determination is made if the constraint is still in the constraint section. If not, function 176 is exited at 212. If the constraint is still in the constraint section, the system breaks the textural components into structures for storing in the constraint list at 213. After which the constraint status is tested at 214. If a new constraint, function 240 is invoked, which allocates a new constraint. If the constraint already exists in the constraint list, function 176 is exited at 217. If the constraint exists in the repository but it is not in the constraint list, the function reads the structure from the repository and puts it in the constraint list at 216. After either function 240 or process 216 is accomplished, function 176 exits at 217. Function 220, previously invoked in FIG. 11, is detailed in FIG. 14. Function 220 creates an empty object structure at 221, enters the name and all object attributes into the structure at 222, places the object structure in the object list at 223, and exits at 224. Function 230, previously invoked in FIG. 12, is detailed in FIG. 15. Function 230 creates an empty fact structure at 231. At 232 the function puts the predicate text, the objects involved and internal constraint information into the structure. At 233, the function places the fact structure into the fact list and exits at 234. Function 240, previously invoked at FIG. 13, is detailed in FIG. 16. Function 240 allocates a new constraint as follows: it creates an empty constraint structure at 241. At 242 it puts the constraint information (role positions, predicate IDs) into the structure. At 243 the function places the constraint structure into the constraint list and exits at 244. Function 149, previously invoked in both FIGS. 7 and 8, is detailed in FIG. 17. After function 149 is invoked, an updated record exists in a fact list, an object list or a constraint list, as shown at 250. At 251, the record type is tested. If the record is an object, the object structure is sent the object update function at 252. If the record is a fact, the fact structure is sent to the fact update function at 254 and if a constraint structure is sent to a constraint update function at 253. After any of the aforementioned update functions is accomplished, function 149 is exited at 256. Fact Tree Formation Referring to FIG. 18, the second option possible from function selection 51 is initiation of Fact Tree Specification 300, which is detailed at FIG. 19. Referring to FIG. 19, the fact-tree is formed at 300. An example of a fact tree is:
______________________________________
Person ( = Mr Smith ) . . . . . . . . . . . . . . . . . . . . . . .
restriction
- that person lives at an address
- that has a phone number . . . . . . . . . . . . . . . . . . . . . . . .
. noun
- that studies a subject
- that is taught by a Person
.vertline. . . . . . . phrase
Each node in the fact tree has a noun (e.g. Person).
Each node in the fact tree may have a restriction (e.g. =
Mr Smith).
Each non-root node in the fact tree (all but the very top node)
has a phrase (e.g. is taught by).
______________________________________
The root of the tree is then assigned to the variable Root at 301. In this case, the shaded node (Person) is assigned to Root. If the fact tree is to be mapped to an SQL query, Root is passed as the parameter to the function Fact.sub.-- Tree.sub.-- To.sub.-- SQL 400. The return value of this function will be an SQL query. Fact.sub.-- Tree.sub.-- To.sub.-- SQL is described using functions 400 to 465. If the fact tree is to be mapped to an English description, Root is passed as the parameter to the function Fact.sub.-- Tree.sub.-- To.sub.-- Description (500). This function has no return value. The result of Fact Tree To Description will be to print out the description of the tree. Fact.sub.-- Tree.sub.-- To.sub.-- Description is described using functions 500 to 535. Tree Interpreter The present invention also provides a Tree Interpreter, invoked as "Fact.sub.-- Tree.sub.-- to.sub.-- Description", a detailed description of which follows. Referring to FIG. 25, Fact.sub.-- Tree.sub.-- To.sub.-- Description (500) is a recursive function that takes a node of a fact tree as input and returns a description of the query represented by that tree or sub-tree. The parameter Root is the node on which the function is to operate. Function 501 assigns some working variables. Root is the root of a tree of subtree that may or may not contain a parent and may or may not contain children. For example, if the shaded node in the example tree (Parent) was passed to Fact.sub.-- Tree.sub.-- To.sub.-- Description, there would be no parent, and three children. Parent is assigned Root's parent (in this case NULL). Nodes is assigned the number of Root's children (three). Child [i . . . Nodes] is an array which is assigned Root's children (the three children of Person--"that lives at Address", "that has a phone number", and "that studies a subject". Next, the temporary variable Text is assigned the "description" of the Root. (502-504). If Root has no parent (it is the root of the fact tree), the Root is passed to the function 503 Create.sub.-- Text.sub.-- For.sub.-- Root (503), otherwise Root is passed to Function 507, Create.sub.-- Text.sub.-- For.sub.-- Node. The return value of both of these functions is the description of Root. At 505, Text is then printed out followed by a carriage return. If Root referred to the Person node, Function 503, Create Text For Root would have been used and Text would be "For all Person(s) (where Person=Mr. Smith) show:" If Root referred to the "that studies Subject" node, function 504, Create Text For Node would have been used and Text would be "the Subject(s) that the Person studies, and for those Subject(s) show:". The next step is to recursively process Root's children using a depth-first-search algorithm, detailed in functions 505-510. Nodes is the number of Root's children and Child [i . . . Nodes] are Root's actual children. The variable I is used as a counter variable. It is initially assigned to 1 at 506. If I is greater than Nodes (there are no more children to process), this instantiation of Fact.sub.-- Tree.sub.-- To.sub.-- Description is complete (507-508) otherwise, Fact.sub.-- Tree.sub.-- To.sub.-- Description is invoked for Child[i] at 509, I is incremented at 510 and the loop continues (507) until there are no more children to process. The result of Fact.sub.-- Tree.sub.-- To.sub.-- Description applied to the Person node in the example tree would be:
__________________________________________________________________________
the Address that the Person lives at
Output
For all Person(s) (where Mr. Smith) show:
the Phone Number that the Person has
the Subjects that the Person studies,
and for those subjects show:
Processing
The Person(s) that the Subject
Fact.sub.-- Tree To .Description called on Person
is taught by
Fact.sub.-- Tree To .Description called on
that lives at Address
Fact Tree To .Description called on
that has phone number
Fact.sub.-- Tree To .Description called on
that studies subject
Fact Tree To Description called on
that Is taught by Person
__________________________________________________________________________
Function 503, is invoked in FIG. 25 and Create.sub.-- Text.sub.-- For.sub.-- Root, takes the root node of a fact tree as an argument (Root) and returns the description of that node. At 520, Root contains a noun (e.g. Person); this is assigned to Noun. Root may also contain a Restriction (e.g. =Mr. Smith); this is assigned to Restriction. The variable Text is assigned `For all`+noun+`(s)` (e.g. `For all Person(s)`). If Root contains a Restriction, that restriction is added to Text at 521 and 522. (e.g. Text="For all Person(s)(where Person=Mr. Smith)") "show": `is added to Text at 523 and Text is returned at 524 (e.g. Text="For all Person(s)(where Person=Mr. Smith) show:"). Referring back to FIG. 25, Create.sub.-- Text.sub.-- For.sub.-- Node (504), which takes a non-root node of a fact tree as an argument (Root) and returns the description of that node is detailed at FIG. 27. For example, the node `that studies a Subject` could be passed as an argument. At 530 Root contains a noun (e.g. Subject), this is assigned to Noun. Root contains a phrase (e.g. `studies`) this is assigned to Phrase. Root's parent contains a noun (e.g. Person), this is assigned to Parent-Noun. Root may also contain a Restriction (e.g. =Mr. Smith) this is assigned to Restriction. The variable Text is assigned `the`+noun+`(s) that the`+Parent-Noun+` `+Phrase (e.g. `the Subject(s) that the Person studies`). If Root contains a Restriction, that restriction is added to Text at 531 and 532. If Root has children`, and for those`+Noun+`:(s) show:`is added to Text at 533 and 534. At 535, Text is returned (e.g. Text=`the Subject(s) that the Person studies, and for those Subject(s) show:`). Query Mapper The present invention also provides a Query Mapper, invoked as "Fact.sub.-- Tree.sub.-- to.sub.-- SQL.sub.-- Query", a detailed description of which follows. Referring back to FIG. 19, function 400 is detailed at FIG. 20. This function takes the root node of a fact tree as input and returns an equivalent SQL query. The parameter Root is the root of the tree on which the function is to operate. Each non-root node in the fact tree (all but the very top node) has a relational mapping associated with it. The relational mapping specifies the node's representation in a relational database. By way of example, for an example set of facts: Person lives at address Person has Phone Number Person studies Subject Subject is taught by Person if the relational database associated with an example fact tree is: Person.sub.-- Table: (Person, Address) Phone.sub.-- Table: (Person, Phone Number) Studies.sub.-- Table: (Person, Subject Studied) Subject.sub.-- Table: (Subject, Teacher Person) The associated RMAP mappings would be:
__________________________________________________________________________
FIRST NOUN
SECOND NOUN
FACT TABLE COLUMN COLUMN
__________________________________________________________________________
Person lives at address
Person.sub.-- Table
Person Address
Person has Phone Number
Phone.sub.-- table
Person Phone Number
Person studies Subject
Studies.sub.-- Table
Person Subject Studies
Subject is taught by Person
Subject.sub.-- Table
Person Teacher Person
__________________________________________________________________________
Table denotes the table in which the fact is stored. First Noun Column denotes the column for the node's parent. Second Noun Column denotes the column for the node's noun. For example, the mapping associated with (Person) lives at Address means that all addresses are stored in the Person Table table, with the people who live at them in the Person column and the actual addresses in the Address column. The predicate mappings are derived by an algorithm similar to the one described in McCormack & Halpin, Automated Mapping of Conceptual Schemas to Relational Schemas, Proc CAiSE 93, Sorbonne University, Paris, 1993. An SQL query contains three parts, a SelectList, a From List, and a WhereClause. These are gradually build up using recursive calls to Node.sub.-- To.sub.-- SQL. Initially, SelectList, FromList, and WhereClause are set to the empty string (" ") at 401. At 402, Node.sub.-- To.sub.-- SQL is called, with Root as its parameter, to build up the query. At 403, 404 and 405, SelectList, FromList, and WhereClause are respectively formatted. At 406, the query is assembled as the result of the function and returned at 407. Node.sub.-- To.sub.-- SQL is invoked at 402. This is a recursive function that maps a node of a fact tree into an SQL query. Successive calls to this function build up the SelectList, FromList, and WhereClause variables. Node.sub.-- To.sub.-- SQL has three parameters. The first, Root, is the root of the tree or sub tree being mapped. The second and third parameters are the table and column used to join the query for Root's sub tree to the rest of the query. At 410, Parent is the parent of Root, Nodes is the number of children of Root and Child [i . . . Nodes] are the children of Root. If Root has no children, no processing is required so the function simply returns (411, 412). Otherwise, the children of Root are added to the query as follows: If Root has a parent, it needs to be joined into the query using the JoinTable and JoinColumn function 413 and 414; and Root needs to be added to the select list of the query at 415. To join all of Root's children into the query, they are processed sequentially as follows: The counter variable i is initialized to 1 at 416. The value of nodes is checked at 417, and if i is less than nodes Add.sub.-- Selector.sub.-- 2 (node) is invoked at 419. Each child is added to the select-list using Add.sub.-- Selector.sub.-- 2 at 419, and that child's children are added into the query using recursive calls to function 421, Node.sub.-- To.sub.-- SQL. The selector for each node is used to join the subtree's queries together at 420. Create.sub.-- Join, invoked at 414 is detailed at FIG. 22. This function joins a subquery to the main query by adding an inner join to the where-clause. The join is based in the first noun in the passes node (Node) and the passed parameters. Referring back to FIG. 21, function 415, Add.sub.-- Selector.sub.-- 1 is detailed at FIG. 23. This function adds Node's table to the FromList and the column for Node's first noun to the SelectList. Referring again to FIG. 21, function 419, Add.sub.-- Selector.sub.-- 2 is detailed at FIG. 24. This function adds Node's table to the FromList and the column for Node's second noun to the SelectList. It is to be understood that the above-described embodiments are merely illustrative of some of the many specific embodiments which represent applications of the principles of the present invention. Clearly, numerous variations can be readily devised by those skilled in the art without departing from the scope of the present invention.
|
Same subclass Same class Consider this |
||||||||||
