Method and apparatus for querying a database containing disjunctive information5628012Abstract A method and apparatus for querying a disjunctive database is disclosed. An annotated object, which defines a complete object which is a unique element of the normal form of the database, is stored and maintained in memory. A complete object is created using the annotated object, and is stored in memory. A query is run against the complete object. If continued processing is required, a next function produces a next annotated object, which represents another unique element of the normal form of the database. The next annotated object is stored in memory, overwriting the previous annotated object. A complete object is then created using the next annotated object and is stored in memory, overwriting the previous complete object. The query is then applied to the complete object stored in memory. By processing the elements of the normal form one at a time and reusing memory spaces, the present invention has a polynomial space requirement. The invention also allows processing to begin from a pre-defined annotated object, either user defined or random. A user defined annotated object may define certain sub-objects which are not to be processed. Further, processing may terminate with the last processed, annotated object as an output. Claims I claim: Description FIELD OF THE INVENTION
______________________________________
Object
Type Annotation
______________________________________
Base: (B)
Pair: (P, bool)
Set: (S, bool)
Or-set: (O,bool, [bool,... bool])
______________________________________
.sup.1 In FIG. 1, A1 is shown with only two subobjects. However, A1 is
defined as a set instead of as a pair because all elements of an orset
must, by definition, contain elements of the same type. This is because
orset elements are alternatives, and therefore they must be compatible.
Since A2 contains 3 elements, and must be defined as a set, A1 must also
be defined as a set. This does not present a problem since a pair is a
special form of a set.
The first element of each of the annotations identifies the object's type: B (base), P (pair), S (set), and O (or-set). The element bool represents a boolean value of true or false. The first boolean value shown above in the annotation for the pair, set and or-set type objects, is used to identify whether the object needs to be processed further while creating the elements of the normal form. In other words, the annotation indicates whether the object, or any of its sub-objects, contains disjunctive information which needs to be further considered in creating the elements of the normal form. For base type objects, this boolean value is always assumed to be false, and therefore is not stored in the annotation for base type objects. The use of this boolean value will become clear with reference to the discussion that follows. The or-set annotation contains an additional list of boolean values, shown above as [bool, . . . bool]. These boolean values are used to indicate the element of the or-set that is the current choice given by that or-set. Each boolean value in the boolean list positionally represents one of the elements in the or-set. Only one of these boolean values will contain the value true at any one time. The position of this true value indicates which element is the current choice for the or-set. For example, if the second boolean value in the list is true, then the second element in the or-set is the current choice. To avoid confusion, the boolean elements used to identify whether an object needs further processing while creating the elements of the normal form will be referred to as a boolean annotation. The list of boolean elements contained in the annotation for the or-sets to indicate the element of the or-set which is the current choice will be referred to as a boolean list annotation. The initial annotated object created in step 310 is shown in FIG. 4A as 402. Each of the objects from FIG. 1 is shown with an associated annotation. The annotation associated with each of the objects is represented in FIGS. 4A and 9A-F by an attachment to the object, e.g. the attachment 404 to object B in FIG. 4A. The initial annotation is created as follows. The base objects, w, k, l, m, x, y, z, v, p, q, r, s, t, and u are annotated with their type B. The pair objects, Design and B, are annotated with their type P and the boolean annotation set to T (true). The set objects, A1 and A2, are annotated with their type S and the boolean annotation set to T. The or-set objects, A, A1.1, A1.2, B1, B2, A2.1, A2.2, and A2.3 are annotated with their type O and the boolean annotation set to T. As described above, the or-set annotation also contains a boolean list annotation [bool, . . . bool]. Initially, the first boolean value in the list is set to T, with all remaining boolean values set to F. This indicates that initially, the first element of the or-set will be chosen for the completed object. Thus, for objects of type pair, set, and or-set, the boolean annotation is initially set to T to indicate that the object requires additional processing to create elements of the normal form. For or-set objects the boolean list annotation is set to contain a T in the first position and an F in all other positions to indicate that the first element in the or-set is the initial chosen element. The initial annotated object 402 is stored in memory unit 210 in memory space 216. In step 315, a complete object, which is an element of the normal form, is created in memory space 214 using the annotated object stored in memory space 216. Apick function is used to create the element of the normal form from the current annotated object. The pick function operates as follows for each of the object types:
______________________________________
Type Operation
______________________________________
Base: pick the element itself
Pair: pick each element of the pair to create a pair
Set: pick each element of the set to create a set
Or-set:
pick the element of the or-set which corresponds to a true
value in the annotation's boolean list
______________________________________
The element of the normal form which will be created from the initial annotation is shown in FIG. 4B. The Design object is a pair and each element of the pair, A and B, are picked to create a pair in the complete object. Since A is an or-set object, it will contain the element which corresponds to a true value in A's boolean list annotation. The boolean list annotation of object A is [T,F]. Thus, the first element, A1, of the or-set will be chosen for the current complete object. A1 is a set, and each element, A1.1 and A1.2, are picked to create a set. Since A1.1 and A1.2 are or-sets, the elements chosen for A1.1 and A1.2 are determined from their boolean list annotations. Thus, x is chosen for A1.1 and z is chosen for A1.2. In a similar manner, the object B is a set containing objects B1 and B2. Since B1 and B2 are or-sets, the values for B1 and B2 are determined from their boolean list annotations. Thus, w is chosen for B1 and l is chosen for B2. The complete object, which is the element of the normal form defined by the initial annotated object shown in FIG. 4A, is shown in FIG. 4B. This is one complete object of the normal form which is represented by the Design object. The next step, 320, is to query the complete object which is an element of the normal form and update the query accumulator. The complete object created by the pick function is stored in memory space 214, and a query is applied to that object in a conventional manner. For example, assume the complete object represents a complete design, and also assume that each sub-object contains information about the cost of that sub-object. Then a query on the complete object may take the sum of the costs of each sub-object to calculate the cost of the complete design. Numerous other conventional query techniques which are well known in the art may be applied to the complete object, and will not be described further herein. Various new query techniques which are made possible by the present invention will be described below. In step 325 a query condition is checked. The query condition will depend on the particular query being run. If the query condition is satisfied, then the query results are output in step 340 in some convention manner. If the query condition is not satisfied, then it is determined in step 330 whether the current annotated object stored in memory space 216 is the last annotated object. This test determines whether there are any elements of the normal form which have not yet been created as complete objects. This test is performed by testing the highest level object, in the present example, the Design object. If the boolean annotation of the highest level object is F, then the current annotated object is the last one and the query results are output in step 345. As an example, a query may ask for the number of complete designs that are represented by a database object containing disjunctive information. In such a case, the query accumulator would contain a counter which would be incremented by one in step 320 each time a new element of the normal form is created. When the test of step 330 is YES, the query accumulator would contain the number of complete designs represented by the database object and the results would be output in step 345. If the boolean annotation of the highest level object is T, then there are additional complete objects which have not yet been created. Control is passed to step 335 in which the next annotated object is created using the next function. The next function operates on the current annotated object stored in memory space 216. As discussed above, in order to avoid the exponential space requirement of the prior art, the present invention queries one element of the normal form at a time. As represented in FIG. 2, the annotated objects are stored in memory space 216 and the completed objects which are elements of the normal form are stored in memory space 214. Each time a new annotated object is created, it is stored in memory space 216 and overwrites the previous annotated object. Similarly, each time a completed object, which is an element of the normal form, is created, it is stored in memory space 214 and overwrites the previous complete object. The next function algorithm will be described in conjunction with a specific example and with reference to FIGS. 5-9. FIG. 5 shows the algorithm for pair type objects. FIG. 6 shows the algorithm for set type objects. FIG. 7 shows the algorithm for or-set type objects. FIG. 9A shows a starting annotated object for the example. The resulting annotated object, after applying the next function to the annotated object shown in FIG. 9A, is shown in FIG. 9F. FIGS. 9B-9E show intermediate annotated objects which result from the recursive nature of the algorithm. In FIGS. 9B-9E, the portion of the annotated object which has changed from the immediately preceding annotated object is shown enclosed in broken lines. FIG. 8 illustrates the nesting of the recursive function calls. The next function is applied to the entire DESIGN annotated object shown in FIG. 9A as follows. The next function is called to perform the algorithm on the DESIGN object. This is the top level of recursion and is shown as level 802 in FIG. 8. Since the DESIGN object shown in FIG. 9A is a pair object, the algorithm shown in FIG. 5 is applied. Referring to FIG. 5, the general algorithm for a pair type object is as follows. In step 502 the next function is performed recursively on the y object of the pair. For purposes of this algorithm, the first object of the pair is identified as the x object and the second object of the pair is identified as the y object. After the next function is applied to an object, the boolean annotation of the object will indicate whether the object needs to be processed further. If the boolean annotation is T, then additional processing is necessary. If the boolean annotation is F, then no additional processing is necessary. In step 504 the y object's boolean annotation is examined. If the boolean annotation is T, then the function ends in step 514. If the y object's boolean annotation is F, then the next function is performed recursively on the x object in step 506. In step 508 it is determined whether the x object's boolean annotation is true. If it is, then object y, including all of y's sub-objects, is reset to its initial annotation in step 516, and the function ends in step 518. If the x object's boolean annotation is F, then the boolean annotation of the pair object is set to F in step 510, and the function ends in step 512. Returning now to the specific example, in step 502, the next function is called recursively on object B. This is shown in FIG. 8 as level 804 of recursion. Since B is a pair object, the pair algorithm is applied again. In step 502, the next function is called recursively on the B2 object of the pair, shown as level 806 of recursion. Since B2 is an or-set, the or-set algorithm is applied. Referring now to FIG. 7, the general algorithm for an or-set type object is as follows. In step 702 it is determined if the or-set is empty or if the or-set's boolean annotation is F. If either of these conditions is true, then the function ends in step 718. Otherwise, object x.sub.i of the or-set, which corresponds to the T value in the or-set object's boolean list annotation, is chosen in step 704. As an example, consider the or-set (a,b,c) . If the or-set's boolean list annotation contained [F,T,F], then the second object, b, would be chosen as object x.sub.i. Thus, the true value in the boolean list positionally represents which element to choose as object x.sub.i. In step 706 the next function is applied recursively to x.sub.i. In step 708 it is determined whether the x.sub.i object's boolean annotation is true. If it is, then the function ends in step 720. If it is false, then in step 710 it is determined whether x.sub.i is the last object of the or-set. If it is, then the boolean annotation of the or-set object is set to false in step 722 and the function ends in step 724. If x.sub.i is not the last object of the or-set, then the boolean value corresponding to object x.sub.i in the or-set object's boolean list annotation is set to F in step 712, and the boolean value corresponding to object x.sub.i+1 in the or-set object's boolean list annotation is set to T in step 714. In effect, steps 712 and 714 shift the T boolean value in the or-set's boolean list annotation one position to the right. The function ends in step 716. Returning now to the specific example, the test in step 702 is YES because the boolean annotation of the B2 or-set is F. The function ends in step 718 and the algorithm returns to processing the B object at recursion level 804. The next function continues processing the B object at step 504. Since the boolean annotation of B2 is F, control passes to step 506 and next is performed recursively on B1 at recursion level 808. Since B1 is an or-set, the algorithm of FIG. 7 is performed and control passes to step 702. Since the boolean annotation of B1 is F, the function ends in step 718 and the algorithm returns to processing the B object at recursion level 804 at step 508. Since the boolean annotation of B1 is F, the B object's boolean annotation is set to F in step 510. The resulting intermediate annotated object is shown in FIG. 9B. The function ends in step 512 and the algorithm returns to processing the DESIGN object at the top recursion level 802 at step 504. Since the boolean annotation of B is F, the condition of step 504 is NO, and control passes to step 506, where the next function is called recursively on object A. The algorithm enters recursion level 810 to process object A. Since A is an or-set, the algorithm of FIG. 7 is applied. In step 702 it is determined that the or-set is not empty and the or-set annotation is not F, so control passes to step 704. In step 704, A1 is chosen as x.sub.i since object A's boolean list annotation has a T in the first position, thus indicating that the first element of the or-set is to be chosen. In step 706, next is applied recursively to A1 at recursion level 812. Since A1 is a set object (see footnote 1), the algorithm of FIG. 6 is applied. Referring now to FIG. 6, the general algorithm for a set type object is as follows. Regarding notation, the set [X] is considered to consists of the first object in the set, x.sub.1, plus the rest of the objects in the set, x.sub.rest. Thus, the set [X] is treated as [X]={x.sub.1 }.orgate.[x.sub.rest ]. In step 602, the next function is applied recursively to object x.sub.1. In step 604 it is determined whether the x.sub.1 object's boolean annotation is T. If it is, then the function ends in step 614. If the boolean annotation is F, then the x.sub.1 object, along with all of its sub-objects, is reset to its initial annotation in step 606. In step 608, the next function is applied recursively to X.sub.rest. In step 610, the boolean annotation of the set object is set to the value of the boolean annotation of the last object of the set which was processed. The function ends in step 612. Returning to the specific example, in step 602, next is applied recursively to object A1.1 at recursion level 814. Since A1.1 is an or-set, the algorithm of FIG. 7 is applied. The test of step 702 is NO, and the object y is selected in step 704. In step 706, the next function is applied recursively to object y at recursion level 816. y is a base type. If the next function is applied to a base type, there is no change to the annotated object. Thus there is no processing necessary, and the function returns to recursion level 814 to continue processing the A1.1 object. The test of step 708 is NO because, as described above, the boolean annotation of a base object is always assumed to be F. Since y is the last element of the or-set, the test of step 710 is YES and control passes to step 722. In step 722, the boolean annotation of object A1.1 is set to F. The resulting intermediate annotated object is shown in FIG. 9C. The function ends in step 724, and the algorithm returns to recursion level 812 to continue processing the A1 object. Since the boolean annotation of object A1.1 is F, the test in step 604 is NO, and A1.1 is reset to its initial annotation in step 606. The resulting intermediate annotated object is shown in FIG. 9D. In step 608, next is applied recursively to A1.2, which is x.sub.rest, at recursion level 818. Since A1.2 is an or-set, the algorithm of FIG. 7 is applied. The test of step 702 is NO, and object z is selected in step 704. In step 706, the next function is applied recursively to object z at recursion level 820. Since z is a base type, there is no processing necessary, and the function returns to recursion level 818 to continue processing the A1.2 object. The test of step 708 is NO, and control passes to step 710. Since z is not the last object in the or-set, the boolean value in the A1.2 object's boolean list annotation corresponding to z is set to F in step 712, and the boolean value in the A1.2 object's boolean list annotation corresponding to v is set to T in step 714. In effect, steps 712 and 714 shift the T value in A1.2's boolean list annotation one position to the right. The resulting intermediate annotated object is shown in FIG. 9E. The function ends in step 716 and the algorithm returns to recursion level 812 to continue processing object A1. In step 610, the boolean annotation of A1 remains set to T, since the boolean annotation of the last sub-object processed, A1.2, has a boolean annotation of T. The function ends in step 612 and the algorithm returns to recursion level 810 to continue processing object A. In step 708 it is determined that A1 has a true annotation. The function ends in step 720 and the algorithm returns to the top recursion level 802 to continue processing the DESIGN object. In step 508 it is determined that A has a true annotation. In step 516, object B, including all of its sub-objects, is reset to its initial annotation. The resulting annotated object is shown in FIG. 9F. The function ends in step 518. Since this was the top recursion level, the next function is finished processing the DESIGN object, and FIG. 9F represents the final annotated object which results from applying the next function to the annotated object of FIG. 9A. Referring to FIG. 3, upon completion of step 335, control is passed to step 315, and the algorithm continues as described above. In an alternate embodiment of the invention, step 310 is modified so that a pre-defined annotated object may be input and used as the starting point for the main loop (steps 315 through 335) of the algorithm. This is illustrated in FIG. 3 by annotated object 350. In addition, the results 340 which are output following a YES condition in steps 325, and the results 345 which are output following a YES condition in step 330, may contain the current annotated object which is stored in memory space 216. Thus, in this alternate embodiment, the invention allows for a pre-defined annotated object to be used as input, and for the results of the algorithm to include the current annotated object as an output. These aspects allow for novel query techniques, as described below. One technique made possible by the present invention is normalization with time constraints. The time needed to normalize a large database may be very long. Thus, iterating over the entire normal form may be impractical. Using the above techniques, it is possible to start at the initial annotated object in step 310, and iterate for a pre-defined time period. Thus, one of the conditions of step 325 would be whether the pre-defined time period has expired. If it has, then the results 340 would include the results of the particular query being run, along with the last annotated object 355 which was processed. The user may then examine the query results and determine if they are satisfactory. If so, no further processing is necessary. If the user would like to continue processing the query, the last annotated object 355 processed which was output with the results 340, is used as an input annotated object 350 for further processing. The algorithm then continues processing from step 315 using the last processed annotated object from the previous iteration. This allows for querying a disjunctive database over its entire normal form, while allowing the processing to be interrupted at certain intervals so that the intermediate results can be examined. In addition, a random annotation could be generated and used as the input annotated object 350. This would be useful for optimization queries over large normal forms. In such a case, it may not be feasible to calculate the entire normal form. In addition, the time limit approach may not be sufficient, because optimal values may be far from any given starting annotation in terms of the number of times the next function must be applied. The solution to this problem is to generate a number of random annotations and run the optimization query starting from each random annotation for a predetermined time period. In another embodiment, another object type is defined for use in annotating the objects. This new type is Fake Base (FB). This allows for additional user control as follows. Consider the DESIGN object of FIG. 1 with its initial annotation shown in FIG. 4A. It is possible that a user may want to run a query on the DESIGN object, but the user is only interested in optimizing over the A object, including its sub-objects. In such a case, the user is not interested in the B object or its sub-objects. The use of the fake base type allows the user to specify this by creating an annotated object in which the B object is annotated with the type FB. This annotated object is used as the input 350 to the algorithm. The pick function is modified to treat FB annotated objects as base objects, disregarding their complex structure. The next function also treats these fake base types as base types, and thus will not create all the elements of the normal form represented by these objects. In the example above, where B is annotated as FB, the query can still run correctly because it does not consider the B object's complex structure. Functional Description The following is an additional description of certain aspects of the present invention using a functional language specification. For further information on functional language specification see, C. Reade, Elements of Functional Programming, Addison-Wesley, 1989. The same algorithms which were described above in terms of flow diagrams, are described below using a functional language specification. Note that the invention has been fully described above. What follows is an alternate description using a different type of descriptive language. This section is included for pedagogical reasons and is not intended to limit the above description in any way. Definition of Annotated Object For each object type t, there is a new annotated type A(t) and the initial translation t.fwdarw.A(t). Type K (kind) has four possible values: B (base), P (pair), S (set), and O (or-set). [t] denotes the type of lists of type t. For each type t, an annotated type A(t) is created as follows: ##EQU1## The first boolean value in these translations is set to true if there are still entries represented by the object that have not been looked at. For or-sets, the boolean component inside lists is used for indicating the element that is currently used as the choice given by that or-set. In all algorithms only one entry in such a list will have the true boolean component. Initial Function initial:t.fwdarw.A(t) produces the initial annotation of an object. The definition of initial is given in FIG. 10. Pick Function pick:A(t).fwdarw.sk(t) produces an element of the normal form given by an annotation. sk(t) is t from which all or-set brackets are removed. If an object has type t, elements of its normal form have type sk(t). The definition of pick is given in FIG. 11. In FIG. 11, void means a special object used to indicate the end of the process of going over the normal form. P1-P5 gives a version of pick in which void is not propagated to the top level. Such propagation can be done to detect inconsistencies represented by empty or-sets. End Function end: A(t).fwdarw.bool returns true if and only if all possibilities represented by its argument have been exhausted. This function always returns true on (B,x). On any other annotated object, x is of the form (k,c,v), where: k is kind; c is condition; and v is value. For such x, end (x)c. Reset Function rest:A(t).fwdarw.A(t) disregards the annotation of an object and restores the initial one. This definition repeats the initial function. Next Function The algorithm for the next function is recursive. In the definition of the next function, for any list X=[x.sub.1, . . . , x.sub.n ] X.sub.oi stands for [x.sub.1, . . . , x.sub.i-1 ] and X.sub.1i denotes [x.sub.i+1, . . . , x.sub.n ]. These sets may be empty. The notations :: and @ are used for consing and appending. That is, a::x puts a as the new head before the list x, and x@y appends y to the end of x. The next algorithm is shown in FIG. 12. The following algorithm, which defines a function norm, can be used to query a disjunctive database:
______________________________________
norm (cond, init, update, out) (o)
acc:=init;
ao:=initial o;
last:=end ao;
while not (cond(pick ao) or last)
do
acc:=update(pick ao, acc);
ao:=next ao;
last:=end ao
end;
return out((pick ao, last), acc, ao)
______________________________________
The output value is accumulated in acc; cond is used to break the loop if the condition is satisfied; last indicates if all possibilities have been looked at; and out forms the output. As noted at the beginning of this section, the algorithms of the present invention have been fully described in a conventional manner in conjunction with the flow diagrams of FIGS. 3 and 5-7, and the detailed description associated with those figures. The functional language description is given only as an alternate description of certain aspects of the present invention. The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention.
|
Same subclass Same class Consider this |
||||||||||
