Method of conjoining clauses during unification using opaque clauses5903860Abstract A method of using a processor to conjoin a first clause and a second clause as part of a unification of a first graph. If the first clause is not associated with the first graph, then a third clause is created that is opaque and has a pointer to the first clause. Afterward, the third clause is conjoined with the second clause. Claims What is claimed is: Description REFERENCE TO MICROFICHE APPENDIX
______________________________________
Edge›
id: an integer uniquely identifying this edge
right: an integer identifying the rightmost word of the edge
left: an integer identifying the leftmost word of the edge
category: indicates the grammatical category of the edge; e.g., NP, S,
VP, etc.
subtrees: a list of the various ways of making this edge
graph: a pointer to the graph data structure for this edge
Each Subtree Data Structure represents a subtree in Chomsky Normal
Form. and includes the following information:
Subtree›
partial: points to the left daughter of the rule, a partial starts at
the
leftmost word of the edge and ends somewhere in the middle
complete: points to the right daughter of the rule, a complete starts
somewhere in the middle of the edge and ends at the rightmost word of
the edge
constraint: defines how partial and complete should be combined to
produce the subtree
graph: a pointer to the graph data structure for this subtree
!
______________________________________
Note that each subtree data structure includes only two daughters, partial and complete, even though conceptually context-free rules permit an arbitrary number of daughters per subtree. Application of a standard transformation to a grammar produces a new grammar in which all rules are binary. For example, by applying this transformation to the rule S .fwdarw. NP VP ADV two rules are produced S .fwdarw.S1 ADV and S1.fwdarw. NP VP. The class of Graph Data Structures include three types: Graph, AVPair, and CVPair. Each Graph Data Structure represents a feature structure and related information and includes the following information:
______________________________________
Graph ›
attrs: a list of pointers to the AVPairs associated with this graph data
structure
context: represents the context that this graph is in, and corresponds a
variable assigned to a subtree to distinguish it from other valid
subtrees
associated with the same edge
nogood: a boolean value indicating whether this feature data structure
is
nogood
nogoods: a list of clauses that are nogood and associated with this
graph
edge: a pointer to the edge associated with this graph
disjunctive: a boolean value that indicates whether or not the graph is
an OR graph; i.e., a single graph that represents many alternative
graphs
clauses: a list of clauses allocated in this graph
disjunctions: a list of locally instantiated disjunctions
solutions: a list of pointers to Restriction Sets and their solutions
______________________________________
Each AVPair Data Structure represents an attribute contexted value pair and includes the following information:
______________________________________
AVPair›
attr: the name of the type of attribute
attrs: a list of pointers to the attributes that are included in the
feature
structure that this AVPair represents
equals: a list of pointers to CVPair data structures that store the
value
for this attribute; e.g., if attr is NUM then the values in this field
might by
SG in one context and PL in another context
copies: a list of pointers to CVPair data structures that store copy
links
to values that have been copied to or copied from this AVPair
contexts: contexts in which this AVPair already has constraints
prefix: a pointer to the AVPair that includes this AVPair. For example,
in
the LFG term (.uparw. SUBJ NUM), the NUM AVPair would have the
(.uparw.SUBJ) AVPair as its prefix.
graph: a pointer to the feature data structure that this AVPair belongs
to
expanded: a boolean value that indicates whether the contexted lazy
links associated with this AVPair have been expanded
______________________________________
Each CVPair data structure represents a context value pair and includes the following information:
______________________________________
CVPair›
contexts: the context, or clause, associated with this value, for
regular
copy links the context is TRUE, any other value indicates a contexted
value
value: this field my include a value or a pointer to another AVPair data
structure, depending on whether the pointer to this CVPair is stored in
the equals field or copies field of an AVPair
lazy: a boolean variable that indicates whether the value field is a
forward copy link pointing forward to a recipient or is a lazy copy link
that
points backward to a source
______________________________________
There are two types of Clause Data Structures: Clause and Disjunction. Each Clause Data Structure represents a clause and has its own clause cache, which is a list of cache items. A Clause Data Structure includes the following information:
______________________________________
Clause:›
type: Clause type- AND, OR, CHOICE, OPAQUE, nogood, TRUE
body: union of{
- this type of clause is a conjunction of contexts
first: a pointer to the first clause of the conjunction
rest: a pointer to the rest of the conjoined clauses
OR{--this type of clause is a disjunction of contexts--
first: a pointer to the first clause of the disjunction
rest: a pointer to the rest of the disjoined clauses
}
NOT{--this type of clause is a negation of a context--
negated: the clause negated
}
- this a primitive choice of disjunction
disjunction: pointer to disjunction data structure
containing the choice
}
OPAQUE{--this imports a context
imported: the imported clause that is being wrapped
graph: a pointer to graph data structure from which clause was imported
}
}
graph - a pointer to the graph data structure with which this clause is
associated
cache - a pointer to an area of the clause cache associated with this
clause. This area of the clause cache stores results of previously
performed
operations with this clause
exported - a boolean variable that indicates whether this clause has
been imported to another graph data structure
nogood - boolean variable that indicates whether this clause has been
determined to be nogood
!
______________________________________
Each Disjunction Data Structure represents a disjunction includes the following information:
______________________________________
Disjunction›
count - integer indication the number of disjuncts in this disjunction
context - the context with which this disjunction is associated
arm1 - the first choice context (if only one)
disj1 - the first choice disjunction (if more than one)
arm2 - the second choice context (if only one)
disj2 - second choice disjunction (if more than one)
______________________________________
There are three types of Solution Data Structures: Restriction Set, Restricted Solutions and Internal Solution Data Structures. Each graph has a solution cache within memory 50 that stores the graph's three solution data structures. Each Restriction Set data structure represents a set of clauses for which solutions are sought and includes the following information:
______________________________________
Restriction Set ›
restriction set: a list of clauses given to Get Edge Solutions for
solution
solutions: a list of pointers to restricted solution data structures
for the restriction set
______________________________________
Each restricted solution data structure represents solutions to a restricted set and includes the following information:
______________________________________
Restricted Solution ›
clauses: the group of clauses that make up a solution. For
example, if the restriction set is a:1, b:0, and (a:0 & b:0), then a
solution
might b:0 and (a:0 & b:0). This should be a subset of the restriction
set. If
there is a clause in the restriction set that isn't in this field, then
it's value
is presumed to be FALSE.
map: list of pointers to Internal Solution data structures that
evaluate to the solutions in the clauses field. For any particular
restriction
set, all of the internal solutions must be a member of the map of
exactly
one restricted solution. Each internal solution appears once in each
restriction set.
______________________________________
Each Internal Solution data structure represents internal solutions to a restricted set and includes the following information:
______________________________________
Internal Solution ›
graph - a pointer to the graph that this internal solution was
extracted from
choices - a list of local choices of local disjunctions
partial - a solution for the partial edge; i.e., left daughter
complete - a solution for the complete edge
______________________________________
C. The Main Routine 100 FIG. 8 illustrates in flow diagram form the instructions of Main 100. Briefly described, Main 100 builds a chart for a natural language string, processes the edge constraints of the chart and builds graph data structures for the chart in a bottom-up fashion. After finding the graph feature structure for the root-spanning edge, processor 48 finds solutions for that edge. Upon receipt of a natural language string in machine readable form, processor 48 begins execution of instructions 100 with step 120. During step 120 processor 48 builds a context free parse forest, a chart, for that natural language string. Standard techniques, known to those of ordinary skill in the art, are used to build the chart. The chart constructed, processor 48 exits step 120 and advances to step 122. Processor 48 determines during step 122 whether the chart defines a string S that spans the entire natural language string. If the chart does not, then it has no solution and processor 48 branches to step 124. On the other hand, if the chart defines an S that spans the entire natural language string, then the chart may have a solution. In response, processor 48 advances to step 126 from step 122. During step 126 processor 48 adds to the chart the lexical and grammatical constraints associated with the grammar being used. Standard methods of decorating the chart are used. After decorating the chart, processor 48 exits step 126 and advances to step 128. Processor 48 finds solutions for the root spanning edge of the chart during step 128 by recursively Process Edge Constraints 102 and building graph data structures for the chart. These recursive calls cause processor 48 to walk down the chart until a leaf is reached, at which point processor 48 begins building graph data structures and walking back up the chart. Process Edge Constraints 102 will be described in detail later with respect to FIG. 9. Having generated graph data structures for the chart, processor 48 advances to step 130 from step 128 and begins the process of finding a solution to the chart. Processor 48 does so by calling Get Edge Solutions 104. Processor 48 finds solutions to the root spanning edge by walking down the chart, passing down opaque contexts of interest, until a leaf is reached. At that point, processor 48 begins determining local solutions to opaque contexts that have been imported and passing those solutions back up the chart. This continues until solutions have been found for the root spanning edge of the chart. Processing of edge solutions by instructions 104 will occur in cubic time for context-free parts of the grammar. Context-freeness causes local nogoods to factor nicely. Thus, even though solution computation time is an exponential in the number of opaque variables, K, experience has shown the number of solutions actually produced tends to be small. After executing instructions 104, processor 48 exits step 130 and branches to step 124, processing of the natural language string complete. D. The Process Edge Constraints Routine Process Edge Constraints 102, illustrated in FIG. 9, enables processor 48 to create a graph data structure for an edge given a pointer to the relevant edge. Briefly described, as each graph data structure represents all of the alternative ways that the edge can be constructed, instructions 102 begin by creating a graph data structure for each subtree associated with the edge in steps 150-188. To generate a graph data structure for an edge, graph data structures are first generated for each daughter of the subtree in steps 154-164. Using these, graph data structures are created for each subtree in steps 166-186 and with those a graph data structure is finally created for the edge during steps 190-212. In response to receipt of a pointer to a selected edge processor 48 begins execution of instructions 102 with step 140. During step 140 processor 48 determines whether a graph data structure need be constructed by examining the pointer just received. A null pointer indicates that the selected edge does not exist, perhaps because it is from a subtree that has no partial edge. In response to a null edge pointer, processor 48 branches to step 142 from step 140. During step 142 processor 48 indicates that the selected edge is TRUE; i.e., can be combined with any other edge, by setting to a value of 0 the pointer stored in the graph field of the selected edge data structure. Afterward, processor 48 returns to the calling routine. If the edge pointer is not null, processor 48 branches to step 150 from step 140. With step 150, processor 48 turns its attention to building a graph data structure for the selected edge. To do so, processor 48 must first generate graph data structures for each subtree associated with the selected edge. Thus, during step 150 processor 48 determines whether there are any subtrees for which a graph data structure should be generated. So long as there are, processor 48 advances to step 152 from step 150. Processor 48 selects one of the remaining subtrees as the selected subtree during step 152 and then advances to step 154. Processor 48 will generate a graph data structure for the selected subtree by first creating graph data structure for both the left daughter and the right daughter. Thus, during step 154 processor 48 creates a graph data structure for the left daughter of the selected subtree by a recursive call to Process Edge Constraints 102 and indicating the left daughter as the selected edge. Having generated a graph data structure for the left daughter of the selected subtree, processor 48 exits step 154 and advances to step 156. During step 156 processor 48 determines whether the graph for the left daughter is nogood. Processor 48 makes this determination by examining the nogood field of the graph data structure for the left daughter or if the pointer to the graph is the NOGOOD value; i.e., 1. If the graph is nogood, then the graph for the selected subtree will also be nogood. In this case, processor 48 advances to step 160 and generates a nogood graph data structure for the selected subtree. That done, processor 48 returns to step 150. On the other hand, if the graph for the left daughter is not nogood, then processor 48 exits step 156 and advances to step 162. Processor 48 turns its attention to generating a graph data structure for the right daughter of the selected subtree during step 162. Processor 48 performs this task by calling Process Edge Constraints 102 and indicating that left daughter is the selected edge. Afterward, processor 48 determines whether or not the graph data structure is nogood during step 164. If so, processor 48 returns to step 150. If not, processor 48 exits step 164 and advances to step 166. Having generated graph data structures for both the left and right daughters of the selected subtree, processor 48 begins the process of constructing a graph data structure for the selected subtree during step 166. In this effort, the first task is to convert any grammatical constraints associated with the selected subtree into a graph data structure. Next, during step 168 processor 48 stores a pointer to the graph data structure just created in the graph field of the subtree data structure of the selected subtree. Processor 48 then advances to step 180. During step 180 processor 48 begins incorporating constraints into the selected subtree that are imposed by its left daughter. Processor 48 does so by calling Copy AVPair 106, which copies one AVPair at a time and will be described in detail later. Afterward, during step 182, processor 48 determines whether this causes the graph data structure of the selected subtree to be nogood. If so, processor 48 returns to step 150. If not, processor 48 exits step 182 and advances to step 184. Processor 48 incorporates into the selected subtree constraints imposed by its right daughter during step 184. Processor 48 does so by again calling Copy AVPair 106. Processor 48 then determines whether these constraints make the selected subtree's graph nogood during step 186. If so, processor 48 returns to step 160. If the graph is not nogood, processor 48 returns to step 150 from step 186. Processor 48 branches through steps 150-186 so long as there are graph data structures to be generated for subtrees associated with the selected edge. Once all subtree graph data structures have been generated, processor 48 exits step 150 and branches to step 188 to begin the process of creating a single graph data structure for the selected edge that represents each of its not nogood subtrees. Processor 48 first counts the number of subtrees associated with the selected edge that have not nogood graphs during step 188. Next, during step 190, processor 48 determines whether the number of subtrees with not nogood graphs is greater than or equal to two. If that number is less than two, processor 48 determines whether there is at least one subtree with a not nogood graph during step 192. If there is one subtree with a not nogood graph, processor 48 advances to step 193 and stores a pointer to the graph data structure for the good subtree in the graph field of the edge data structure of the selected edge. On the other hand, if there is not at least one subtree with a good graph, then processor 48 exits step 192 and advances to step 194. Processor 48 then uses the graph field of the selected edge data structure to indicate that the edge is nogood by storing a pointer value of 1. Having completed construction of a graph data structure for the selected edge during step 194 or 193, processor 48 advances to step 144. Processor 48 advances to step 196 when the number of subtrees with not nogood graphs is greater than, or equals, two. In this case, the graph data structure of the selected edge will be an OR type because it will represent multiple, alternative subtrees. Thus, during step 196 processor 48 marks the graph as OR type by appropriately setting the disjunctive field of the selected edge data structure. That done, processor 48 exits step 196 and advances to step 198. During step 198 processor 48 builds disjunction data structures for the OR clause just generated in step 196. Processor 48 sets the count field of the disjunction data structure to the number of good graphs counted during step 188. Processor 48 then completes construction of the disjunction data structure, which produces contexted variables to represent each of the good subtrees associated with the selected edge. Afterward, processor 48 advances to step 200. With step 200 processor 48 begins incorporating information from the subtrees into the graph data structure for the selected edge. Processor 48 branches through the loop of steps 200, 210 and 212 until the information from all of the good subtrees has been incorporated into the selected edge data structure. During step 212 processor 48 selects one of the subtrees with a not nogood graph and finds the context associated with that subtree by the disjunction data structure. Processor 48 then copies up into the selected edge data structure information from the selected subtree by calling Copy AVPairs 106 and indicating the selected clause. Afterward, processor 48 returns to step 200. After copying information from all of the subtrees with good graphs into the graph data structure for the selected edge, processor 48 returns to step 144. E. The Copy AVPair Routine Copy AVPair 106, illustrated in FIG. 10, enables processor 48 to copy information from a source AVPair data structure into a recipient AVPair data structure. Instructions 106 are used to copy information from daughters into their mother, as well as to copy information from subtrees into their associated edge. Briefly described, instructions 106 attempt to avoid copying information by adding to the recipient AVPair contexted lazy copy links pointing back to the source AVpair. If the relevant information is already represented by a contexted copy link, then processor 48 copies the first level of information from the source into the recipient AVPair via calls to Expand Lazy Links 108 and Copy Facts 110. Processor 48 begins execution of instructions 106 in response to a pointer to the source AVPair, the recipient AVPair, and a selected clause. During step 230 processor 48 determines whether there is already a contexted copy link between the two AVPairs for the selected clause. Processor 48 makes this determination by examining the copies fields of both the source and recipient AVPairs. If a copy link with the selected clause is found in either the source or recipient, then nothing further need be done and processor 48 responds by branching to step 244. On the other hand, if no contexted copy link with the selected clause exists between the source and recipient AVPairs, processor 48 exits step 230 and branches to step 232. During step 232 processor 48 begins efforts to determine whether it can represent the source AVPair in the recipient AVPair via a contexted lazy copy link. That depends, in part, upon whether the other lazy copy links of the recipient have already been expanded. Processor 48 determines whether that is the case by examining the expanded bit of the recipient AVPair. If that bit indicates that lazy copy links have not been expanded, processor 48 may be able to represent the source AVPair using a contexted lazy copy link if that link is the recipient AVPair's only lazy copy link in an overlapping context. Processor 48 determines during step 234 whether the recipient AVPair already includes other contexted lazy copy links within the context given as an argument to Copy AVPair by enumerating the lazy copy links that are in the AVPair's copies field and for each lazy copy link conjoining its context with the selected context. If all of the conjunctions are nogood, then processor 48 goes to step 236. On the other hand, if any of conjoined contexts are not nogood, then processor 48 advances to step 238. Processor 48 advances to step 238 from step 234 when all lazy copy links associated with the recipient AVPair have yet to be expanded. Processor 48 expands those lazy copy links during step 238 by calling Expand Lazy Links 108. Afterward, processor 48 proceeds to step 240. Processor 48 reaches step 240 when the source AVPair cannot be represented in the recipient AVPair by a lazy copy link. To indicate in the source AVPair that it has been copied into the recipient, processor 48 adds to the copies field of the source AVPair a forward copy link pointing to recipient. Then, in step 242 processor 248 copies the constraints of the source AVPair into the recipient AVPair by calling Copy Facts 110. That done, processor 48 exits step 242 and advances to step 244. F. The Expand Lazy Links Routine Expand Lazy Links 108, illustrated in FIG. 11, enables processor 48 to replace a contexted lazy copy link with one level of greater detail and, if necessary, other contexted lazy copy links. After adding forward copy links in the source to record the expansion of a lazy copy link, processor 48 copies the relevant information by calling Copy Facts 110. Processor 48 begins execution of instructions 108 with step 260 in response to receipt of a pointer to the selected AVPair, whose lazy copy links are to be expanded. During step 260 processor 48 determines whether the contexted lazy copy links of the selected AVPair have already been expanded by examining the expanded bit. If that bit indicates that contexted lazy copy links associated with the selected AVPair have already been expanded, processor 48 advances to step 276. On the other hand, if the contexted lazy copy links have not yet been expanded, processor 48 branches to step 262 from step 260. Processor 48 begins expansion of the contexted lazy copy links of the selected AVPair by setting the value of expanded field to indicate expansion. Afterward, processor 48 advances to step 264 to begin expanding the contexted lazy copy links, one contexted lazy copy link at a time. So long as any contexted lazy copy link remains to be expanded, processor 48 advances from step 264 to step 266. During step 266 processor selects for expansion one of the remaining contexted lazy copy links in the copies field. Processor 48 then advances to step 270. Frequently, the target AVPair pointed to by the selected contexted lazy copy link is represented by lazy copy links, which will also require expansion. In anticipation of such a situation, during step 270 processor 48 calls Expand Lazy Links 108 to expand the lazy copy links pointed to by the selected contexted lazy copy link. During step 272 processor 48 adds a forward copy link pointing from the target AVPair to the selected AVPair. Processor 48 then advances to step 274 from step 272. Having expanded the target AVpair, processor 48 can now expand the selected lazy copy link by copying one level of information from the target AVpair into the selected AVpair. Processor 48 does so by calling Copy Facts 110. That done, processor 48 returns to step 264 and branches through steps 266, 270, 272, 274, and 264 until every contexted lazy copy link associated with the selected AVPair has been expanded. G. The Copy Facts Routine FIG. 12 illustrates in flow diagram form the instructions of Copy Facts 110. Briefly, described instructions 110 enable processor 48 to copy information from a source attribute-value pair into a recipient attribute value pair. Each attribute value pair may include a number of attributes, as well as a number of contexted values, to which the attribute value pair is equal. Instructions 110 use analogous means to copy both attributes during steps 292-302 and contexted values during steps 204-312; consequently, only steps 292-302 will be described in detail herein. Processor 48 begins execution of instructions 110 with step 290 in response to receipt of pointers to the source AVPair and the recipient AVPair, as well as a selected clause, associated with the recipient. Processor 48 begins in step 290 by determining whether any facts need be copied. No facts need be copied if they are associated with a nogood clause. Processor 48 thus examines the nogood field of the clause and if the selected clause is nogood, processor 48 exits step 290 and proceeds to step 292. With step 292 processor 48 begins efforts to copy the attributes associated with the source AVPair into the recipient AVPair. If any attributes remain to be copied, processor 48 selects one of the remaining attributes during step 292. Next, during step 294 processor 48 conjoins the selected clause with the clause associated with the selected attribute. Processor 48 does so by calling Conjoin Clauses 112, which returns the resulting clause. If the resulting clause is not nogood, processor 48 branches from step 298 to step 300. During step 300, if there isn't already an AVPair data structure for the recipient, processor 48 creates one that points back to the recipient AVPair and adds a pointer to this new AVPair in the attrs field of the recipient AVPair. Having created a data structure into which information can be copied, processor 48 advances to step 302 and calls Copy AVPair. Copy AVPair may or may not copy information from the source AVPair, depending upon whether there is an interaction between contexted lazy copy links. Afterward, processor 48 exits step 302 and returns to step 292. Processor 48 branches through the loop of steps 292-302 until Copy AVPair has been called for every relevant attribute of the source AVPair. When that occurs, processor 48 exits step 292 and advances to step 304 to begin copying values of the source AVPair into the recipient AVPair in substantially the same fashion as attributes were copied. After copying all contexted values associated with the source AVPair, processor 48 advances to step 320. During step 320 processor 48 examines the new constraints imposed upon the recipient AVPair and deduces new local nogoods, if possible. That done processor 48 returns to step 322. H. The Conjoin Clauses Routine Instructions 60 conjoin two clauses together to produce a new clause, which is cached in the clause cache of memory 50. This is done in a standard fashion for the most part. If the clauses are disjunctive, then each pair of disjuncts is conjoined and the conjoined disjuncts are then disjoined. If the two clauses are conjunctive then all pairs of conjuncts are conjoined and checked to see if the resulting conjoined conjunct is nogood. If none of the conjoined conjuncts are nogood then they are merged together, while eliminating redundant conjuncts. Two differences do exist between how instructions 60 conjoin clauses and standard approaches. First, before beginning to conjoin two clauses, processor 48 searches the clause cache for an entry involving the same two clauses. Processor 48 starts such a search by examining the cache field of the clause data structure with the higher id to search for the desired operation and operand. If such an entry is found, then the previously stored result can be used without performing the conjoin, saving processing time. The second difference between standard approach to conjoining clauses is the use of opaque clauses. The flow diagram of FIG. 13 illustrates those portions of Conjoin Clauses 112 that handle opaque clauses to reduce unification processing time. Briefly described, when two opaque clauses are imported from the same graph, the two opaque clauses are unwrapped, conjoined together to produce a new clause, the new clause is wrapped to produce a new opaque clause, which is then imported. Processor 48 begins execution of instructions 112 with step 360. Processor 48 is able to reduce the number of propositional variables imported between graphs when the two opaque clauses are associated with the same graph data structure. Processor 48 checks this possibility during step 360 by examining the graph fields of the two clause data structures to be conjoined, clause 1 and clause 2. If both clauses are associated with the same graph processor 48 exits step 360 and branches to step 362. Because of the possibility that the conjoining of the two opaque clauses may result in a simpler clause, like TRUE or nogood, processor 48 "unwraps" both clause 1 and clause 2 during step 362. Processor 48 unwraps the opaque clauses by retrieving the imported field of each opaque clause. Afterward, processor 48 conjoins the two unwrapped clauses during step 364 by calling Conjoin Clauses 112. Processor 48 examines the resulting clause during step 366 to determine whether it is nogood. If so, processor 48 advances to step 368 and returns a pointer indicating that the resulting clause is nogood. On the other hand, if the resulting clause is not nogood, processor 48 branches to step 370 from step 366. Processor 48 `wraps` the resulting clause and then imports the new opaque clause during step 370 by calling Import Clause 116. Processor 48 then advances to step 372 from step 370. To prevent expending further processing time conjoining clause 1 and clause 2 again, during step 372 processor 48 caches in clause cache of memory 50 the result of conjoining clause 1 and clause 2. Preferably, clauses within the clause cache are indexed according to the clause with the highest id and the information stored is the triple: operator, operand, and resulting clause. In this case, the triple would be: conjoin, clause 2, resulting clause. Afterward, processor 48 stores a pointer to this entry in the clause cache in the cache field of clause data structure for the clause with the higher id. Processor 48 performs step 372 whenever two clauses are conjoined, or disjoined, regardless of whether they are opaque. Caching the results of all operations with clauses helps reduce the processing time to unify graph data structures. I. The Import Clause Routine FIG. 14 illustrates in flow diagram form instructions 116 for importing a clause into a graph. In doing so, processor 48 creates a new opaque clause data structure to "wrap" the clause being imported. This ensures that the graph generated for the root spanning edge will be context-free equivalent where possible by replacing multiple propositional variables with a single propositional variable. Processor 48 begins execution of instructions 116 by determining during step 400 whether the selected clause, the one to be imported, has already been imported to the selected graph. Processor 48 does so by searching the clause cache for an entry with whose resulting clause is equal to the selected clause. If processor 48 finds such an entry, then the selected clause has already been exported to the selected graph, eliminating the need to do so. In response, processor 48 branches to step 402 from step 400, returning the imported clause. On the other hand, if the clause cache does not indicate that the selected clause has already been exported, processor 48 branches to step 404 from step 400. During step 404 processor 48 creates a new opaque clause data structure, storing the selected clause in the imported field. Subsequently, during step 406 processor 48 records that the opaque variable has been exported in the imported field and indicates which graph it has been exported to by storing a pointer to the selected graph in the graph field. Finally, during step 408, processor 48 caches the result of this operation in the clause cache of memory 50 by storing the triple: selected clause, opaque, new opaque clause. Having imported the new opaque clause, processor 11 returns to step 402 from step 408. J. The Get Edge Solutions Routine FIG. 15 illustrates in flow diagram form instructions 104 for obtaining solutions for a selected edge given the clauses for which solutions are sought, called a restriction set because solutions may not be sought for all clauses associated with the selected edge. A pointer to the selected edge with which the restriction set is associated is also passed in to instructions 104. Briefly described, Get Edge Solutions 104 generates solutions for the selected edge by finding which combinations of clauses of the restriction set have solutions. Processor 48 identifies possible internal solutions a subtree at a time by first finding solutions to the left and right daughters of the selected subtree. Processor 48 then takes the cross product of the solutions for the left and right daughter. Processor 48 evaluates the validity of the possible internal solutions using the local nogoods. Given internal solutions that are not invalid, processor 48 then determines which of them, when TRUE, cause clauses of the selected restriction set to evaluate to TRUE. Those clauses of the restriction set that evaluate to TRUE are then made into a restriction solution. Execution of instructions 104 begin with step 450, during which processor 48 begins searching for obvious solutions for the selected edge. There are three. First, processor 48 examines the pointer to the selected edge. If the pointer is null, this means that the selected edge can be successfully combined with any other edge. In response to such a discovery, processor 48 advances to step 452 and indicates that the solution for the selected edge is TRUE. Processor 48 then returns to step 454. On the other hand, if the pointer to the selected edge is not null, processor 48 branches to step 460 to examine another obvious solution for the selected edge. During step 460 processor 48 examines the nogood field of the graph of the selected edge data structure to determine if the selected edge is nogood. If it is, during step 462 processor 48 sets the solution for the selected edge to null and returns to the step 454. When the selected edge is not categorized as nogood, processor 48 exits step 460 to search for the last obvious solution. During step 464 processor 48 searches the solution cache of the graph to see if this restriction set has already been solved. If it has, then processor 48 returns a pointer to the solutions, if any, during step 466. When the effort to find to an obvious and easy solution fails, processor 48 exits step 464 and advances to step 468. Processor 48 creates a Restriction Solution data structure for the selected clauses in the solution cache, with all fields set to null. That done, processor 48 advances to step 470 to begin searching for solutions for the selected restricted set, a subtree at a time. During step 472 processor 48 selects one of the subtrees still requiring solution. Next, processor 48 next determines whether the selected subtree is nogood by examining the nogood field of the graph data structure for the selected subtree. If the selected subtree is nogood, processor 48 turns its attention to other subtrees by returning to step 470. On the other hand, if the graph for the selected subtree is not nogood, then processor 48 advances to step 476 from step 474. Finding solutions for the selected subtree requires first finding solutions for the selected subtree's left and right daughters. This effort occurs during steps 476-486. First, processor 48 determines which of the subtree graph's clauses are imported from its left daughter, defining a new restricted set. Processor 48 uses this information during step 478 to find solutions for the left daughter by calling Get Edge Solutions 104. If there are no solutions to the left daughter, there can be no solution to the selected subtree. Processor 48 responds to this situation by branching to step 470 from step 480, turning attention to another subtree. On the other hand, if the left daughter has a solution, then the selected subtree may have a solution. In response, processor 48 branches to step 484 from step 480 to identify solutions to the right daughter of the selected subtree. Processor 48 begins by identifying which of the clauses of the subtree graph have been imported from the right daughter. This defines a new restricted set, which processor 48 uses when calling Get Edge Solutions 104 during step 484. Processor 48 responds to the discovery that the right daughter has no solutions by returning to step 470. If, on the other hand, the right daughter has a solution, processor 48 branches to step 490 from step 486. Given solutions to the left and right daughters of the selected edge, with step 490 processor 48 begins efforts to identify solutions for the selected edge. These solutions are represented by a Restriction Solution data structure. During step 490 processor 48 creates local solutions based on the disjunctions that were introduced when the local constraints were instantiated. Processor 48 then takes the cross product of the local solutions and solutions to the left and right daughter to produce a number of candidate internal solutions for the selected edge. Processor 48 will examine these candidate internal solutions one at a time during subsequent steps. Processor 48 selects one of the candidate internal solutions for evaluation and evaluates it using local nogood clauses to determine the validity of the selected candidate internal solution clause. If processor 48 determines during step 506 that the selected candidate internal solution is not valid, processor 48 returns to step 500 to begin evaluating another candidate internal solution. On the other hand, if the selected candidate solution is valid, processor 48 advances to step 508. During that step each of the clauses in the selected candidate internal solution are assumed to be TRUE and the clauses of the selected restricted set are examined to determine which of them evaluate to TRUE. Processor 48 notes those clauses of the restricted set that evaluate to TRUE during step 510 and compares them to those listed in the clauses field of restricted solution data structures associated with the selected restricted set. If not yet included in the solutions to the restriction set, during step 514 processor 48 creates a new restrictions solution with the noted clause and adds it to the solutions for the restriction set. Finally, during step 516, processor 48 adds to the map field of the restriction solution data structure a pointer to the data structure for the selected candidate internal solution. Having completed evaluation of one candidate solution, processor 48 returns to step 500 to continue examination of candidate internal solutions as described with respect to steps 502-516. After examining of the candidate internal solutions for the selected subtree, processor 48 branches to step 470 from step 500. After processing all the subtrees associated with the selected edge, processor 48 returns to step 454 and returns any solutions to the restriction set. In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. Appendix A Definitions Attribute--An attribute is the name of a grammatical function such as subject, object, number, etc., which takes a value. Chart--A chart is a data structure for caching the constituents that have already been constructed by a parser. Clause--A clause is a combination of contexts. More technically, a clause is a boolean combination of propositional variables. Constituent--A group of words within a string, which may or may not be contiguous within the string. According to this definition, a string may include constituents regardless of whether the grammar being used defines constituents explicitly or implicitly. Context Free--A grammar is context free when two constituents can be combined with each other without regard to the internal structure of the constituents being combined. Contexted Copy Link--A copy link, lazy or forward, that has a context other than TRUE. Contexted Unification--A known method of efficiently unifying disjunctive feature graphs together. It allows disjunctions to be embedded with a feature graph, and produces a database of "nogoods," which represent inconsistent combinations of disjuncts. Contexted unification allows multiple analyses to be represented using a single feature graph. Contexted unification is also referred to as "disjunctive unification." Disjunction--A complex sentence in logic that is true when either one or both of its disjuncts are true. Expanding--Expanding is the act of increasing the level of specification of a feature structure so that the resulting feature structure includes information previously omitted and indicated via a lazy link. Expansion may not remove all lazy links from a feature structure because levels below that expanded to are represented via lazy links. Fact--An attribute or value. Feature Structure--A representation of a surface grammatical function of a part of a string. This information is represented as a set of ordered pairs, each of which consists of an attribute and a specification of that attribute's value. Feature structures are described notationally by arranging the set of ordered pairs vertically inside square brackets with each attribute value pair placed on a single horizontal line. Forward Copy Link--A copy link that points forward to the recipient of information from the source of information. Graph Data Structure--a data structure that represents a feature structure, solutions to the nogood database, and related material. Interact--A lazy link is said to interact, or to be activated, when it is unified with another lazy link or feature in an overlapping context. Lazy Copying--A standard technique for unifying feature structures together that only copies those parts of feature structures that interact. Lazy Copy Link--A copy link that points backward to the source of information from the recipient of the information. Non-finite Feature Structure--A feature structure that has feature structures as the values of its features. Opaque Contexts--An opaque context is a context whose complexity is not visible to the graph which imported it, because the context is represented by a propositional variable. Opaque contexts allow nogood databases to be solved as a collection of little databases. Parsing--A process of analyzing a sentence or string to determine its underlying meaning. Underspecified--A feature structure is said to be underspecified when it does not include all information necessary to describe its features, instead using lazy links to other feature structures to indicate the omitted information. Unification--Unification is a combining operation on two disjunctive feature structures that produces a third, conjunctive feature structure. In general the conjunctive feature structure represents more information than either of the feature structures it unifies. Unspecified--A feature structure is said to be unspecified when it includes only a lazy link to another feature structure or lazy links to one or more other feature structures.
|
Same subclass Same class Consider this
|
||||||||||||||||
