Database system providing methodology for property enforcement6801905Abstract A database system providing methods for eager and opportunistic property enforcement is described. Plan fragments are generated for obtaining data requested by a query. Plan fragments are grouped together in classes based upon tables of the database covered by each plan fragment. For each class, a particular plan fragment having the lowest execution costs for obtaining the data requested by the query is determined. If grouping is not required at a given class, an operator enforcing ordering is added to this particular sub-plan. However, if grouping is required at the given class, an operator enforcing both grouping and ordering is added to this sub-plan. Claims What is claimed is: Description COPYRIGHT NOTICE
Tab Index Key
c ic1 c(c1)
ic2 c(c2)
ic12 c(c1, c2)
o io1 o(o1)
io2134 o(o2, o1, o3, o4)
io3 o(o3)
1 il2 l(l2)
il4 l(l4)
il234 l(l2, l3, l4)
The availability of all useful indices increases both the optimizer's chance to find a cheap access plan (i.e., one with low execution costs) and the complexity of its search space. The exemplary indices above include "covering" indices (i.e., indices that contain all columns needed by the query) to avoid data page versus index page access considerations. 2. Basic Optimization of Query Execution Optimization is the process of finding the best query execution plan (QEP) that has the same semantics (i.e., result) as the SQL query. This can be formalized by the search of the best physical relational expression that is equivalent to the canonical logical relational expression of the SQL query. To achieve this result, the optimizer inspects a search space by transforming the initial expression. This transformation typically includes reordering expressions using algebraic rules (associativity, commutativity, etc.), replacing the logical operators with the physical operators that implement them, and enforcing the properties needed by some algorithms. The optimizer thus gradually builds alternative total plans (i.e., plans that have the same semantics as the original SQL query and that constitute candidates to become the final QEP) and estimates the cost of each alternative using a cost model. The optimizer keeps the partial plans (referred to as plan fragments or sub-plans) in a "plan cache." The optimizer selects the total plan that is determined to be the "cheapest" one to execute and that has the properties needed by the query. For purposes of this discussion, "cheapest" means the plan with the lowest execution cost based upon the cost model. When the optimizer generates sub-plans, only the ones that are likely to be part of the best total plan are retained and the other sub-plans are eliminated or "pruned." There are a number of seminal papers that describe general optimization frameworks. These optimization frameworks include: System R (see e.g., P. G. Selinger et al, "Access Path Selection in a Relational Database Management System," SIGMOD Conference, 1979), Starburst (see e.g., G. M. Lohman, "Grammar-like Functional Rules for Representing Query Optimization Alternatives," SIGMOD Conference, 1988), EXODUS (see e.g., G. Graefe, D. J. DeWitt, "The EXODUS Optimizer Generator," SIGMOD Conference, 1987), Volcano (see e.g., Goetz Graefe, William J. McKenna, "The Volcano Optimizer Generator, Extensibility and Efficient Search," ICDE, 1993), Cascades (see e.g., G. Graefe, "The Cascades Framework for Query Optimization," Data Engineering Bulletin 18(3), 1995), and Columbia (see e.g., Y. Xu, "Efficiency in the Columbia Database Query Optimizer," M. S. Thesis, Portland State University, 1998). Ordering handling is a central topic of a few prior papers, see e.g., J. Claussen et al, "Exploiting early sorting and early partitioning for decision support query processing," VLDB Journal 9(3), 2000 and D. E. Simmen, E. J. Shekita, T. Malkemus, "Fundamental Techniques for Order Optimization," SIGMOD Conference, 1996. Ordering handling is, however, mentioned in other papers that describe general optimizer frameworks given below. For example, papers regarding the "Cascades" and "Columbia" optimizers have some relevant sections on property enforcement, see e.g., K. Billings, "A TPC-D Model for Database Query Optimization in Cascades," M. S. Thesis, Portland State University, 1997; L. D. Shapiro et al, "Exploiting Upper and Lower Bounds In Top-Down Query Optimization," IDEAS, 2001; and the above-mentioned "Efficiency in the Columbia Database Query Optimizer"). Both the "System R" and the "Starburst" bottom-up optimizers handle available orderings. The plans having "interesting orderings" are preserved. Papers regarding the System R optimizer introduce the "interesting orderings" term, and discuss handling a set of interesting orders for equi-joins and for GROUP BY and ORDER BY clauses. See e.g., the abovementioned "Fundamental Techniques for Order Optimization." However, from these papers it appears that both the System R and Starburst systems utilize "lazy, on-demand sorting." This lazy sorting method involves introducing a sorting operator for each operator that has an ordering requirement on a child. This sorting operator is introduced over the cheapest access plan implementing that child. The cost of this construct is compared with the cost of the best existing plan for the child having the required ordering available and the cheaper of the two is selected. However, when several joins are attempted over the same child, a number of sort operators may be generated. In the rule-based top-down Cascades and Columbia optimizers, ordering is handled by requesting a sub-plan with the needed property. The invocation of an "enforcer rule" produces, as an alternative, a sorting operation over the cheapest unordered sub-plan. In the Cascades optimizer, the enforcer rule is explicitly called when an ordering property is needed. As this enforcer rule is invoked several times for a group, it typically generates duplicates. See e.g., the abovementioned "Efficiency in the Columbia Database Query Optimizer." The Columbia optimizer includes an enhancement (compared to Cascades) in the handling of the enforcer rule. See e.g., the abovementioned "Efficiency in the Columbia Database Query Optimizer." In the Columbia optimizer, the enforcer is generated on demand, only once. It is considered to have the same cost for any key, and is left without a sort key definition. Its parents (i.e., its parent nodes in the access plan) will decide what it sorts on. Several other papers have covered the optimization of aggregation, under different names: early grouping (see the abovementioned "Data reduction through early grouping"), generalized projection (see e.g., A. Gupta, V. Harinarayan, D. Quass, "Aggregate-Query Processing in Data Warehousing Environments," VLDB, 1995), eager and lazy aggregation (see e.g., W. P. Yan, P. Larson, "Eager Aggregation and Lazy Aggregation," VLDB, 1995), generalized selection (see e.g., P. Goel, B. R. Iyer, "SQL Query Optimization: Reordering for a General Class of Queries," SIGMOD Conference, 1996) and aggregation moving or reordering (see e.g., S. Chaudhuri, K. Shim, "Optimizing Queries with Aggregate Views," EDBT, 1996 and the abovementioned "Orthogonal Optimization of Subqueries and Aggregation"). The search space traversal complexity is one of the main measurable quality criteria of an optimizer, relevant both to the handling of ordering and of aggregation. See e.g., K. Ono, G. M. Lohman, "Measuring the Complexity of Join Enumeration in Query Optimization," VLDB, 1990; Y. E. Ioannidis, Y. Cha Kang, "Left-Deep vs. Bushy Trees: An Analysis of Strategy Spaces and its Implications for Query Optimization," SIGMOD Conference, 1991; R. S. G. Lanzelotte, P. Valduriez, M. Zait, "On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces," VLDB, 1993; B. Vance, D. Maier, "Rapid Bushy Join-order Optimization with Cartesian Products," SIGMOD Conference, 1996; and A. Pellenkoft, C. A. Galindo-Legaria, M. L. Kersten, "The Complexity of Transformation-Based Join Enumeration," VLDB, 1997, 306-31. 3. The Plan Cache One of several published plan cache implementations may be used for implementation of the present invention. Available plan cache implementations include the System R dynamic programming "search tree" (see e.g., the abovementioned "Access Path Selection in a Relational Database Management System") and the Cascades "MEMO" structure (see e.g., the abovementioned "The Cascades Framework for Query Optimization"). The optimizer of the currently preferred embodiment, which is implemented in a Sybase.RTM. Adaptive Server.RTM. Enterprise (ASE) database system, uses a structure similar to the Cascades MEMO optimizer by grouping together sub-plans (or plan fragments) covering the same base tables in plan "equivalence classes." The main characteristics of an equivalence class ("Eqc") are that an equivalence class is identified by its set of underlying tables and that its sub-plans compete with each other to be part of the best total plan. The plan cache contains the set of created equivalence classes (Eqcs). FIG. 4 is an illustration of an exemplary plan cache state at an intermediate point during the optimization of a sample query. As shown, FIG. 4 illustrates the plan cache state during the optimization of the query described above in Example 1. This Example 1 query illustrated at FIG. 4 does not contain any sum ( ), GROUP BY, or ORDER BY clauses. The equivalence class (Eqc) of the currently preferred embodiment is similar to a MEMO equivalence class or group. Among the differences are that plans in the same equivalence class (Eqc) do not always return the same relational result; some can have grouping enforced, whereas others do not have grouping enforced. Also, the members of an Eqc are not multi-expressions but rather are the root physical node of the Eqc's sub-plans. As described later in this document, logical operators and multi-expression formalism are not needed in a bottom-up search engine. However, the present invention can also be implemented using a classical MEMO structure, provided that the properties are organized accordingly. The plan cache of the currently preferred embodiment does not have a tree structure. Rather, the plan cache of the currently preferred embodiment utilizes a directed acyclic graph (DAG). The final query execution plan built by the optimizer of the currently preferred embodiment, conversely, is a tree. Nodes are tuple stream operators that implement the open/next/close protocol. Each node has a current status with respect to the input already consumed from its children and the output already returned to its parent. The leaf nodes (i.e., the scans) have a current position on the underlying stored table. Two parents nodes cannot share a child. As the parents advance at a different pace, each node of the shared sub-tree holds two status descriptions, one per parent. This is basically equivalent to having two copies of the shared sub-tree. Likewise, each sub-plan in the plan cache is a tree rooted at some equivalence class (Eqc). However, the entire plan cache is a directed acyclic graph (DAG). Several parents can, however, share a sub-plan, provided they do not have a common ancestor. As shown at FIG. 4, the sub-plans that scan tables c and o are shared by the joins of those tables. But a total plan will usually never contain more than a join operator for c and o. 4. Logical and Physical Properties The term "property" is used in the literature to describe a quantifiable characteristic of a relational expression, sub-expression, or operator (i.e., a tree, sub-tree, or node of a query execution plan). When the property depends only on the logical expression, it is a "logical property." A logical property can have the same value for a group of plans. For instance, the set of underlying tables is a logical property shared by all plans in an equivalence class (Eqc). For queries that have no aggregation, the set of available columns is an Eqc level logical property and so is the set of projected columns, the ones needed by any parent. Indeed, all underlying table columns are available as well as all derived columns that transitively depend only on the underlying tables and can be thus computed. As for the projected columns, they are the columns needed by the SELECT list and by the set of predicates still to be applied. Likewise, other equivalence class level logical properties are the set of all non-expensive predicates applied somewhere within the plan and the set of non-expensive predicates still to be applied by any parent. Indeed, non-expensive predicates are always pushed down to the deepest possible location; if all columns involved in a given predicate are available, the predicate is applied somewhere in each sub-plan; otherwise, the closest parent that has all columns available will apply the predicate. The set of all equi-joins yet to be applied by any parent is an important logical property that is used to compute the "interesting orderings." The same holds for the result of an operator and for the statistics that describe it: the cardinality of the result, the duplicates density and histogram of each column. Indeed, the result is fully determined by the underlying tables and the applied predicates, themselves determined by the underlying tables. This layout changes when nested subqueries and the inner child of nested loop joins are also considered. The outer attribute values are stable for the whole open/next/close inner side cycle. These outer attribute values are "pushed down" and made available within the whole inner child. Likewise, predicates depending only on the outer attribute values and on attributes of the inner child are pushed down. Hence the logical properties of plans in an equivalence class (Eqc) would depend on whether a given plan is inner to a nested algorithm or not, and this is given by the set of available columns. For queries involving aggregation, the available columns are not an equivalence class level logical property. Plans that have enforced the grouping will make available the aggregation results whereas other plans will not make aggregation results available. The logical property will be shared by the subset of plans of the Eqc with the same level of grouping enforcement. Likewise, the different classes of predicates described above will still be logical properties, but will be shared by fewer plans if they rely on aggregation results. FIG. 5 is a table illustrating exemplary equivalence class (Eqc) level logical properties computed by the optimizer of the present invention for the sample plan cache shown at FIG. 4. These equivalence class level logical properties are properties without attribute pushdown or aggregation derived attributes. The properties describing ordering and aggregation are discussed below. Not all properties of a plan depend only on the logical level relational expression describing it. The ordering of the data stream produced by the plan depends on the actual algorithms implementing the physical nodes. Some orderings are produced (e.g., by an index scan or a sort), some are preserved (e.g., the outer ordering of a join) and some are destroyed (e.g., the inner ordering of a join, with some exceptions). Such properties that are known only for a given physical implementation of a logical relational expression are called "physical properties." Physical properties belong to a given plan, whereas the logical properties belong to an equivalence class (Eqc). It should be noted that, with respect to the optimization process, the logical properties have an a priori nature and the physical properties an a posteriori nature. Indeed, given a query and a logical relational expression describing a fragment of the query, one can compute the logical properties upfront, before having inspected any possible physical implementation of the expression. However the physical properties can be computed only for a given sub-plan once the sub-plan has been built. 5. Required and Available Properties Properties are also used to express the precondition of an operator. Some operators need a property to hold onto their inputs. In this event only arguments that have the required property are legal. For instance, some relational operators, such as the merge join, need their inputs to be ordered. Others operators make an ordering available, either by creating it or by propagating the ordering of its children. The index scan and the sort create an ordering. The join propagates the ordering of its outer child. The sort node holds a special position with respect to the ordering property: its result has the same relational content as its argument (it is a pure physical operator, meaning that it does not implement a logical relational operator). The only role of a sort node is to create an ordering. This is also called "enforcing" an ordering, the sort is called an order "enforcer" and the ordering is called an "enforceable" physical property. To handle such physical properties (as orderings), the optimizer requires representations for the properties available at a node, the properties that a node needs from its children, and the interesting properties at an equivalence class (i.e., the ones that might be used by some parent of any plan in that Eqc). The optimizer uses these representations to build only semantically correct plans. Plans are semantically correct when the children's properties always satisfy the parent's needs. The optimizer also uses these representations to compute the property creation and propagation as well as for pruning. While pruning, the representations are needed to enable the optimizer to retain the cheapest plan that makes available an interesting property (e.g., an ordering) in cases where there is a cheaper plan that does not have an interesting property. 6. Enforceable Properties The ordering described above is an enforceable physical property. But the same model applies also to some logical properties. For instance, the vector aggregation, or grouping, can be modeled by a logical property in which the grouping algorithms are the aggregation enforcers. For any GROUP BY query there are algebraic transforms that define the grouping columns and aggregation functions at a given Eqc. A plan will have the aggregation done (i.e., the property available or satisfied), if its root is group node or if the aggregation need was satisfied in its children and propagated up to its root. The optimizer retains in an equivalence class the cheapest plan that has the aggregation need satisfied, even when a cheaper plan exists but has no aggregation. In other words, the optimizer keeps the cheapest plan that satisfies the aggregation requirement in situations when there is a cheaper plan that does not satisfy this requirement. In some cases the root of the full plan of a GROUP BY query has an aggregation need: it must have the grouping property satisfied. As group is a logical operator, the aggregation representation does not need physical properties: it is based on logical properties, as the set of grouping columns and aggregation functions is legal at an Eqc. In this respect it is an "enforceable logical property." Thus, the concept of an "enforceable property," a physical property generalization is introduced. An enforceable property inherits some of the characteristics of a physical property. An operator or algorithm that produces an enforceable property is called an "enforcer." In addition, the approach of the present invention is to prune (i.e., discard) a plan only if it is both more expensive and weaker for each interesting enforceable property. It should be recognized that not all logical properties are enforceable. For instance, one cannot enforce the availability of a base table attribute in an equivalence class (Eqc) that does not contain that base table. Likewise, the cost is a physical property that is not enforceable. The method of the present invention focuses on the handling of two enforceable properties. The first of these enforceable properties is the ordering physical property. The second is the grouping logical property. 7. Properties Compared to Rules Enforceable properties are closely related to relational expression transformation rules. In a transformation-based optimizer, the physical property enforcers are generated by the transformation rules. Likewise, eager/lazy aggregation is handled by transformation rules. The application of these rules implies the matching of a rule pattern against a representation of the relational expression to transform, resulting in the construction of the transformed expression. The sorting columns, grouping columns, aggregation functions, etc. of the transformed expression are repetitively computed as a side effect of the transformation. They are known a posteriori, once the rule is applied. When the optimization uses the "enforceable property" model, all the above scalars needed to fully construct the sort and group nodes are pre-calculated once in the equivalence class (Eqc) as logical properties. As hereinafter described, the need of a property, be it physical or logical, can be expressed with a logical property that can be computed for a given Eqc, before the generation of any plan of the Eqc. These logical properties are known a priori. This a priori nature of the enforceable properties is the cornerstone of their usage in optimization in the system of the present invention. As hereinafter described, the optimizer uses their availability before plan enumeration in guiding the search space traversal decisions as well as the construction of query execution plans and in abstract plan (AP) validation. Before discussing these enforceable properties and the methodology of the present invention in more detail, the operations of the optimizer search engine will be described. C. Optimizer Search Engine Taxonomy In the abovementioned "Exploiting Upper and Lower Bounds In Top-Down Query Optimization," as in other papers regarding top-down optimizers, it is claimed that an advantage of top-down optimizers over bottom-up ones (e.g., System R and Starburst) is the ability of top-down optimizers to quickly obtain a complete plan and thus prune further sub-plans against the cheapest total cost. However, it is the depth-first nature of an optimizer (e.g., Cascades) that supplies this advantage, rather than top-down vs. bottom-up distinction. Optimizers are classified in the literature from the viewpoint of the search space inspection as being top-down or bottom-up. This classification is not sufficient to understand the optimizer behavior. Instead, an optimizer must be classified as both depth-first vs. breadth-first as well as top-down vs. bottom-up in nature. These classifications will now be explained. Top-down optimizers start by enumerating the implementation alternatives of the top, root equivalence class (Eqc). They divide the initial optimization problem into sub-problems. Then the top-down optimizer proceeds to go down and recursively solve each sub-problem. Bottom-up optimizers, on the other hand, start by enumerating the implementation alternatives of the bottom, leaf equivalence classes (Eqcs). Thus, these bottom-up optimizers build minimal solution fragments for the optimization problem. Then, they proceed up and gradually build larger solution fragments based on the available smaller solution fragments. Breadth-first optimizers exhaustively enumerate all implementation alternatives of an Eqc before moving to the next one. Pure breadth-first optimizers move sideways, i.e., they enumerate all peer Eqcs before passing to a different level. Depth-first optimizers, in contrast, keep the current enumeration point for each active Eqc and move to another one before completing the enumeration. Pure depth-first optimizers move as soon as they have a new alternative. FIG. 6 illustrates the operational differences in different classes of optimizer search engines. Block 610 depicts the operations of a depth-first, bottom-up search engine. Block 610 includes equivalence classes 611, 612, 613, 614, 615. Each of the circles in these equivalence classes represents a node of a query execution plan. Arrow 618 represents the main (or primary) direction in which in the search engine will move. As shown at block 610, arrow 618 indicates the primary direction of this type of optimizer is upwards from the bottom (i.e., bottom-up). Arrow 619 represents the secondary direction, which is the direction in which the search engine tries to move when it exhausts a path in the main direction. As shown, the secondary direction of this depth-first, bottom-up optimizer is sideways to enumerate all peer implementation classes. Block 620 illustrates a breadth-first, bottom-up optimizer. As shown, block 620 includes equivalence classes 621, 622, 623, 624, 625. Arrow 628 represents the main (or primary) direction in which in the search engine will move. As shown, arrow 628 indicates the primary direction of this optimizer is sideways. Arrow 629 represents the secondary direction, which in this case is upwards from the bottom (i.e., bottom-up). Blocks 630 and 640 illustrate top-down optimizers. Block 630 shows a depth-first, top-down optimizer with equivalence classes 631, 632, 633, 634, 635, 636, 637. Block 640 depicts a breadth-first, top-down optimizer containing equivalence classes 641, 642, 643, 644, 645, 646, 647. Arrows 638 and 648 represent the main (or primary) direction in which in each of these search engines will move. As shown, Arrow 638 indicates the primary direction of a depth-first, top-down optimizer is downwards from the top. Arrow 639 illustrates that the secondary direction of this search engine (block 630) is sideways. As shown by arrow 648 at block 640, the main direction of movement of a breadth-first, top-down search engine is sideways. The secondary direction is down from the top as shown by arrow 649. The search engine directions have the following implications on costing, property handling and pruning. The above theoretical concepts do not have a pure application in any working optimizer implementation. For instance, there are no pure top-down optimizers because costing and property propagation are part of finding a solution. Costing and property propagation can usually be processed only bottom-up, as one cannot compute the cost or properties of a node before having computed the cost or properties for its children (assuming a real life cost model which also estimates the input and output (I/O) costs). For example, Volcano is neither a pure depth-first nor a pure breadth-first optimizer. It fully enumerates an Eqc before moving, but it does not move sideways. Both the base Volcano and Cascades are largely top-down optimizers. An important difference consists in the depth-first nature of Cascades that avoids enumerating all logical expressions before moving to the physical ones. The System R optimizer is a dynamic programming-based pure breadth-first, bottom-up optimizer. The optimizer of the currently preferred embodiment has a (non-pure) depth-first, bottom-up search engine. It fully enumerates the sub-plans in single table equivalence classes (Eqcs). Then, at each higher-level Eqc, it adds increments by enumerating all plans based on the newly added increments of the current children Eqc. As soon as a child configuration is thus consumed, the search engine moves up to the next level. Actually, before having obtained a first full plan, the optimizer of the currently preferred embodiment adopts a pure depth-first strategy, trying to reach a reasonably cheap plan as soon as possible. With this pruning high water mark established, it switches back to the relaxed depth-first strategy. However, the plan cache of the currently preferred embodiment is designed to accommodate, when needed, any kind of search engine. As previously mentioned, both costing and available property handling are generally performed in a bottom-up direction. Costing is usually performed bottom-up because the cost of children nodes is generally needed to compute the cost of the parent. Logical cost models are an exception to this general rule in that logical costs (i.e., derived relation cardinality) can be computed in advance. However, these logical cost models have little expressive power. The costliest operations in a real life environment are typically the disk input and output (I/O) and, for clusters, the network transfer of data. Neither of these operations can be assumed proportional to the derived relation cardinality if the actual plan is not known. Available property computing (or handling) is also usually performed bottom-up. Most nodes (except, for instance, enforcer nodes) need their children's properties to compute what they should propagate up to their parent node. Breadth-first and depth-first search engines have very different approaches for pruning. Breadth-first search engines generally cannot prune sub-plans against the cheapest current full plan. For this reason they reach their first full plan relatively late in the optimization process. In contrast, depth-first, search engines do some amount of pruning of sub-plans as they proceed. Depth-first, top-down search engines reach a full plan relatively quickly. However, they cannot accurately cost their sub-plans because their children's plans are not yet known. Their pruning method is based on logical costing which is less accurate and less effective in pruning sub-plans. Depth-first bottom-up search engines are more effective in pruning sub-plans. These depth-first, bottom-up search engines typically reach a full plan relatively quickly. Moreover, each time they add a new sub-plan they essentially add a new parent node over fully costed children and the new sub-plan's cost is immediately known. If the new sub-plan's cost is too expensive, the whole search space region of its parents is pruned. For these reasons, the optimizer of the currently preferred embodiment uses a depth-first, bottom-up search engine. D. Eager Property Enforcement 1. Eager Enforcement Using Maximal Useful Property For an enforceable property, a key issue is how the optimizer should handle placing the enforcers (i.e., operators that produce enforceable properties). An optimizer may place the enforcers "lazily" (i.e., lazy, on-demand enforcement) by adding the enforcer when adding a parent node that requires a particular property. Alternatively, an optimizer may place enforcers "eagerly" when adding child nodes to a plan. In either case, only the cheapest existing plan (sub-plan) will receive the enforcer. Other enforced plans are pruned out given that both the cheapest existing plan and other more expensive plans have the same enforced property. The cheapest plan will be generated by applying the enforcer to the cheapest non-enforced child as the enforcer's cost is assumed to be the same for any child in the equivalence class (Eqc). Lazy, on-demand enforcement places an enforcer when a parent needs a child to have a given property. The cost of the enforced cheapest child is compared with the cost of the cheapest child already having the needed property, if any. The alternative with the lowest overall cost is selected. Consider, for instance, merge joins and their ordering requirements. Given the directed acyclic graph (DAG) structure of the plan cache, an enforcer will be generated and costed for each parent of a given plan. When several parents have different ordering needs, each will add its own enforcer. The method of the present invention for eager property enforcement places an enforcer in advance (a priori) when a new cheapest plan is added to an equivalence class. With this approach, no enforcer needs to be added when a parent looks for a child node, as the enforcer is already present. Eager property enforcement relies upon knowing in advance the needs of all possible parents of any plans in an Eqc. This is an enhancement of the known concept of "interesting properties" and an evolution of the Cascades keyless sort nodes. The approach of the present invention is to model the whole set of corresponding interesting physical properties with a logical property which is referred to as the "maximal useful property." As the maximal useful property is used to represent the physical property of the enforcer, it shares the same representation as the underlying physical property. The maximal useful property is used to represent an available property and can be compared, propagated, and so forth. The difference between the underlying (physical) property that is needed and its corresponding (logical) maximal useful one purely lies in their semantics. The maximal useful property is known in advance (a priori) for an equivalence class. The needed/useful property is known after the fact (a posteriori), for a given plan of that Eqc. The characteristics of a physical property for eager property enforcement can be summarized as follows. The maximal useful property for a given Eqc is a meaningful concept. The maximal useful property is inexpensive to compute, represent, and use. In addition, enforcing a stronger property is (almost) as expensive as enforcing a cheaper one. Eager enforcement is not bound to a specific search engine style (top-down vs. bottom-up, rule-based or not). Eager enforcement only needs the optimizer to implement concepts similar to equivalence classes and logical properties (which most modern optimizers do implement). Eager enforcement can be easily inserted at the place (or places) where the search engine adds new plans to an equivalence class. The application of this concept to ordering will now be described. 2. The Ordering Representation An ordering can be available on more than one column. The ordering provided by the scan of a multi-column index has a fixed, major to minor, index key column sequence. This is referred to as a "vector ordering" and identified by using the notation: (c1, c2, c3) where c.sub.i are attributes. Conversely, the ordering needed for grouping with several GROUP BY columns has no fixed major to minor sequence. A sequence must be selected for the QEP (i.e., the final query to be executed), but any of the alternatives may be used for purposes of optimization. This is referred to as a "set ordering" and identified by using the notation {c1, c2, c3}. In addition, an available set ordering can be constrained to have a subset of columns first. For instance, if the child of a group ordered operator has a {c1, c2, c3, c4} ordering and the grouping is on c2 and c4, then the result would need c2 and c4, in any sequence, to precede c1 and c3. This is called a "mixed ordering" and identified using the notation ({c2, c4}, {c1, c3}). Both set orderings and mixed orderings are compact, folded representations that are compatible with a number of vector orderings. This means that an available set ordering is sufficient if any of its "vector ordering instantiations" is needed. Likewise, when a set ordering is needed, the availability of any of its instantiations satisfies the requirement. For instance, the {c1, c2, c3} set ordering has all n! (n factorial) permutations of its three columns as vector ordering instantiations: (c1, c2, c3), (c1, c3, c2), (c2, c1, c3), (c2, c3, c1), (c3, c1, c2), (c3, c2, c1). The optimizer of the currently preferred embodiment uses a richer ordering representation that also covers ascending/descending orders and collating sequences. The system also implements other order simplification techniques, such as column equality transitive closures. For additional information on other order simplification techniques, see e.g., R. P. Kooi, "The Optimization of Queries in Relational Databases," PhD Thesis, Case Western Reserve University, 1980. See also e.g., the abovementioned "Fundamental Techniques for Order Optimization." 3. Applying Eager Enforcement to Ordering A "set ordering" is the compact representation of the "maximal useful ordering" that is needed for eager property enforcement. A set ordering is introduced as a physical property, to represent the ordering needs of the group ordered algorithm. But a set ordering also represents the ordering made available by a sort node (i.e., a sort operator), which is the order enforcer. Indeed, unlike a B-tree index that makes available the vector ordering determined by its key, the same sort can be used during the optimization to represent any ordering on its attributes. The eager enforcement requirements with respect to this ordering representation can be summarized as follows. The maximal needed ordering of an Eqc is the set of projected columns that are in an equi-join yet to be applied, in a grouping yet to be applied, in an ORDER BY clause, under a scalar min( )/max( ) aggregate, below a UNION, or in a SELECT DISTINCT select list. FIG. 5 illustrates the needed orderings for each exemplary equivalence class for the sample query shown above in Example 1. This set ordering is cheap to compute, represent and use. The cost of a sort operation is mainly given by the input/output (I/O) cost, which is based upon the volume of data to sort. The cost of the sort operation is largely insensitive to the size of the sort key. Eager enforcement can thus be applied to ordering. Its implementation in any search engine implies minimal changes to logical property computation and to adding new plans to an equivalence class. It also enables simplification of valid child lookup as this will no longer require enforcers management. 4. Components of the Currently Preferred Embodiment FIG. 7 is a block diagram 700 illustrating components of the system of the present invention, which in its currently preferred embodiment is implemented as part of the optimizer of a relational database management system. As shown, the system includes equivalence classes 710, a logical properties module 720, a physical operators module 730, a search engine 740, and a plan cache 750. The plan cache 750 stores equivalence classes created for a query. The equivalence classes 710 is a data structure that describes a set of sub-plans. Each of the sub-plans in the same equivalence class references the same set of underlying database tables. For instance, two sub-plans joining the same two tables are in the same equivalence class. Each sub-plan in an equivalence class competes with other sub-plans in the same equivalence class to be part of the total (final) plan generated by the optimizer. In the currently preferred embodiment, the equivalence classes 710 include a base (or equiv) module 711, a single table Eqc (Eqc1) module 712, and a multiple table Eqc (Eqcn) module 713. Single table Eqc module 712 contains plans over a single table (such as index scan or table scans and the enforcers of those scans of the table). The multiple table Eqc 713 holds sub-plans joining together a plurality of tables. Sub-plans retained in multiple table Eqc module 713 would typically have a join node or an enforcer at the root node (such as a sort operator). The logical properties module 720 includes a set of logical operators (or log props) which describe (a priori) the common behavior of a set of sub-plans that refer to these operators. Before enumerating a sub-plan or plan fragment, several things about the sub-plan can be determined from examination of the equivalence class containing the sub-plan. One of the characteristics of sub-plans in the plan cache 750 that can be determined in advance (a priori) is the set of underlying tables referenced by the sub-plans. Because an equivalence class is characterized by an underlying set of tables, these tables are trivially known for all of the sub-plans in each equivalence class. In addition, the available columns (or attributes) are known. The available columns can be determined as the union of all of the columns of all of the underlying tables. Of particular interest to the present invention, the logical properties can be computed (determined) a priori before sub-plan enumeration. This allows sub-plan enumeration to be based upon the available, computed logical properties. Unlike the logical properties, the physical properties are specific to a given plan--a given physical implementation of a plan fragment or sub-plan. For instance, whether or not an ordering is available in the result of a physical operator depends upon whether the physical plan has a sort or an index scan providing an ordering and the ordering is retained (not lost). In other words the physical properties are a postiori, meaning that they are only known for a given plan or sub-plan once it has been built. The physical operators module 730 describes the hierarchy of each of the physical operators. The physical operators module implements optimization time physical operators that are used for the optimization of the Volcano runtime iterators. They are not themselves runtime iterators that implement the open/next/close protocol. Rather, the physical operators module 730 contains optimization time descriptions of those runtime iterators which are used for evaluation and costing during the optimization process. These optimization time physical operators contain a description of the physical properties, such as the ordering, the partitioning, and the need for distinctness and grouping enforcement. These physical operators are nodes in a tree that have children. A tree of physical operators is a sub-plan or the full (final) plan. A plan is the full plan if it covers the total semantics of the query (i.e., if the result set of the root node of the plan tree delivers the result set of the query). In the currently preferred embodiment, the optimization time nodes and the query execution time nodes are implemented using two different types of data structures. Different data structures are used because of differing needs during query optimization and query execution. For instance, the execution time nodes implement the open/next/close interface, while the optimization time nodes (i.e., the optimization time physical operators referenced above) handle different requirements such as costing, supplying available properties, and computing available properties. In addition, the number of nodes in the final query execution tree is typically much smaller than the number of nodes that must be examined during optimization. Accordingly, it is advantageous to have the smallest possible operators (nodes) during the optimization process. The search engine module 740 inspects a search space of possible plans and plan fragments (or sub-plans) in order to determine the most cost-effective query execution plan for execution of a given query. The search engine module receives as input a logical operator tree consisting of a tree of logical relational operators. The search engine 740 generates a physical operator tree as output. The physical operator tree that is generated is the best plan determined based upon the optimizer's cost model and the area of the search space inspected by the search engine. This best plan may then be used by the code generator (not shown at FIG. 7) to generate the execution plan for execution against the database. In the currently preferred embodiment, this execution plan is a Volcano iterator tree. As previously described, the search engine of the currently preferred embodiment is a depth-first search engine. A breadth-first search engine (such as IBM System R for example) starts by processing all of the one table equivalence classes, then proceeds to processing all of the two table equivalence classes, as so on. In contrast to the operations of a breadth-first search engine which exhaustively processes each level in the plan cache 750, the depth-first search engine of the currently preferred embodiment attempts to develop a full plan (i.e., a plan covering all tables) more rapidly. After evaluating the single table equivalence classes, the depth-first search engine combines two of these equivalence classes to enumerate a single two-table equivalence class. The depth-first search engine then immediately moves up a level in the plan cache by adding another single table equivalence class to the two-table equivalence class to generate a three-table equivalence class. In this manner, the depth-first search engine proceeds to try and develop a full plan before evaluating each and every combination at each level of the plan cache. After a full plan has been developed, the depth-first search engine then goes back and considers other alternative combinations that may result in a more efficient query execution plan. The depth-first search engine of the currently preferred embodiment is a permutation-based engine that incrementally generates permutations of the base tables to be joined together. For example, if a given query references tables R, S, T, the search engine would start by joining tables R and S and then joining table T for a full permutation. Then the engine backs out by abandoning table T and looks for another permutation starting with R and S. In this example, there is no other combination starting with R and S, so the search engine backs out again to determine if there is another permutation starting with R. In this case there is another possibility, the tables R and T can be joined and then S added for a second full permutation (R, T, S). This process of permutation mapping typically continues until all permutations are enumerated. During this process, the search engine is driving the equivalence classes by joining them together using a depth-first strategy. The methods of the present invention for eager property enforcement and opportunistic property enforcement during the optimization process will now be described in more detail. 5. Method Steps for Eager Property Enforcement FIGS. 8A-B comprise a single flowchart 800 illustrating the detailed method steps of the operations of the present invention for the application of eager enforcement to ordering. In this example, an SQL query containing an ORDER BY clause is used as an example to illustrate the operations of the present invention. However, the methodology of the present invention is also useful for optimization of a wide range of queries and is not limited to use in this context. In addition, the currently preferred embodiment uses a bottom-up, depth-first search engine. However, the system and method of the present invention may also be used with other types of search engines as previously explained. The method steps described below may be implemented using computer-executable instructions, for directing operation of a device under processor control. The computer-executable instructions may be stored on a computer-readable medium, such as CD, DVD, flash memory, or the like. The computer-executable instructions may also be stored as a set of downloadable computer-executable instructions, for example, for downloading and installation from an Internet location (e.g., Web server). As shown, the method commences at step 801 with the receipt of a query (e.g., an SQL query). The SQL query may, for example, include a SELECT statement on a customer table, with a WHERE clause and an ORDER BY clause requesting ordering by customer name. At step 802, the query that is received is parsed and normalized and an internal representation of the query is generated as a logical operator tree. This logical operator tree is a tree whose nodes are pure logical relational algebra operators, such as join, group, or union for example. However, operators such as nested loops join and merge join are not included in the logical operator tree as they are not pure logical relational algebra operators. The logical operator tree that is generated is input to the optimizer. After the optimizer receives the logical operator tree as input, at step 803 the search engine module of the optimizer commences inspection of the search space of possible plans and plan fragments in order to determine the most cost-effective query execution plan. As described above, search space inspection involves transforming the initial query expression into a physical operator tree. This transformation includes reordering expressions using algebraic rules, replacing the logical operators with the physical operators that implement them, and enforcing the properties needed by some algorithms. During search space inspection, the search engine typically visits each multi-table equivalence class more than once as described below. As previously described and illustrated at FIG. 4, an equivalence class is identified by its set of underlying tables and contains sub-plans (or plan fragments) which compete with each other to be part of the best total plan. Each time a multi-table equivalence class is visited, at step 804 a sub-plan or set of sub-plans is added to the equivalence class. The same process also applies in the case of single table equivalence classes. However, each single table equivalence class is only visited once for the purpose of adding sub-plans. After one or more new sub-plans have been added to an equivalence class, the equivalence class is eagerly enforced. Eager enforcement means that the optimizer will try to place the strongest possible enforcer over the cheapest new sub-plan added to the equivalence class. In the currently preferred embodiment, the list of new sub-plans added to the equivalence class is traversed and, at step 805, the cheapest sub-plan added to the equivalence class is determined. This determination is based upon the cost of each sub-plan in the equivalence class according to the optimizer's cost model. After the cheapest new sub-plan in the equivalence class has been determined, at step 806 that cheapest sub-plan eagerly enforces itself by (if necessary) returning an operator that is the eager enforcer that is needed. The method for determining whether or not a particular sub-plan requires an eager enforcer is described below in more detail. At step 807, the operator that is returned is added to the equivalence class. The equivalence class retains both the cheapest sub-plan that is not eagerly enforced as well as the eagerly enforced sub-plan. This results in an increase in the search space. However, this search space increase cannot be avoided as some parent equivalence classes do not need the enforced property and should be able to use the cheapest sub-plan without the eagerly enforced property. At step 808, the equivalence class is pruned by deleting sub-plans that are more expensive than the cheapest sub-plan and that do not provide a needed property (e.g., a needed ordering). A sub-plan that is not the cheapest may be retained if it provides a needed property. Also, as described above, the revised equivalence class may contain both the cheapest sub-plan without the eagerly enforced property as well as one or more sub-plans which enforce a needed property. The revised equivalence class is retained in the plan cache, which is a directed acyclic graph of the set of candidate sub-plans that are competing to be included in the best total plan being generated by the optimizer. At step 809, the above steps 804-808 are repeated until the optimizer has determined a best final plan for execution of the query. At step 810, the optimizer generates a physical operator tree as output. This physical operator tree is the best plan that is determined to be available according to the optimizer's cost model and the area of the search space inspected by the optimizer. The best plan that is determined by the optimizer is then used by the code generator to generate the query execution plan, which in the currently preferred embodiment is a Volcano tree of runtime operators. The foregoing steps will now be explained and illustrated using a sample query. 6. Optimization of a Sample Query As previously described, the currently preferred embodiment of the present invention utilizes a depth-first, bottom-up search engine. When a parsed, normalized query is received by the optimizer (i.e., as a logical operator tree) the depth-first, bottom-up search engine of the currently preferred embodiment proceeds to generate a physical operator tree for execution of the query by starting with solutions (i.e., sub-plans or plan fragments) for small portions of the query and gradually using these plan fragments to construct larger solutions (i.e., larger sub-plans or the final plan). For instance, the search engine typically starts with sub-plans scanning single tables and uses these single table sub-plans to build larger plans joining two tables, and then three tables, and so forth until a final plan joining together all of the tables is completed. In any search engine, whether top-down, bottom-up, depth-first or breadth-first the search engine typically visits each equivalence class more than once during this process of search space inspection. Equivalence classes referencing only a single table are an exception to this general rule. Single table equivalence classes are not really part of the search space inspection since there are as many of these equivalence classes as there are underlying tables referenced by the query. Accordingly, the search engine usually visits single table equivalence classes only once. However, multiple table equivalence classes which join together several tables are usually visited by the search engine a number of times. For instance, in the above example of a query referencing tables R, S, T, the equivalence class of R, S will be visited at least twice. It is visited once for the R, S, T permutation. It is visited a second time for the S, R, T permutation. As a result, plans in an equivalence class that join together S and R would be visited and added to the equivalence class for both the permutation R, S, T as well as for the permutation S, R, T. In this manner, a search engine typically visits each multi-table equivalence class several times for depth-first descent into the search space for different combinations and permutations. Property enforcement is articulated at the point in the optimization process where a sub-plan or set of sub-plans is added to a multi-table equivalence class. The approach of the present invention is to eagerly enforce each equivalence class by placing the strongest possible enforcer over the cheapest new sub-plan added to the equivalence class. This involves two steps. First, the cheapest sub-plan added to the equivalence class is determined by traversal of the equivalence class. The determination of which sub-plan is the cheapest (i.e., has the lowest execution costs) is based upon the cost of each sub-plan in the equivalence class according to the optimizer's cost model. Use of a depth-first, bottom-up optimizer means that each of the sub-plans in an equivalence class has been fully costed. Accordingly, the costing decision is made based upon readily available information. To illustrate this process, consider the same example of an SQL query including a SELECT statement on a customer table, with a WHERE clause and an ORDER BY clause requesting ordering by customer name. This query results in a single table equivalence class as the above query refers to only the customer table of the database. This exemplary equivalence class also has a logical property called needed ordering. The needed ordering logical property contains a set of columns with a single column (customer name in this example) by which the information is to be ordered. This equivalence class also contains several sub-plans. For example, these sub-plans may include a table scan of the customer table. Another sub-plan might be an index scan using an index on the customer name column of the table. A third plan is an index scan on another index that is relevant to the WHERE clause of the query. In this example, the cheapest plan may be the index scan on the index relevant to the WHERE clause of the query. Each of the operators in the list of sub-plans in the equivalence class would have a physical property which is called available ordering. The available ordering physical property of the table scan would have nothing (NULL), as the table scan does not provide an ordering. The index scan on the customer name column would have an ordering on this column. Similarly, the other index scan relevant to the WHERE clause would also have an ordering. Accordingly, these two exemplary index scan sub-plans would have an available ordering property (e.g., as a set or vector of columns) that may be useful for optimization of this query. The needed ordering logical property which is required in accordance with the methodology of the present invention is always a set ordering because at optimization time the ordering (e.g., the ordering of these two columns in this example) is not relevant to optimization of the query. The decision about whether the ordering should be column 1, then column 2 or, alternatively column 2 and then column 1 can be deferred until later in the optimization and code generation process. This ability to defer this decision about ordering is an important point because of the directed acyclic graph structure of the plan cache. The same sort node may be a child of several parent nodes. Although each of these parent nodes may require a different actual ordering, they are compatible with the set ordering provided by the sort for purposes of query optimization. After the cheapest new sub-plan in the equivalence class has been determined, the cheapest sub-plan eagerly enforces itself by returning any eager enforcer that is needed. The operator that is returned is then added to the equivalence class. For example, when the sub-plan is added to the above-described equivalence class, a sort may be added over the cheapest plan in the equivalence class, which might be one of the two sub-plans containing an index scan as previously described. Eager enforcement includes identifying the cheapest sub-plan in the equivalence class and requiring that cheapest sub-plan to eagerly enforce itself if necessary. The process of determining whether or not an enforcer is necessary is described below. In this example, the sub-plan may eagerly enforce itself by returning a sort operator which is added above the cheapest sub-plan in the equivalence class. After any required eager enforcers have been added, the equivalence class is pruned by deleting sub-plans that are more expensive than the cheapest sub-plan and that do not provide a needed property (e.g., a needed ordering). A sub-plan that is not the cheapest may be retained if it provides a needed property. For instance, if the first index scan in the above example is more expensive because it cannot be used with the WHERE clause, but it does provide a needed property (e.g., an ordering that is useful) it will be retained in the equivalence class. Conversely, the table scan in this example will be discarded as it is the most expensive and it does not provide any ordering. After one or more operators are added to the equivalence class to eagerly enforce needed properties and unnecessary sub-plans are pruned, the revised equivalence class is added to the plan cache. In the currently preferred embodiment, the plan cache is implemented as a directed acyclic graph which contains the set of candidate sub-plans that are competing to be included in the best total plan being generated by the optimizer. 7. Determining Whether an Eager Enforcer is Required The method for determining whether or not a particular sub-plan (i.e., the cheapest sub-plan) needs an eager enforcer will now be described. Eager enforcement is based upon the fact that it can be determined in advance, as an equivalence class logical property, whether or not some ordering is needed in an equivalence class. If no ordering is needed for an equivalence class, then the method for determining whether an eager enforcer is required returns a NULL. In this situation there is no need to add a sort as no ordering is needed (i.e., no needed ordering) for the equivalence class. On the other hand, if a needed ordering is required, a check is made to determine whether or not the ordering is already available. If the needed ordering is already available, there is no need to add a sort operator to the sub-plan. Similarly, if the cheapest sub-plan has an ordering at least as strong as the needed ordering, then the method returns NULL as there is no need to generate an enforcer. For instance, using the same example of a query with a WHERE clause and an ORDER BY clause, the index scan on the column may be the cheapest sub-plan if both the WHERE clause and the ORDER BY clause are on this same column. If this is the cheapest sub-plan, this sub-plan is instructed to eagerly enforce itself. In this example, the available ordering property of this sub-plan is compared to the needed ordering that is required. If the available ordering satisfies the needed ordering requirement, there is no need to add a sort operator to the equivalence class. However, if the WHERE clause is on one column and the ORDER BY clause is on another column of the database, the cheapest index scan may be an index scan on the WHERE clause. In this situation, this index scan does not provide the required needed ordering, so a sort operator providing the required ordering is generated and returned as an enforcer. This sort operator is then added to the equivalence class above this sub-plan. 8. Needed Ordering Logical Property is a Set Ordering It should be noted that the needed ordering logical property is a set ordering. As previously described, the actual physical ordering that is required does not need to determined at the time of query optimization. This is one of the strengths of the present eager enforcement method. The set ordering (e.g., column 1, column 2) is enforced; however for purposes of optimization these columns can be in any order. The enforcer that is the sort is providing a set ordering and not a vector ordering. The sort physical operator can be instantiated after optimization with the ordering that is actually required in the final plan. This approach greatly simplifies the optimization process and avoids expanding the search space for each and every ordering that may be required. Instead, a single sort node with a set ordering may be used as the child of multiple parents in a query where multiple tables are being joined. For example, several join nodes in an equivalence class which compete with each other may be over the same child sort operator. The method of the present invention avoids the expansion of the search space that would result from using a different child sort node for each of the parent nodes. The use of a set ordering also enables the eager enforcement framework to be used for opportunistic enforcement of grouping as hereinafter described. 9. Parallel Queries If a query only requires the enforcement of ordering of a serial query, a maximum of one node is added to an equivalence class as the ordering enforcer. However, if a parallel query is being optimized, then in addition to the ordering enforcer, a partitioning enforcer (Xchg), or the combination of ordering and partitioning (a sort over an Xchg) may be added to the equivalence class. 10. Application to Multi-Table Equivalence Classes In the case of a multi-table equivalence class including join nodes which join a plurality of tables, nested loops joins and merge joins will also be generated. A merge join is a join algorithm that needs its children to be ordered on the merge predicate. Historically search engines typically handle merge joins by obtaining the cheapest plan that has the needed ordering while also adding a sort over the cheapest plan that has been found in the child equivalence class to make sure that the required ordering is available. As a result, for each and every merge join added by parent equivalence classes in existing search engines, a sort operator must be added to the child equivalence classes, which includes costing this operator, computing its properties, and so forth. With the present eager enforcement method, this is no longer required. With eager enforcement, the sort operator is already available in the child equivalence class, because it was placed there previously, if required, when the sub-plans were added to the child equivalence class. In the system of the present invention the merge join only requires checking to verify that the properties in the child equivalence classes match and that sufficient ordering is available to perform the merge join. This enables stronger pruning and avoids unnecessarily costing and computing a number of merge joins that will be discarded later in the optimization process. This approach also simplifies the valid child lookup process. As enforcers are already available in child equivalence classes, management of enforcers is no longer needed. E. Opportunistic Property Enforcement 1. Overview of Opportunistic Property Enforcement The opportunistic enforcement method of the present invention takes advantage of the available eager enforcement framework to handle additional property enforcement requirements in a very efficient manner. This opportunistic enforcement methodology requires only a very small increase in the search space during query optimization. The approach of the present invention is to take advantage of existing properties, such as when particular data is already ordered. For example, if data is already ordered on the grouping columns, grouping the ordered data is a very efficient operation that requires very little increase in query execution cost (compared to not grouping the ordered data). The opportunistic enforcement method also takes advantage of the enforcement process itself. For instance, when a sorting operation is being performed to obtain an ordering it is advantageous to also handle grouping at the same time if grouping is required. For example, if a sort operator is being added to an equivalence class and grouping is also needed, a group sorting operator may be used instead of the sort to obtain the required result with lower overall query execution costs. As previously described, logical properties may be computed in advance (a priori) for an equivalence class. Among the logical properties that may be computed a priori is grouping. A group needed logical property of an equivalence class indicates whether or not grouping is needed for that equivalence class. If grouping is required, the present opportunistic property enforcement methodology modifies the eager enforcement process at two points to perform the required grouping operations in an advantageous manner. The first point for application of the present opportunistic enforcement method is when new sub-plans (physical plans) are being added to an equivalence class. When new sub-plans are evaluated to determine which sub-plans are the most advantageous (e.g., cheapest) to add to an equivalence class, a determination is made as to whether any of these sub-plans provides a needed grouping. Initially, a check is made as to whether grouping is needed at the Eqc. If no grouping is needed, then the normal eager enforcement process (described above) proceeds without any modification. However, if grouping is required at this Eqc, then the sub-plans being added are next evaluated to determine if any sub-plan already provides an ordering on the grouping columns. For instance, if a sub-plan includes an index scan, this sub-plan may include a grouping on an index column which provides the required grouping. As another example, a sub-plan may include a merge join, with the merge predicate a superset of the grouping column. In these examples the needed ordering is already available and no sorting operation is required. If a sub-plan provides a needed ordering and grouping is required, a group ordered operator is added above the sub-plan. The group ordered node that is added results in little or no increase in query execution cost as this operator works on the fly without requiring any input/output operation. This newly modified sub-plan, which includes a parent property enforcer (group ordered) over the original sub-plan, replaces the original sub-plan in the Eqc. For example, if a sub-plan is a merge join providing a needed ordering, a group ordered operator is placed over the merge join and this modified sub-plan is added to the Eqc instead of the original sub-plan. Replacing the original sub-plan with the modified sub-plan provides the needed grouping while avoiding an increase in the search space that would result if both the original and the modified sub-plans were retained in the Eqc. This is different than eager enforcement method which retains the cheapest sub-plan and also adds a new sub-plan which adds a sort operator over the cheapest sub-plan. The second point during which opportunistic enforcement may be applied during the eager enforcement process is the point at which a sort operator is being added to an equivalence class. When a sort node is being added above the cheapest sub-plan in an Eqc to eagerly enforce a required ordering, a determination is made as to whether or not grouping or distinctness are also needed at this Eqc. If grouping is needed, a group sorting operator is added to the Eqc instead of the sort, thereby enabling both ordering and grouping to be enforced at the same time. One advantage of this approach is that the group sorting operator that is inserted is more efficient than the sort. Another advantage is that an increase in the search space is avoided. Because a group sorting operator is added above the cheapest sub-plan in the Eqc, there is no need to add another sub-plan which enforces grouping to the Eqc. It should be noted that the approach of the present invention may result in a slight increase in the search space by reducing pruning. As this method provides for tracking a grouping logical property, sub-plans which are not the cheapest plan in an Eqc, but which satisfy the grouping requirement, may be retaine | ||||||
