Handling null values in SQL queries over object-oriented data5905982Abstract A method, apparatus, and article of manufacture for handling NULL values in SQL queries over object oriented data. A two-phase method is used to enable a query evaluator in a two-valued logic environment to properly handle occurrences of NULL values for predicates that involve subqueries, i.e., basic subquery predicates and/or quantified subquery predicates. For basic subquery predicates, negation reduction is performed by applying logical equivalence rules and inverting basic comparators (e.g., transforming < to >=) to eliminate NOTs. Then, transformations are employed for the resulting positive predicates to include NULL value testing, i.e., NULL protection. For quantified subquery predicates, in addition to performing negation reduction, quantified subqueries are converted into existential subqueries. In most cases, this yields a predicate that can be handled using NULL protection transformations for positive predicates. The exception (i.e., a "NOT EXISTS" resulting from the conversion step of a universally quantified subquery) is handled using NULL protection transformations for negative predicates. The evaluation of the NULL tested positive and negative predicates ensures that if a predicate has a NULL value, it is not included in the query result. Claims What is claimed is: Description RESERVATION OF RIGHTS
______________________________________
typedef struct Address
char *street;
char *city;
char *zip;
};
class empClass : public person
{ public:
int no;
float sal;
Address address;
deptClass *dept;
person *spouse;
int mgr;
};
class deptClass
{ public:
root
int no;
char name›20!;
Collection<empClass*> emps;
};
Collection<empClass*> *Emp; // Database root
Collection<deptClass*> *Dept; // Database
______________________________________
In the schema above, each instance of empClass has a "dept" attribute that is a pointer to an instance of deptClass. Consider a path expression that reaches an employee's department number through the "dept" attribute. If the "dept" attribute for a particular instance of empClass is a NULL pointer, then the path expression results in a NULL value for that instance of empClass. NULL values in predicates follow the same semantics as in SQL-based RDBMSs (where bindings for variables where the condition evaluates to UNKNOWN are not included in the result, and an "IS ›NOT! NULL" predicate is provided to determine whether expressions are NULL). The present invention handles NULL values in queries using tri-valued logic in a two-valued logic system. NULL values do not exist in the object-oriented data models manipulated by object-oriented SQL queries, but are introduced due to path expressions that can contain NULL pointers. The present invention can handle path expressions of any length in which a node might be a NULL pointer. This protects a simple, two-valued, query engine from abnormal termination when traversing a NULL pointer, and provides the correct interpretation of object-oriented SQL queries in the presence of such expressions. Moreover, the present invention handles predicates that include subqueries whose interpretation can be NULL. Also, the present invention handles the correct interpretation of the IS NOT NULL predicate for general path expressions. To enable a two-valued query evaluator to properly handle occurrences of NULL values in basic path predicates (i.e., those without subqueries), an extended version of the basic GEM transformations are employed for positive and negative predicates called null protection. In particular, the present invention applies query rewrite transformations on object-oriented SQL queries so that the rewritten query is amenable to the introduction of appropriate NULL-pointer checking predicates for two-valued logic environments. To enable a two-valued query evaluator to properly handle occurrences of NULL values in both simple predicates (i.e., those without subqueries) and subquery predicates (i.e., those within subqueries), a two phase transformation is performed. For both types of predicates, negation reduction is performed by applying DeMorgan's logical equivalence rules and inverting their basic comparators (e.g., transforming < to >=) to eliminate NOTS. Additionally, to handle subquery predicates, all quantified subqueries are transformed into existential subqueries. Then, for both types of predicates, transformations are employed to include NULL value checks. ENVIRONMENT An OO-SQL query engine 100 is described with reference to FIG. 1. A preferred embodiment has an implementation of the OO-SQL query engine 100 that runs on top of System Object Model (SOM), which is well known in the art as an object architecture. SOM is described fully in Christina Lau, Object Oriented Programming Using SOM and DSOM, Van Nostrand Reinhold, an International Thomson Publishing Company, 1994. In addition, the OO-SQL query engine 100 can run on top of an OODB system. FIG. 1 presents the framework for the implementation which supports the integration of the OO-SQL query engine 100 with a user's applications 102. The application 102 issues OO-SQL queries 104 through a Call Level Interface (CLI) 106. The OO-SQL parser 108 parses the query and generates an internal representation of the query called a query graph, i.e. a data structure OQGM (object query graph model) 110. The OQGM 110 is passed to the query rewrite component 112, which applies transformations to optimize the query. This query rewrite component 112 employs standard relational query rewrite techniques that were developed for relational systems as well as the query rewrite transformations of this invention. The query rewrite transformations of this invention apply the negation reduction techniques as discussed herein. The rewrite phase component 112 outputs as its result a query graph 114 that is used by the optimizer 116. The optimizer 116 further optimizes the query by applying the null protection techniques discussed herein. Each query is then translated into an executable plan (QPL) 118. Once translated, the QPL 118 is passed to the Query Evaluation Subsystem (QES) 120, which runs it against SOM collections of objects 122, which can include objects stored in a database 124. Returned to the application 102 (which could also be a user interacting directly through an interface) is an ordered collection of tuples or objects 126. Query results can include pointers to objects in stored collections 124. These pointers are simply virtual memory addresses in the application program 102, SO they must be valid pointers in the application program 102 in order to be useable for further C++ based navigation and data manipulation. FIG. 2 shows the context of the OO-SQL query engine 100 in a processing system 200 having memory 202 and at least one cpu 204. The system 200 could be connected to other systems 206 via a network. The application could be resident on any of the systems 200 or 206 in the network. Further, the OO-SQL query engine 100 could retrieve data from a data storage device 208 connected locally or remotely to the processing system. Using the specification herein, the invention may be implemented as a machine, process, or article of manufacture by using standard programming and/or engineering techniques to produce programming software, firmware, hardware or any combination thereof. Any resulting program(s), having computer readable program code, may be embodied within one or more computer usable media such as memory devices or transmitting devices, thereby making a computer program product or article of manufacture according to the invention. As such, the terms "article of manufacture" and "computer program product" as used herein are intended to encompass a computer program existent (permanently, temporarily, or transitorily) on any computer usable medium such as on any memory device or in any transmitting device. Executing program code directly from one medium, storing program code onto a medium, copying the code from one medium to another medium, transmitting the code using a transmitting device, or other equivalent acts, may involve the use of a memory or transmitting device which only embodies program code transitorily as a preliminary or final step in making, using or selling the invention. Memory devices include, but are not limited to, fixed (hard) disk drives, diskettes, optical disks, magnetic tape, semiconductor memories such as RAM, ROM, Proms, etc. Transmitting devices include, but are not limited to, the internet, electronic bulletin board and message/note exchanges, telephone/modem-based network communication, hard-wired/cabled communication network, cellular communication, radio wave communication, satellite communication, and other stationary or mobile network systems/communication links. A machine embodying the invention may involve one or more processing systems including, but not limited to, cpus, memory/storage devices, communication links, communication/transmitting devices, servers, I/O devices, or any subcomponents or individual parts of one or more processing systems, including software, firmware, hardware or any combination or subcombination thereof, which embody the invention as set forth in the claims. One skilled in the art of computer science will easily be able to combine the software created as described with appropriate general purpose or special purpose computer hardware to create a computer system and/or computer subcomponents embodying the invention and to create a computer system and/or computer subcomponents for carrying out the method of the invention. INTERPRETATION AND OPTIMIZATION Handling NULL Values The invention is applicable to predicates having subqueries and quantified subqueries. A predicate is a truth-valued function, i.e., a function that, given appropriate arguments, returns either TRUE, FALSE, or NULL (UNKNOWN). A quantified subquery includes a quantified predicate such as the universal predicate ALL and existential predicates such as ANY, SOME, and IN. To handle object-oriented SQL, queries are transformed in two phases. The first phase performs negation reduction to simplify the treatment of NULL values including predicates that involve subqueries. Following this negation reduction phase, the second phase applies NULL protection transformations to guard the query engine against intermediate NULL pointer values. The negation reduction phase occurs at logical query rewrite time, and precedes query plan optimization. The NULL pointer protection phase occurs somewhat later, during plan optimization, as it requires predicates to have an ordered execution semantics. The invention presented herein provides techniques for implementing tri-valued logic over two-valued systems to derive the correct interpretation of path expressions with NULL pointers and queries involving subquery predicates. To properly handle NULLS in the context of path expressions, the technique of this invention extends the GEM transformations for NULLs in predicate expressions. To properly handle subqueries (including a basic predicate with a subquery and a quantified subquery), both types of subqueries are handled in a special fashion when being converted to two-valued logic since both types have specific semantics for their interpretation with respect to tri-valued logic. The invention is especially applicable for use with an underlying query evaluation engine that is itself capable of handling path expressions if, and only if, no intermediate pointer along the path is NULL, i.e., a two-valued query engine system. The problem that this invention addresses is different than the problem of NULL values addressed in GEM in the following respect: The invention does not try to enrich the types of values that a column, say t.A of type integer, can have by selecting a specific value, say zero, to represent a NULL value instead of the value zero. The invention is concerned with the correct interpretation of path expressions in which a node might be a null pointer. The problem which this invention overcomes has two aspects: 1) protecting a simple, two-valued, query engine from abnormal termination by traversing a NULL pointer, and 2) providing the correct interpretation of queries in Object-Oriented Systems (OOS) and Object-Oriented Database Systems (OODB), including those involving subquery predicates, in such an environment. These two aspects of the problem of handling NULL valued semantics in the context of queries in OOSs and OODBs are solved by this invention in the following ways. First, the invention provides a correct interpretation, by the query engine, of predicates with path expressions having NULL pointers by using NULL valued semantics. Second, the invention handles predicates that include subqueries whose interpretation can be NULL. Third, the invention provides a correct interpretation of the SQL "IS ›NOT! NULL" predicate for the detection of NULL values in systems that do not directly support such a notion. The present invention is incorporated into an OO query system that provides SQL-based query access to C++ programming language environments, Object-Oriented Systems (OOSs), and Object-Oriented Database Systems (OODBSs). The present invention is embodied in an object-oriented database query interface that provides an upward compatible extension to SQL-92, as discussed in Database Language SQL ISO/IEC 9075:1992, ISO.sub.-- ANSI, 1991, ›hereinafter referred to as "IS091"!, which is incorporated by reference herein. The motivation for this approach is twofold. First, it enables programmers familiar with SQL to write object-oriented database queries without learning an entirely new language; they must simply learn about the object extensions. Second, it enables the many tools that have been built for relational systems to access object-oriented database data via interfaces such as open database connectivity (ODBC), Microsoft Open Database Connectivity Software Development Kit, Microsoft Programmer's Reference, 1992, ›hereinafter referred to as "Mic92"!, which is incorporated by reference herein. Table 1, below, gives the interpretation of expressions in predicates.
TABLE 1
______________________________________
P Q P AND Q P OR Q
______________________________________
T T T T
T F F T
T NULL NULL T
F T F T
F F F F
F NULL F NULL
NULL T NULL T
NULL F F NULL
NULL NULL NULL NULL
______________________________________
Additionally, NOT (TRUE) is FALSE, NOT (FALSE) is TRUE, and NOT (NULL) is NULL. The SQL-92 "IS ›NOT! NULL" predicate can be applied to a column of any type to determine if it has a NULL value, as discussed in ›ISO91!. The IS NULL predicate is TRUE if the path expression has a node which is a NULL pointer. The predicate is FALSE otherwise, and inversely for IS NOT NULL. SQL-92 includes predicates of the form: e r Q, and e r .alpha. Q in which e is an expression, r is a relational operator, .alpha. is a qualifier among {ANY, ALL} and Q is a subquery. Both have specific semantics for their interpretation with respect to tri-valued logic, and therefore require careful treatment when being converted to two-valued logic. Managing NULL Values In Object-Oriented SQL Queries Using Two-Valued Logic To handle object-oriented SQL, queries are transformed in two phases. The first phase is to use negation reduction to simplify the treatment of NULL values for predicates that involve subqueries. The purpose behind this simplification is that if negation is eliminated in predicates, then predicates that evaluate either to FALSE or to NULL can be treated similarly. (This is the case because object-oriented SQL, like SQL, only returns query results for those predicates that evaluate to TRUE.) Following this negation reduction phase, the second phase applies the NULL protection transformations to guard the query engine against intermediate NULL pointer values. A two-phase method is used to enable a query evaluator in a two-valued logic environment to properly handle occurrences of NULL values for predicates that involve subqueries, i.e., basic subquery predicates and/or quantified subquery predicates. For basic subquery predicates, negation reduction is performed by applying logical equivalence rules and inverting basic comparators (e.g., transforming < to >=) to eliminate NOTS. Then, transformations are employed for the resulting positive predicates to include NULL value testing, i.e., NULL protection. For quantified subquery predicates, in addition to performing negation reduction, quantified subqueries are converted into existential subqueries. In most cases, this yields a predicate that can be handled using NULL protection transformations for positive predicates. The exception (i.e., a "NOT EXISTS" resulting from the conversion step of a universally quantified subquery) is handled using NULL protection transformations for negative predicates. The evaluation of the NULL tested positive and negative predicates ensures that if a predicate has a NULL value, it is not included in the query result. A predicate with a subquery has the form e r Q, where e is an expression, r is a relational operator, and Q is a subquery that returns at most a single value. For example, the following query selects employees who have a higher salary than their manager:
______________________________________
select *
from Emp e
where e.sal > (select m.sal
from Emp m
where m.no = e.mgr);
______________________________________
For a given Employee "e", the subquery is computed and the comparison (e.sal>m.sal) is performed. The SQL interpretation of predicates with a subquery is NULL if either operand is NULL (e.sal or m.sal in the above example), or if the subquery returns an empty result. A quantified subquery predicate is of the form e r .alpha. Q, where e is an expression, r is a relational operator, .alpha. is a qualifier among {ALL, ANY, SOME, IN}, and Q is a subquery. For example, the following query selects employees having the highest salary of employees in their department:
______________________________________
select *
from Emp e
where e . . . sal >= ALL
(select . . . e2 . . . sal
from (e . . . dept . . . emps) e2);
______________________________________
For a given Employee "e", the subquery selects the salaries of all employees in "e's" department. The interpretation of this predicate is NULL if the specified relationship is not FALSE for any values returned by the subselect and at least one comparison is unknown because of a NULL value. The following query selects employees who have in their departments, other employees with the same name:
______________________________________
select *
from Emp e
where e . . . name = ANY
(select e2 . . . name
from (e . . . dept . . . emps) e2
where e2 <> e);
______________________________________
The interpretation of this predicate is NULL if the specified relationship is not TRUE for any of the values returned by the subselect and at least one comparison is unknown because of a NULL value. As shown above, the interpretation in OOSQL of basic predicates with subqueries, of an "ALL" quantified subquery and of an "ANY" quantified subquery, is enabled to be the same as the interpretation provided by SQL-92 even though OOSQL is based upon a two-valued logic system and SQL-92 provides for tri-valued logic. The invention presented herein handles queries with such subquery predicates. As discussed above, a path expression p is of the form q.m.sub.1..m.sub.2.., . . . , ..M.sub.n in which attributes m.sub.1, . . . ,m.sub.n-1 can be NULL pointers. To protect the query engine from abnormal termination due to dereferencing a NULL pointer, each attribute m.sub.1 through m.sub.n-1 along the path must be checked for NULL values before attempting to further resolve the path p. Doing this takes care of intermediate NULLs along paths. Object-Oriented SQL Transformations FIG. 3 is a flowchart illustrating the method of initializing and optimizing SQL queries over object-oriented data according to the present invention. In step 302, each predicate in a query is selected, starting with the first predicate. In step 304, the negation reduction is performed. To perform negation reduction, logical equivalence rules are applied to the query to eliminate negative predicates and to convert quantified subqueries into existential subqueries, as will be discussed in further detail below. Once step 304 is completed, step 306 indicates that NULL protection is performed. Recall that a path expression resolves to the value of the leaf of the path expression, and that the correct interpretation of predicates is dictated by tri-valued logic. Predicates having path expressions are transformed as follows: p.sub.1 r k p.sub.1 r p.sub.2 in which p.sub.i is a path expression, k is a constant, and r is a relational operator among {=, .noteq., <, .ltoreq., >, .gtoreq.}. Positive expressions are transformed as follows: P.sub.1 r k.fwdarw.q.sub.1.m.sub.1 .noteq.NULL and q.sub.1.m.sub.1..m.sub.2 .noteq.NULL and . . . and q.sub.1.m.sub.1.., . . . , ..m.sub.n-1 .noteq.NULL and q.sub.1.m.sub.1.., . . . , ..m.sub.n r k p.sub.1 r p.sub.2 .fwdarw.q.sub.1.m.sub.1 .noteq.NULL and q.sub.1.m.sub.1..m.sub.2 .noteq.NULL and . . . and q.sub.1.m.sub.1.., . . . , ..m.sub.n-1 .noteq.NULL and q.sub.2.m.sub.1 .noteq.NULL and q.sub.2.m.sub.1..m.sub.2 .noteq.NULL and . . . and q.sub.2.m.sub.1.., . . . , ..m.sub.m-1 .noteq.NULL and q.sub.1.m.sub.1.., . . . , ..m.sub.n r q.sub.2.m.sub.1.., . . . , ..m.sub.m Note that it is assumed that the last nodes m.sub.n in the path expression above are not pointer types. Since any non-leaf node in a path expression can contain a NULL pointer, each node in a path expression must be checked for NULL values before proceeding to the successive one. For positive expressions, the NULL test ensures that bindings for predicates evaluating to NULL are not included in the result. It should be noted that if an existential subquery of the form "not exists Q" is part of the original query, then it is treated as a positive predicate as described above. However, if the existential subquery "not exists Q" is a result of being converted from an "ALL" quantified subquery, then its predicates are treated according to the following formulas for negative predicates: not (P.sub.1 r k).fwdarw.not (q.sub.1.m.sub.1 =NULL or q.sub.1.m.sub.1..m.sub.2 =NULL or . . . or q.sub.1.m.sub.1.., . . . , ..m.sub.n-1 =NULL or q.sub.1.m.sub.1.., . . . , ..m.sub.n r k) not (p.sub.1 r p.sub.2).fwdarw.not (q.sub.1.m.sub.1 =NULL or q.sub.1.m.sub.1..m.sub.2 =NULL or . . . or q.sub.1.m.sub.1.., . . . , ..m.sub.n-1 =NULL or q.sub.2.m.sub.1 =NULL or q.sub.2.m.sub.1..m.sub.2 =NULL or . . . or q.sub.2.m.sub.1.., . . . , ..m.sub.m-1 =NULL or q.sub.1.m.sub.1.., . . . , ..m.sub.n r q.sub.2.m.sub.1.., . . . , ..m.sub.m) Note that it is assumed that the last nodes m.sub.n in the path expression above are not pointer types. The NULL test ensures that bindings for predicates evaluating to NULL are not included in the result by returning TRUE if a NULL pointer is found. After reaching the scope of negation, it will become false. As shown above for the positive predicates, a NULL testing predicate is applied to each node along the path expression with each NULL testing predicate being AND'ed together. The semantic of the IS NOT NULL is interpreted as being FALSE if the path expression has a node which is a NULL pointer, and it is interpreted as being TRUE if the path expression does not have a node which is a NULL pointer. Because each node and its NULL testing predicate are AND'ed together, if there is a NULL pointer present, or if the predicate expression evaluates to FALSE, then the whole predicate expression evaluates to FALSE. As such, for positive predicates, the NULL test ensures that bindings for predicates evaluating to NULL are not included in the result. Since bindings for predicates evaluating to FALSE are not included in the result, both a NULL value and a FALSE value are handled similarly. As shown above for the negative predicate "not exist Q", a NULL testing predicate is applied to each node along the path expression with each NULL testing predicate being OR'ed together. The semantic of the IS NULL is interpreted as being TRUE if the path expression has a node which is a NULL pointer, and it is interpreted as being FALSE if the path expression does not have a node which is a NULL pointer. Because each node and its NULL testing predicate are OR'ed together, if there is a NULL pointer present, then the whole predicate expression evaluates to TRUE. After the scope of negation is applied, the expression evaluates to FALSE thereby ensuring that bindings for predicates having a NULL pointer are not included in the result. By applying the NULL transformations and interpreting the semantics as described above with respect to positive and negative predicates, the query can be processed, in a two-valued logic query system that does not support NULL values and in an environment that has NULL pointers, without an abnormal termination as a result of a traversal of a NULL pointer with a predicate having a subquery. NULL tests are added to all of the positive path predicates in the transformed predicate tree. As for negative predicates, the only negative predicates that survive the negation reduction process are predicates of the form "not exists Q." These must be treated carefully because a positive (universal) predicate is transformed into a negative (existential) one. Those that were generated by the transformations in Table 5, which convert universally quantified subqueries into negative existential subqueries, must be handled as a special case in order to produce the correct NULL semantics. It should be noted that if a "not exists Q" appears in the original query, it is handled as a positive predicate. In the case of a "not exists Q" subquery that was introduced by transformations in Table 5, the newly created predicate (a r.sup.-1 x.sub.1) was pushed into the body of Q. In the predicate, r.sup.-1 is the inverse comparator for r, and x.sub.1 is the first and only projection element. This predicate requires particular attention, as it must remain in the scope of negation with respect to NULL testing predicates. These are the only predicates to which the path transformations for negative predicates are applied. Other negative existential subqueries are handled similarly to positive ones (i.e., their predicates are considered to be positive with respect to the introduction of NULL testing predicates). For positive (universal) predicates that are transformed into negative (existential) predicates, the resulting negative existential subquery must return FALSE if the universal predicate is either FALSE or UNKNOWN. It must therefore return FALSE if (i) the specified comparison is not FALSE for any values returned by the original subquery and (ii) at least one comparison is unknown because of a NULL value. The first case is handled by the inverse predicate which is added to the negative existential subquery; this detects any bindings for which the comparison fails. The second case detects any NULL values, which cause at least one comparison to be UNKNOWN due to a NULL value. This will be clarified in the example below. Step 308 is a decision step determining whether all of the predicates have been selected. When all of the predicates have not been selected, then the next predicate is selected for negation reduction and NULL transformation. In step 310, the next subquery is selected, starting with the first subquery. In step 312, the transform query routine is performed on the subquery. This ensures that the predicates in subqueries have been properly handled. Step 314 is a decision step that determines whether all of the subqueries have been selected. When all of the subqueries have not been selected, then the next subquery is selected and transform query is performed on it. FIG. 4 is a flowchart further illustrating additional detail of the steps required to perform negation reduction (step 304 of FIG. 3) according to the present invention. Step 402 is a decision step that determines whether a predicate is in the scope of negation. In step 404, when the predicate is in the scope of negation, logical equivalence rules are applied to progressively push negation from the root of a predicate tree down to the leaves of the predicate tree. In particular, first logical equivalence rules are applied for negation reduction for predicate trees, as illustrated in Table 2 below.
TABLE 2
______________________________________
not (not (A)) A
not (A or B) (not (A) and not (B))
not (A and B) (not (A) or not (B))
______________________________________
Next, logical equivalence rules are applied for negation reduction of simple predicates, as illustrated in Table 3 below.
TABLE 3
______________________________________
not (a = b) a .noteq. b
not (a .noteq. b) a = b
not (a > b) a .ltoreq. b
not (a .gtoreq. b) a < b
not (a < b) a .gtoreq. b
not (a .ltoreq. b) a > b
not (a like b) a not like b
not (a not like b) a like b
not (a IS NULL) a IS NOT NULL
not (a IS NOT NULL b)
a IS NULL
______________________________________
Then, logical equivalence rules are applied for negation reduction for quantified subqueries, as illustrated in Table 4 below.
TABLE 4
______________________________________
not (a = ANY Q) a .noteq. ALL Q
not (a .noteq. ANY Q)
a = ALL Q
not (a > ANY Q) a .ltoreq. ALL Q
not (a .gtoreq. ANY Q)
a < ALL Q
not (a < Q) a .gtoreq. ALL Q
not (a .ltoreq. ANY Q)
a > ALL Q
not (a = ALL Q) a .noteq. ANY Q
not (a .noteq. ALL Q)
a = ANY Q
not (a > ALL Q) a .ltoreq. ANY Q
not (a .gtoreq. ALL Q)
a < ANY Q
not (a < ALL Q) a .gtoreq. ANY Q
not (a .ltoreq. ALL Q)
a > ANY Q
______________________________________
The last step involves applying logical equivalence rules to convert quantified subqueries into existential subqueries, as illustrated in Table 5 below. Note that x.sub.1 stands for the first (and only) projection element in the subquery Q.
TABLE 5
______________________________________
a = ANY Q exists (. . . and a = x.sub.1)
a .noteq. ANY Q exists (. . . and a .noteq. x.sub.1)
a > ANY Q exists (. . . and a > x.sub.1)
a .gtoreq. ANY Q exists (. . . and a .gtoreq. x.sub.1)
a < ANY Q exists (. . . and a < x.sub.1)
a .ltoreq. ANY Q exists (. . . and a .ltoreq. x.sub.1)
a = ALL Q not exists (. . . and a .noteq. x.sub.1)
a .noteq. ALL Q not exists (. . . and a = x.sub.1)
a > ALL Q not exists (. . . and a .ltoreq. x.sub.1)
a .gtoreq. ALL Q not exists (. . . and a < x.sub.1)
a < ALL Q not exists (. . . and a .gtoreq. x.sub.1)
a .ltoreq. ALL Q not exists (. . . and a > x.sub.1)
______________________________________
These transformations push the quantified predicate into the body of the subquery in order to convert the subquery from a quantified subquery into an existential subquery. A predicate of the form: "e r ANY (select x.sub.1 . . . " generates a newly created predicate of the form "e r x.sub.1 ", which is then AND'ed into the list of conjuncts in the existential subquery. Similarly, a predicate of the form: "e r ALL (select x.sub.1 . . . )" generates a newly created predicate of the form "e r.sup.-1 x.sub.1 " (where r.sup.-1 is the inverse comparator for r) which is then AND'ed into the existential subquery's list of conjuncts; in this case, the resulting existential subquery itself is negative. The details of the transformed existential subquery vary slightly from case to case. If the original subquery Q is not an aggregate form (i.e., Q has no aggregate function in its projection element and no group-by or having-clause), then the newly created predicate is AND'ed into the list of conjuncts in the existential subquery's where-clause. Otherwise, the newly created predicate is AND'ed into the having-clause (creating one if Q previously had no having-clause). Continuing with FIG. 4, step 406 is a decision step that determines whether the predicate has a left operand. In step 408, when the predicate has a left operand, the negation reduction routine is performed on the predicate and its left operand. Step 410 is a decision step that determines whether the predicate has a right operand. In step 410, when the predicate has a right operand, the negation reduction routine is performed on the predicate and its right operand. An Illustrative Example In the following example, the query selects employees who don't have any spouses with the same name as their own spouse in their departments.
______________________________________
select *
from Emp e
where not (e.spouse. .name = any (select
e1.spouse. .name from (e.dept. .emps)
e1 where e <> e1))
______________________________________
The transformations applied to this initial query are now detailed.
______________________________________
1. select *
from Emp e
where e.spouse. .name <> all (select
e1.spouse. .name from (e.dept. .emps) e1
where e <> e1))
2. select *
from Emp e
where not exists (select 1 from (e.dept. .emps)
e1
where e <> e1 and e.spouse. .name =
e1.spouse. .name)
3. select *
from Emp e
where not exists (select 1 from (e.dept. .emps)
e1
where (e <> e1 and
(espouse == NULL or
e1.spouse == NULL or
e.spouse. .name =
e1.spouse. .name)))
______________________________________
NULL values can appear in the following three places in the example query: (i) if "e.spouse" is a NULL pointer in "e.spouse..name", (ii) if "e1.spouse" is a NULL pointer in "e1.spouse..name", and (iii) if "e.dept" is a NULL pointer in "e.dept. . emps". Assume, in this example, that "name" is a character array and cannot be NULL. The first transformation eliminates negation by transforming the "ANY" quantified predicate into an "ALL" quantified predicate, and inverting the comparison operator. The second transformation converts the quantified predicate into a negative existential subquery by introducing a new predicate into the body of the subquery. The newly formed predicate is "e.spouse..name"="e1.spouse.name". Note that in the absence of NULL values, the negative existential subquery is equivalent to the "ALL" quantified predicate. The final transformation, which is actually performed later (during plan optimization), adds the NULL testing predicates that are required to implement the correct semantics of the query in the presence of NULL values (i.e., e.spouse == NULL or e1.spouse == NULL or . . . ). If "dept" in "e.dept.emps" is a NULL pointer, the interpretation of "e1" is the empty set. The Complete Method The transformations applied to an object-oriented SQL query to handle NULL values using a two-valued logic systems are presented in pseudocode below.
______________________________________
negation.sub.-- reduction (q subquery)
begin
for each conjunct p in q do
begin
reduce (p); null.sub.-- protection (p);
end
for each subquery qi in q do
begin
negation.sub.-- reduction (qi);
end
end
reduce (p predicate)
begin
if (p is in the scope of negation) then
begin
if (Table 2 transformations can be applied to
p) then
apply Table 2 transformations to p;
else if (Table 3 transformations can be
applied to p) then
apply Table 3 transformations to p;
else if (Table 4 transformations can be
applied to p) then
apply Table 4 transformations
to p;
if (Table 5 transformations can be
applied to p) then
apply Table 5 transformations to p;
end
if (p is a predicate that has a left operand)
then
reduce (left operand (p));
if (p is a predicate that has a right operand)
then
reduce (right operand (p));
end
null.sub.-- protection (predicate p)
begin
if (p is of the form q <relop> k) then
if (p is not in the scope of negation) then
p := q.m.sub.1 <> NULL and . . . and
q.m.sub.1 . . ., . . ., . . m.sub.n-1 <> NULL and
q.m.sub.i . . ., . . ., . . m.sub.n <relop> k;
else
p := q.m.sub.1 = NULL or . . . or
q.m.sub.1 . ., . . ., . . m.sub.n-1 = NULL or
q.m.sub.1 . ., . . ., . . m.sub.n <relop> k;
else if (p is of the form q1 <relop> q2) then
if (p is not in the scope of negation) then
p := q.sub.1.m.sub.1 <> NULL and . . . and
q.sub.1.m.sub.1 . ., . . ., . . m.sub.n-1 <> NULL and
q.sub.2.m.sub.1 <> NULL and . . . and
q.sub.2.m.sub.1 . ., . . ., . . m.sub.m-1 <> NULL and
q.sub.1.m.sub.1 . ., . . ., . . m.sub.n <relop>
q.sub.2.m.sub.1 . ., . . ., . . m.sub.m ;
else
p := q.sub.1.m.sub.1 = NULL or . . . or
q.sub.1.m.sub.1 . ., . . ., . . m.sub.n-1 = NULL or
q.sub.2.m.sub.1 = NULL or . . . or q.sub.2.m.sub.1 . ., . . ., . .
m.sub.m-1
= NULL or q.sub.1,m.sub.1 . ., . . ., . . m.sub.n <relop>
q.sub.2.m.sub.1 . ., . . ., . . m.sub.m ;
else
begin
if (p is a predicate that has a left operand)
then
null.sub.-- protection (left operand (p));
if (p is a predicate that has a right
operand) then
null.sub.-- protection (right operand (p));
end
end
.COPYRGT. Copyright 1997 IBM Corporation.
______________________________________
The recursive negation.sub.-- reduction procedure accepts a single parameter that is a subquery. The initial query is treated as a subquery since it is passed as a parameter to this function. The negation.sub.-- reduction procedure iterates through each AND'ed predicate in the query, and calls the reduce procedure to apply negation reduction on each predicate. For example, the predicate expression e.sal>100 and not (e.no <> 10) would generate two iterations. The first for predicate e.sal>100 and the second for predicate not (e.no <> 10). For a given subquery Q, once all predicates have been reduced, the method processes each subquery contained within Q by recursively calling negation reduction for each one. Once the reduce procedure has transformed a predicate p, the null.sub.-- protection procedure is applied to the predicate p to introduce the NULL pointer testing predicates to ensure i) that path expression traversal does not extend beyond NULL pointers, and ii) the correct interpretation of predicates that include them. The null.sub.-- protection procedure performs the null.sub.-- protection transformations described under the heading "Object-Oriented SQL Transformations". Subqueries contained within subqueries might also have predicates scoped by negation that need to be reduced in order to be interpreted correctly with respect to NULL values. The reduce procedure accepts a predicate p, over which it attempts to apply the transformations given in Tables 2 through 5. The reduce procedure handles predicates that are scoped by negation. Even if a predicate is not itself scoped by negation, it might include other predicates that are. The reduce procedure also includes recursive calls to the reduce procedure that handle operands of predicates that are also predicates. For example, calling reduce with the predicate e.sal>100 or not (e.no <> 10), would cause two recursive calls to the reduce procedure. The transformations performed using Table 2 push negation into the leaves of predicate trees. These leaves are further reduced by the transformations performed by Tables 3 to 5. However, the transformations performed using Table 4 produce subqueries that are applicable for further transformations using Table 5 (e.g., a negated "ALL" subquery is transformed into an "ANY" subquery form using Table 4 transformations, and then transformed into an "exists" subquery using Table 5 transformations). The IS ›NOT! NULL Predicate In SQL-92, discussed in ISO91, the "IS NULL" predicate, and it's negation, the "IS NOT NULL" predicate are used to test whether a column contains, or does not contain, a NULL value. The following is an interpretation of these predicates in a system that doesn't directly support such a notion. The truth value of an expression p IS ›NOT! NULL, where p is a path expression, is determined using the approach presented under the heading "Object-Oriented SQL Transformations", and is implemented using a method that is similar to the null.sub.-- protection method. For path expressions p, where the leaf of the expression is not a pointer type, the transformations are as follows: p IS NOT NULL.fwdarw.q.m.sub.1 .noteq.NULL and q.m.sub.1..m.sub.2 .noteq.NULL and . . . and q.m.sub.1.., . . . , ..m.sub.n-1 .noteq.NULL p IS NULL.fwdarw.q.m.sub.1 =NULL or q.m.sub.1..m.sub.2 =NULL or . . . or q.m.sub.1.., . . . , ..m.sub.n-1 =NULL Path expressions that are pointers types have an additional test for the leaf node as follows: p IS NOT NULL.fwdarw.q.m.sub.1 .noteq.NULL and q.m.sub.1..m.sub.2 .noteq.NULL and . . . and q.m.sub.1.., . . . , ..m.sub.n-1 .noteq.NULL and q.m.sub.1.., . . . , ..M.sub.n .noteq.NULL p IS NULL.fwdarw.q.m.sub.1 =NULL or q.m.sub.1..m.sub.2 =NULL or . . . or q.m.sub.1.., . . . . , ..m.sub.n-1 =NULL or q.m.sub.1.., . . . , ..m.sub.n =NULL Note that it is assumed that the last nodes m.sub.n in the path expression are not pointer types. CONCLUSION The foregoing description of the preferred embodiment of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.
|
Same subclass Same class Consider this |
||||||||||
