Method to help in optimizing a query from a relational data base management system, and resultant method of syntactical analysis5495605Abstract The method for help in optimizing a query (10) in a window (24) of a screen of a work station consists of forming a tree (26) representative of the execution plan of the query, representing the tree on the screen (25), making a syntactical analysis of the query in order to form a syntactical graph, comparing the elements of the tree with those of the graph, and completing the tree with elements contained only in the graph. The advantageous result is a method of syntactical analysis consisting of analyzing the function of the RDBMS in order to determine in particular the types of nodes of the syntactical graph and preferably the information associated with these type of nodes. Claims I claim: Description BACKGROUND OF THE INVENTION
______________________________________
typedef struct node.sub.-- {
int type;
char operation [30];
char options [30];
char object.sub.-- name [30];
char object.sub.-- owner.sub.-- [30];
int id;
int parent.sub.-- id;
int position;
struct node.sub.--
* brother;
struct node.sub.--
* child;
} Node;
______________________________________
Each time a tuple is rendered, one of the boxes 27 is created in the conventional manner, which is easy for one skilled in the art, for example using a graphical library function of the OSF/Motif software. Another exemplary embodiment adapted to the set of phases constituting the method of the invention will be seen later, with reference to the appended file named struct.h. In summary, in the tree 26 shown, the boxes 27 and their links 28 have been created by analysis of the query 10 in the table 11 that represents the execution plan. In the example shown in FIG. 3, the window 25 for showing the tree 26 is advantageously furnished by the tool DB*EXPLAIN under the Plan option and is accompanied by the name of the view and by the window 24 for editing the query 10 that have been shown in FIG. 2C. The tree 26 may incorporate all or some of the information resulting in the search for the execution plan of the query written in the RDBMS query language, such as the information contained in table 11 of FIG. 1. For example, each box 27 has as a heading the name of the operation to which this box relates. Hence in FIG. 3 the names NESTED LOOPS, TABLE ACCESS and INDEX, which are shown in the OPERATION column of table 11 in FIG. 1, are seen. If the operation constituting the heading of a box 27 has an option being executed for a given object, then the name of this option may be mentioned in the box, advantageously as a subheading. Similarly, if the operation mentioned in the heading in a box 27 has characteristics such as the name of the object and the name of the owner of the object to which the operation option applies, all or some of these characteristics may be indicated in the box. All of these advantageous options of the invention are contained in the tree 26 shown in FIG. 3. Thus in the thirteen boxes 27 of the tree 26, the names of the thirteen operations mentioned in table 11 are found, supplemented with the possible option that is executed for a given object, the name of the object, and the name of the owner of the object. The tree 26 affords the advantage of being easily understood by any user, regardless of his skill in the query language of the RDBMS, and of furnishing him a global overview of the execution plan of the query. The user seeing the tree 26 can thus more easily and quickly decide on what action to take to optimize the query that has been made. On the other hand, thanks to the tool DB*EXPLAIN, it has been seen that the user need not write the query and can easily modify it in the editing window 24 of FIG. 2C. After modification, the administrator can have the new execution tree very quickly, in order to find out how effective his modification is. The tool affords the additional advantage of saving the preceding tree and thus enables the administrator to compare the two trees easily and quickly. Naturally, these advantages may also be afforded without using the tool DB*EXPLAIN, and by using the conventional means for searching for the execution plan of a query in the query language of the RDBMS. The results of this search are shown in a table 11 in the RDBMS chosen by way of example, but it will be appreciated that they may be present in some other form, depending on the type of RDBMS. The analysis of these results will be easily adapted to their form by one skilled in the art in order to create an execution tree for the query. In another characteristic of the invention, other information besides that furnished by the analysis of the results of the request for the execution plan may be obtained. For example, it is possible to obtain additional information about certain possible characteristics of an operation. In the example shown, the name of the object on which the operation is executed, constituting the heading of a box 27 and optionally the owner of this object, are written in a button 29 inserted in the box, in such a way that by clicking on the button 29, a user can obtain additional information about this object. In the case where the operation that is executed on the object clicked on is a table (TABLE ACCESS in the boxes 27 in the example shown), the search for the additional information is made for example by means of an SQL request in the DBA.sub.-- TAB.sub.-- COLUMNS table of the Oracle dictionary. This table shows the structure of a table contained in the data base of the RDBMS, that is, the name of the columns and in particular with their types and sizes. The additional information can thus be all or some of the descriptive elements of the table mentioned in the box. It is accordingly possible to obtain the description of the table, for example, the name of the columns, and the type of the columns. By preference, the tool DB*EXPLAIN displays them in an appended window, not shown. If the operation being executed on the object clicked on is an index (INDEX in the boxes 27), then the additional information may be a histogram, like that shown in FIG. 4A. The information contained in this histogram has been obtained by the Oracle command VALIDATE INDEX. In response, Oracle creates a histogram table. On the basis of this histogram table and by means of a small program that is ordinary to one skilled in the art, a screen representing this histogram, like that shown in FIG. 4A, is formed. In the example shown in FIG. 4A, the user knows that his index includes 340 keys that are not repeated. The user immediately knows that the index chosen is very good. FIGS. 4B and 4C, respectively, show two other examples of histograms that may be obtained thanks to the invention. In the example of FIG. 4B, the user knows that the indexes used are less selective than before. FIG. 4C is a histogram representing a very unselective index which accordingly is to be changed. These examples highlight the advantages afforded by the method of the invention. Another phase of the method for helping optimize a query consists of making a syntactical analysis of the query. The syntactical analysis may be done by various methods known to one skilled in the art. In the example that will be described now, the syntactical analysis according to the invention is determined in general by analysis of the function of the RDBMS, in order in particular to determine the types of nodes of the syntactical graph, and preferably to determine the information associated with these types. Deep analysis of Oracle function makes it appear that execution of joins between tables and cartesian products is done without any order. It is possible to access a plurality of tables a plurality of times and to perform a plurality of operations for one SQL order. Sorts may be added or deleted. The projections and the tests are not marked. The only indicator that the RDBMS gives is the name of the tables and of the indexes accessed, and the order of citation of the tables in the FROM clauses of the source query. This is inadequate for full comprehension of a plan of execution that may contain several hundred nodes. This inadequacy, when it appears for a given type of RDBMS, is overcome by the method of syntactical analysis according to the invention. In that case, the syntactical analysis includes the determination of the operations executed by the RDBMS and the order in which these operations are executed. This order has the advantage of establishing a rigorous relationship between the query and the execution plan. In the case of the present RDBMS selected by way of example, the deep analysis of its operation has caused the following order to appear:
______________________________________
TABLE GROUP HAVING UNION ORDER,
______________________________________
where TABLE, for a query, represents both the access to the tables of the FROM clause, the cartesian product between these tables and all the tests of the WHERE clause; GROUP represents the GROUP BY clause and the calculations of the groups; HAVING represents the HAVING clause; UNION represents the operations UNION, MINUS and INTERSECT; and ORDER represents the ORDER clause. Oracle calculates the groups of a query and makes the GROUP BY operation when all the tables of the FROM clause have undergone a cartesian product and the restrictions of the WHERE clause. It executes the HAVING clause after having done the grouping operations at the same time as the projections. Next it performs the UNION operations, and then ORDER orders the furnishing of the final result. In accordance with another constraint, the UNION operations are executed in the order cited in the query. FIGS. 5A-5N by way of example illustrate conditions for determining the types of nodes of the syntactical graph based on the order relationship resulting from the deep analysis of the function of the RDBMS chosen. They also provide some examples of information that may be obtained from the deep analysis of the RDBMS function. By a conventional definition, a node represents a real or virtual table. In the example chosen, one can create a node: by table cited (FIG. 5A); for the grouping operations. These operations are placed in the nodes and are the groups of GROUP BY, UNIQUE (FIG. 5B); for the ORDER BY operation (FIG. 5C); by the FROM clauses, which may possess a plurality of tables. In this case, the real outgoing flow is the result of the cartesian product of these tables. The cartesian product is not an explicit operation in the syntactical graph and is shown here by the arrival of a plurality of flow arrows at the same node. However, if there are no grouping operations, then an empty node must be created (FIG. 5D); by a merging operation UNION, MINUS and INTERSECT. The type of operation is indicated (FIG. 5E). With respect to the oriented arcs: an oriented, nonlabeled arc represents the direction of the real flow of tuples (FIG. 5F); an oriented arc labeled by a test represents the direction of the logical flow of the tuples. In other words, for each tuple of the starting table, the set of tuples of the destination table that meets the condition is selected. The condition hence influences only the destination table of the flow (FIG. 5G). The result of the request is represented by a flow arrow without a destination node, leaving from the node that executed the last operation of the query (FIG. 5H). With respect to the projections: the projections of the SELECT clause are written close to the node from which the real flow leaves. If there is only one table in the FROM clause and no grouping operation, then the projections are marked under the node of the table (FIG. 5I); the projections are written under an empty node resulting from a cartesian product, or under a node created by grouping operations (FIG. 5J); the projections are written close to the arrow (flux or condition arrow) resulting from the projection (FIG. 5K). In the operations of the nodes, the projections and the condition labels, the operands originating from the entering flow are underlined (FIG. 5L). When a condition of the WHERE (or HAVING) clause includes only elements of this table and constants, then the conditions are marked under the node (FIG. 5M). Finally, the tests of a WHERE clause affect the nodes containing the name of the tables, and the tests of a HAVING clause affect the node containing the grouping operations (FIG. 5N). FIG. 6 illustrates an example of a graph 30 representing the syntactical analysis done in accordance with the criteria that have just be defined. This graph appears in the form of a network, which is the name by which this graph will henceforth be called. Now that the tree 26 of the execution plan (FIG. 3) and the network 30 of the syntactical analysis of the query (FIG. 6) have been obtained, the final step of the method of the invention consists of comparing them, to supplement the tree 26 with elements contained only in the network. In the example shown, the comparison of the tree with the network has been done in three phases: associating the nodes of the tree with the nodes of the network; placing the projections; and placing the tests. For example, to associate the nodes of the network with those of the execution plan, the operations of the DB*EXPLAIN tool, which are those of the RDBMS, have been classified into two groups depending on whether they are predictable or unpredictable. The predictable operations are those that ensue in the same order in the tree and in the network. It has been seen that the network is made up of only operations of this first group (TABLE, GROUP, UNION and ORDER node). The INDEX, TABLE ACCESS, UNION, MINUS, INTERSECT, SORT GROUP BY and SORT ORDER operations are also found in this group. FIG. 7 shows an exemplary association of predictable nodes of a network and of a tree. The operations constituting the second group are unpredictable when the order in which they appear in the query has not been preserved because a plurality of algorithms are possible each time. These are in particular the join operations NESTED LOOPS, FILTER and MERGE JOIN. The joins in the network appear in FIG. 11 as the links between the associated nodes. In general, the links between the associated predictable nodes are constituted with the unpredictable nodes. The placement of the projections consists of placing the projections on the proper nodes of the tree. A SELECT clause is not executable unless all the tables of the FROM clause are accessed, all the tests of the WHERE clause are executed, the groups are calculated, and the GROUP BY and HAVING clauses have been executed. In the network, the projections are done on the GROUP node, except if there is no cartesian product, group, or GROUP BY clause, or if the ORDER BY clause is added to the GROUP BY clause. Hence in the example of the tree 26 shown in FIG. 8 and representing the execution plan of the query shown in the margin of this drawing figure, the projection is done at the end of the SORT GROUP BY operation. The placement of the tests consists of placing the tests on the proper nodes of the tree. Several examples will illustrate this phase of associating the tree with the network. In a first example, it has been assumed that the tree 26 had the form illustrated in FIG. 9A and that the join between T1, T2, T3 and T4 was such that T1.c1+T2.c2=T3.c3+T4.c4. In this example, the remainder of the query does not matter. The location of the test depends on the algorithm used. In the first case illustrated in FIG. 9B, Op1, Op2 and Op3 represent a NESTED LOOPS operation, which has already been defined as a parametrized operation. The tuples brought back by the first son have passed to the second son. The test is then performed in sheet T4, as illustrated. In the second case illustrated in FIG. 9C, Op2, Op2 and Op3 represent a MERGE JOIN operation, which is a merging operation that does not require passage of parameters. The test is done in the MERGE JOIN Op3 node through which the flows of the four tables pass, as illustrated. It may be noted that the algorithm of Op1 has no influence on the location of the test. In the third and last case shown in FIG. 9D, Op2 is a MERGE JOIN operation, and Op3 is a NESTED LOOPS operation. The tuples of operation Op1 pass in parameter form into the lower branch of the operation Op3. Consequently, it is the MERGE JOIN operation, Op2, that performs the test. The second example relates to the SELECT nested clauses. Let it be assumed that the nesting . . . T1.c1 IN (SELECT T2.c2 . . . ) is as shown in the tree 26 of FIG. 10A. When the test is a nesting operation, the algorithm, is the FILTER clause. In that case, the IN test requires the prior execution of all the nested SELECT clauses. The test is accordingly done in FILTER. It is concluded by the fact that T1.c1 must have a value equal to a value originating in the second branch and translates as "IN flow of the second branch", as illustrated in FIG. 10B If the algorithm is NESTED LOOPS or MERGE JOIN, the RDBMS converts the nesting into normal joining in the presence of an IN or EXISTS test possessing a join between a nested SELECT clause and a non-nested SELECT clause. For example, if SELECT * FROM T1 WHERE Exists (Select * FROM T2 WHERE T1.c1 - T2.c2) then the RDBMS converts the IN or EXIST test into a join by modifying the projection: SELECT unique T1.* FROM T1, T2 WHERE T1.c1=T2.c2. Exemplary embodiments of the method of the invention will now be illustrated. These examples are based on a file written in C language to function in a Unix configuration. The file given by way of example is the file named struct.h appended to the end of the present specification and forming part of the specification. This file provides the definitions of all the structures specific to the network and presents itself in the following manner. In accordance with the C language, all the objects of the file are characterized by the same class, specified by the key word typedef specific to the construction of a new type. The type thus defined may then be used as a basic type. The file is divided into two parts relating to the two types composed, which are the enumerations and the structures of the C language. An enumeration defines a type by extension and declares itself with the help of the key word enum. The structures declare themselves with the help of the key word struct. It will also be recalled that the integral type is declared with the key word int, the character type by char, and the floating type by float. The qualifier const specifies that the object to which it is applied is a constant object which is not modifiable by the execution of the program. Comments are delimited by the sequences /* and /*. The first enumeration done in the file named struct.h has the name TYPE.sub.-- COL and defines the various types of columns that can be used in a table of the data base and by the structure s.sub.-- Field. The next enumeration, designated by TYPE.sub.-- CONST, defines the types of constants used in the structure s.sub.-- CONST. The enumeration TYPE.sub.-- EXPLAIN defines the types of operation of the RDBMS that have been summarized by the tool DB*EXPLAIN and that are used in the structure s.sub.-- Node. The enumeration TYPE.sub.-- EXPL.sub.-- OPT defines the types of operations relating to the operations enumerated in the preceding enumeration. These options are used in the s-Node structure. Certain operations of the sorting operation SORT have been defined above by example. The enumeration TYPE.sub.-- NODE defines the types of nodes of the network that are used in the structure s.sub.-- Node. The enumeration TYPE.sub.-- OPER defines the types of operation of the SQL language used in the arithmetical expressions and in the structure s.sub.-- Oper. The other enumerations relate successively to: the modes of the ORDER clause (TYPE.sub.-- ORDER) that are used in the structure s.sub.-- Order.sub.-- Col; the operations on queries (TYPE.sub.-- QUERYOP); the modes that relate to the SELECT command (TYPE.sub.-- SELECT) and are used by the s.sub.-- Node, s.sub.-- Oper and s.sub.-- SELECT structures; the types of each of the structures (TYPE.sub.-- STRUCT); the test types in a predicate (TYPE.sub.-- TEST) that are used in the s.sub.-- Test structure; the options (TYPE.sub.-- TESTOPT) of the tests of the SQL language that are used in the s.sub.-- Test and s.sub.-- Set structures; and the lists of tests in a node structure (TEST.sub.-- LIST). These enumerations are supplemented with the individual definitions indicated in the following lines. The second part of the struct.h file describes the various structures that are used. The description that follows furnishes additional comments to those of the file. The first field of each of the structures is defined in the enumeration TYPE.sub.-- STRUCT. The s.sub.-- Node structure is representative of a node of the tree that is to be supplemented with the elements of the network. The structure described in the file is an improved version of the structure mentioned above that was sufficient to construct the tree 26. The type of node may be a TABLE, a GROUP, an ORDER or an EXPLAIN or QUERY operation. The list named input.sub.-- nodes is the list of son nodes whose data flow enters into a node of the tree. The next field defines as an output node the one where the output flow from the node goes. The lists named input.sub.-- tests and output.sub.-- tests relate respectively to the input join tests, which are those influencing the result, and the output join tests, which are those using the values of the aforementioned output flow without influencing the result. The internal tests of the next list concern only that node. The information info indicates the structure of a node. For a table node, one points to an s.sub.-- Table; for a group node, one points to an s.sub.-- Group structure; for an order node, one points to an s.sub.-- Order structure; and a union node corresponds to a TYPE.sub.-- QUERYOP operation. The fields list is the list of all the pointers at the fields or projection operations. It represents the structure of the output flow from the node after the projection. The pointers of the list do not mark specific structures of that list. They mark either the s.sub.-- Field structures of the fields list of the s.sub.-- Table structure, if there is no further projection, or the structures of the proj list of the present structure. Depending on the last field nested, in the case of a node representing a semi-join between a normal SELECT and a NESTED select, it is necessary to know which of the two daughter branches of the node represents the nested SELECT. The s.sub.-- Oper structure well describes an arithmetical operation or a function call. The s.sub.-- Column structure is used for the syntactical analysis of a column. A column is what separates the commas in the SELECT (projection columns), GROUP BY (grouping columns), and ORDER BY (sorting column) clauses, and is found to the right and left of a non-nested test. For example, avg(table.field)+table.price * 1000 is a single column. A column is the root of a tree of s.sub.-- Oper, s.sub.-- Field, s.sub.-- Const and s.sub.-- Set structures. In the s.sub.-- Column structure, oper is a root operation. It may also involve a field or a constant. The funct list is that of the groups contained in the column. It has no structures of its own. This is a list of pointers to the s.sub.-- Oper structures contained in the tree, whose root is s.sub.-- Column. Finally, the fields list is that of the fields contained in the column. It too has no structures of its own. The s.sub.-- Const and s.sub.-- Field structures are well described. In the s.sub.-- From structure brought back by the syntactical analysis of the FROM clause, the tables list is the list of the table nodes that result from the syntactical analysis. There is one node per table cited. In the s.sub.-- Group structure, the funct list is the list of the groups to be calculated. This list does not have its own structures. It is a list of pointers to the s.sub.-- Oper structures that are present in the following lists of the following structures: the fields list of the s.sub.-- Group (the groups of the GROUP BY clause), the proj list of the s.sub.-- Node (the groups of the SELECT clause), and the test lists of s.sub.-- Node, if it is a grouping node (the groups of the HAVING clause). The s.sub.-- Having, s.sub.-- Order and s.sub.-- Order.sub.-- Col are well described. The s.sub.-- Select structure is brought back by the syntactical analysis of the SELECT clause. In the proj list of projection operations, each column in the projection corresponds to one element in the list. The funct list of the groups present in the projection does not have any structures of its own. They point to the s.sub.-- Oper structures present in the preceding list. Similarly, the fields list of the fields of the projection does not point to its own structures. The structure s.sub.-- Set is well defined in the file. It makes it possible to memorize a list of values. In the s.sub.-- Table structure, the object EXPLAIN gives the number of the EXPLAIN node of the execution tree. In the s.sub.-- Test structure representing a test, the fields list of the fields of the table or index does not have its own structures. These are those present in the tree of operands constituting the test. The s.sub.-- Where structure is brought back by the syntactical analysis of the WHERE clause. The nodes list is the list of total nodes created by the syntactical analysis of the nested SELECT clauses. The tests list combines the tests of the HAVING clause and the tests created by the syntactical analysis of the nested SELECT clauses. Finally, since for the syntactical analysis a s.sub.-- Where structure may be used at the same time as a syntactical analysis of the HAVING clause, the funct list contains the groups present in the tests. This list does not point to its own structures. The final structure, s.sub.-- Connect, is brought back by the syntactical analysis of the CONNECT privilege of the SQL. The first illustration of the use of the struct.h file relates to the example of a query cited in the introduction: SELECT Type FROM Type.sub.-- Car WHERE Mark=`XXX` In that case, the s.sub.-- Select structure would be used for the construction of the tree, like that described above. For the syntactical analysis of the query, the model name XXX is found in s.sub.-- Const, which would contain value=`XXX` and type=CONST.sub.-- STRING with the enumeration TYPE.sub.-- CONST. The s.sub.-- Set structure may be nested, if Mark is to be chosen between XXX, YYY and ZZZ. On the other hand, in the query, the term Mark in WHERE corresponds to the s.sub.-- Field structure. In s.sub.-- Field, table.sub.-- name=Type.sub.-- Car, name=Mark, and type would be the type of the column defined by COL.sub.-- STRING in the enumeration TYPE.sub.-- COL. In the query, WHERE Mark =`XXX` refers to s.sub.-- Test and s.sub.-- Where. In s.sub.-- Test, the type of test defined in TYPE.sub.-- TEST would be TEST.sub.-- EQ, signifying that the test consists of an equality. Otherwise, in the case of nonequality, for example, the NOT option would be chosen. The first operand would be Mark corresponding to s.sub.-- Field, and the second operand would be the constant XXX determined in s.sub.-- Const, as has been seen above. The list of test fields would include only Mark, but could also include Price, for example. In s.sub.-- Where, the test will be chosen from the list of possible tests that are contained in the WHERE clause. Similarly, in the nodes list, the table containing Mark will be chosen. The structures relating to the clauses CONNECT, FROM and HAVING are not used in the example in question. However, the previous example and the struct.h file will suffice for one skilled in the art to use them correctly without ambiguity or problems. Finally, one points to s.sub.-- Global to obtain the final graph. "Term" designates the terminal node of the final graph, that is, the box at the top of the graph at which all the other nodes begin that are found in the nodes list resulting from the tests contained in the query, that is, the test of the WHERE clause in the query that has been used as an example. The representation of the network with the data structures that have just been described will now be done with the aid of the illustrative example of the next query, which is also reproduced in FIG. 11: SELECT type avg (Price) FROM TYPE, MARK WHERE TYPE.mark=MARK.mark HAVING avg (Price)>1000 In accordance with the above teaching, the resultant network 31 has the form illustrated in FIG. 11. To more clearly represent the network with the data structures, the various fields of the structures are shown in the various FIGS. 12A-12H. By convention, the arrows in these figures represent pointers. A pointer is the field that is located at the beginning of the arrow. When the pointer is a list, each object on the list is connected to the next by an arrow. The order of the arrows represents the order of the list. FIG. 12A shows the fields marking all the nodes and all the tests of the network. In the s.sub.-- Global structure, the nodes list points successively to all the three s.sub.-- Node structures representing the three nodes of the network, while the tests list points successively to the two s.sub.-- Test structures representing two tests involved in the network, that is, the test of relative superiority to the first node, and the test of equality between the other two nodes. FIG. 12B illustrates the fields that mark the direction of flow. In the s.sub.-- Global structure, the "term" node representing the terminal node from which the result originates points to the s.sub.-- Node structure representing the terminal node. In s.sub.-- Node, the list of input nodes, which is named input.sub.-- nodes, points to the other two structures s.sub.-- Node. The output nodes, named output.sub.-- node, of these two structures point to the s.sub.-- Node structure of the terminal node, in which the output node points to s.sub.-- Global. FIG. 12C illustrates the field describing the links between tests and nodes. The test of superiority relates only to the terminal node and hence constitutes an internal test. Consequently, the list of internal tests, named intern.sub.-- tests, of the s.sub.-- Node structure of the terminal node points to the s.sub.-- Test structure, in which the nodes list, i.e., the list of nodes that have a field in the test, points to the structure s.sub.-- Node. The test of equality relates to the other two nodes. Consequently, the input.sub.-- tests list of the s.sub.-- Node structures of these two nodes point of the s.sub.-- Test, in which the nodes list points to the two s.sub.-- Node structures. FIG. 12D relates to the type of nodes. The terminal node is a grouping node and causes a projection to take place. The type of node in s.sub.-- Node is accordingly NODE.sub.-- GROUP representing a GROUP BY clause, and the info field indicating the structure of the node points to the grouping structure s.sub.-- Group, in which the "funct" list of groups to be calculated points to the calculation s.sub.-- Oper structure. On the other hand, the proj list of projections of the s.sub.-- Node structure points to the field structure s.sub.-- Field, which points to s.sub.-- Oper in the manner described in detail in conjunction with FIG. 12E. The other two nodes relate to tables and accordingly have the type NODE.sub.-- TABLE, and their info field, which indicates the structure of the node, clearly points to the corresponding table TYPE and MARK. FIG. 12E relates to the projections. The example shown in FIG. 11 causes two projections relating to the terminal node to take place. In the corresponding s.sub.-- Node structure, the list of projections proj points first to the field s.sub.-- Field structure, in which the name of the table, table.sub.-- name="TYPE", and the name of the object is name=type. Next it points to the calculation structure s.sub.-- Oper, whose type is AVG (type=OPER.sub.-- AVG), and whose first operand, named "first", points to the s.sub.-- Field structure, in which table.sub.-- name="TYPE", and the object name is name="price". Finally, FIG. 12F relates to the tests. The example shown brings about two types of test, the test of superiority (>), and the test of equality (=). For the test of superiority, in the corresponding s.sub.-- Test test structure, type=TEST.sub.-- GT, without option (option=TESTOPT.sub.-- EMPTY) and without nesting (nested=NO). In a similar way to that shown in FIG. 12E, the first operand points to the structure s.sub.-- Oper, whose operation type is AVG and whose first operand points to the s.sub.-- Field structure, in which table.sub.-- name="TYPE" and the object name is name="price", this structure having been designated by the fields list of the corresponding test structure, s.sub.-- Test. The second operand is the constant 1000 furnished by the s.sub.-- Const structure, in which the type is an integral number (type=CONST.sub.-- INT) and the value is 1000 (value=1000). For the test of equality, one refers to the test structure s.sub.-- Test, where type=TEST.sub.-- EQ, without option and without nesting. The fields list points to the s.sub.-- Field structure representing the table where table.sub.-- name="TYPE" and the object name is name="mark", and then to the s.sub.-- Field structure representing the table where table.sub.-- name="MARK" and the object is "mark". These two structures respectively constitute the first operand, first, and the second operand, second, of the test to be performed. Thanks to the network 31 of FIG. 11, one knows that projections on [or about] TYPE and AVG (price) and the restriction rest AVG (price)>1000 exist in the node terminal, and that the other two nodes are involved in the equality test (TYPE.mark)=MARK.mark. These elements of the query were incapable of appearing in the execution plan of the query, and advantageously, all or some of them can supplement the execution tree 26. FIG. 13 shows the complete tree 32 for execution of the query 10 shown in FIG. 2C resulting from employing the method of the invention. The complete tree will be compared with the tree 26 shown in FIG. 3 obtained simply by analysis of the execution plan of the table shown in FIG. 1. In the complete tree 32, the boxes 27 contain the descriptive elements of all the corresponding operations. Hence, beginning with the last box that appears on the right of the screen shown in FIG. 13, it can be seen that the operation INDEX, whose option is UNIQUE SCAN and whose object is SYS.I.sub.-- OBJ# (as indicated in FIG. 3) corresponds to the second condition of the WHERE clause in the query 10. This box is connected to the one affecting the TABLE ACCESS operation, whose option is CLUSTER and whose object is SYS.CLU# (as indicated in FIG. 3), which in the query 10 has been given the synonym (alias) C. Similarly, one knows that in the upper box connected to the same operation NESTED LOOPS, the object SYS.OBJ# has the synonym O. Continuing, from the tree 26, one also knows the following: with respect to the third and next-to-last NESTED LOOPS operation, that the object in (TABLE ACCESS) has the synonym S and is involved in the fourth test under the WHERE clause of the query, and that the daughter box bearing on INDEX is concerned with the last two tests of the WHERE clause of the query; with respect to the second NESTED LOOPS operation, that the object SYS.TS# has the synonym TS and that the INDEX operation is involved in the third test mentioned under the WHERE clause of the query; and finally, on the subject of the first NESTED LOOPS operation which appears in the lower left corner, that all the projections bearing on this operation are now known, that the object SYS.USERS# in the box has the synonym U, and that the index is subjected to the first test of the WHERE condition of the query. This example is a good illustration of the fact that the complete tree 32 can provide a complete description of any query. It is understood that the analysis may be limited to certain elements, in order to make a more or less complete description. It is understood that further variants are possible to one skilled in the art, depending on the type of RDBMS and on its environment. In a general way, the steps of forming two graphs representing the tree and the network, respectively, may be done in any arbitrary order, and that comparison could be done differently from that described, which consists of associating the nodes of the tree with the nodes of the network and placing the projections and the tests. Similarly, the criteria for syntactical analysis and comparison may vary greatly. However, the criteria described optimize the method of the invention and constitute the preferred embodiment for the RDBMS selected as an illustrative example. ##SPC1##
|
Same subclass Same class Consider this |
||||||||||
