Incremental maintenance of materialized views containing one-to-one lossless joins6134543Abstract A method and apparatus are provided for performing incremental refreshes to materialized views defined by one-to-one lossless joins. Each base table of the materialized view is selected to be processed as the current "selected table". During the processing, if the selected table is the right table of an outer join, then selected columns of rows of the materialized view are set to NULL based on rows of the selected table that have been updated or deleted in the selected table after a prior refresh operation. Then, NULL values in selected columns of the materialized view are updated to reflect new values caused by updates and inserts into the selected table that occurred after the prior refresh operation. If the selected table is not the right table of an outer join, then rows from the materialized view are deleted based on rows of the selected table that have been updated or deleted in the selected table after the prior refresh operation. Then, rows are inserted into the materialized view based on updates and inserts into the selected table that occurred after the prior refresh operation. The incremental refresh techniques are "memoryless" in that they do not require a record of the sequence of changes that were made subsequent to the prior refresh operation. Techniques are described for performing the incremental refresh steps through the use of database commands and queries. Claims What is claimed is:
______________________________________
update MV
set
columns-of(OJGraph(Ti) = null
where
MV.Ti.sub.-- rid in (select rid from Dlt.sub.-- Ti)
______________________________________
wherein MV is the materialized view, OJGraph(Ti) is the part of the join query that defines the materialized view that is reachable from the selected table by either inner joins or left outer joins, MV.Ti.sub.-- rid is a column in the materialized view that stores unique identifiers for rows from the selected table, and Dlt.sub.-- Ti is the set of changed rows from the selected table. 9. The method of claim 1 wherein the step of updating null values is performed by performing operations specified in the following statement:
______________________________________
update MV
(select
mv-columns-of(OJGraph(Ti)), columns-of(OJGraph(Ti))
from
tables-of(POJGraph(Ti)), MV
where
join-conditions-of(POJGraph(Ti)) and
Ti.rowid in (select rid from Dlt.sub.-- Ti) and
MV.Tj.sub.-- rid = Tj.rowid
) v
set
columns-of(OJGraph(Ti)) = columns-of(OJGraph(Ti))
______________________________________
wherein MV is the materialized view, Ti represents the selected table; Ti is the right side of the join Tj.fwdarw.Ti, OJGraph(Ti) is the part of the join query that defines the materialized view that is reachable from the selected table by either inner joins or left outer joins, POJGraph(Ti) is the subgraph Tj.fwdarw.OJGraph(Ti), MV.Ti.sub.-- rid is a column in the materialized view that stores unique identifiers for rows from the selected table, and Dlt.sub.-- Ti is the set of changed rows from the selected table. 10. The method of claim 1 wherein the step of deleting rows is performed consistent with the statement: delete MV where MV.Ti.sub.-- rid in (select rid from Dlt.sub.-- Ti); wherein MV is the materialized view, MV.Ti.sub.-- rid is a column in the materialized view that stores unique identifiers for rows from the selected table, and Dlt.sub.-- Ti is the set of changed rows from the selected table. 11. The method of claim 1 wherein the step of inserting rows is performed by executing a modified view definition query, wherein the modified view definition query is the view definition query for the materialized view in which references to the selected table are replaced by references to a row source that includes the rows that have been updated and inserted into the selected table since the prior refresh operation. 12. A computer-readable medium bearing instructions for performing an incremental refresh of a materialized, lossless, one-to-one join view, the instructions including instructions for performing the steps of: establishing a base table of the materialized view as a selected table; if the selected table is the right table of an outer join, then performing the steps of setting to NULL selected columns of rows of the materialized view based on rows of the selected table that have been updated or deleted in the selected table after a prior refresh operation; updating NULL values in selected columns of the materialized view to reflect new values caused by updates and inserts into the selected table that occurred after the prior refresh operation; if the selected table is not the right table of an outer join, then performing the steps of deleting rows from the materialized view based on rows of the selected table that have been updated or deleted in the selected table after the prior refresh operation; and inserting rows into the materialized view based on updates and inserts into the selected table that occurred after the prior refresh operation. 13. The computer readable medium of claim 12 wherein the steps of setting to NULL, updating NULL values, deleting rows and inserting rows are performed by executing queries on tables. 14. The computer readable medium of claim 12 further including instructions for performing the steps of: storing change data that indicates changes made to base tables of the materialized view without tracking the order of the changes, identifying a set of changed rows based on the change data, the set of changed rows including rows of the selected table that have been changed since a previous refresh operation; performing the step of deleting rows from the materialized view based on the set of changed rows; identifying a set of still-existing changed rows, said set of still-existing changed rows including those rows where changed after the prior refresh operation but still exist in the selected table; and performing the step of inserting rows into the materialized view based on the set of still-existing changed rows. 15. The computer readable medium of claim 12 further comprising instructions for performing the steps of: determining whether, since the prior refresh operation, the only changes made to the selected table were inserts; if the selected table is the right table of an outer join and the only changes made to the selected table after the prior refresh operation were inserts, then skipping the step of setting to NULL selected columns; if the selected table is not the right table of an outer join and the only changes made to the selected table after the prior refresh operation were inserts, then skipping the step of deleting rows from the materialized view. 16. The computer readable medium of claim 12 further comprising instructions for performing the steps of: determining whether, since the prior refresh operation, the only changes made to the selected table were deletes or drop partitions; if the selected table is the right table of an outer join and the only changes made to the selected table after the prior refresh operation were deletes or drop partitions, then skipping the step of updating NULL values; if the selected table is not the right table of an outer join and the only changes made to the selected table after the prior refresh operation were deletes or drop partitions, then skipping the step of inserting rows into the materialized view. 17. A database system including: a plurality of base tables; a one-to-one lossless materialized join view derived from said base tables; a mechanism for identifying changes that occur to said base tables after a prior refresh operation; a database server configured to incrementally refresh said materialized view by performing the steps of: establishing a base table of the materialized view as a selected table; if the selected table is the right table of an outer join, then performing the steps of setting to NULL selected columns of rows of the materialized view based on rows of the selected table that have been updated or deleted in the selected table after a prior refresh operation; updating NULL values in selected columns of the materialized view to reflect new values caused by updates and inserts into the selected table that occurred after the prior refresh operation; if the selected table is not the right table of an outer join, then performing the steps of deleting rows from the materialized view based on rows of the selected table that have been updated or deleted in the selected table after the prior refresh operation; and inserting rows into the materialized view based on updates and inserts into the selected table that occurred after the prior refresh operation. 18. The database system of claim 17 wherein the database server is configured to perform the steps of setting to NULL, updating NULL values, deleting rows and inserting rows are performed by executing queries on tables. 19. The database system of claim 17 wherein the database server is further configured to perform the steps of: storing change data that indicates changes made to base tables of the materialized view without tracking the order of the changes, identifying a set of changed rows based on the change data, the set of changed rows including rows of the selected table that have been changed since a previous refresh operation; performing the step of deleting rows from the materialized view based on the set of changed rows; identifying a set of still-existing changed rows, said set of still-existing changed rows including those rows where changed after the prior refresh operation but still exist in the selected table; and performing the step of inserting rows into the materialized view based on the set of still-existing changed rows. 20. The database system of claim 17 wherein the database server is further configured to perform the steps of: determining whether, since the prior refresh operation, the only changes made to the selected table were inserts; if the selected table is the right table of an outer join and the only changes made to the selected table after the prior refresh operation were inserts, then skipping the step of setting to NULL selected columns; if the selected table is not the right table of an outer join and the only changes made to the selected table after the prior refresh operation were inserts, then skipping the step of deleting rows from the materialized view. 21. The database system of claim 17 wherein the database server is further configured to perform the steps of: determining whether, since the prior refresh operation, the only changes made to the selected table were deletes or drop partitions; if the selected table is the right table of an outer join and the only changes made to the selected table after the prior refresh operation were deletes or drop partitions, then skipping the step of updating NULL values; if the selected table is not the right table of an outer join and the only changes made to the selected table after the prior refresh operation were deletes or drop partitions, then skipping the step of inserting rows into the materialized view. Description FIELD OF THE INVENTION
______________________________________
MV: select
<sel-cols>
from
T1, T2, T3, . . . , Tn
where
<join.sub.-- preds>
______________________________________
In this query definition, T1, T2, . . . , Tn are the base tables of MV. <sel-cols> are the columns in the select clause. <join.sub.-- preds> are the join predicates of the form T1.a=T2.a or T1.a=T2.a(+). Join predicates are in a conjunctive form. The MV stores rowids of the base table columns that are listed in <sel-cols> and the join columns that are listed in <join.sub.-- preds>. The notation MV.Ti.sub.-- rid refers to the column in MV that stores the rowid values from a particular table Ti. The rowid value stored in the rowid column of a given row in the MV is used to uniquely identify the row in Ti that combined to form the given row in the MV. Consequently, any other column or set of columns of Ti that is subject to uniqueness and non-null constraints may be used for this purpose instead of the rowid column. OJGraph(T1) is used to refer to the part of the join of a query Q that is reachable from T1 by either inner joins or left outer joins. For example, given the join S.rarw.L.fwdarw.O.fwdarw.C, OJGraph(O) is equal to O.fwdarw.C. When the same table is not allowed to be the right side of two different left outer joins, the following statement holds true: if there exists a Tj, such that Tj.fwdarw.Ti, then Tj.fwdarw.OJGraph(Ti) is a valid subgraph of the Join-graph of Q and furthermore this graph is unique for a given Ti. The subgraph Tj.fwdarw.OJGraph(Ti) is indicated by the notation POJGraph(Ti). For example, given the join S.rarw.L.fwdarw.O.fwdarw.C, POJGraph(O) is equal to L.fwdarw.O.fwdarw.C. If Ti is not the outer table of any outer join then POJGraph(Ti)=OJGraph(Ti) The notation .about.OJGraph(Ti) is used to refer to the complement of OJGraph(Ti). Thus, .about.OJGraph(Ti)=Join.sub.-- Graph(Q)-OJGraph(Ti). Dlt.sub.-- Tn is a table which contains the rowids of the rows in table Tn which have changed. Dlt.sub.-- Tn is referred to herein as the "delta table" for table Tn. VTn denotes a view on table Tn which produces the rows from Tn which have changed. SCN refers to a System Change Number. A system change number is a logical number assigned to transactions in commit time order. Each change is associated with the scn of the transaction that performed the change. The scn associated with a change indicates the logical time at which the change was made within the database. THE INCREMENTAL REFRESH OPERATION Referring to FIG. 5, it is a flow chart that illustrates the steps involved in an incremental refresh operation according to one embodiment of the invention. Incremental refresh is performed by applying Dlt.sub.-- Ti to MV. According to one embodiment, there is a partial order in application of Dlt.sub.-- Ti. Specifically, if Ti is a right side of an outer join Tj.fwdarw.Ti, then Dlt.sub.-- Tj is applied before Dlt.sub.-- Ti. At step 500, the leftmost unprocessed table listed in the join query that defines the materialized view to be refreshed is selected. Thus, if the join query is T1.fwdarw.T2, and neither T1 nor T2 have yet been processed, then T1 is selected during step 500. During the second iteration of step 500, T1 will have already been processed, so T2 will be selected. At step 501, the membership of the rowid set Dlt.sub.-- Ti is determined. The rowid set Dlt.sub.-- Ti includes the rowids of all rows of the selected table that have undergone any change since the last refresh operation. Various techniques may be used to establish Dlt.sub.-- Ti. For the purposes of explanation, it shall be assumed that the database system maintains a "delta table" SNLog.sub.-- Ti that stores, for each changed row of table Ti, (1) the rowid of the changed row, and (2) the scn that indicates the logical time at which the row was changed. Assuming that the last refresh of a materialized view occurred at a time "Last.sub.-- Refresh.sub.-- SCN" and that the current refresh is intended to update the materialized view to reflect the time "Current.sub.-- Start.sub.-- SCN", the membership of Dlt.sub.-- Ti may be established according the database query: Dlt.sub.-- Ti=select unique rid from SNLog.sub.-- Ti where scn>:Last.sub.-- Refresh.sub.-- SCN and scn<:Current.sub.-- Start.sub.-- SCN; After the membership of Dlt.sub.-- Ti is established, control passes to step 502. At step 502, the changed rows that still exist in the currently selected table arc retrieved. This set of these rows is referred to as VTi. VTi may be established using t he database query: VTi=select Ti.* from Ti, Dlt.sub.-- Ti where Ti.rowid=Dlt.sub.-- Ti.rid; Some changed rows may no longer exist in the selected table at the ti me the incremental refresh of the materialized view is being performed. Specifically, changed rows include rows that have been deleted, as well as rows that have been updated and inserted. The rowids of deleted rows will be returned in Dlt.sub.-- Ti, but the deleted rows themselves will not be part of VTi, since VTi is generated by executing a query on the selected table after the deleted rows have been removed from the table. After the membership of VTi is established, control passes to step 504. At step 504, it is determined whether the selected table is the right side of an outer join. If the selected table is the right side of an outer join, control passes to step 506. If the selected table is not the right side of an outer join, control passes to step 510. At step 506, the columns of MV for rows derived from the rows that are in Dlt.sub.-- Ti are set to NULL. Setting to these rows to NULL is required by semantic of the outer join. Step 506 may be performed according to the following update statement:
______________________________________
update MV
set
columns-of(OJGraph(Ti)) = null
where
MV.Ti.sub.-- rid in (select rid from Dlt.sub.-- Ti)
______________________________________
Control proceeds from step 506 to step 508. During step 508, the rows of MV that are in the delta log of the selected table are recalculated. This recalculation may be performed, for example, according to the following update statement:
______________________________________
update MV
(select
mv-columns-of(OJGraph(Ti)), columns-of(OJGraph(Ti))
from
tables-of(POJGraph(Ti)), MV
where
join-conditions-of(POJGraph(Ti)) and
Ti.rowid in (select rid from Dlt.sub.-- Ti) and
MV.Tj.sub.-- rid = Tj.rowid
) v
set
columns-of(OJGraph(Ti)) = columns-of(OJGraph(Ti))
______________________________________
This update statement uses an updatable view "v". According to one embodiment of the invention, a table T in view V is updatable, if the join of T to the rest of tables in V is one-to-one lossless. After calculating the rows in MV at step 508, control proceeds to step 514. At step 514, it is determined whether any base tables of the MV have not yet been processed. If any tables have not yet been processed, control returns to step 500 and the next unprocessed base table is processed. If all base tables have been processed, control proceeds to step 516. As mentioned above, control proceeds from step 504 to step 510 if the selected table is not a right side of an outer join. Therefore, control proceeds to step 510 when Ti is a left side on an outer join Ti.fwdarw.Tj or is either side of an inner join Ti><Tj. At step 510, rows in the materialized view that are derived from rows in the selected table that have been changed are removed from the materialized view. This removal can be performed according to the statement: delete MV where MV.Ti.sub.-- rid in (select rid from Dlt.sub.-- Ti); Control proceeds from step 510 to step 512. At step 512, the rows that were removed in step 510 are recalculated. This recalculation may be performed according to the following statement:
______________________________________
insert into MV
select
<sel-cols> . . . with all Ti replaced by VTi
from
VTi, T2, T3, . . . , Tn
where
<join-preds> ; . . . with all Ti replaced by VTi
______________________________________
Control proceeds from step 512 to step 514. At step 514, it is determined whether there are any unprocessed base tables of the materialized view. If any base tables have not yet been processed, then control returns to step 500. Otherwise, control proceeds to step 516, and the incremental refresh process is complete. EXAMPLE For the purpose of explanation, an exemplary application of the incremental maintenance process set forth above shall be given with reference to tables X and Y, illustrated in FIG. 4a. Referring to FIG. 4a, table X includes two columns x.a and x.b and a pseudo-column "X.rowid". The pseudo-column contains a unique identifier for each row in table X. The pseudo-column may, for example, simply represent the addresses of the actual storage locations at which the rows are stored. Similar to table X, table Y includes two columns y.a and y.b and a pseudo-column Y.rowid. FIG. 4a represents the tables X and Y at a particular point in time (T1). At time T1, table X has five rows, with corresponding rowids of x1 to x5, and table Y has three rows, with corresponding rowids y1 to y3. FIG. 4b illustrates a materialized join view MV 400 created by performing ajoin on tables X and Y. The defining query for MV 400 is: QUERY3 SELECT * FROM X, Y WHERE x.a (+)=y.a; The MV 400 illustrated in FIG. 4b is current as of time T1. Therefore, it accurately represents the results produced by executing Query3 on tables X and Y at time T1. MV 400 includes the columns of the base tables X and Y, as well as the pseudo-columns X.rowid and Y.rowid. Query3 is an outer join. Consequently, the join produced by applying the query to tables X and Y are lossless (all rows of table X are reflected in MV 400). In addition, no row of X combines with more than one row of Y. Therefore, the join that created MV 400 is also a one-to-one join. FIG. 4c illustrates changes made to tables X and Y between time T1 and a later time T2. Various mechanisms may be employed to identify these changes at the time an incremental refresh is to be performed. For example, snapshot logs may be used to record the changes that have been made to base tables after the most recent MV refresh. The present invention is not limited to any particular mechanism for recording changes to base tables. In the illustrated example, five changes were made to table X and three changes were made to table Y between T1 and T2. The changes to table X include two insert operations, two delete operations, and one update operation. The changes to table Y include one insert operation, one delete operation, and one update operation. FIG. 6 illustrates tables X and Y at time T2 after the changes have been made. At time T2, the technique described above is used to incrementally refresh MV 400 to accurately reflect the state of tables X and Y at time T2. The sequence of the incremental refresh would proceed as follows: SELECTED TABLE=TABLE X At step 500, the leftmost unprocessed table listed in the join query that defines the materialized view to be refreshed is selected. In the present example, none of the tables have been processed, and table X is the leftmost table in the join query, therefore table X is selected. At step 501, the membership of Dlt.sub.-- X is determined. Dlt.sub.-- X is the set of rowids of the rows that have been changed in table X since the last refresh of MV 400. For the purposes of explanation, it shall be assumed that the database system maintains a "delta table" SNLog.sub.-- X that stores, for each changed row of table X, (1) the rowid of the changed row, and (2) the scn that indicates the logical time at which the row was changed. The rowids of the changed rows of table X may be retrieved using the database query: Dlt.sub.-- X=select unique rid from SNLog.sub.-- X where scn>:T1 and scn<:T2; In this query, scn stands for the system change number associated with changes that are recorded in SNLog.sub.-- X. As mentioned above, the system change number is a logical number assigned to transactions in commit time order, and which applies to all changes made by those transactions. Executing this query returns a set of rowids Dlt.sub.-- X whose membership consists of the rowids X6, X3, X7, and X2. Once the rowids of the changed rows in table X are identified, control proceeds to step 502. In step 502, VX is calculated. VX is the set of changed rows of table X that still exist in table X. VX may be calculated using the database query: VX=select X.* from X, Dlt.sub.-- X where X.rowid=Dlt.sub.-- X.rid; This query, executed against the state of table X at time T2 (FIG. 6), returns two rows: <X3, 7, 0> and <X6, 5, 1>. No rows are returned for the rowids X7 and X2 because those rows do not exist in table X at time T2. Therefore, at the end of step 502, Dlt.sub.-- X=X2, X3, X6 and X7 VX=<X3, 7, 0> and <X6, 5, 1> After VX is generated, control passes to step 504. At step 504, it is determined whether table X is the right side of an outer join. Table X is not the right side of an outer join, so control passes to step 510. At step 510, rows in the materialized view that are derived from rows in table X that have been changed are removed from the materialized view 400. This removal can be performed according to the statement: delete MV where MV.X.sub.-- rid in (select rid from Dlt.sub.-- X); This delete statement causes the rows in MV 400 that have X2, X3, X6 or X7 in column X.sub.-- rid to be deleted. The deletion of these rows from MV 400 leaves three rows remaining in MV 400. The three rows remaining in MV 400 after this deletion are: <X1, 1, 1, Y1, 1, 0> <X4, 4, 4, Y3, 4, 1> <X5, 9, 0, NULL, NULL, NULL> Control proceeds from step 510 to step 512. At step 512, the rows that were removed from MV 400 in step 510 are recalculated. To perform this recalculation, the query that defines MV is re-executed after replacing references to table X with references to the view VX. Since the view VX will typically contain many orders of magnitude fewer rows than table X, execution of the join using VX will typically involve a much lower overhead than would be required to perform a total refresh. This recalculation may be performed according to the following statement:
______________________________________
insert into MV
select
*
from
VX, Y
where VX.a(+)= y.a
______________________________________
The two rows in VX, namely <X3, 7, 0> and <X6, 5, 1>, do not have values in column VX.a that combine with any values in column y.a of Table Y. Therefore, since the join is an outer join, the rows in VX are combined with Null rows to produce rows <X3, 7, 0, NULL, NULL, NULL> and <X6, 5, 1, NULL, NULL, NULL>. These newly produced rows are inserted into MV 400 by the insert statement specified above. After step 512, MV 400 has the following five rows: <X1, 1, 1, Y1, 1, 0> <X4, 4, 4, Y3, 4, 1> <X5, 9, 0, NULL, NULL, NULL> <X3, 7, 0, NULL, NULL, NULL> and <X6, 5, 1, NULL, NULL, NULL> Control proceeds from step 512 to step 514. At step 514, it is determined whether there are any unprocessed base tables for MV 400. In the present example, MV 400 has two base tables X and Y. Table X has been processed but table Y has not yet been processed. Control therefore returns to step 500. SELECTED TABLE=TABLE Y At step 500, the leftmost unprocessed table listed in the join query that defines the materialized view to be refreshed is selected. In the present example, the only unprocessed table is table Y. Therefore table Y is selected. At step 501, the membership of Dlt.sub.-- Y is determined. Dlt.sub.-- Y is the set of rowids of the rows that have been changed in table Y since the last refresh of MV 400. The rowids of the changed rows of table Y may be retrieved using the database query: Dlt.sub.-- Y=select unique rid from SNLog.sub.-- Y where scn>:T1 and scn<:T2; Executing this query returns a set of rowids Dlt.sub.-- Y that includes Y 1, Y3 and Y4. Once the rowids of the changed rows in table Y are identified, control proceeds to step 502. At step 502, the set of still-existing changed rows of table Y (VY) is calculated using the database query: VY=select Y.* from Y, Dlt.sub.-- Y where Y.rowid=Dit.sub.-- Y.rid; This query, executed against the state of table Y at time T2 (FIG. 6), returns two rows: <Y1, 9, 0> and <Y4, 3, 1>. No rows are returned for the rowid Y3 because the row that had rowid Y3 does not exist in table Y at time T2. Therefore, at the end of step 502, Dlt.sub.-- Y=Y1, Y3 and Y4, and VY=<Y1, 9, 0> and <Y4, 3, 1> After VY is generated, control passes to step 504. At step 504, it is determined whether table Y is the right side of an outer join. Table Y is the right side of an outer join, so control passes to step 506. At step 506, the columns of MV that are derived from the rows that are in Dlt.sub.-- Y are set to NULL. The MV columns derived from table Y include (1) the MV columns that contain values from table Y, and (2) all columns "to the right" of those columns. Using the notation described above, the MV columns derived from Y are the columns of OJGraph(Y). Step 506 may be performed according to the following update statement:
______________________________________
update MV
set
columns-of(OJGraph(Y)) = null
where
MV.Ti.sub.-- rid in (select rid from Dlt.sub.-- Y)
______________________________________
Prior to the execution of this update statement, MV 400 contains the rows: <X1, 1, 1, Y1, 1, 0> <X4, 4, 4, Y3, 4, 1> <X5, 9, 0, NULL, NULL, NULL> <X3, 7, 0, NULL, NULL, NULL> and <X6, 5, 1, NULL, NULL, NULL> Of these rows, the two rows <X1, 1, 1, Y1, 1, 0> and <X4, 4, 4, Y3, 4, 1> have columns that contain values derived from rows identified in Dlt.sub.-- Y. Specifically, the values in the last three columns <Y1, 1, 0> of row <X1, 1, 1, Y1, 1, 0> are derived from a row of table Y that has changed, and the values in the last three columns <Y3, 4, 1> in row <X4, 4, 4, Y3, 4, 1> are derived from a row of table Y that has changed. Therefore, those values are set to Null, thus changing row <X1, 1, 1, Y1, 1, 0> to <X1, 1, 1, NULL, NULL, NULL>, and changing row <X4, 4, 4, Y3, 4, 1> to <X4, 4, 4, NULL, NULL, NULL>. After execution of the update statement, MV 400 contains the rows: <X1, 1, 1, NULL, NULL, NULL> <X4, 4, 4, NULL, NULL, NULL> <X5, 9, 0, NULL, NULL, NULL> <X3, 7, 0, NULL, NULL, NULL> and <X6, 5, 1, NULL, NULL, NULL> Control proceeds from step 506 to step 508. During step 508, the MV columns that were set to NULL in step 506 are updated to new values, where appropriate. One technique for performing this update is to calculate the join X<>VY. Then, for each row produce by this join, it is determined whether there is a corresponding row in MV 400. In this context, rows correspond to each other if they have the same x.rowid value. When a correspondence between a row produced by X<>VY and a row in MV 400 is identified, the row in MV 400 is replaced by the corresponding row from X<>VY. In the present example, the join X<>VY produces a single row <X5, 9, 0, Y1, 9, 0>. The row thus produced corresponds to the row <X5, 9, 0, NULL, NULL, NULL>of MV 400 because both have the x.rowid of X5. Therefore, row <X5, 9, 0, NULL, NULL, NULL> in MV 400 is replaced by <X5, 9, 0, Y1, 9, 0>. After this replacement, MV 400 contains the rows: <X1, 1, 1, NULL, NULL, NULL> <X4, 4, 4, NULL, NULL, NULL> <X5, 9, 0, Y1, 9, 0> <X3, 7, 0, NULL, NULL, NULL> and <X6, 5, 1, NULL, NULL, NULL> Using an SQL query, step 508 may be performed according to the following update statement:
______________________________________
update MV
(select
mv-columns-of(OJGraph(Y)), columns-of(OJGraph(Y))
from
tables-of(POJGraph(Y)), MV
where
join-conditions-of(POJGraph(Y)) and
Y.rowid in (select rid from Dlt.sub.-- Y) and
MV.X.sub.-- rid = X.rowid
) v
set
columns-of(OJGraph(Y)) = columns-of(OJGraph(Y))
______________________________________
After the completion of step 508, control passes to step 514. At step 514 it is determined whether there are any unprocessed base tables of MV 400. In the present example, both of the base tables of MV 400 (tables X and Y) have been processed. Consequently, control proceeds to step 516 and the incremental update of MV 400 is complete. FIG. 6B illustrates MV 400 after the completion of the incremental updated process. The contents of MV 400 in FIG. 6B is identical to the results produced by executing the join query definition of MV 400 on the base tables X and Y at time T2, as shown in FIG. 6A. However, the cumulative overhead incurred in performing the incremental refresh using the techniques described herein will typically be significantly less than the overhead incurred by a total refresh operation. The sequence of steps illustrated in FIG. 5 are merely exemplary. The sequence of various steps may be altered without altering the outcome of the incremental refresh process. For example, step 502 may be performed between steps 510 and 512. This and various other alterations would be apparent to one skilled in the art. Consequently, the present invention is not limited to any particular sequence of steps for performing incremental refresh of a materialized join view. Under certain conditions, certain steps in the incremental refresh process described above may be skipped without affecting the result of the refresh, thus increasing the speed and decreasing the overhead of the incremental refresh. For example, if, since the last refresh, the only changes made to a base table were inserts, then steps 506 and 510 may be skipped when that base table is being processed. Similarly, if since last refresh, the only changes made to a base table were deletes or drop partitions, then steps 508 and 512 may be skipped when that base table is being processed. According to one embodiment of the invention, the rowids stored in the MV indicate the partition to which the row belongs. In such an embodiment, incremental refresh performance is improved when the only change since the last refresh is that a partition was dropped. Under these conditions, the rows in MV that include values from the dropped partition (the "rows of interest") are identified by inspecting the rowids in the MV. For inner joins, the rows of interest are deleted. For outer joins, some of the columns of the rows of interest are set to NULLs. In addition, if MV is partitioned in a way that parallels the partitioning of T, the same partition from MV could simply be drop for inner joins. In both cases, no join to master tables are required. Significantly, the incremental refresh technique described above is "memoryless". That is, it does not require information about the order of updates to the base tables. Consequently, the overhead associated with maintaining such sequencing information is avoided. In addition, the technique is idempotent in that performing an incremental refresh N times on the same data using the techniques described herein yields the same results a single incremental refresh on the data. This property is valuable when, for example, the system crashes during an incremental refresh operation. After the crash, the incremental refresh operation may be restarted from the beginning without taking into account how far the operation had progressed prior to the crash. In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
|
Same subclass Same class Consider this |
||||||||||
