Computer program product for optimizing data retrieval using index scanning5778353Abstract A method of index scanning involves scanning one or more selected indexes and determining the number of data transfers required to traverse all or a portion of a selected index for a selected number of buffer pool sizes. The number of page transfers to scan a whole table of interest versus the number of page transfers to scan each relevant index in accordance with the buffer pool size is determined for a query. The number of page transfers required in proportion to the selectivity of starting and stopping conditions is determined and scaled down in proportion to the selectivity of any starting and stopping conditions present in the search criteria in the query. A suitable correction factor is applied to the number of transfers to account for few rows remaining to be transferred after applying the search criteria, the buffer pool being large, or a low degree of clustering for an index,, as well as accounting for any remaining index sargable search criteria. The search procedures that results in the least number of page transfers in then implemented. Claims The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows: Description BACKGROUND OF THE INVENTION
______________________________________
Sequential scan
Data paging 50,000 pages
Total 50,000 pages
Nonclustered index scan
Index paging = 2% of 5,000 pages
100 pages
Data paging = 2% of 1,000,000 rows .times. 1 page per row
20,000 pages
Total 20,100 pages
Index scan with 90% clustering
Index paging = 2% of 50,000 pages
100 pages
Clustered data paging = 90% of 2% of 50,000 pages
90 pages
Nonclustered data paging I/O
10% of 2% of 1,000.000 rows .times. 1 page per row
2,000 pages
Total 2,190 pages
______________________________________
The optimal access path is clearly the one through the ninety percent (90%) clustered index. Each page transfer takes a certain amount of processing time so clearly an access path with fewer page transfers will take less time. Degrees of Clustering The degree of clustering is defined as the number of clustered rows in a given index divided by the total number of rows in the table. Thus, the degree of clustering is proportional to the number of rows which when in index order is in the same sequence as that sequence in which they are stored. The number of clustered rows may be determined by performing an index scan through the entire table. A row is considered to be clustered if it is physically stored immediately following the row previously specified by the index. If all rows of an index are clustered according to this definition, then the index is completely clustered. If ninety percent (90%) of the table's rows immediately follow their predecessor rows in index order, then the index is ninety percent (90%) clustered. Access Path Selection The optimizer's task, when presented with a query into a data base table, is to select the fastest access path for that query. The optimizer can choose between a sequential table scan through the entire table, or an index scan using an index corresponding to one of the query's search criteria. Where two or more search criteria have corresponding indexes, the optimizer must choose between these alternative candidate indexes. Search criteria (predicates) on the index columns which make up the index can be used to restrict the records (rows) that are fetched. Such index scans are called partial index scans as opposed to full index scans. For a partial index scan, the optimizer can estimate the selectivity, i.e., the number of records that are expected to be retrieved in the index scan. Methods for estimating the selectivity are well known ›1! and are not discussed here. For an access plan involving an index scan, the optimizer has to estimate the number of data page transfers (fetches). The number of page fetches from disk may depend on the number of buffer pool slots available to hold the pages fetched. In most relational database systems, the buffer pool is assumed to be managed using the Least Recently Used (LRU) algorithm. For a table scan, the number of page fetches is exactly the number of pages in the table T because each page is accessed exactly once. Note that this is independent of the value of the buffer pool size B. Let the number of data pages accessed during the scan of an index I be denoted by A. A data page is accessed if at least one record on the page is examined during the scan. The value of A depends on the number of records (rows) retrieved in the index scan and the placement of the retrieved records in the pages of the table. It does not depend on B. The number of pages fetched while scanning the index I is denoted by F, which depends on A and may depend on B as shown below. Recall that the placement of the retrieved records in the pages of the table determines how clustered the index is. An index is called a clustered index if the records (rows) are stored in the table in the order of the index column. When the records are retrieved in the order of values of the index column, no page is accessed more than once. Hence, F=A independent of B. Note that if this is a partial scan, A<T.gtoreq.F<T. As we have said an index is said to be unclustered if the rows (records) are not stored in the table exactly in the order of the index column. When the rows are retrieved in the order of values of the index column, a page may be accessed more than once. The page may be replaced in the buffer pool between two accesses to the page due to other page transfers (fetches) in the interim. SPECIFIC EMBODIMENT OF THE INVENTION The Effect of Buffer Pool Size We have discovered one significant parameter that affects F is the value of B. As B increases, the size of the buffer pool may be able to compensate for any lack of order in the page reference pattern. When B approaches A, disorganization in the key sequence of records becomes irrelevant. Similarly, as the buffer becomes smaller, even a slightly unclustered index will have to redo many page transfers, since the accessed pages will already have been discarded by some previous reference. In the worst case each new record will require an additional page fetch. Bounds can therefore be placed on F as follows: A.ltoreq.E.ltoreq.N, where N is the number of rows retrieved from Table T. Some indexes are greatly affected by even a small change in the size of the buffer pool. The degree to which the changes occur depends on the amount of disorganization that exists. Note that if there are multiple records (rows) on a page and the index is highly unclustered, an index scan may result in multiple accesses to pages. The result is a large value for F, if B (the buffer pool size) is sufficiently small compared to A (the number of pages accessed (ie. the number of page transfers)). In FIG. 4, we show how the number of page fetches for a full index scan varies with the buffer size. The curves are shown for indexes for columns of a sample set of tables. We refer to these curves as FPT (for Full index scan Page Transfer) curves. B (the size of the buffer pool) is expressed as a fraction of the number of pages T in the table. F (the number of page transfers) is expressed in relation to T. In FIG. 4 the page transfer multiplier (PTM) expresses the number of page fetches F normalized with respect to T. For a full index scan, the minimum value of F is T(in FIG. 4, this corresponds to a value of 1). We see that the value of F, can be quite sensitive to the buffer size B available. Note that for partial index scans, the number of page fetches can be less than T. Hence, in order to choose a good access plan, it is important to accurately estimate the number of page transfers F for every relevant index I. The index scans may be partial or full index scans. A number of factors are usually considered in selecting access paths using indexes. For instance an index can be defined on two columns a and b with a as the major column. Starting and stopping conditions frequently limit the index range that needs to be scanned. Examples of starting conditions are a>50 and a.gtoreq.25. Examples of stopping conditions are a<75 and a.ltoreq.100. Starting and stopping conditions can be combined, e.g., 40.ltoreq.a AND a<60. Let the selectivity of the starting and stopping conditions be denoted by s. We can have other predicates (search conditions) on the index columns that do not define a contiguous range of values and hence do not restrict the range of the index that needs to be scanned. Such predicates are referred to as index sargable predicates. An example of such a predicate is b=5. Let the selectivity of the index sargable predicates be denoted by S,. An index I on table T is said to be relevant if any of the following conditions are satisfied: 1. One or more of the predicates can be used to form starting and/or stopping conditions on I. 2. Records retrieved using I would be in the desired sort order. 3. Access through I can be used to separate records from the table T from other tables in the tablespace of T. Note that a full index scan is not needed if the first condition is applicable. Hence, the optimizer may have several access plans to choose from: 1. Perform a table scan, evaluate the predicates on all the records. If necessary, sort the resulting set of records. 2. Use a partial index scan on a relevant index I and evaluate the unevaluated predicates on the resulting set of records. If necessary, sort the resulting set of records. 3. Use a full scan on a relevant index I to obtain the desired sort order and evaluate the predicates on the resulting set of records. The number of basic access plans to be considered is the number of relevant indexes plus one (for the table scan). In order to choose between the different access plans, the optimizer has to determine the costs of the different access plans. A major component of the cost of an access plan is the number of page fetches from secondary storage (disk) that are required under the plan. For an access plan involving an index scan, the optimizer has to estimate the number of page fetches (transfers). Note that the number of records remaining after applying starting and stopping conditions and index sargable predicates forms an upper bound on the number of pages fetched (transferred). The number of qualifying records is given by the formula N.times.s.times.S. The example below depicts a specific embodiment of the invention herein and when suitably embodied in a program for operating oil a computer system containing a database comprised of tables and their associated indexes is capable of achieving advantages of the invention. Example 1, below, shows a program fragment written in pseudocode for determining the fastest indexed access path for a table in accordance with one embodiment of this invention. The following Notations will be used in the example:
______________________________________
Term
______________________________________
Notation
Number of pages in buffer pool
B
Number of pages in table T
Number of records in table
N
Number of distinct values in index
I
Number of pages accessed in scan on index
A
Number of pages fetched in scan on index
F
Selectivity of start and stop conditions
s
Selectivity of index sargable predicates
S
Clustering factor C or CR
______________________________________
EXAMPLE 1 Pseudocode Implementation of the Invention BASIC STEP 1 (computing the FPT relationship)
______________________________________
SUBSTEP 1
(choosing the modelling range)
111 Select a minimum buffer size to model, i.e
Bmin = max (0.01 .times. number of pages in the
table, T and the smallest reasonable buffer
pool size)
112 Select a maximum buffer size to model, i.e.
Bmax = min (the number of pages in the table, T
user-supplied value)
SUBSTEP 2
(computing the relationship between required
page transfers and the size of the buffer pool)
121 Choose some table T
122 Choose some index I that exists for table T
123 For each index page Ip of index I
123.1 Read index page Ip
124 For each index entry Ie of Ip
124.1 Obtain index entry Ie
125 Extract the page number P from the
row identifier from Ie
126 For each possible size of buffer pool
BP
127 If the page P cannot be found in
BP then increment the number of
page transfers required by a
scan using a buffer pool of this
size
128 End loop 127
129 End loop 124
1210 End loop 123
1211 Record the number of page transfers Fmin of the
smallest buffer pool size Bmin and compute a
clustering factor C using the formula C = (N -
Fmin)/(N - T)
SUBSTEP 3
(fitting a line to the curve defined by the
page transfer vs bufferpool size relationship
determined by SUBSTEP 2)
131 Approximate the curve by breaking it into a
discrete set of line segments (preferredly
using the method of least squared error)
SUBSTEP 4
(saving the line segment information)
141 Save the x, y coordinate endpoints of each line
segment for later use
142 Repeat BASIC STEP 1 for all desired indexes and
tables
BASIC STEP 2
(using the computed FPT relationship from
BASIC STEP
1, assign a cost associated with the
potential use of an index or table
scan for a particular table)
SUBSTEP 1
(table scan cost computation)
211 Compute the cost of scanning the entire table T
Comment: To do this account for both page
transfers and CPU instructions to compute an
estimate of the work required to read the
entire table. Techniques for doing this are
well known.
SUBSTEP 2
(computing the cost of using each index)
221 Establish (Bquery) the appropriate size of the
buffer pool for accessing this table by this
particular query (this is done by an external
routine of database and is not part of this
invention)
222 For each index I that exists for T
223 Using the FPT relationship that was computed in
BASIC STEP 2 determine a set of two x, y
coordinate pairs that include between their (x-
axis) endpoints the Bquery value
224 Compute the estimated page transfers (PF.sub.B)
using the index under consideration for the
current value of Bquery (a preferred method is
interpolation)
225 Take into account the effect of any relevant
starting and stopping conditions (this is a
subset or the search criteria) by multiplying
the estimated number of page transfers (PF.sub.B) by
the combined selectivity (filter factor) of all
the starting/stopping conditions to reduce the
estimated number of page transfers.
s .times. PF.sub.B
226 Apply a correction factor when
a) very few rows remain after applying search criteria
b) the buffer pool to table size (in pages) ratio is high
c) the index has a low degree of clustering
eg. by using the following formula which is well known in
the art as Cardenas' formula:
##STR1##
227 Account for any remaining search criteria that
can be applied prior to obtaining the data
page. Therefore the final estimate of the
number of page transfers required is given by:
F = (1 - (1 - 1/Q).sup.k .times. ›(s .times. PF.sub.n) + V .times.
min(1,.phi./6s)) .times.
(1 - C) .times. T .times. (1 - (1 -/T)s.sup.N))!
where k is the number of qualifying rows after
all search criteria which can be applied to
the index page have been evaluated (k = SsN)
228 Compute a complete cost estimate of using this
index using the estimated number of page
fetches as one of the dominant cost components.
229 end loop 222
SUBSTEP 3
(choosing the least expensive access method)
231 Compare all the possibie index scan costs as
well as the cost of scanning the entire table
and select the one with lowest cost.
______________________________________
A discussion of the operation of the program fragment of Example 1 follows including additional information on specifics useful for application of the invention. The invention illustrated in the pseudocode has two components. The first, BASIC STEP 1, is typically part of the statistical collection routines. It is called periodically to calculate some parameter values that can be stored. These parameter values are then used by the second component, BASIC STEP 2, which would be typically called by the optimizer during query compilation. It provides an estimate of page transfers for an index scan. We describe the two basic steps in more detail below. BASIC STEP 1 BASIC STEP 1 first determines the range of buffer pool sizes that need to be modeled. It then obtains a table of FPF data in the range of interest. It approximates the FPF curve using some appropriate number of line segments. BASIC STEP 2 then uses the approximated FPF curve in the estimation of page transfers. Determining Modeling Range (pseudocode routine 111, 112) We need to know the range of buffer pool sizes (B) that are likely to be encountered during optimization. The optimizer will need accurate estimates of F, for these values of B. If desired, the range of B can be specified by the database administrator (DBA). If the range is not specified, BASIC STEP 1 chooses the range of B values to model as follows. For the most useful ranges of the invention the minimum value of B (denoted by Bmin) is taken to be max(0.01.times.T, Bsml), where Bsml is the minimum buffer pool size modeled. We have selected Bsml=12. The maximum value of B (denoted by Bmax) is taken to be T (the number of pages in the table). BASIC STEP 1 will next determine the number of page fetches for a full index scan at selected B values in the range determined above. These B values are denoted by B1, B2, . . . Bk, with B1=Bmin and Bk=Bmax. The values B2, . . . , Bk-1 are equally spaced, with the distance between consecutive values being given by the following heuristic formula: ##EQU1## This allows an increased number of buffer size values to be modeled for larger ranges but the increase is slower than the increase in the range size. B2 is the smallest multiple of Binc greater than Bmin and Bk-1 is the largest multiple of Binc smaller than Bmax. Note that the method to be described allows us to generate useful page fetch estimates even if the buffer pool size falls outside of the range (Bmin . . . Bmax). The estimates may be less accurate outside this range but we have found that the invention can always obtain a useful estimate of the number of page transfers. The FPT relationship applies to a specific index of a specific table therefor a table and an index must be selected and then processed (pseudocode steps 121, 122). Obtaining the FPF Data(pseudocode steps 123, 123.1, . . . 1210) A full scan of all the index entries produces the sequence of page numbers as stored in the index. A scan of the index for index statistics collection has these characteristics. The sequence of page accesses can be used to determine the number of page fetches that would result from a full index scan for any B value. In the art buffer pool modeling techniques ›2! are known that can obtain page fetches for a number of buffer pool sizes in a single scan of the sequence. BASIC STEP 1 uses these techniques in order to generate the table of FPF data. For examples of FPF curves see FIG. 4. The table consists of (x, y) pairs where the x values are B1, B2, . . . Bk and the y values are the corresponding number of page transfer multipliers used to determine the total number of page transfers for a full index scan. Approximating the FPF Data(see pseudocode 131, 141) The FPF curve is used subsequently for estimation of page transfers in BASIC STEP 2. To reduce the amount of data that needs to be stored, we approximate the FPF curve. Any approximation method that permits sufficiently accurate approximation, e.g., polynomial curve fitting, splines, could be used. We use the simple but adequate method of approximating the FPF curve using line segments (see, for example, ›3!). The line segment information is captured by storing the coordinates of the end-points of the line segments. This coordinate information can be stored in a system catalog entry associated with the index for later use by BASIC STEP 2. Clearly, the larger the number of line segments, the more accurate the approximation. However, for each additional line segment, an additional pair of values needs to be stored in the catalog. If space usage in the catalog structure is of concern, it is desirable to keep the minimum number of line segments that result in acceptable errors in page fetch estimation. We performed a large number of experiments on different indexes to study the sensitivity of the estimation errors to the number of approximating line segments. The experiments show that the estimation errors do not change very much when the number of line segments is greater than five. Hence, we use six (6) line segments to approximate the FPF curves. Processing Additional Indexes (see pseudocode line 142) We have now completed the discussion of processing a single index and this procedure can be repeated for all other indexes of the table. BASIC STEP 2 As discussed in Section 1, the optimizer often needs to choose between performing a table scan or a scan on one of the relevant indexes. For a relevant index 1, the optimizer determines the applicable search criteria. The optimizer also expects that the buffer size available for the scan to be some value B. In order to estimate the number of page fetches required for the scan on 1, the optimizer uses BASIC STEP 2. BASIC STEP 2 uses the approximation to the FPF curve obtained by BASIC STEP 1. Let the number of page fetches resulting from a full scan on I, given a buffer size of B, be denoted by PF.sub.B. BASIC STEP 2 first determines which line segment contains among its x values the value B. It then uses the equation for the line segment to calculate the corresponding number of page fetches. This is the value of PF.sub.B. If B lies outside the range modeled by BASIC STEP 1, BASIC STEP 2 uses extrapolation of the appropriate boundary line segment to obtain a value for PF.sub.B. BASIC STEP 2 then scales down the value of PF.sub.B (see pseudocode step 225) appropriately in order to obtain the number of page fetches corresponding to the scan on I (possibly restricted by starting and stopping conditions if present). Thus, it estimates the page transfers by s.times.PF.sub.B Correcting for Small Selectivity (s) (pseudocode 226) The calculation of number of transfers F tends to underestimate the number of page fetches when the following conditions hold: 1. The selectivity s is small. 2. .0.=max(l,/T) is significantly greater than the selectivity s. 3. The index is not very clustered. This is reflected by a value of C not close to 1. C is a clustering factor defined by the following formula: C=(N-F.sub.min)/(N-T) Where the number of page fetches for a buffer size of B.sub.min pages is denoted by F.sub.min. BASIC STEP 1 also determined the value of F.sub.min (see pseudocode steps 1211). C is a measure of how "clustered" the index is. If C=1 the index is perfectly clustered. The "degree" of clustering tends to increase as C approaches 1. We take these observations into account by using a heuristic correction term that is calculated as follows: ##EQU2## The factor (T.times.(1-(1-1/T)sN)) is known in the literature as Cardenas' formula ›4!. The more unclustered an index is, the more likely it is that a partial index scan looks like a random selection. Note that if the index is very clustered, i.e., C is close to 1, the second factor (1-C) will be small. Hence, the factor (1-C) is a measure of how unclustered the index is and it is used to reduce Cardenas' estimate. The term s.times.PF.sub.B tends to underestimate page transfers primarily when .0. is large compared to s. The factor (1,/6s) reduces the contribution of the second term if .0. is not significantly larger than s. Note that the second term will be significant only for small values of s. For example, it is always 0 for s>0.34. If the selectivity s is small, sN is small, the term corresponding to Cardenas' formula is small, and hence the second term will be tend to be small. Thus, the contribution of the second term to F, is usually not large. Let us use the indicator variable V to denote the condition .0.>3 s. If the condition is true, then V is 1, otherwise it is 0. Then, the estimate we have for the number of page fetches is given by: F=(1-(1-1/Q).sup.k).times.((s.times.PF.sub.B)+V.times.min(1, .0./(6s)).times.(1-C).times.(T.times.(1-(1-1/T)s.sup.N))) Effect of Index Sargable Predicates (pseudocode step 227) The index sargable search criteria are applied to the index column values inspected during the (partial) index scan (step 225). Those records that qualify are then fetched. Hence, such search criteria can have the effect of reducing the number of pages fetched. We use a simple urn model to estimate the effect of index sargable search criteria. We first need to estimate the number of pages referenced after applying the starting and stopping conditions. If the index is highly clustered (C is approximately 1), the number of pages referenced is close to sT. If it is highly unclustered, the number of pages referenced is close to T. Using a simple linear model, we estimate the number of pages referenced after applying the starting and stopping conditions to be Q=CsT+(1-C)*min(T,sN) The number of qualifying records after index sargable search criteria are applied is k where k=SsN The factor by which the number of pages referenced is reduced is (1-(1-1/Q).sup.k) The number of page fetches is estimated to be reduced proportionately. The estimated number of page fetches F taking index sargable search criteria into account is: F=(1-(1-1/Q).sup.k).times.((s.times.PF.sub.B)+V.times.min(1, .0./(6s)).times.(1-C).times.(T.times.(1-(1-1/T)s.sup.N))) Having computed F the number of page transfers we compute a final complete estimate of the processing required (the page fetches plus index page transfers and associated computations) and thus the cost of using the index under consideration (step 228). This is repeated for each relevant index (step 229-222). The results from each index evaluation and the complete table scan are compared and the most cost effective one (the one with the least transfers) is chosen (step 231). Overview A useful embodiment of the invention consists of the following steps: 1. Determine the modeling range if not specified by the database administrator (DBA). 2. At statistics collection time, use LRU buffer pool modeling on the sequence of index page accesses to obtain the page fetches for different buffer pool sizes in the modeling range. 3. Approximate the table of page fetches, e.g., by a small number of line segments. 4. At query compilation time, use the line segment approximation to determine the number of page fetches for a full index scan. The buffer size is specified by the optimizer. 5. Scale down the full scan page fetches by the selectivity of the starting and stopping conditions. 6. If necessary, use the heuristic correction described above to compensate for small values of s. (eg. when only a small amount of data is being accessed from a table) 7. Account for the effect of index sargable search criteria on the number of page fetches if required. Numerical Example To better illustrate the usefulness and advantages of this invention, consider the following comparison of two indexed access paths, one (C2) having a 31% degree of clustering DC, and the other (C3) a 41% degree of clustering. Remember that in the prior art C3 would be considered a better access path than C2. Assume that the table has 150,000 rows (NR=150,000), and that the rows are distributed over 21,000 data pages (NP=21,000). Assume further that the indexes are equally selective, having a filter factor of two percent (FF=2%). Further, assume that each index has 150 leaf pages (NLP=150) in a tree of three levels (NL=3). The estimated total page transfers incurred by using each index as the access path is then calculated as follows: Comparison of Prior Art with The Invention for the Numerical Example 100 B=1,000 (ie. 0.5 T or 50% of T) 101 N=150,000 rows (number of rows in the table) 102 T=2,000 pages (number of pages in the table) 103 NLP=150 (number of index pages) 104 NL=3 (number of levels (height of tree in the indes)) 105 s=0.02 (selectivity of the search criteria) 106 DC=degree of clustering 107 (TPT) Total Page Transfers=Index Page Transfer+(DPT) Data Page Transfers 108 (IPT) Index Page Transfers=NL+s.times.NLP 109 (DPT)=s.times.DC.times.T+s.times.(1-DC).times.N Table Scan (scan table from beginning to end) ##EQU3## Prior Art Index Scan on C2 (column 2 of table C) ##EQU4## Index Scan on C3 (column 3 of table C) ##EQU5## Index Page Transfer Estimation Under This Invention ##EQU6## As can be readily seen the prior art which fails to take into account the buffer pool size greatly overestimates the number of page transfers required and also results in the optimizer making an incorrect selection of the index chosen for the access path. The prior would have chosen index C3 for which it calculated 1792.4 page transfers, but the invention shows that it would have been closer to 276 due to the large buffer pool. However, index C2 derives greater benefit from the large buffer pool and hence only requires 234 page transfering (which the prior art predicting 2088.4 did not select.) For large values of selectivity (s) (a greater fraction of rows qualifying (matching the search criteria) the correction factor would not contribute to the estimate of total number of page transfers. References 1. M. V. Mannino, P. Chu and T. Sager, Statistical Profile Estimation in Database Systems, ACM Computing Surveys, 20(3):191-221, September 1988. 2. R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger, Evaluation Techniques for Storage Hierarchies, IBM Systems Journal; 9(2):78-117, 1970. 3. B. K. Natarajan, On Piecewise Linear Approximations to Curves, Technical Report, Hewlett-Packard Laboratories, March 1991, Technical Report HPL-91-36. 4. A. F. Cardenas, Analysis and Performance of Inverted Database Structures, Communications of the ACM, 18(5):253-263, May 1975 It will be understood that this invention is not limited to relational data base queries, but can be readily applied to optimizing the access paths in joining relational data base tables. Further, the invention is considered to have value outside the field of relational data base management systems, in the broader realm of estimating page accesses in other data processing applications. It will be understood that outside the area of relational data bases, there is data commonly considered to be stored in "records", and other structures (analogous to the indexes described above) are used to access the records in sequences other than that in which they were stored. Accordingly, the scope and protection of the invention is not limited except as by the following claims.
|
Same subclass Same class Consider this |
||||||||||
