Index-only tables with nested group keys5852822Abstract A method and apparatus for building, maintaining, and using a multi-level index is provided. The multi-level index is accessed using a key. The key is divided into multiple portions referred to as sub-keys. The first level of the multi-level index is built on a first-level sub-key. Each index entry at the first-level is for a particular first-level sub-key value, and either includes sub-entries associated with second-level sub-key values or a reference to a second-level data retrieval structure. All second-level data retrieval structures are built on the portion of the key that has been designated as the second-level sub-key. As the vocabulary of the first-level sub-key becomes exhausted, fewer maintenance operations will have to be performed to maintain the first-level data retrieval structure. This decreases the overhead and increases the concurrency in a database system that uses the multiple-level index. The multi-level index structure is especially suited for queries that retrieve all values for a given first-level sub-key. The structure also has reduced storage costs compared to a single-level index structure, since first-level sub-key values are stored only once for each nested group. Claims What is claimed is: Description FIELD OF THE INVENTION
______________________________________
CREATE TABLE docindex
( token char(20),
doc.sub.-- oid integer,
token.sub.-- frequency smallint,
token.sub.-- occurrence.sub.-- data varchar(512),
CONSTRAINT pk.sub.-- docindex PRIMARY KEY (token, doc.sub.-- oid))
ORGANIZATION INDEX AREA text.sub.-- collection
PCTTHRESHOLD 20
OVERFLOW AREA text.sub.-- collection.sub.-- overflow;
______________________________________
In the above definition, the ORGANIZATION INDEX qualifier indicates that docindex is an index-only table, where the row data reside in an index defined on columns that designate the primary key (token, doc.sub.-- id) for the index-only table. The overflow clause indicates that pieces of data rows which exceed 20% of the block size will be placed in the text.sub.-- collection.sub.-- overflow area. No syntax changes are required to manipulate index-only tables. A user can use an index-only table in place of a regular table in SQL INSERT, SELECT, DELETE, and UPDATE statements. However, for index-only tables the rows are stored in the B-tree itself and these rows are identified by the primary key instead of their physical location. Users access the rows of index-only tables using the primary key. ROW OVERFLOW For index-only tables, index entries can become very large, since they are made up of <key, non.sub.-- key.sub.-- column.sub.-- values> tuples. If index entries get very large, then the leaf nodes may have to store one row or row-piece, thereby destroying the dense clustering property of the index. To overcome this problem, a Row Overflow Area clause may be used. Specifically, users specify an overflow area as well as a threshold value. The threshold is specified as a percentage of the block size. If the row size is greater than the specified threshold value, then the non-key column values for the row that exceeds the threshold are stored in the specified overflow area. When this occurs, the index entry contains <key, rowhead> pair, where the rowhead contains the beginning portion of the rest of the columns. The rowhead is like a regular row-piece, except that an overflow row-piece contains the remaining column values. EXEMPLARY INDEX-ONLY TABLE FIG. 2 is a block diagram that illustrates an index-only table 200 according to an embodiment of the invention. As with conventional B-tree indexes, index-only table 200 includes branch nodes and leaf nodes. Each branch node stores one or more values that indicate which value ranges correspond to the nodes beneath the branch node. Each leaf node stores entries for a particular range of key values. In the illustrated embodiment, a leaf node 202 includes index entries associated with the key values "happy 10", "happy 15", "happy 21", "happy 25" and "ice 1". Another leaf node 204 includes index entries associated with the key values "ice, 3", "ice, 33" and "icicle, 7". Index-only table 200 is not associated with any actual table. Rather, the index entries in the leaf nodes of index-only table 200 contain all of the non-key values that would conventionally be stored in a separate table. For example, index entry 206 not only contains the value "icicle, 7" for the primary key <word, doc.sub.-- oid> used to build the index, but also contains all of the values "3, 78, 99, 103" for the non-key column values <frequency, occurrence-list> that correspond to the primary key value in the index entry. A row overflow area 220 has been established for index-only table 200. The leaf node 202 does not have enough space to store all of the values for an index entry 208. Therefore, the rowhead 210 of index entry 208 is stored within leaf node 202, while the rowtail 212 of index entry 208 is stored within row overflow area 220. Note that the rowhead 210 contains the address 220-1 of rowtail 212. B-TREE SPLIT AND MERGE OPERATIONS The leaf nodes of an index have limited space available for storing index entries. If an insert attempts to add an entry that exceeds the free space available in an index leaf node, approximately half of the leaf node entries are moved to a new (empty) index leaf node. This operation is typically known as the splitting of a leaf node. Similarly, as more space within a leaf node gets freed due to deletions, it is possible to merge a series (typically a pair) of neighboring leaf nodes and create a single leaf node. Such splits and merges cause the corresponding branch nodes to split and merge, and these operations can potentially propagate to the root of the B-tree. Thus, mere and split operations can consume large amounts of time and processing power as the B-tree gets large. NESTED DATA RETRIEVAL STRUCTURES A nested data retrieval structure is a data retrieval structure (e.g. hash table, B-tree, linear list) that serves as a component of a larger data retrieval structure. A nested data retrieval structure may itself contain a nested data retrieval structure. For the purposes of explanation, a data retrieval structure that is not nested within another structure shall be referred to as a first-level data retrieval structure. A data retrieval structure that is nested within a first-level data retrieval structure shall be referred to as a second-level data retrieval structure. A data retrieval structure that is nested within a second-level data retrieval structure shall be referred to as a third-level data retrieval structure, etc. To reduce the frequency of and overhead associated with split and merge operations on index-only tables, an inverted index is implemented using an index-only table that includes nested data retrieval structures according to an embodiment of the invention. The index-only table, which serves as the first-level data retrieval structure, is not built on the entire key (e.g. <word, doc.sub.-- oid>) that is used to access the data within the inverted index. Rather, the key is divided into two or more portions ("sub-keys"). The first-level tree is only built on the first portion of the key (the "first-level sub-key"). Because the first-level tree is built on the first-level sub-key, each index entry in the first-level tree corresponds to a particular first-level sub-key value. Each index entry in the first-level tree is associated with a group of index sub-entries. Each index sub-entry includes a second-level sub-key value and the values for the non-key columns associated with the particular <first-level sub-key, second-level sub-key> combination. The index entry in the first-level tree for a given first-level sub-key value either contains the group of index sub-entries for the given first-level sub-key value, or a reference to a nested data retrieval structure. According to one embodiment of the invention, if the group of sub-entries for a given first-level sub-key value is not stored in the index entry for that first-level sub-key value, then the group of sub-entries is stored within a nested data retrieval structure that is referenced by the index entry for that first-level key value. Specifically, if the size of the group of sub-entries for the given first-level sub-key value is less than a predetermined threshold, then the group of sub-entries is stored in the index entry for the first-level sub-key value. On the other hand, if the size of the group of sub-entries exceeds the predetermined threshold value, then the group of sub-entries is stored in a nested data retrieval structure. When the sub-entries for a first-level sub-key are stored in a nested data retrieval structure, only information required to locate to the nested structure is maintained in the first-level B-tree index entry along with the first-level sub-key value. In the nested structure, each index entry will contain the second-level sub-key and the values of the non-key columns. MAINTAINING LEFT-MOST LEAF NODE POINTERS Often, queries request all of the occurrence information for a particular word. If a nested tree structure has been created for the sub-entries associated with the word, then the nested tree structure will have to be traversed during the processing of such queries. To avoid the overhead associated with traversing the height of the nested tree, the first level index maintains a pointer to the left-most leaf node of the nested tree. Specifically, according to one embodiment of the invention, if a nested B-tree is used to store the group of sub-entries for a first-level sub-key value, then a pointer is maintained from the first-level index entry for the first-level sub-key value to the left-most leaf node of the second-level B-tree. Typically, leaf nodes of a B-tree are linked by pointers so that once a starting leaf node is identified, the subsequent leaf nodes can be traversed by following the neighboring leaf node pointers. All of the entries for a particular first-level sub-key value may be retrieved by following the pointer from the first-level index entry to the leftmost second-level leaf node of the nested tree, and then following the pointers that form the leaf node chain. By avoiding the traversal of the second-level tree, the query processing can be performed more efficiently. EXEMPLARY TWO-LEVEL INVERTED INDEX FIG. 3 is a block diagram of an inverted index 300 in which the whole key <word, doc.sub.-- oid> of an inverted index is divided into two sub-keys, <word> and <doc.sub.-- oid>. The first-level tree 302 of inverted index 300 is built on the first-level sub-key <word>, rather on the entire key <word, doc.sub.-- oid>. Each index entry in the leaf nodes 308 and 310 of the first-level tree 302 corresponds to a particular first-level sub-key value. For example, an index entry 304 in leaf node 308 corresponds to the first-level sub-key value "happy", while an index entry 306 in leaf node 310 corresponds to the first-level sub-key value "ice". Each index entry in the first-level tree 302 is associated with a group of index sub-entries. As mentioned above, if the size of the group of sub-entries for the given first-level sub-key value is less than a predetermined threshold, then the group of sub-entries is stored in the first-level index entry itself along with the first-level sub-key value. In the illustrated embodiment, the number of sub-entries for the first-level sub-key value "happy" is less than the predetermined threshold. Therefore, the sub-entries for the first-level sub-key value "happy" are stored in the index entry 304 within the leaf node 308 of the first-level tree 302. If the size of the group of sub-entries for a first-level sub-key value exceeds the predetermined threshold value, then the group of sub-entries is stored in a nested structure. In the illustrated embodiment, the number of sub-entries for the first-level sub-key value "ice" exceeds the predetermined threshold. Therefore, the sub-entries for the first-level sub-key value "ice" are stored within a second-level B-tree 320 that is referenced within the first-level tree 302. As mentioned above, when the sub-entries for a first-level sub-key are stored in a nested structure, only information required to get to the nested structure is maintained in the first-level B-tree index entry along with the first-level sub-key value. Therefore, the index entry 306 for the first-level sub-key value "ice" contains a pointer to the root node 324 of the second-level tree 320 that stores the group of sub-entries for the value "ice". In the leaf node of the second-level tree 320, each index entry contains the second-level sub-key <doc.sub.-- id> and the non-key columns. Index entry 306 also contains a pointer to the left-most leaf node 322 of the second-level tree 320. In addition, the leaf nodes of the second-level tree 320 are linked by pointers to form a leaf node chain. Consequently, all of the entries for the first-level sub-key value "ice" may efficiently be retrieved by following the pointer from the first-level index entry 306 in the index entry 306 within the first-level leaf node 310 to the leftmost leaf node 322 of the second-level tree 320, then following the pointers that form the leaf node chain. REDUCING INDEX MAINTENANCE OVERHEAD By implementing inverted indexes using the multi-level data retrieval structure described above, the overhead associated with maintaining an index-only table is significantly reduced. Specifically, the first-level tree 302 will grow fast initially, requiring frequent balancing operations. However, such operations will involve little overhead while the first-level tree 302 is yet small. As the first-level tree 302 grows larger, the vocabulary of the first-level sub-key will approach exhaustion. That is, first-level index entries for virtually all possible first-level sub-key values will have been inserted into the first-level index. Thus, the insertion of new data into inverted index 300 will rarely involve the insertion of new entries into the first-level tree 302. Rather, the insertion of new data will only require the insertion of new sub-entries into existing first-level entries, or the insertion of new sub-entries into a second-level retrieval structure, such as second-level tree 320. In the exemplary inverted index 300 of FIG. 3, the first-level sub-key values are words from the documents in a document collection associated with inverted index 300. The growth of the first-level tree 302 will slow significantly as the contents of the first-level tree 302 approaches the entire vocabulary of the language in which the documents are written. Once the first-level tree 302 is saturated, the addition of a new document to the document collection associated with the inverted index 300 would involve adding hundreds of sub-entries to existing index entries and/or nested data retrieval structures. However, no new index entries would have to be added to the first-level tree 302. The sub-entry insertions involve traversing the first-level tree 302 based on the first-level sub-key and adding an entry to the underlying sub-entry group. If the sub-entry group is stored in a nested data retrieval structure, then the sub-entry is added to the nested data retrieval structure. For example, if the nested data retrieval structure is a B-tree, then the nested B-tree is traversed based on the second-level sub-key and the sub-entry is added as a second-level index-entry in the appropriate leaf node of the nested B-tree. Sub-entry insertions generally do not cause growth of the first-level tree 302. If a hash table is used to store the underlying group of sub-entries, then a hash function is performed on the second-level sub-key to generate a number that indicates where to store the sub-entry within the hash table. If a nested B-tree is used for the underlying group of sub-entries, then the nested B-tree may grow. The updates on a nested B-tree may result in leaf node splitting and merging. However, any splits and merges performed during index maintenance will be mostly restricted to the particular nested B-tree. Because the nested B-tree is typically a small component of the entire inverted index 300, such maintenance operations involve much less overhead than maintaining a single, one-level index-only table such as index-only table 200 illustrated in FIG. 2. ENTRY INSERTION PROCESS FIGS. 4A and 4B contain a flow chart that illustrates the process of inserting an entry into an inverted index 300 that is implemented using an index-only table with nested subgroups according to an embodiment of the invention. At step 400, the key of the entry to be inserted is divided into sub-keys. For the purposes of explanation, it shall be assumed that the key for the entry to be inserted is <word, doc.sub.-- oid>, and that the key is divided into the two sub-keys <word> and <doc.sub.-- oid> at step 400. At step 402, the first-level tree 302 is traversed based on the first-level sub-key <word>. After performing step 402, the leaf node of the first level tree 302 that corresponds to the range that includes <word> will have been identified. At step 404, it is determined whether an index entry already exists for the first-level sub-key <word>. If an index entry already exists for the first-level sub-key <word>, then control passes to step 412. For example, if the first-level sub-key is "HAPPY", then control will pass to step 412, since an index entry 304 already exists for the first-level sub-key "HAPPY". If an index entry does not already exist for the first-level sub-key <word>, then control passes to step 406. For example, if the first-level sub-key is "HAPPINESS", then control will pass to step 406 if leaf node 308 does not contain an index entry for the first-level sub-key "HAPPINESS". At step 412, it is determined whether a second-level tree exists for the first-level sub-key. If a second-level tree exists for the first-level sub-key, then control passes to step 414. For example, if the first-level sub-key is "ICE", then control would pass to step 414 because a second-level tree 320 has been built for the sub-entries associated with the first-level sub-key "ICE". If at step 412 a second-level tree does not exist for the first-level sub-key, then control passes to step 416. For example, if the first-level sub-key is "HAPPY", control would pass to step 416 because a second-level tree has not been built for the sub-entries associated with the first-level sub-key "HAPPY". Rather, all of the sub-entries associated with the first-level sub-key "HAPPY" are stored within the entry 304 of leaf node 308 for the word "HAPPY". At step 416, it is determined whether insertion of an additional sub-entry for the first-level sub-key would exceed a predetermined threshold. For example, assume that any given first-level index entry with the threshold size limitation can hold four sub-entries. The entry 304 for the word "HAPPY" already has four sub-entries. Therefore, the insertion of an additional sub-entry would exceed the predetermined threshold for a given first-level sub-key that may be stored in a leaf node, and control would pass to step 418. On the other hand, if a first-level index entry with the threshold size limitation can hold fifty sub-entries, then the insertion of a fifth sub-entry for the word "HAPPY" would not exceed the predetermined threshold, and control would pass to step 410. At step 418, a second-level tree is created to store the sub-entries associated with the first-level sub-key. The second-level tree is built on the second-level sub-key stored in the various sub-entries. In the present example, the second-level tree would be built on the <doc.sub.-- oid> sub-key. At step 420, the new sub-entry and the existing sub-entries associated with the first-level sub-key are added to the second-level tree. For example, if the second-level tree is being built for the first-level sub-key "HAPPY", then the new sub-entry and the four sub-entries stored in entry 304 of leaf node 308 would be inserted into the new second-level tree. At step 422, existing sub-entries are removed from the first-level index entry. For example, if the second-level tree is being built for the first-level sub-key "HAPPY", then the four sub-entries would be deleted from entry 304 of leaf node 308. At step 424, a root node pointer and a leftmost leaf node pointer are added to the first-level index entry. Thus, if the second-level tree is being built for the first-level sub-key "HAPPY", then a pointer to the root node of the new tree and a pointer to the leftmost leaf node of the new tree would be added to first-level index entry 304. As mentioned above, if at step 416 the insertion of the sub-entry into the first-level index entry would not exceed the predetermined threshold, then control passes to step 410. At step 410, a sub-entry is created. The sub-entry contains the second-level sub-key and all non-key column values. As mentioned above, the sub-entry may overflow into a row overflow area 220 if the non-key column values exceed a predetermined length. At step 411, the sub-entry is inserted into the entry for the first-level sub-key that resides in the leaf node of the first-level tree 302. If at step 412 it is determined that a second-level tree already exists for the first-level sub-key, then control passes to step 414. At step 414, the second-level tree is traversed based on the second-level sub-key. Thus, if the index entry to be inserted is for the key (ICE, 10), then the second-level tree 320 for the first-level sub-key "ICE" is traversed based on the second-level sub-key "10". At step 410 a sub-entry is created and at step 411 the sub-entry is inserted into the second-level tree 320. If at step 404 it is determined that no index entry exists for the first-level sub-key, then an index entry is created for the first-level sub-key at step 406. Control then proceeds to step 416, if the insertion of the first level index entry would not exceed the predetermined threshold, then control passes to step 410, where a sub-entry is created. At step 411, the sub-entry is inserted into the new entry within the leaf node of the first-level tree. If, at step 416, the insertion of the first-level index entry exceeds the predetermined threshold, then the sub-entry is inserted in a second level sub-tree as described earlier (steps 418 through 424). ENTRY DELETION PROCESS When a document is removed from the collection of documents associated with an inverted index, then the index entries for that document must be removed from the index. The removal of entries may cause the number of sub-entries in a second-level tree to fall below a predetermined threshold. Under these conditions, the sub-entries stored in the second-level tree may be moved into the index entry within the leaf node of the first-level tree, and the second level tree may be deleted. REDUCED STORAGE COSTS A two-level index structure constructed in the manner described above potentially has reduced storage costs as compared to a conventional one-level index. Storage cost may be reduced because, for each of the nested groups, the first-level sub-key value is stored only once in the top-level index. In contrast, a sub-key value will be repeated once per index entry in a single level index organization. Prefix compression can mitigate the first-level sub-key duplication costs in a single level index organization. However, the compression is applicable only for all entries in a single leaf block. Thus, if the entries with the same first-level key value span multiple blocks, the key value needs to be repeated at least once per leaf block. IMPROVED CONCURRENCY A process must typically acquire a lock that prevents other processes from accessing a node of an index before the process can modify the node. Each such locks may prevent other processes from accessing the node to process other queries. When many such locks are held on nodes of a B-tree, the number of processes that can concurrently process queries using the B-tree is reduced. Without nested data retrieval structures, the insertion of entries into a B-tree index always results in updates being made to one or more leaf nodes of the index, and possibly updates to branch nodes as a result of rebalancing operations. When nested data retrieval structures are employed as described above, many insertion operations will only require updates to the nested data retrieval structures. As the first-level tree becomes saturated, it becomes less likely an insertion will require modification of the first-level tree. Consequently, fewer locks will be acquired on the nodes of the first-level tree, and concurrency within the database system is improved. MULTI-LEVEL INDEXES In the foregoing discussions, embodiments of the present invention have been described with reference to two-level indexes for the purposes of explanation. However, the present invention is not limited to indexes with any particular number of levels. For example, the key of an index may be divided into five sub-keys. When the size of the group of sub-entries for a particular first-level sub-key value exceeds a predetermined threshold, a second-level tree is built for that first-level sub-key value based on the second-level sub-key. Similarly, when the size of the group of sub-entries for a particular second-level sub-key value exceeds a predetermined threshold, a third-level tree is built for that second-level sub-key value based on the third-level sub-key. This process may continue to support any number of levels. FIG. 5 illustrates an inverted index that has four levels. Each leaf node in the first-level tree 502 contains entries for a particular range of first-level sub-key values. The first-level tree 502 includes a leaf node 504 with entries that identify two second-level trees 506 and 508. Each leaf node in the second level trees 506 and 508 contains entries for a particular range of second-level sub-key values associated with a particular first-level sub-key value. The second-level tree 506 has leaf nodes 510 and 512 that include entries that identify two third-level trees 514 and 516, respectively. The second-level tree 508 has a leaf node 518 that includes an entry that identifies a third-level tree 520. Each leaf node in the third level trees 514, 516 and 520 contains entries for a particular range of third-level sub-key values associated with a particular <first-level sub-key, second-level sub-key> combination. The third level tree 516 has a leaf node 522 that includes an entry to a fourth-level tree 524. Each leaf node in the fourth-level tree 524 contains entries for a particular range of fourth-level sub-key values associated with a particular <first-level sub-key, second-level sub-key, third-level key> combination. In the embodiments described above, a lower-level data retrieval structure is not created until the number of sub-entries in the upper-level leaf node for a given sub-key exceeds a predetermined threshold. However, alternative embodiments of the invention may never store sub-entries in the leaf nodes of the upper-level trees. In such embodiments, every upper-level index entry would have a corresponding lower-level data retrieval structure. NON-B-TREE STRUCTURES The embodiments described above include multiple-level B-trees. However, the present invention is not limited to any particular types of data retrieval structures. Rather, each level of the multiple-level retrieval structure may include any type or any combination of types of data retrieval structures, including B-trees, linear lists, B+trees, R-trees, and hash tables. For example, the first-level sub-key value may be used to hash to an entry in a first-level hash table. The entry in the hash-table may contain a pointer to a second-level B-tree for the first-level sub-key value. The second-level B-tree is built on the second-level sub-key. The second-level B-tree may be traversed based on the second-level sub-key to locate a leaf node that contains the non-key column data for that particular <first-level sub-key, second-level sub-key> combination. MULTI-LEVEL INVERTED INDEXES WITH TABLES While it is preferred to store the non-key column data in the sub-entries of the leaf nodes, as described above, the non-key column data may alternatively be stored in a separate table. For example, FIG. 6 illustrates an inverted index implemented using a multiple-level retrieval structure 602 in association with a table 604. The multiple-level retrieval structure 602 includes one first-level B-tree 606 built on a first sub-key, and two second-level B-trees 608 and 610 that are built on second-level sub-keys that are associated with particular first-level sub-key values. The sub-entries in the leaf nodes of B-trees 606, 608 and 610 do not store non-key column values. Instead, the sub-entries either store pointers to a lower-level data retrieval structure, or a pointer to an entry in table 604 that contains the non-key column values associated with a particular key value. SPATIAL DATA APPLICATIONS In the foregoing discussion, embodiments of the invention have been described with reference to text-based information retrieval. However, the present invention is not limited to the retrieval of any particular type of information. For example, a multiple-level data retrieval structure may be used to retrieve information about spatial data. With spatial data, three-dimensional coordinates are mapped to a one dimensional value referred to as an hhcode. The hhcode for points within the same sub-region of space have a common prefix. A multi-level index may be built on the hhcode where the prefix portion of the hhcode is used as the first-level sub-key to build a first-level data retrieval structure, and the remainder of the hhcode is used as a second-level sub-key to build second-level data retrieval structures. In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
|
Same subclass Same class Consider this |
||||||||||
