Parallel searching technique5884303Abstract A parallel query manager accepts a list of file extents to be searched and produces a number of search lists, one for each disk to be searched. The query manager first uses a mapper to find out how the database spaces are stored on disk. It then matches the search extent list with the mapping information to determine which parts of which disks are to be searched. It then initiates several searches in parallel so that all the affected disks can be kept busy at the same time. The query manager then checks for return data on each stream, and merges the results. Claims I claim: Description BACKGROUND OF THE INVENTION
______________________________________
Level Number of Type-dependent
number components Type data
______________________________________
0 3 concat volume A, size = 8 Mb
1 0 raw disk 1, offset = 0,
length = 2.5 Mb
1 0 raw disk 2, offset = 48 k,
length = 3 Mb
1 0 raw disk 3, offset = 2 Mb,
length = 2.5 Mb
______________________________________
This mapping table represents a file which is mapped to a logical volume A, which in turn is mapped to three fragments, each stored on a separate physical disk. This mapping can be represented graphically as a tree structure, as shown in FIG. 2. When the RDBMS receives a database query, it analyses the query to discover which files and which extents within those files it requires to access in order to answer the query. From this analysis, the RDBMS generates a dataspace extent list, comprising a number of entries, one for each of the extents to be accessed. Each entry includes the following items: A file reference (file descriptor or full file name). The offset of the start of the extent relative to the start of the file. The length of the data area to be searched in this extent. The RDBMS then sends a bulk input request to the parallel search manager. The bulk input request comprises two items: the number of extents to be searched a pointer to the dataspace extent list. FIG. 3 is a flow chart showing the operation of the parallel search manager 18, when it receives a bulk input request from the RDBMS. (Step 31) The parallel search manager first scans the dataspace extent list to identify which files are referenced in this list. This step generates a file list, comprising the following information: the number of files a set of pointers to the files. (Step 32) For each file in the file list, the parallel search manager makes a request to the mapper 22 via a system call. In response to each request, the mapper returns a mapping table as described above, indicating how the file is mapped on to the physical disks. (Step 33) Using the mapping tables returned by the mapper, the parallel search manager constructs a list of the physical disks that might be involved in the search, and of the file or files that may have data on each disk. (Step 34) The parallel search manager then performs an outer loop, which selects in turn each of the physical disks that might be involved in the search. Within this outer loop, there is an inner loop, which selects in turn each of the files that may have data on the currently selected disk. The inner loop contains a "Create Search List" routine. This routine generates a search list, identifying the extents, or parts of extents, of the selected file that map on to the selected disk. The extents in the search list are identified in logical terms (that is as file, offset, length) and so the list is still valid even if a disk mirror has failed or a file system has tidied itself. The "Create Search List" routine is described in detail below, with reference to FIG. 4. (Step 35) When all the required search lists have been created, each list may be independently passed to the SCAFS driver 20, with a request for it to initiate a search through the specified extents on the specified disk. The SCAFS Driver translates the file offsets to disk offsets (disk addresses) and passes the lists to the respective SCAFS units. Several SCAFS searches are initiated in parallel, so that all the affected disks are busy at the same time. Each SCAFS unit performs the requested searches and returns a stream of selected rows or records to the host. Alternatively, a disk search could be started as soon as the first non-empty search list has been generated. It is also possible to limit the number of parallel searches and to generate new search lists as executing searches are completed. This is useful as it is better to allocate parallel disk searches to different SCSI channels and spread the load on system resources. (Step 36) The parallel search manager checks for return data on each stream, and merges the results of all the streams for the RDBMS as they become available. FIG. 4 shows the "Create Search List" routine mentioned above, for identifying the portions of the current target file that map on to the current target disk. (Step 41) The routine scans the dataspace extent list supplied by the RDBMS, and from this creates a file extent list, consisting of a list of the extents to be searched in the currently selected file, in ascending offset order. (Step 42) The routine initialises a byte count value to zero, and selects the first extent on the file extent list. It also positions a pointer at the start of the mapping table. (Step 43) The routine then advances the pointer through the mapping table, searching for the next entry relating to a physical disk fragment. (That is, an entry of the type "raw"). (Step 44) When the next entry relating to a fragment is found, the routine increments the byte count by adding the length of the fragment. The byte count therefore indicates the position of the end of the fragment. (Step 45) The routine then checks whether the byte count is greater than the offset value of the currently selected extent, i.e. whether the extent maps (at least partially) into the fragment. If so, the routine proceeds to Step 46; otherwise, it returns to Step 43 above, to search for the next fragment. (Step 46) The routine checks whether the fragment is on the target disk (i.e. the currently selected disk). If so, the routine proceeds to Step 47; otherwise it proceeds to Step 50. (Step 47) If the fragment is on the target disk, the routine identifies the overlap between the extent and the fragment. It then creates an entry in the output search list, this entry including the file name, offset and length of the overlap area. (As an added refinement the disk offset can also be included in the entry, and used to order the disk search extents and minimise disk head movement.) (Step 48) The routine then checks whether the currently selected extent has been exhausted, i.e. whether the byte count is greater than the sum of the extent's offset and length. If so, the routine proceeds to Step 49. If, on the other hand, the extent has not been exhausted (i.e. the extent continues into the next fragment), the routine returns to Step 43 above, so as to search for the next fragment. (Step 49) If the currently selected extent has been exhausted, the next extent in the file extent list is now selected, and the routine returns to Step 45 above. (Step 50) If the currently selected fragment is not on the target disk, the procedure checks whether the byte count is greater than the sum of the extent's offset and length, i.e. whether the currently selected extent terminates within the fragment. If so, the routine proceeds to Step 51; otherwise, returns to Step 43 above, so as to search for the next fragment. (Step 51) The routine selects the next extent in the file extent list, and returns to step 50. The above loop, comprising Steps 43-51, is repeated until it is found (at Step 49 or 51) that there are no more extents in the file extent list to be processed. If the end of the mapping table is reached before this, (i.e. there are no more fragments to process), an error has occurred. For mirrored volumes the search for the fragment follows one mirror only. The actual decision on the mirror to be searched is taken by the operating system. SUMMARY In summary, the parallel query manager accepts a list of extents to be searched and organises them into an efficient sequence for searching. This enables an extremely high data search rate to be achieved. This solution has the following benefits: It is architecturally simpler than multi-process or multi-threading solutions. It takes account of physical data placement so can be optimised to minimise disk head movement and maximise data input rates. The search activity can be scheduled to make best use of system resources. It does not rely on availability of multiple processors and is quite effective even when only one processor is available. SOME POSSIBLE MODIFICATIONS It will be appreciated that many modifications may be made to the system described above without departing from the scope of the present invention. For example, instead of using search processors, the searching may be done by the RDBMS itself. In this case, the parallel search manager would be used in the same way as described above, to create lists of search areas to for each disk. A bulk input manager, resident in the host processor, would then use these lists to drive a series of asynchronous block reads through the disk driver, so as to read the required data into the host, for searching by the RDBMS. This possibility is illustrated in FIG. 5.
|
Same subclass Same class Consider this |
||||||||||
