Generating database or data structure (e.g., via user interface)

System and method for use and storage of geographic data on physical media

5968109

Abstract

An improved method and system for storage of geographic data on physical storage media. The geographic data are stored in a manner that facilitates and enhances use and access of the data by various navigation application functions in navigation systems that use the data. The geographic data includes a parcelization that separates the geographic data into parcels having less than or equal to a maximum parcel size but having at least a desired fill percentage. The parcelization method also provides for a division arrangement that facilitates addressing and identification of the parcels. According to a further aspect, the geographic data includes special nodal entities that are used to collapse complex intersections, such as roundabouts, cloverleaves, and divided highways, into simpler data representations. The special nodal entities are associated with road segment data entities and used in a route calculation program in place of regular node entities. Further, the geographic data include a normalized attribute array that includes reoccurring combinations of certain selected attributes of the geographic data. Indices to the array are included in place of data corresponding to the selected attributes. When a navigation application program requests data, an entry in the normalized attribute table pointed to by an index in the data is used to return the requested data in the particular combination of attributes from the normalized attribute array. The geographic data is compiled by a method that facilitates access to the data on a physical medium. According to the compilation method, data files to be stored on the medium are organized into parcels. The data records within the data files are identified by the parcel in which they are located. An arrangement of all the data files on the medium is determined and a parcel identification related to the medium is assigned to each parcel. Cross references between data records are updated to include the assigned parcel identifications and the parcels are stored on the medium.


Claims

We claim:

1. A method of storing a plurality of records of geographic data on a storage medium, wherein each record represents a physical feature having a physical location in a geographic region, the method comprising the steps of:

separating said plurality of records into first and second groupings of records wherein the records in said first of said groupings represent physical features having geographic locations encompassed within a first sub-rectangular area and the records in said second of said groupings represent physical features having geographic locations encompassed within a second sub-rectangular area,

wherein said first and said second sub-rectangular areas are formed by a division at a position of a rectangular area that encompasses the locations of the physical features represented by the plurality of records in said first and second groupings, wherein said position of said division is determined by

evaluating a plurality of trial divisions of said rectangular area; and

selecting one of said plurality of trial divisions based upon resultant sizes of said first and second groupings.

2. The method of claim 1 wherein ends of portions of road segments located in said geographic region are represented by node data records which are included among the plurality of records of geographic data, the method further comprising:

prior to the step of separating, identifying complex intersections in said geographic data; and

representing said complex intersections as a single node data record.

3. The method of claim 1 further comprising the step of:

comparing resultant sizes of said first and second groupings derived from said evaluation of said plurality of trial divisions to a first range of sizes;

and wherein said step of separating comprises separating said plurality of records into first and second groupings based upon at least one of said groupings corresponding to said first range of sizes.

4. The method of claim 1 further comprising the step of:

comparing resultant sizes of said first and second groupings derived from said evaluation of said plurality of trial divisions to a first range of sizes;

and wherein said step of separating comprises separating said plurality of records into first and second groupings based upon both said first and second groupings corresponding to said first range of sizes.

5. The method of claim 4 wherein said first range of sizes is based upon a fill percentage of between 80 and 100 percent for a parcel formed of one of said groupings of records.

6. The method of claim 1 further comprising:

ascertaining an aspect ratio of at least one trial rectangle formed by each of said trial divisions; and

wherein said selecting step is based upon evaluating both said aspect ratio and said resultant sizes of groupings formed by each of said trial divisions.

7. The method of claim 1 wherein said trial divisions are located at a position m/2.sup.n along a side of said rectangle where n is an integer between and including 1 and 5 and m is an integer between and including 1 and (2.sup.n -1).

8. The method of claim 1 further comprising;

upon determining that a size of one of said first and second groupings formed by said separating step exceeds a predetermined threshold, separating said one grouping into further first and second groupings by applying the same separating, evaluating, and selecting steps to said one grouping that had been applied to said plurality of records.

9. The method of claim 8 further comprising the step of:

forming any of said first and second groupings of records, and the further first and second groupings of records formed therefrom, into a parcel upon evaluation of a size of said grouping not exceeding a predetermined desired maximum parcel size.

10. The method of claim 1 further comprising;

if a size of one of said first and second groupings formed by said separating step exceeds a predetermined desired maximum parcel size, continuing to apply said separating, evaluating, and selecting steps to the plurality of data records in said one grouping and to all groupings formed therefrom until each of said groupings formed therefrom has a size that does not exceed the predetermined desired maximum parcel size.

11. The method of claim 1 wherein said plurality of records represent segments of roads in said geographic region.

12. The method of claim 11 wherein said plurality of records are routing data records that include information about speed of travel and turn restrictions along each of said segments of roads.

13. The method of claim 1 wherein said plurality of records are cartographic data records that represent shapes of segments of roads and that are used to display said segments of roads on a computer display.

14. The method of claim 1 further comprising the step of:

forming one of said first and second groupings of records into a parcel upon evaluation of a size of said grouping not exceeding a predetermined desired maximum parcel size, wherein said parcel size represents a minimum quantity of data that can be accessed by a navigation system in a single access operation.

15. The method of claim 14 wherein said forming step further comprises the step of:

when forming said one of first and second groupings of records into a parcel, adding an amount of padding to said grouping of records so that said grouping and said padding together are equal to the predetermined desired maximum parcel size.

16. The method of claim 1 wherein the geographic data includes data entities that represent segments of roads in said geographic region, wherein each segment of a road is associated with two nodes located at respective endpoints thereof, the method further comprising the steps of:

prior to the step of separating,

evaluating each of the nodes of represented segments of roads with a plurality of criteria, wherein the plurality of criteria include

a first criterion that any number of segments, other than exactly 2 segments, can meet at a node, and

at least one other criterion;

determining that a node is an aggregated-significant node based upon said step of evaluating;

forming an aggregated segment data entity that represents in aggregation a plurality of road segments that connect between aggregated-significant nodes; and

storing the aggregated segment data entity as one of the plurality of records.

17. The method of claim 1 further comprising:

prior to the step of separating, identifying common attributes in a portion of the plurality of records,

forming an array containing said common attributes; and

in said portion of the plurality of records, substituting said common attributes with a reference to said array.

18. The method of claim 17 further comprising:

storing said array with said geographic data on said storage medium.

19. A database formed according to the method of claim 1, wherein said database is stored on a computer-readable medium.

20. A method of forming a geographic database that includes data entities that represent segments of roads in a geographic area, wherein each segment of a road is associated with two nodes located at respective endpoints thereof, the method comprising the steps of:

evaluating each of the nodes of represented segments of roads with a plurality of criteria, wherein the plurality of criteria include

a first criterion that any number of segments, other than exactly 2 segments, can meet at a node, and

at least one other criterion;

determining that a node is an aggregated-significant node based upon said step of evaluating;

forming an aggregated segment data entity that represents in aggregation a plurality of road segments that connect between aggregated-significant nodes;

storing the aggregated segment data entity in the geographic database,

for each aggregated segment data entity formed by the forming step, forming abbreviated segment data entities that represent the plurality of segments of roads that are represented by the aggregated segment data entity; and

storing the abbreviated segment data entities in the geographic database.

21. The method of claim 20 wherein said at least one other criterion comprises:

a second criterion that a restricted driving maneuver extends across the node.

22. The method of claim 20 wherein the geographic database is formed of a plurality of layers wherein each layer includes a separate collection of data entities and inclusion of a data entity in a given layer is based upon a functional rank of at least one road segment represented thereby, and wherein the abbreviated segment data entities formed to represent the plurality of segments of roads that are represented by an aggregated segment data entity are stored in a same layer as the aggregated segment data entity.

23. A database formed according to the method of claim 20, wherein said database is stored on a computer-readable medium.

24. A method of forming a geographic database that includes data entities that represent segments of roads in a geographic area, wherein each segment of a road is associated with two nodes located at respective endpoints thereof, the method comprising the steps of:

evaluating each of the nodes of represented segments of roads with a plurality of criteria, wherein the plurality of criteria include

a first criterion that any number of segments, other than exactly 2 segments, can meet at a node, and

at least one other criterion;

determining that a node is an aggregated-significant node based upon said step of evaluating;

forming an aggregated segment data entity that represents in aggregation a plurality of road segments that connect between aggregated-significant nodes;

storing the aggregated segment data entity in the geographic database;

for each node which is not an aggregated-significant node, forming an abbreviated node data entity and including a reference thereto in the aggregated segment data entity that represents the segments of roads connected thereto; and

storing the abbreviated node data entity in the geographic database.

25. The method of claim 24 wherein the geographic database is formed of a plurality of layers wherein each layer includes a separate collection of data entities and inclusion of a data entity in a given layer is based upon a functional rank of the at least one road segment represented thereby, and wherein each abbreviated node data entity is located in a same layer as the aggregated segment data entity that refers thereto.

26. A database formed according to the method of claim 24, wherein said database is stored on a computer-readable medium.

27. A method of manufacturing a geographic database for a geographic region for use by a navigation application program, wherein said geographic database is stored on a computer-readable medium and wherein said geographic database includes a plurality of records wherein each record corresponds to a physical feature having a physical location in the geographic region, the method comprising the steps of:

arranging said records of geographic data into a plurality of parcels according to a method that stores in each parcel records of geographic data that represent features having physical locations in proximity to each other and wherein the records stored in any one parcel represent features having locations encompassed within a corresponding rectangular area associated therewith and located in said geographic region;

wherein a size and location of each rectangular area associated with a parcel are determined by a series of divisions of a rectangular area encompassing all of said features represented by said plurality of records, wherein each division of said series of divisions subsequent to an initial division is made on a rectangular area formed from a preceding division; and

wherein each division of a rectangular area into further rectangular areas is made at a location of said rectangular area based upon a comparison of quantities of data that represent features encompassed by each of said rectangular areas formed by said division to a plurality of ranges of acceptable sizes derived from a desired fill percentage of parcels with data.

28. The method of claim 27 wherein said plurality of ranges of acceptable sizes is based upon a fill percentage of between 80 and 100 percent for a parcel formed therefrom.

29. The method of claim 27 wherein said division of a rectangular area into further rectangular areas is also based upon an evaluation of aspect ratios of the rectangular areas formed by said division.

30. The method of claim 27 wherein a division of a rectangular area into further rectangular areas is located at a position m/2.sup.n along a side of said rectangular area where n is an integer between and including 1 and 5 and m is an integer between and including 1 and (2.sup.n -1).

31. The method of claim 27 wherein said records represent segments of roads in said geographic region.

32. The method of claim 31 wherein said records comprise routing data records that include information about speed of travel and turn restrictions along each of said segments of roads.

33. The method of claim 31 wherein said records comprise cartographic data records that represent shapes of said segments of roads and that are used to display said segments of roads on a computer display.

34. The method of claim 27 wherein said desired fill percentage is based upon a parcel size not exceeding a minimum quantity of data that can be accessed by a navigation system in a single access operation.

35. The method of claim 27 further comprising the step of:

when forming a parcel of records of geographic data, if a size of said records is less than a maximum size for a parcel, adding an amount of padding equal to the difference between the size of the records of geographic data and the maximum parcel size.

36. A database formed according to the method of claim 27, wherein said database is stored on a computer-readable medium.

37. A geographic database for use with a navigation application program that uses said geographic database, wherein the geographic database comprises:

segment data entities that represent segments of roads;

aggregated segment data entities that represent aggregations of segments of roads; and

abbreviated segment data entities that are abbreviated representations of the segments of roads represented by the aggregated segment data entities,

and wherein each of the aggregated segment data entities refers to those abbreviated segment data entities that represent the same segments of roads that are represented in aggregation thereby,

and further wherein the geographic database is comprised of a plurality of separate layers of segment data entities, aggregated segment data entities and abbreviated segment data entities, wherein each of said plurality of separate layers includes a separate collection of said data entities based upon functional ranks of the represented segments of roads,

and wherein the abbreviated segment data entities that are referred to by an aggregated segment data entity and the aggregated segment data entity that refers to those abbreviated segment data entities are located in a same layer.

38. A database according to claim 37, wherein said database is stored on a computer-readable medium.

39. A geographic database for use with a navigation application program that uses said geographic database, wherein the geographic database comprises:

segment data entities that represent segments of roads;

aggregated segment data entities that represent aggregations of segments of roads; and

abbreviated segment data entities that are abbreviated representations of the segments of roads represented by the aggregated segment data entities,

abbreviated node data entities that are abbreviated representations of nodes which are endpoints of segments of roads represented in aggregations by aggregated segment data entities and wherein the nodes which are represented by abbreviated node data entities are internal of endpoints of the aggregations of road segments represented by the aggregated segment data entities,

wherein each of the aggregated segment data entities refers to those abbreviated segment data entities that represent the same segments of roads that are represented in aggregation thereby,

and wherein each of the aggregated segment data entities refers to those abbreviated node data entities that represent the nodes at the endpoints of the segments of roads represented in aggregation by the aggregated segment data entity and that are located internal of the endpoints of the aggregated segment data entity.

40. The invention of claim 39 wherein each of said aggregated segment data entities includes attributes that include at least one of:

a length of the segments of roads represented in aggregation;

an average speed along the segments of roads represented in aggregation; and

a transit time along the segments of roads represented in aggregation.

41. A database according to claim 39, wherein said database is stored on a computer-readable medium.

42. A method of formatting geographic data for storage on a computer-readable storage medium, said geographic data relating to a geographic region and said geographic data being nonuniform in density over the geographic region, the method comprising the steps of:

dividing said geographic data into a first plurality of portions, wherein each one of said first plurality of portions includes geographic data that correspond to geographic positions encompassed within a rectangular area of the geographic region, and wherein a rectangular area encompassing geographic positions corresponding to any one of said plurality of portions is separate from rectangular areas encompassing the remaining of said plurality of portions;

with respect to each portion of said first plurality of portions that exceeds a predetermined multiple of a predetermined maximum parcel amount,

forming smaller portions of said portion by repeating a regular separation process upon said portion, and the portions formed therefrom, until all resultant portions do not exceed the predetermined multiple of the maximum parcel amount;

wherein the regular separation process applied to a portion forms a pair of resultant portions, wherein each of said pair of resultant portions includes geographic data that correspond to geographic positions encompassed within a separate equal-sized rectangular area, wherein said separate equal-sized rectangular areas together correspond in size to a rectangular area encompassing the portion that had been divided to form the pair of resultant portions;

with respect to each portion of said first plurality of portions and each of the resultant portion that exceeds the maximum parcel amount but is not greater than a predetermined multiple of the maximum parcel amount,

forming smaller portions of said portion by repeating a special separation process upon said portion, and the portions formed therefrom, until all resultant portions do not exceed the maximum parcel amount;

wherein the special separation process applied to a portion forms a pair of resultant portions, wherein each of said pair of resultant portions includes geographic data that correspond to geographic positions encompassed within a separate rectangular area, wherein said separate rectangular areas together correspond in size to a rectangular area encompassing the portion that had been divided to form the pair of resultant portions, and wherein one of said portions includes an amount within ranges defined between an integer multiple of a desired fill percentage times the maximum parcel amount and the integer multiple of the maximum parcel amount; and

with respect to each portion of said first plurality of portions and said portions formed by said regular and special separation processes that is not greater than a maximum parcel amount, forming a parcel of said portion.

43. A method of storing geographic data in a computer-readable storage medium, said geographic data relating to a geographic region and said geographic data being nonuniform in density over the geographic region, the method comprising the steps of:

formatting the geographic data according to the method of claim 42;

storing said geographic data in their corresponding portions; and

storing said portions onto the computer-readable storage medium.

44. A method of storing geographic data in a computer-readable storage medium, said geographic data relating to a geographic region and said geographic data being nonuniform in density over the geographic region, the method comprising the steps of:

separating said geographic data into a first plurality of portions, wherein each of said first plurality of portions includes geographic data that corresponds to geographic positions encompassed within a separate rectangular tile, wherein a grid composed of a plurality of said separate rectangular tiles encompasses the geographic region;

with respect to each portion of said first plurality of portions that is not greater than a maximum parcel amount, forming a parcel of said portion,

with respect to each portion of said first plurality of portions that exceeds a predetermined multiple of said maximum parcel amount, separating said portion to form a pair of resultant portions, wherein each of said pair of resultant portions includes geographic data that correspond to geographic positions encompassed within a separate equal sized rectangular tile, wherein said separate equal sized rectangular tiles together correspond in area to the rectangular tile encompassing the geographic data that had been divided to form the pair of resultant portions;

with respect to each resultant portion that exceeds said predetermined multiple of said maximum parcel amount, continuing to divide said resultant portion and portions resultant therefrom to form a pair of further resultant portions, wherein each of said pair of further resultant portions includes geographic data that correspond to geographic positions encompassed within a separate equal sized rectangular tile;

with respect to each portion of said first plurality of portions that exceeds said predetermined maximum parcel amount but does not exceed said predetermined multiple of said maximum parcel amount and each resultant portion that exceeds said predetermined maximum parcel amount but does not exceed said predetermined multiple of said maximum parcel amount, separating said portion to form a pair of resultant portions, wherein each of said pair of resultant portions includes geographic data that correspond to geographic positions encompassed within a separate rectangle, wherein said separate rectangles together correspond in area to the rectangular tile encompassing the geographic data that had been divided to form the pair of resultant portions, and

wherein both said resultant rectangles have a first dimension equal to a first dimension of the tile from which said rectangles were formed;

wherein one of said rectangles has a second dimension equal to M times 2.sup.-N times a second dimension of the tile from which said rectangles were formed, and

wherein the other of said rectangles has a second dimension of (2.sup.N -M) times 2.sup.-N times the second dimension of the tile from which said rectangles were formed,

wherein N is a positive integer greater than 1 and M is a positive integer less than 2.sup.N, and

wherein M is chosen so that said first and said second portions can be divided into as few further rectangles as possible each of said further rectangles encompassing a quantity of data exceeding a minimum fill percentage of said maximum parcel amount.

45. A method of formating a geographic data base for storage on a computer-readable medium wherein said geographic data base includes geographic data corresponding to a geographic region, the method comprising the steps of:

separating said geographic data into two portions so that the two portions each include geographic data encompassed by equal sized rectangles of said geographic region;

continuing to perform said separating step upon the portions of geographic data until a portion of said geographic data formed by said separating is less than a predetermined multiple of a maximum parcel size, said portion of geographic data that is less than the predetermined multiple of the maximum parcel size corresponding to geographic data encompassed by a first rectangle of said geographic region having a first side of a first dimension and a second side of a second dimension;

separating said portion of geographic data that is less than the predetermined multiple of the maximum parcel size into two smaller portions having first and second amounts of geographic data,

wherein said first amount of geographic data exceeds an integer multiple of a minimum parcel fill percentage times the maximum parcel size but does not exceed the integer multiple times the maximum parcel size, and

wherein said first amount of geographic data is encompassed by a second rectangle of said geographic region said second rectangle having a first side corresponding in length to the first side of said first rectangle and said second rectangle having a second side having a length that is one of 1/4, 1/2, and 3/4 of said second side of said first rectangle.

46. The method of claim 45 wherein geographic data for route calculation is stored separately on said medium from geographic data for map display, and wherein said geographic data for map display is separated into parcels along the same geographic boundaries as the geographic data for route calculation.

47. An improved geographic database for a geographic region for use with a navigation system, said improved geographic database stored on a computer-readable physical storage medium, the improved database comprising:

a plurality of data entities stored on said physical storage medium, said plurality of data entities representing physical features in the geographic region;

a normalized attribute array stored on said physical storage medium, said normalized attribute array having a plurality of entries each of said entries having an index reference and a specific combination of a plurality of attributes that describe a physical feature;

wherein each of said data entities includes an index reference; and

whereby attributes describing a feature represented by a data entity having a first index reference are determined by reference to an entry in the normalized attribute array corresponding to said first index reference.

48. The improved geographic database of claim 47 wherein a first plurality of data entities have the first index reference and wherein attributes describing each of said first plurality of data entities are determined by reference to the entry in the normalized attribute array corresponding to said first index reference.

49. The improved geographic database of claim 48 wherein a second plurality of data entities have a second index reference and wherein attributes describing each of said second plurality of data entities are determined by reference to the entry in the normalized attribute array corresponding to said second index reference.

50. The invention of claim 47 wherein said normalized attribute array contains combinations of pluralities of attributes that occur most commonly in the plurality of data entities.

51. The invention of claim 47 wherein said normalized attribute array contains 256 entries.

52. The invention of claim 47 wherein the physical features represented by said plurality of data entities comprise road segments.

53. The invention of claim 47 wherein the plurality of data entities comprise cartographic data records.

54. The invention of claim 47 wherein the plurality of data entities comprise route calculation data records.

55. The invention of claim 47 wherein attributes describing a feature represented by a data entity having a second index reference are determined by reference to an entry in the normalized attribute array corresponding to said second index reference.

56. The improved geographic database of claim 47 wherein said normalized attribute array comprises a global array and a local array.

57. The improved geographic database of claim 47 wherein said normalized attribute array comprises a global array and a plurality of local arrays, and wherein said geographic database is stored in a plurality of parcels each of which includes a plurality of data entities and a corresponding one of said plurality of said local arrays.

58. An improved geographic database for a geographic region for use with a navigation system, said improved geographic database stored on a computer-readable physical storage medium, the improved database comprising:

a plurality of data entities stored on said physical storage medium, said plurality of data entities representing physical features in the geographic region;

a global normalized attribute array stored on said physical storage medium, said global normalized attribute array having a plurality of entries each of said entries having an index reference and a specific combination of a plurality of attributes that describe a physical feature;

a plurality of local normalized attribute arrays stored on said physical storage medium, said local normalized attribute arrays having a plurality of entries each of said entries having an index reference and a specific combination of a plurality of attributes that describe a physical feature; and

wherein each of said data entities includes an index reference;

whereby attributes describing a feature represented by a data entity having a first index reference are determined by reference to an entry corresponding to said first index reference in one of the global normalized attribute array and the plurality of local normalized attribute arrays.

59. The improved geographic database of claim 58 wherein said geographic database is parcelized on said storage medium and further wherein each of said local normalized attribute arrays is associated with a parcel, and wherein each of said local normalized attribute arrays includes specific combinations of a plurality of attributes that describe a physical feature represented by a data entity stored within the parcel associated therewith .


Description

REFERENCE TO MICROFICHE APPENDIX

Included with and forming part of this specification is a microfiche appendix, Appendix A, including 20 sheets of 1867 total frames and a second microfiche appendix, Appendix B, including 1 sheet of 43 total frames.

REFERENCE TO COPYRIGHTED MATERIAL

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

BACKGROUND OF THE INVENTION

The present invention relates to a system and method for storage of geographic information on physical media, and more particularly, the present invention relates to a system and method for providing geographic data on a physical storage medium for use in a computer-based navigation system.

Computer-based navigation systems for use on land have become available in a variety of forms and provide for a variety of useful features. One exemplary type of navigation system uses (1) a detailed data set (or map) of a geographic area or region, (2) a navigation application program, (3) appropriate computer hardware, such as a microprocessor and memory, and, optionally, (4) a positioning system. The detailed geographic data set portion of the navigation system is in the form of one or more detailed, organized data files or databases. The detailed geographic data set may include information about the positions of roads and intersections in or related to a specific geographic regional area, and may also include information about attributes, such as one-way streets and turn restrictions, as well as about street addresses, alternative routes, hotels, restaurants, museums, stadiums, offices, automobile dealerships, auto repair shops, etc.

The positioning system may employ any of several well-known technologies to determine or approximate one's physical location in a geographic regional area. For example, the positioning system may employ a GPS-type system (global positioning system), a "dead reckoning"-type system, or combinations of these, or other systems, all of which are well-known in the art.

The navigation application program portion of the navigation system is a software program that uses the detailed geographic data set and the positioning system (when employed). The navigation application program may provide the user with a graphical display (e.g. a "map") of his specific location in the geographic area. In addition, the navigation application program may also provide the user with specific directions to locations in the geographic area from wherever he is located.

Some navigation systems combine the navigation application program, geographic data set, and optionally, the positioning system into a single unit. Such single unit systems can be installed in vehicles or carried by persons. Alternatively, navigation application programs and geographic datasets may be provided as software products that are sold or licensed to users to load in their own personal computers. Personal computer-based systems may be stand alone systems or may utilize a communication link to a central or regional system. Alternatively, the navigation system may be centrally or regionally located and accessible to multiple users on an "as needed" basis, or alternatively, on line via a communications link. Navigation systems may also be used by operators of vehicle fleets such as trucking companies, package delivery services, and so on. Navigation systems may also be used by entities concerned with traffic control and traffic monitoring. In-vehicle navigation systems may use a wireless communication connection. Also, users may access a central navigation system over an on-line service such as the Internet, or over private dial-up services, such as CompuServe, Prodigy, and America Online.

Computer-based navigation systems hold the promise of providing high levels of navigation assistance to users. Navigation systems can provide detailed instructions for travelling to desired destinations, thereby reducing travel times and expenses. Navigation systems also can provide enhanced navigation features such as helping travellers avoid construction delays and finding the quickest routes to desired destinations. Navigation systems can also be used to incorporate real-time traffic information.

One potential obstacle to providing enhanced features in a navigation system is the need to provide the geographic information on a computer-readable storage medium in an efficient, versatile, economic, and flexible manner. In addition, the geographic information should be saved on the storage medium in a manner that facilitates access and use by the navigation application program portion of the navigation system. Accordingly, it is desired to provide an improved computer-readable storage medium product having geographic data stored thereon for use in navigation systems.

SUMMARY OF THE INVENTION

To achieve the foregoing and other objectives and in accordance with the purposes of the present invention, an improved method and system provides for storage of geographic data on physical storage media. The geographic data are stored in a manner that facilitates and enhances use and access of the data by various navigation application functions in navigation systems that use the data.

According one aspect, there is provided a parcelization method for dividing the geographic data into separate parcels. The parcelization method provides for parcels with data contents less than a specified maximum data content but having a desired fill percentage. The parcelization method also provides for a division arrangement that facilitates addressing and identification of the parcels.

In a further aspect of the parcelization process, a common set of boundaries is used to define all spatial parcels (including types such as data sets parcelized independently), such that for any two spatial parcels of any types, one parcel is completely contained in the other. This reduces the amount of data needed to represent spatial parcel boundaries in a global kd-tree index.

According to another aspect, the geographic data stored on a physical storage medium include special nodal entities. Each of special nodal entity represents a selected plurality of regular node entities in the geographic data. The selected plurality of regular node entities have been identified as related to complex intersections of multiple road segments, such as roundabouts, cloverleaves, and intersections of divided highways. In the geographic data, road segment data entities are associated with the special nodal entities instead of with the regular node entities represented by the special nodal entities. Then, in a route calculation program, the special nodal entities are used instead of the regular node entities. The special nodal entities collapse complex intersections of multiple road segments and regular node entities into simpler data representations thereby facilitating route calculation.

According to a still further aspect, a physical storage medium has stored thereon geographic data that includes at least one normalized attribute array. The normalized attribute array is provided as a separate table on the storage medium. The normalized attribute array includes reoccurring combinations of certain selected attributes within the geographic data. Within entity records in the geographic data, indices are included in place of data corresponding to the selected attributes. The indices refer to entries in the normalized attribute array. When a navigation application program accesses a data entity, the entry in the normalized attribute table pointed to by the index in the data entity is used to build the entire data record including the particular combination of attributes pointed to by the index. By including combinations of geographic data attributes in a normalized attribute array, storage space on the medium can be conserved and access to the data can be improved.

According to yet another aspect, there is provided a compilation method for providing data in a format that facilitates access to the data on a physical medium. According to the compilation method, data files to be stored on the medium are organized into parcels. The data records within the data files are identified by the parcel in which they are located. An arrangement of all the data files on the medium is determined and a parcel identification related to the medium is assigned to each parcel. Cross references between data records are updated to include the assigned parcel identifications. The data files are concatenated while maintaining the parcels and the concatenation is stored on the medium.

In another aspect, aggregated segment data are included in some layers of certain types of geographic data, such as data used for routing. These aggregated segments are used to represent a plurality of road segments and include sufficient information about intersections internal of the end points of the aggregated segments to allow a route calculation program to access an aggregated segment internally of its end points.

According to a still further aspect, shape points are generated for data entities that represent segments of roads. In collecting geographic data for use in navigation systems, shape points are determined for segments of roads that bend or curve so that the position of points along the road segment can be accurately determined. When road segments are straight, shape points are generally not included. When used in a navigation system, this lack of shape points for long, straight portions of a road segment may result in difficulty associating the road segment with a particular locality during map display or spatial searches. In this aspect of the disclosed system, shape point data are generated at intervals along straight road segments and associated with the respective road segment thereby providing a means to locate the road segment within the localities along its straight portions.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating a navigation system including a storage medium upon which geographic data, and optionally other data, are stored.

FIG. 2 is diagram illustrating the software components in the navigation system of FIG. 1.

FIG. 3 is a diagram illustrating the types of navigation data files stored on the storage device of FIG. 1.

FIGS. 4A-4D are illustrations depicting the parcelization process for organizing the navigation data on the storage device of FIG. 1.

FIGS. 5A-5E are illustrations depicting the process for including references to related data in parcels that comprise some of the navigation data on the storage device of FIG. 1.

FIGS. 6A-6F are illustrations depicting the process for substituting references to supernodes in place of pluralities of regular nodes in a route calculation geographic data set as shown in FIG. 3.

FIGS. 7A and 7B are diagrams illustrating the use of normalized attribute arrays to represent certain data attributes in certain data entries in the geographic data set shown in FIG. 3.

FIGS. 8A-8D are diagrams illustrating a segment aggregation procedure for use in a route calculation geographic data set as shown in FIG. 3.

FIG. 8E is a diagram illustrating the relation of the aggregated segment in FIG. 8D to other data to which it is associated.

FIGS. 9A, 9B, and 9C are flow diagrams showing the components of an embodiment of a geographic dataset compiler for producing a geographic database for storage on a storage medium used in the navigation system of FIG. 1.

FIGS. 10A and 10B are illustrations showing generated shape points produced in a process in the compiler illustrated in FIGS. 9A-9C.

FIGS. 11A, 11B, and 11C are illustrations showing subdivision of cartographic data in a process in the compiler illustrated in FIGS. 9A-9C.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

I. Overview

Referring to FIG. 1, there is a diagram illustrating an exemplary configuration of a navigation system 10. The navigation system 10 is a combination of hardware and software components. In one embodiment, the navigation system 10 includes a processor 12, a drive 14 connected to the processor 12, and a memory storage device 16, such as a ROM, for storing a navigation application program 18. The navigation application program 18 is loaded from the ROM 16 into a memory 20 associated with the processor 12 in order to operate the navigation system. A storage medium 22 is installed in the drive 14. In one present embodiment, the storage medium 22 is a CD-ROM. In another alternative present embodiment, the storage medium 22 is a PC Card (PCMCIA card) in which case the drive 14 would be substituted with a PCMCIA slot. Various other storage media may be used, including fixed disks, hard disks, DVD's, as well as storage media that may be developed in the future. The storage medium 22 includes geographic data, as described more fully below.

The navigation system 10 may also include a positioning system 24. The positioning system 24 may utilize GPS technology, a dead reckoning-type system, or combinations of these, or other systems, all of which are known in the art. The positioning system 24 outputs a signal 26 to the processor 12. The signal 26 may be used by the navigation application program 18 that is run on the processor 12 to determine the location, direction, speed, etc., of the navigation system 10. The navigation system 10 uses the geographic data stored on the storage medium 22, possibly in conjunction with the output 26 from the positioning system 24, to provide various navigation application features. These navigation application features may include, for example, route calculation, map display, vehicle positioning (e.g. map matching), and maneuver generation (wherein detailed directions are provided for reaching a desired destination). These navigation application features are provided by navigation application programs (i.e. subprograms or functions) that are part of the navigation application 18. The navigation features are provided to the user (e.g., the vehicle driver) by means of a display 27, speakers 29, or other means.

Referring to FIG. 2, the navigation application program 18 typically is a software program that includes separate functions (or subprograms) including, for example, a route calculation function 28, a map display function 30, and a maneuver generation function 32. The navigation application program 18 may include other functions or subprograms 34 in addition or alternatively to these. Although these navigation application subprograms are represented as separate functions or applications within the navigation application program 18, these functions may be combined or otherwise provided.

In FIG. 2, the storage medium 22 is shown to have geographic data 40 stored on it. The geographic data 40 are in the form of one or more computer-readable data files or databases. The geographic data 40 may include information about the positions of roads and intersections in or related to a specific geographical regional area, and may also include information about the attributes of the roads and intersections, such as one-way streets and turn restrictions, as well as other information, such as street addresses, alternative routes, hotels, restaurants, museums, stadiums, offices, automobile dealerships, auto repair shops, etc. The regional area may include a metropolitan area, such as Chicago and its suburbs, New York and its suburbs, Los Angeles and its suburbs, or alternatively, the regional area may include an entire state, such as California, an entire country, such as the United States, or combinations of these. More than one region may be stored on a storage medium.

The various functions 28, 30, 32, and 34, of the navigation application program 18 use portions of the geographic data 40 from the medium 22 in order to provide useful navigation features to the user of the navigation system 10. Though not necessary for the present embodiments, it is preferred that the navigation system include an interface layer 41 located between the various application functions 28, 30, 32, and 34 and the geographic data files 40. The interface layer 41 facilitates the application programs in accessing and reading the geographic data 40. In one embodiment, the interface layer 41 is a collection of software library functions that isolate the navigation application functions from the details of the geographic data 40. The interface layer 41 is described in more detail in the copending application entitled "INTERFACE LAYER FOR NAVIGATION SYSTEM", Ser. No. 08/740,298, filed on even date herewith, the entire disclosure of which is incorporated herein by reference.

II. Types of Geographic Data

As mentioned above, the geographic data 40 includes detailed information about roads, intersections, speed limits, street names, location names, turn restrictions, street addresses, alternative routes, hotels, restaurants, museums, stadiums, offices, automobile dealerships, auto repair shops, etc. These data may be represented in various different ways. Some ways may be proprietary whereas other ways may be in the form of industry or de facto standards.

One format used for geographic data is the GDF (Geographic Data File) format. Other formats are available and are understood to be encompassed within the present embodiment. The GDF 3.0 format is described in a document issued by the CEN (European Committee for Standardization) on Oct. 12, 1995, the entire disclosure of which is incorporated herein by reference. The GDF format also is being considered for adoption outside of Europe by the ISO (International Standards Organization). The GDF format is an interchange format for geographic databases. The GDF format is especially suitable for transferring geographic data sets from another format or to another format. In order to use a geographic data set in a navigation system, the geographic data set may need to be converted from the GDF format into a more specialized format. This conversion process is discussed in more detail below.

For purposes of the discussion of the embodiments herein, the following terminology is used. It is understood that other terminology may be used without departing from the scope of the present disclosure.

For purposes of this embodiment, the terms "feature", "attribute", and "relationship" with respect to a geographic data set may have the definitions set forth in the CEN GDF standard, mentioned above. Specifically, the term "feature" may refer to a database representation of a real world object, the term "attribute" may refer to a property of a feature which is independent of other features, and the term "relationship" may refer to a property of a feature involving other features. In other data models, different terms may be used, but the similar concepts would be applied in a similar manner.

In the geographic data set, nodes and segments are features.

A node is a point representing the intersection of two or more roads, the end of a road, or a point along a road where the road attributes change.

A segment is a representation of a section of a navigable road. A segment has a node at each end and may have one or more shape points along its length.

The following attributes are associated with a segment:

(1) A shape point is a point along a segment where the segment bends, or where the segments cross each other at a different grades.

(2) A landmark is the intersection of a segment with a cartographic feature that is considered significant for explication purposes such as a river or railroad.

(3) A rank specifies the highest routing data layer in which the segment appears and may also correspond to a functional class of the segment.

(4) The speed category classifies the segment based upon average speed.

(5) The lane category classifies the segment based upon the number of lanes available for travel in a single direction.

(6) The route type specifies the type of route, for example, European, Autobahn, Bundesstrassen, Landesstrassen, or Kreisstrassen in Germany, or Federal, Interstate, State, or County in the U.S.A.

(7) The access characteristics specify restrictions on the types of traffic that are permitted to travel on the segment.

A POI (point-of-interest) is a feature such as a hotel, a restaurant, museum, etc. A facility type is an attribute of a POI and identifies the functional category of a POI, such as a hotel, restaurant, museum, etc.

A cartographic point is a representation of any point feature, such as a landmark.

A polygon is the border of some two dimensional area or cartographic feature, such as a lake.

A polyline is a representation of a linear cartographic feature, either non-navigable or navigable.

An administrative area is the region of a governmental entity, such as a city or county.

A zone is the region of a non-governmental name for an area, such as a neighborhood or an unincorporated village.

A Place can be an administrative area or zone.

Third party data (TPD) is information about additional points-of-interest. These additional points-of-interest data may be provided by a third party data vendor or may be otherwise provided in a manner such that they are not fully integrated with the rest of the geographic data.

In addition to the types of information described above, there may be additional types of data included on the storage medium. For example, soundex-type data may be included. This type of data provides for the identification of entities by words or phrases that sound like, or have a similar pattern to, a requested item. The data may be returned to the end-user in the form of a list of possible matches for a requested item.

III. Organization of Geographic Data

Referring again to FIG. 2, the geographic data 40 are organized and arranged in a manner that facilitates and/or enhances the performance of the various navigation application functions 28, 30, 32, and 34. Some of the aspects of the organization and arrangement of the geographic data 40 are specific to the particular physical medium used, but other aspects facilitate and enhance the performance of the navigation functions independent and regardless of the particular storage medium. The aspects of the organization of the geographic data that facilitate the navigation functions include parcelization and parcel identification of the geographic data, and the inclusion of normalized attributes, supernodes, and segment aggregation in the geographic data. Each of these aspects is described in more detail below.

Each navigation function application or subprogram 28, 30, 32, and 34, typically uses only a specific subset of the entire geographic data 40. Where the subsets overlap, the different applications making use of the same data may access them by different paths, in different orders, or may have different performance requirements.

Referring to FIG. 3, in a preferred embodiment, the geographic data 40 are organized as separate groups or subsets of the geographic data. Each of the groups includes different portions or collections of the data. The portion of the data included in each of the groups of the geographic data is related to the navigation application function that utilizes the specific collection of the data. In general, each of the functions 28, 30, 32, and 34 has its own subset or collection of the entire geographic dataset. Arrangement of the data in this manner may result in some duplication of portions of the entire collection of geographic data. However, the navigation application functions will generally run faster if each navigation application function accesses only a subset of the entire geographic data and if that subset includes only that portion of the entire geographic data set that the particular application needs. In general, the portion of the data used and associated with one function is grouped separately from the portions of the data associated with and used by the other functions. Further, in general, the data used by each one of the functions are collected together into physical proximity. At the same time, the subsets of data may share overall parcelization boundaries and other organizational structure, which structural similarities further enhance the efficiency and performance of the application programs.

Each of the groups of geographic data used by each of the functions is organized into parcels, as explained more fully below. Parcels are collections or groupings of data into similar or regularly-sized (though not necessarily equal) amounts. When stored on a storage medium, parcels may correspond to physically distinct locations existing on the storage medium. In one present embodiment, a parcel also represents the smallest quantity of data that is retrieved from the storage medium.

The size of parcels (i.e. the maximum amount of data to be included in a parcel) is predetermined taking into account several factors. One factor used to determine the parcel size includes the access characteristics of the storage medium upon which the data will be stored. These access characteristics include transfer speed and latencies. For indexed data, there is a balance between the average number of index parcel retrievals required to search for a specified record of data and the average size of the index parcels. To minimize average search time for a specified record of data, single-speed CD-ROM characteristics may determine an optimal parcel size of 4-16 sectors (8-32 KBytes), and furthermore that an index parcel should contain an amount of data equal to the average of the data contents of the parcels that it indexes. Indexes can include kd-trees for spatial data and B-trees for ordered (e.g. alphanumeric) data.

Another factor used to determine the size of parcels relates to spatial types of geometric data (such as routing and cartographic). For these types of data, special considerations, such as special types of data, may be needed to account for parcel boundaries. Therefore, smaller parcel data content implies larger total special parcel boundary data, thereby resulting in larger total database data content and potentially less efficient access.

Another factor used to determine parcel size includes the memory constraints of the navigation system that will use the data. Many navigation systems have limited memory, or memory that is optimized for use with certain-sized blocks of data. Accordingly, to the extent possible, these types of hardware requirements are also considered in determining the size of a data parcel.

The geographic data 40 includes one separate group 48 of parcels of data used by the route calculation function 28, another separate group 50 of parcels of data for the cartographic function (i.e. map display) 30, still another separate group 52 of parcels of data for the maneuver function 32, another group 53 of data for cartographic cross-references, yet another separate group 54 of parcels of data for points-of-interest geographic data. In one present embodiment, all of these groups of parcels are in one file, although there may be more than one file.

The subsets of geographic data for each of the functions are cross-referenced (and may include pointers) to provide interoperability among the functions. The physical organization represented in FIG. 3 is independent of the type of media used, and it is recognized that the implementation of the organization represented in FIG. 3 will take into account the specific features associated with various different types of physical media, such as CD-ROM disc, PC card, etc.

Some of the subsets of geographic data are organized spatially. Spatially-organized data are arranged so that the data that represent geographically proximate features are located physically proximate in the data set 40 and on the medium 22. For some of the navigation application functions, spatial organization of their respective data provides for reading closely related geographic data from the medium more quickly and loading related geographic data into memory where it can be used. This kind of organization minimizes accessing of the storage medium 22 and speeds up operation of certain of the navigation functions.

The subsets of the geographic data that are organized spatially include the route calculation data 48, the cartographic data (map display) 50, the maneuver data 52, the cartographic cross-reference data 53, the points-of-interest data 54. Other of the data are organized and accessed non-spatially. The non-spatially organized data 60 include navigable features 62 (e.g., street names), places 63 (e.g. administrative areas and zones), postal codes 64, crossroads/junctions 65 and cartographic features 66. Third-party data 61 are not organized spatially. Each record of the third-party data 61 is associated with a record in the points-of-interest (POI) data 54. Since points-of-interest data 54 are organized spatially, spatial access to third-party data 61 can be achieved via their associated points-of-interest data 54.

In a preferred embodiment, both the route calculation portion 48 of the data and the cartographic portion 50 of the data are layered. Each layer within each data type extends to cover the same geographic region. However, higher layers of a data type contain less detail than lower layers. For example, layer 0 of the route calculation portion of the data generally will be the most detailed and will include all the streets and intersections in the geographical region to which the geographic data set 40 corresponds. Layer 1 of the route calculation data generally will omit the slower or less important streets in the geographical region, layer 2 generally will omit the next most slower or less important streets, and so on. (The inclusion of streets in given layers can be defined by the rank attribute. For example, assuming the street segments are given ranks of I to IV depending on the street type (e.g. alley or interstate), Layer 0 can be defined to include all ranks I-IV, Layer 1 can be defined to include only ranks II-IV (omitting rank I streets), and so on.) The route calculation function may be performed by using combinations of the layers, using the higher layers to the extent possible.

The map display function 30 also benefits from having its subset 50 of the geographic data organized to facilitate rapid panning and zooming. For example, zooming may be done more efficiently if the subset 50 of geographic data is organized into layers, with greater detail at the lower layers and less detail at the higher layers. Furthermore, in the subset 50 of geographic data used by the map display function 30, if each of the data parcels contains an index of its neighboring parcels at the same layer, as well as at layers above and below it, then panning and zooming can be done more efficiently by the map display function 30 without further looking up in a separate index file. The internal index of the parcel's neighbors at the same and other layers can be generated efficiently if parcel boundaries are established using the parcelization method described below.

The route calculation function 28 benefits from having its portion 48 of the geographic data 40 organized to facilitate searches for an optimal route between two points. Also, the route calculation function 28 can run more rapidly where as much of the geographic search data as possible is kept in memory 20, especially if access to the storage medium 22 is relatively slow, such as with a CD-ROM. Accordingly, the subset 48 of the geographic data 40 used by the route calculation function 28 is preferably organized so that each parcel covers a maximum possible physical geographic area (within the constraints of a fixed buffer size or a limited number of buffer sizes). This objective can be met by limiting the data or information in route calculation data parcels to only that information specifically relevant to route calculation, and including as little irrelevant information as possible. For example, in the subset 48 of the geographic data used for route calculation, the names of streets may be omitted.

Information such as street names is used instead by the maneuver function in order to generate directions to an end-user for route guidance. Thus, street names and additional information used for the maneuver function 32 are included in separate spatial and non-spatial data parcels 52. However, street name information is not included in the route calculation subset 48 or cartographic subset 50.

The points-of-interest geographic data 54 are parcelized spatially and may be interleaved with the cartographic data 50 to facilitate both the display of points-of-interest on the screen, and point-and-click selection of points-of-interests by users mostly to obtain points-of-interest "nearby" the vehicle or the route.

As mentioned above, to facilitate access to all types of map data in all contexts, the different types of spatially organized parcels (route calculation, cartographic, maneuver, points-of-interest and cartographic cross reference) each contain pointers to the other types of parcels covering the same geographic area. For example, a route calculation parcel contains pointers to cartographic parcels with data pertaining to coextensive or overlapping coverage areas. In addition, a number of indices or cross-references on important keys or attributes allows access to map data by various paths. For instance, a point-of-interest could be located by its "name", by its "facility type" (type of points-of-interest), or by its chain ID (e.g. "McDonald's" restaurants). This location could be further qualified to be within a city, or within a specified distance of the current location. A destination could be specified as a crossroads, a street address, or by point-of-interest name, possibly being qualified as being within a particular city or town or other geographical indicator.

IV. Physical Storage Format

A. Parcelization

1. Overview

As mentioned above, the data 40 on the storage medium 22 are parcelized, that is, most or all of the data are organized into smaller portions with each parcel including a plurality of data records and other information. Parcels also correlate to physical subdivisions for storage of the data on the storage medium. In general, the full collection of data pertaining to an entire geographic region is too large to be loaded into memory at one time. Therefore, the data is organized into smaller groupings or parcels. For some map data, the parcels are spatially-organized, i.e. each parcel represents geographic data encompassed with a geographic rectangular area (including square areas) of the physical region.

The groupings of data into parcels are made for several purposes. First, data are organized into parcels in an attempt to group into one parcel, or as few parcels as possible, most or all of the data that any one of the navigation functions may require in order to perform an operation for the user. If the user wants to display a map of his location, it would be preferable that the data relating to the geographic area immediately around the user are organized so that all the necessary data could be accessed and loaded into memory quickly. The parcels including all the data needed to display a map of the user's area should therefore be grouped together into one or as few parcels as possible. In general, the larger the parcel the better. Each parcel should ideally cover as much geographical area as possible so as many operations pertaining to that geographical area can be handled. Another reason that data are parcelized is so that data are grouped into parcels with each parcel having a size that can be readily used by the navigation system applications. These sizes relate to hardware and memory constraints and may be regular multiples of 2 Kbytes, 4 Kbytes, 8 Kbytes, or 16 Kbytes, for example.

In a preferred embodiment, parcels for all types of data are constructed using the same method. As mentioned above, it is desired that each parcel contain as much data as possible. If, for example, a rectangle encompassing the entire geographic area were merely bisected again and again until all the smaller rectangles formed therefrom contained no greater than a desired maximum amount of data, it is likely that many parcels formed from such rectangles would be considerably less than full. This results from the fact that a geographic area typically does not have a uniform density of geographic data. Having a considerable portion of parcels that are substantially less than full results in waste of space and poorer performance. The parcelization method described below provides for maximizing the number of parcels that are relatively full. The method described below also provides for a parcelization arrangement that expedites identification and retrieval of the parcels.

As mentioned above, each of the navigation functions is provided with a subset of the entire geographic data set suitable for that function. Accordingly, starting with a version of the entire geographic data set, separate subsets of the entire geographic data set are generated for each of the separate functions. For example, from the entire geographic data set including all the geographic data for a particular geographic region, separate subsets of the geographic data are generated including a route calculation data subset, a cartographic data subset, a maneuver data subset, a points-of-interest data subset, a cartographic feature data subset, a navigable feature data subset, a crossroad/junction data subset, a third party data subset, a place data subset, a postal zone data subset, and so on. After, or as part of, the generation of the separate subsets, parcelization of each of the subsets is performed. The subsets can be parcelized in any order, but in a preferred embodiment, for the spatially organized data (i.e., routing, cartographic, maneuver, and so on), the type of data that is expected, in general, to be the densest set, i.e., the subset of data expected to be divided into the largest number of parcels, is parcelized first to generate a global kd-tree. For example, the route calculation data may be the densest, and accordingly, parcelization is performed on that subset of the geographic data first.

2. Determination of Initial Parcel Boundaries

Although it is possible to establish an initial parcel boundary at any location, in a preferred embodiment, initial placement of parcels boundaries is chosen as follows. First, the geographic data are examined to determine their outer geographic boundaries. Referring to FIG. 4A, there is an illustration of a map of a geographic area 100. The geographic data 40 to be stored on the storage medium 22 relates to the geographic area 100. Shown on the map of the geographic area 100 are a plurality of points 101. As mentioned above, parcelization is first performed on one of the subsets of data, for example the route calculation subset. Included in the route calculation subset of data are individual data records that identify nodes. The node records correspond to specific physical locations in the geographic area 100. Each of the nodes has a specific latitude, longitude, and relative elevation ("Z-level"). The route calculation subset of data also includes data records that identify segments. The segment records correspond to physical features that have a length, such as portions of roadways in the geographic area. Each of the segment records refers to and can be identified by nodes located at the end points of the segment. In addition, a segment may include one or more shape points between its end points that are used to identify a bend (or physical position) in the segment. Accordingly, all of the spatially-related geographic data may be identified by nodes that have a unique latitude and longitude in the geographic area 100. The points 101 shown in FIG. 4A correspond to the nodes in the geographic dataset. Each of the points 101 is shown on the map of the geographic area 100 at the location corresponding to the latitude and longitude of the node to which it corresponds in the route calculation subset of the geographic data set. FIG. 4A is used for illustration purposes only, and in a preferred embodiment, the parcelization procedures described herein are performed by a computer program operating on the appropriate subset of the data.

Referring again to FIG. 4A, the outermost nodes are identified. For example, node 102W is the node in the route calculation portion of the geographic data set that has the maximum longitude, node 102E is the node in the route calculation portion of the geographic data set that has the minimum longitude, node 102S is the node in the route calculation portion of the geographic data set that has the minimum latitude, and node 102N is the node in the route calculation portion of the geographic data set that has the maximum latitude. The nodes 102N, 102S, 102E, and 102W define a minimum bounding rectangle ("MBR") 106,as represented by dashed lines in FIG. 4B. A minimum bounding rectangle is the smallest rectangle that contains all of the geographic data.

In a preferred embodiment, dimensions of latitude and longitude are expressed in navigation dimensional units equal to 1/100,000 of a degree. Like degrees, navigation units may be absolute or may refer to a coordinate position on the surface of the earth. Furthermore, in a preferred embodiment, the dimensional units are integers. Thus, the smallest unit of measurement is "1" which represents 1/100,000 of a degree. In alternative embodiments, other-than-degree values can be choosen as units to represent dimensions, and measurement units can be chosen that include fractions.

In a preferred embodiment, the minimum bounding rectangle is adjusted outwardly at its eastern and northern edges by one additional dimensional unit from those defined by nodes 102N and 102E. For example, if 102N had a latitude of 282, then the northern side of the minimum bounding rectangle would be at the latitude 283, i.e. 282+1. Likewise, if 102E had a longitude of 90,000, then the eastern side of the minimum bounding rectangle would be at the longitude 90,001. This is illustrated in FIG. 4C.

The data set defined by such an adjusted minimum bounding rectangle can be regarded to include all data encompassed within the minimum bounding rectangle, as well as any data that intersects the western and southern edges (but not the northern and eastern edges) of the minimum bounding rectangle. The advantage of using such an adjusted minimum bounding rectangle is that each minimum bounding rectangle will consequently encompass a unique data set. In other words, a node will be found in only one minimum bounding rectangle. This holds true even where the longitude of the eastern edge of a given minimum bounding rectangle (equal to longitude+1 of 102E) is the same as the longitude of the western edge of the minimum bounding rectangle to the immediate east.

The former minimum bounding rectangle only includes data encompassed by such shared edge (but not data intersecting the shared edge), whereas the latter minimum bounding rectangle may include data intersecting the shared edge.

In a preferred embodiment, a minimum enclosing dividable-tile (or "di-tile") 107 is determined that encompasses the minimum bounding rectangle 106. A dividable tile ("di-tile") refers to an area of dimensions 2.sup.I .times.2.sup.J that includes all map data between latitudes M.times.2.sup.I navigation units and (M+1).times.2.sup.I navigation units and between longitudes N.times.2.sup.J navigation units and (N+1).times.2.sup.J navigation units, where M and N are integers, and I and J are positive integers).

One way of determining a minimum enclosing di-tile is to define acceptable intervals and to require that the minimum enclosing di-tile have as its sides only acceptable intervals. Acceptable intervals are defined in both directions of latitude and longitude. (Any arbitrary starting location may be chosen, but in a preferred embodiment, acceptable intervals conform to conventional latitude and longitude starting locations, i.e. the equator and Greenwich). Acceptable intervals may be defined to include only powers of 2, for example: 0-1, 2-3, 4-5, 6-7, . . . , 0-3, 4-7, 8-11, 12-15, . . . , 0-7, 8-15, 16-23, 24-31, . . . , 0-15, 16-31, 32-47, 48-63, . . . , and so on (in navigation units).

Referring to FIG. 4D, examples of acceptable intervals are represented. In FIG. 4D, "0" may represent either Greenwich or equator. The units 1, 2, . . . an so on, may represent navigation units equal to 1/100,000th of a degree. Accordingly, starting from Greenwich (longitude 0), acceptable intervals include any of those represented by lines in the figure. For example, if the minimum bounding rectangle has a west coordinate of 5 and an east coordinate of 12, the acceptable interval would be the interval I(0,16). A similar set of acceptable intervals relative to the equator are defined for north-south coordinates.

As mentioned above, in a present preferred embodiment, the sides of the minimum enclosing di-tile for the minimum enclosing rectangle are required to be acceptable intervals. Therefore, in a present embodiment, the east-west coordinates of the initial di-tile are multiples of 2.sup.I units, and the north-south coordinates of the initial di-tile are multiples of 2.sup.J units. (I and J are integers so that the east-west length of the initial di-tile may have a dimension in unit's of 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024, and so on, and the north-south length of the initial di-tile may have a dimension in units of 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024 and so on, for example).

One advantage of forming the minimum enclosing di-tile in this manner is that di-tiles for different map coverage areas can be merged together more easily, since the boundaries for different map coverage areas are established using the same approach.

It is appreciated that in map coverage areas including both negative and positive navigation units (i.e. coverage areas in which the minimum bounding rectangle includes either the equator or Greenwich), additional acceptable intervals are required and accordingly, intervals of length 2.sup.18 starting at coordinates which are multiples of 2.sup.17, intervals of length 2.sup.19 starting at coordinates which are multiples of 2.sup.18, etc. are acceptable; some of these intervals overlap 0.degree. longitude (Greenwich) and 0.degree. latitude (the equator).

3. Establishing Initial Parcel Sizes

Once the minimum enclosing di-tile is established, the data can be parcelized. In one alternative, the parcelization process can begin by applying the regular division procedure, described below, to the minimum enclosing di-tile 107. However, in a preferred embodiment, the data in the coverage area are first examined based upon an organization of the data into a regular grid of rectangles formed from the minimum enclosing di-tile. This is equivalent to bisecting the minimum enclosing di-tile and then the rectangular (or squares) formed therefrom a number of times until a regular grid of rectangles results. Each of the rectangles in this grid may be referred to as an "initial tile." The initial tile size is determined to be the largest geographic area allowed to be represented by one parcel at any layer of any of the types of data in the geographic data set. In one embodiment, one fixed initial tile size is defined for all regions throughout the country so that regions can be more easily merged. In one present preferred embodiment, each of the initial tiles is of a fixed, predetermined size of 2.sup.17 navigation units by 2.sup.17 navigation units.

Such initial tiles are shown as the grid 108 in FIG. 4B. The initial tiles may alternatively be defined by simply overlaying the geographic area 100 with a regular grid having the same pattern as grid 108. In either event, the grid 108 is made up of initial rectangular tiles (e.g. tiles 110.sub.a,b, tile 110.sub.a+1,b, . . . tile 110.sub.m,n).

The placement of the boundaries of the grid 108 is determined in order to enclose the minimum bounding rectangle 106, (that is, the minimum longitude (corresponding to node 102E), the minimum latitude (corresponding to node 102S), the maximum latitude (corresponding to node 102N), and the maximum longitude (corresponding to node 102W)). The grid boundaries are defined so that when the grid is overlaid on the region 100 as shown in FIG. 4B, all the spatial data are encompassed and the initial tiles have a size as described above. In a preferred embodiment, the placement of the grid boundaries also conforms to the acceptable intervals, described above.

EXAMPLE

An example for implementing the above procedure for determining the grid boundaries is as follows:

1. Start by setting MinLat, MaxLat, MinLong and MaxLong to the minimum and maximum latitudes and longitudes of the minimum bounding rectangle 106. These values are in units of 1/100,000 of a degree, so that they are in the range -18000000 to 18000000 for the longitudes and -9000000 to 9000000 for the latitudes.

2. For K=1 to 25 in increments of 1:

(a) Divide MaxLat by 2.sup.K.

(b) Multiply MaxLat by 2.sup.K. Since MaxLat is an integer, these two operations have the effect of truncating the K low-order binary digits of MaxLat. Actually, this is done by right-shifting and left-shifting MaxLat, which is more efficient than division and multiplication.

(c) If the minimum bounding rectangle's maximum latitude is greater than MaxLat, add 2.sup.K to MaxLat.

(d) Divide MinLat by 2.sup.K.

(e) Multiply Minlat by 2.sup.K.

(f) If the minimum bounding rectangle's minimum latitude is less than MinLat, subtract 2.sup.K from MinLat.

(g) If MinLat+2.sup.K is equal to MaxLat, stop. Otherwise repeat at step (a) for the next higher value of K.

3. Perform the operations of Step 2 on MinLong and MaxLong.

At this point MinLat, MaxLat, MinLong and MaxLong define the boundaries of grid 108, i.e., the di-tile, enclosing the minimum bounding rectangle 106.

4. Establishing Parcels

As mentioned above, a purpose of parcelizing the data is to include in each parcel an amount of data that is close as possible to, but not in excess of, a predetermined maximum parcel amount. For example, the predetermined maximum amount may be 16 Kilobytes.

Each one of the initial tiles in the grid 108 of FIG. 4B is examined as a "trial parcel" to see if the amount of data in it fits into a single parcel. If the data within the "trial parcel", including any parcel overhead (such as index information and headers), would (accounting for data compression, if used) be less than or equal to the maximum parcel amount, then a parcel is constructed with that initial tile and no division of that initial tile for that particular data type is performed. On the other hand, any "trial parcel" that includes an amount of data that exceeds the predetermined maximum parcel amount is divided using one of the two following procedures as a function of the amount by which the data in the "trial parcel" exceeds the desired maximum amount. (In a preferred embodiment, an estimation technique, described below, is used in determining trial parcels. The estimation technique takes into account parcel overhead and compression without actually performing all the steps necessary to form a parcel.)

Regular dividing procedure. If the amount of data in a "trial parcel" exceeds the maximum parcel amount by a predetermined multiple, the "trial parcel" is divided into two rectangles. In a preferred embodiment, the division of the "trial parcel" into two rectangles is carried out by first determining the minimum enclosing di-tile for the trial parcel (in the manner described above with the initial tile), bisecting the enclosing di-tile, and then dividing the "trial parcel" where the line of bisection of the di-tile intersects the trial parcel. Alternatively the "trial parcel" may itself simply be bisected. (It is noted that a bisection of the enclosing di-tile will not always bisect the trial parcel, but instead may divide the trial parcel at an off-center location. For case of reference herein, such division of the trial parcel will nonetheless be referred to as "bisection"). In either event, the line of bisection of the di-tile will be in either the longitudinal or latitudinal direction. In a present preferred embodiment, the di-tile is bisected in whichever of the longitudinal or latitudinal divisions minimizes the maximum aspect ratio (.gtoreq.1) of the two resulting rectangles of the trial parcel. Each of these resulting rectangles is then examined as a "trial parcel", as described above, and bisected if the data contained in it exceeds the predetermined multiple of the maximum parcel amount. Each of these sub-rectangles is also then examined as a "trial parcel", as described above, and the process continues until the amount of data in a rectangle or sub-rectangle is less than the predetermined multiple of the maximum parcel amount. In a present embodiment, the maximum parcel amount is predetermined to be 16 Kilobytes of data, and therefore the predetermined multiple is 3.2. (In alternative embodiments, the maximum amount may be predetermined to be another amount greater than or less than 16K, such as 8K or 32K, or even amounts greater or less than these). Thus, "trial parcels" are bisected according to the current procedure where their data content exceeds 51.2 kilobytes. The predetermined multiple is chosen based upon a desired minimum fill percentage for each parcel. In the present embodiment, the desired minimum fill percentage is 80%. When the amount of data in any trial parcel is less than the predetermined amount, further subdivisions of the trial parcel follow the "Custom division procedure", described below.

Custom division procedure. If the amount of data in any "trial parcel" exceeds the maximum parcel amount, but is less than the predetermined multiple of the maximum parcel amount (e.g., 16 Kb<.times.51.2 Kb), the following custom division procedure is used: Further divisions of the "trial parcel" are not necessarily bisections, but rather are made in a manner that tends to minimize the number of parcels created. This has the effect of minimizing both the space needed to store the parcels and wasted space within the parcels.

For example, given a trial parcel that contains data equal to 3.6 times the maximum parcel size or amount, it should be possible to fit this data into four parcels. However, bisection of the trial parcel may divide it into two rectangles of 1.2 times the maximum parcel size and 2.4 times the maximum parcel size, respectively, which would then end up in a minimum of five parcels if bisection were used to divide each of the rectangles. Therefore, subdivisions of the rectangle at this stage are made with the goal of minimizing the number of parcels created, but with the restriction that the division line not be arbitrary. More particularly, where a "trial parcel" has a data content that is greater than the maximum parcel size, but not in excess of the predetermined multiple thereof, the "trial parcel" is divided at a division of 2.sup.-x along one of its dimensions. In a present preferred embodiment, X={1, 2, 3, 4, 5}. Thus, the trial parcel is divided at 1/2 or 1/4 or 1/3 or 1/16 or 1/32 divisions of its width. For example, a trial parcel may be divided into two rectangles with widths equal to 5/8 and 3/8, respectively, of the width of the trial parcel. This custom division may be applied directly to the dimensions of the trial parcel, or, in a present preferred embodiment, it may be applied to the dimensions of, a minimum enclosing di-tile of the trial parcel. In the latter case, the trial parcel is divided where the division line of the di-tile intersects the trial parcel. In either event, the division line will be in either the longitudinal or latitudinal direction.

Candidate division lines are examined as follows: First, divisions are made at each of the specified 2.sup.-x divisions along both the longitudinal and latitudinal widths of the trial parcel. For each such division, the aspect ratios (defined as the ratio of the larger dimension to the smaller dimension) of each of the two resulting rectangles are determined, and the greater of the two is identified. The greatest aspect ratios identified for each of the candidate division lines are then compared, and the candidate division lines are ordered from smallest to greatest of such aspect ratios. The rectangles resulting from the candidate division lines are then examined, beginning with the first candidate line in the ordered list. The candidate division line chosen for dividing the trial parcel is the first one in the list encountered where the data content in one of its two resulting rectangles is greater than the maximum parcel size but less than or equal to a multiple (such as two times) of a minimum fill percentage times the maximum parcel amount and the maximum parcel amount. For example, the division line is chosen to include in one of the resultant rectangles an amount between 1.6 and 2.0 times the maximum parcel amount. This should enable making one more division of the rectangle to form two further sub-rectangles each with a fill percentage greater than 80% (0.8) of the maximum parcel amount. Each of these resultant sub-rectangles with a fill percentage between 80% and 100% of the maximum parcel amount is formed into parcels. If no candidate division line meets this criterion, then the first candidate division line (i.e. the one with the smallest maximum aspect ratio) is used to divide the given rectangle.

EXAMPLE

The following describes how, during parcelization, a determination is made of when to stop bisecting trial parcels, and to start the custom procedure of evaluating candidate divisions of 1/32, 1/16, 3/32, 1/8, 5/32, 3/16, 7/32, 1/4, 9/32, 5/16, 11/32, 3/8, 13/32, 7/16, 15/32, 17/32, 9/16, 19/32, 5/8, 21/32, 11/16, 23/32, 3/4, 25/32, 13/16, 27/32, 7/8, 29/32, 15/16, and 31/32 along both the longitudinal and latitudinal widths of the trial parcel.

A target parcel fill percentage F is chosen. For the sake of example, let F be 0.8 (80 percent). As mentioned above, a maximum parcel size P is also determined. P is expressed in bytes and is the maximum amount of data that can be put into a parcel. Optimally, therefore, it is desired to create parcels in the range of F.times.P bytes and P bytes in size.

If a trial parcel having a data size D is in the range P<D<1.6.times.P, then it is not possible for parcels to be created therefrom whose data sizes fall in the target range. If the trial parcel is divided such that one resulting rectangle has a data size greater than or equal to 0.8.times.P, then the other one has a data size less than 0.8.times.P.

This process can be extended to give the following list of non-acceptable data sizes:

Unacceptable Data Sizes D

0<D<0.8.times.P

P<D<1.6.times.P

2.times.P<D<2.4.times.P

3.times.P<D<3.2.times.P

The list stops at this point, because the next entry would be

4.times.P<D<4.times.P.

From the above list a complementary list of acceptable data sizes can be generated:

Acceptable Data Sizes D

0.8.times.P.ltoreq.D.ltoreq.P

1.6.times.P.ltoreq.D.ltoreq.2.times.P

2.4.times.P.ltoreq.D.ltoreq.3.times.P

3.2.times.P.ltoreq.D

The above list corresponds to a fill percentage F equal to 0.8. Other fill percentages generate different lists of acceptable or unacceptable data sizes. In general, the list of unacceptable data sizes is in the following form:

Unacceptable Data Sizes D

0<D<F.times.P

P<D<2.times.F.times.P . . .

n.times.P<D(n+1).times.F.times.P . . .

continuing until an empty range is reached.

The above is used as follows in the custom procedure for forming the parcels: The data sizes of each of the two rectangles resulting from a trial parcel division should fall within one of the acceptable ranges (whenever possible). In practice, this means that as long as the data size in a rectangle is somewhat larger than the high end of the highest unacceptable range (3.2.times.P, in the example), the rectangle can be divided according to the above-described bisection procedure. Once the high end of the highest unacceptable range is approached, custom divisions (i.e., divisions of 1/32, 1/16, 3/32, 1/8, etc. in this example) are considered.

Candidate division lines are examined and compared in the manner described above, and the one selected for division is where the following criterion is met: The amount of data falling into each of the two subrectangles of the trial parcel is of a size such that it is theoretically capable of, in turn, being subdivided into rectangles that, on the average, achieve a specified minimum parcel data fill percentage. Some cases may occur in which the above criterion cannot be met.

The data are divided at the division line chosen in the custom division procedure, and, for each of the two subrectangles created, the custom division process is repeated as necessary. As mentioned above, the bisection and custom division procedure can be applied either directly to the trial parcel or to the minimum enclosing di-tile of the trial parcel, although the latter is preferred. It is noted that in some cases the minimum enclosing di-tiles are exactly equal to trial parcel boundaries. This most notably may occur with respect to the initial tiles 108. The utility of defining the divisions in terms of the minimum enclosing tiles is that a tile can be repeatedly divided in half evenly, whereas a trial parcel rectangle of arbitrary dimensions cannot. Another advantage is that this procedure facilitates processing at boundaries between different databases. The custom division of a trial parcel at a 2.sup.-X division of the tile's side is equivalent to a sequence of from 1 to X bisections. Consequently, the division lines, and therefore the resulting subrectangles, can be represented in a minimal number of bits (5 bits for 1/32 divisions, as opposed to up to eight or sixteen bytes to define an arbitrary rectangle).

(In examining amounts of data included in rectangles, a convention is established that any data entity that is located exactly on a dividing line is included with the data in the rectangle "to the right" ("east") of the line, if the dividing line is a north-south line, and with the data "above" ("north of") the line, if the dividing line is an east-west line.)

The above procedure is performed on all the initial tiles 110 (and, where necessary, all resulting rectangles) in the grid 108.

5. Subsequent Parcelizations

As mentioned above, each subset or type of geographic data is maintained separately from subsets of other types. For example, route calculation data are located separately from cartographic data. Further, within a single subset of data, the data may be further separated into layers.

After the parcelization procedure described above is performed on one layer of one type of geographic data (i.e., the layer and type estimated to be the densest, such as layer 0 of the route calculation data), parcelization of the other layers of the same data type and other data types are performed. These other parcelizations (i.e. parcelizations of other layers of the same data type and of other data types) begin with exactly the same initial tiling grid 108 as was used in the initial parcelization. At each recursive level in each subsequent parcelization of different data types, when subdivision of a rectangle is necessary it is required to be performed at exactly the same division line (whether accomplished through bisection or custom division) that was used during the parcelization of the initial data type (or any previous parcelization).

Since higher layers are less dense, it may not be necessary to divide the data into as many parcels in order that each parcel be within the maximum parcel size. Accordingly, at higher levels of a particular data type, there may be fewer parcels and some of the parcels may represent a larger geographic area than at the lower layers. However, as mentioned above, the divisions of the data of the higher levels are made along the same dividing lines that were made in the lower levels and divisions at the higher levels proceed as far as needed until an amount of data not greater than the maximum parcel size is reached.

During a subsequent parcelization, it may be necessary for the division process to proceed further than in a prior parcelization. This occurs if the data of the subsequent data type located in any of the initial tiles or rectangles formed therefrom is more dense than the data type used during a prior parcelization. In this case, the original parcelization may not have divided the initial tile into small enough sub-rectangles so that each contains less than the maximum parcel amount of the subsequent data type. In this case, a division line is determined as it would have been in the initial parcelization procedure. This requirement means that, for parcels A and B of different layers or types, it is always the case either that the geographic area covered by parcel A is contained in that covered by parcel B, or the geographic area covered by parcel B is contained in that covered by parcel A.

In other words, after the type of data with the greatest expected data density (e.g., layer 0 of the route calculation data) has been parcelized, other layers of the same data type (e.g. layer 1 of the route calculation data) or other data types (e.g., layer 0 of the cartographic data) are parcelized in parallel to the initial parcelization, as follows:

1. The subdivision of data into initial rectangles begins with exactly the same set of tiles.

2. Whenever data in a particular rectangle exceed the maximum parcel size, and therefore has to be subdivided, one of two procedures is used (1) If that rectangle was subdivided in the initial parcelization, or in any previous parcelization, then exactly the same division line is used as was used in the previous parcelization. (2) If that rectangle was not previously divided, then its division line is determined in the manner described above for the initial parcelization.

Advantages

The parcelization method and organization described above has several advantages, including the minimizing the creation of parcels that are less than optimally-filled, and thereby maximizing storage efficiency on the storage medium. This allows more data to be kept in memory buffers at one time, given buffers of one or a few fixed sizes. In addition, the parcelization method and organization make it easy to navigate between neighboring parcels of the same type and layer, without having to include the bounding rectangles of these neighbor parcels in each parcel, or having to read a separate spatial index. Because of the regularity in the method of defining parcel boundaries, a minimally-sized spatial index of nearby parcels can be carried within each parcel. Further, because different layers and different parcel types are parcelized in parallel, this method makes it easy to navigate between parcels of different types or at different layers that cover the same or overlapping geographic areas. Furthermore, the use of di-tiles for dividing panels makes it easy to navigate between bordering but separately stored, geographic database regions.

6. Ordering of Parcels

Once all the parcels for all the data types are established, the parcels are placed in order to facilitate the eventual writing of the parcels to disk. To minimize seek time in reading data from a storage device, such as a CD-ROM disc, each spatial set of parcels is organized in depth-first order with respect to the global kd-tree index (i.e., so that for each division line, all parcels contained in the sub-rectangle with coordinates less than the coordinate of division precede all parcels contained in the sub-rectangle with coordinates greater than the coordinate of division) so that geographically neighboring parcels are likelier to be close together on a disc. The linear or spatial indices likeliest to be used to access a particular type of data are placed next to and preceding that data. Since the "global kd-tree" indexes all spatial parcels types, it may be redundantly replicated at various locations on the medium (i.e., near to and preceding each data set of spatial parcels).

Parcels are stored on disk approximately in Peano key order. The actual ordering of the parcels is based on the two-dimensional kd-tree (representing the entire database region) that is generated during parcelization, as the initial di-tile is recursively split into sub-rectangles and finally into parcels. The ordering of the parcels is generated by a depth-first traversal of this kd-tree. Due to the method (described above) by which the region's initial enclosing di-tile is chosen, and by which each division line is chosen during parcelization, this ordering would be identical to Peano key ordering were each rectangle's division line exactly to bisect that rectangle, but differs slightly from Peano key ordering at the lowest level of the kd-tree whenever division lines are not bisections. This ordering, like an exact Peano key ordering, makes geographically neighboring parcels likelier to be stored near each other on disk. An advantage of this ordering over exact Peano key ordering is that it is generated automatically by the parcelization process, so that an additional sort is not required. An index consisting of a two-dimensional kd-tree (explained in more detail below) can be used to access route calculation parcels spatially (the two dimensions are the latitudes and longitudes of the division lines used during parcel generation ). This type of indexing is useful for initial location of an arbitrary position, such as when a route guidance application initially locates the map data corresponding to a vehicle's current position. Once a starting position is known, however, all information about nearby map data can be found in indices internal to a parcel.

7. Index Tree Parcelization and the Physical Storage Format

Various index trees are used in the physical storage format. These include the 2d "neighboring parcel kd-tree" that is stored in each route calculation, cartographic and points-of-interest parcel. Route calculation and cartographic data parcels also include a 2d "internal kd-tree."

Also used are various global index trees (per database region). In general, each of these trees may not fit into a single index parcel. An index parcel may contain a complete or partial tree (a global tree or subtree), or it may store more than one complete tree (subtrees of a global tree). Theoretical considerations imply that index parcels should have a size equal to the (average) size of the record parcels they index in order to minimize search times.

Parcelization of the global tree proceeds as follows. Nodes are stored in breadth first order (to balance search times). If a parcel has been filled so that another node cannot be stored in it, then a new parcel is created and an unstored node is selected as a root for the new parcel and it is recursively stored in breadth first order. An index parcel may contain more than one root only if the complete subtrees at these roots fit into the parcel; that is, if a complete subtree has been parcelized in an index parcel then another highest level unstored node may be stored as another root in that index parcel only if its complete subtree fits into the parcel; this prevents unnecessary cross-parcel index searching.

The roots are numbered (by tree identifier) in sequence starting with 0 within each index parcel. Thus, an inter-parcel reference to a node is by parcel ID (minus region specifier) plus tree ID. Intra-parcel reference to a node is by byte offset (as with the "neighboring parcel kd-tree" and the "internal kd-tree"). Index trees are stored decompressed in the index parcels.

The index parcel has the following structure. The first data in the parcel is the IdxPclHdr.sub.-- t header (byte packed). Following the header is the offset array to the roots in the parcel, indexed by tree ID:

    ______________________________________
    Ushort 16   root.sub.-- offset[ ];
                              /*root offset array*/
    ______________________________________


Following the offset array are the trees.

8. Global Spatial Parcelization Tree

For each region there is a 2D kd-global spatial parcelization index tree. The structure of this tree corresponds to the parcelization that is a refinement of all parcelizations for all spatially parcelized data types (which is defined since parcelization proceeds in parallel with previous parcelizations for other data types). This tree is used for at least two different kinds of searches. First, the tree is used to determine in which parcel would contain a point given the point's latitude and longitude coordinates. Another type of search is used to determine the one or more parcels that would intersect with a rectangle given the rectangle's boundary coordinates. Both these types of searches may be used in various functions, including queries, map display, route calculation, and so on. A node in this kd-tree refers via its right (resp. left) descriptor identifier pair list to a data parcel if the rectangle corresponding to the node's right (resp. left) child is the bounding rectangle for that data parcel; it may refer to a data parcel of some type (e.g. layer 1 route calculation) and also have descendant nodes which refer to data parcels of another type (e.g. layer 0 cartographic).

Each "neighboring parcel kd-tree" represents a subtree of the global tree (except for the nodes representing external-to-region parcels); note that a node in the global tree may have a child stored in a different index parcel while the corresponding node in a "neighboring kd-tree" and the corresponding child are stored in the same parcel. The data structure for the global tree is similar to that of the "neighboring parcel kd-tree", with the following differences. First, value 01 for control bits 8-9 or control bits 10-11 mean that the pruned child appears as a root in a different index parcel; in this case the child offset is 4 bytes, and the first 2 bytes of the offset (between the index parcels storing the root and the child: all index parcels in the global tree have identical size and redundancy information, which allows the child parcel's ID to be constructed at runtime from this parcel ID offset), and the second 2 bytes of the offset give the tree ID. Furthermore, both left and right child offsets are required to have the same size (a 4 byte offset for a non-pruned child uses the rightmost, i.e. least significant bytes). Second, the descriptor list does not store table indices for parcel ID's but instead stores parcel ID's directly since there is no further reference to the parcel ID in the index parcel; however, these parcel ID's are not necessarily aligned in storage. Finally, the entire descriptor, identifier pair list is rounded up to a 2-byte-multiple total length because child offsets are in units of 2-bytes.

9. Route Calculation Parcel Internal Structure

The information in data records can be carried in a compressed form. After decompression, the data are still in a packed form in which there are no padding bytes and generally no unused fields. Each decompressed data record is therefore in the form of a variable-length character array. This decompressed but packed data is the source from which logical records passed to the navigation application program software by the interface layer 41 (of FIG. 2) are constructed.

For most records (notably excluding nodes and segments above the bottom layer), a table of offsets is used to locate the beginning of each block of 2.sup.K records. This reduces the number of variable-length records that have to be examined (compared with a sequential search for the desired record).

Data in the parcel header is used by the navigation application program (specifically, the interface layer 41) to navigate within the parcel. These data are therefore in a form that is immediately useable. For example, data in the parcel headers are stored in fields defined as two- or four-byte signed or unsigned integers, where applicable, instead of in character array fields like the data records.

Within a parcel at the bottom layer, node records are stored in Peano key order based on latitude and longitude, and segment records are stored in Peano key order based on the latitude and longitude of the segment's "Left" node. This Peano ordering is followed down to the level of a grid of predetermined size, and within the grid a latitude-longitude ordering is followed. Entity ID's within a bottom layer parcel are then assigned consecutively within a parcel. At higher layers, segment and node records are simply stored in entity identifier order within the parcel.

Condition records are used to store information about restrictions or additional attributes associated with a segment, e.g. "no truck access." Date-time records include information associated with a condition that indicates when conditions are in effect, e.g. "no left turn 4:00 PM-6:00 PM." Condition and Date-time records are stored in Peano key order based on the Peano key of the primary segment's left node. Storage of records in Peano key order reduces the time spent searching for records within a parcel, since entities in spatial proximity a re often represented by records near each other within a parcel.

10. Cartographic, Points-of-Interest, and Maneuver Parcels Internal Structure

Except as noted below, the cartographic, points-of-interest, and maneuver parcels may have the same or a similar internal structure as the routing parcels. The cartographic and points-of-interest data may be interleaved within each layer. This is based on the expectation that cartographic data and associated points-of-interest data are frequently accessed together. Interleaving becomes more useful at higher CD-ROM rotational speeds because the rotational delay before the desired data are reached becomes proportionally less at higher speeds in comparison with the time required to move the read head to reach the data in the non-interleaved case. This is particularly true as the volume of data increases, since the distance the head would have to move (and the time it would take) to reach the non-interleaved data would increase as some function of the data volume.

The route calculation and maneuver data need not be interleaved, even though the data is parcelized in the same way and are used together for some functions. The reason for this is to make it possible to increase the speed of the route calculation process. Interleaving the maneuver data and the route calculation data would to some extent defeat this purpose, since it could slow down access to route calculation data.

B. Redundant Data--Neighboring Parcel Information

Within each parcel is stored information about certain other parcels of interest, either those that geographically neighbor the parcel (parcels of the same type and at the same layer), or those that overlap the same geographic area (parcels of different types or at different layers). Within a particular type of parcel, some but not all parcels of other types are of interest. For example, within a route calculation parcel, overlapping maneuver parcels are referenced, but overlapping points-of-interest parcels are not. All parcels of interest are stored together in a single kd-tree structure. This structure contains implicit bounding rectangle information for each parcel--implicit because the parcelization scheme above creates parcel boundaries that can be encoded with a few bits (5 bits per division). Each entry (that is, each internal node or leaf node in the kd-tree) also contains the parcel identifiers of any parcels that correspond to the sub-rectangle defined by the division line at the entry.

The example shown in FIG. 5A includes just one type of parcel (route calculation parcels, for example), and just four layers, so that the drawing can be simplified. The example of FIG. 5A shows the parent, neighbor, and child parcel information for parcel A1.

FIGS. 5B and 5C show the structure of an entry in the kd-tree describing the geographical area surrounding a parcel. In general, each division in a rectangle is described with a kd-tree entry of two or more bytes: two bytes of control information, and 0-4 bytes to represent offsets to the left and right children (8 or 16 bits per offset). The cut can be at any 1/32 division of a rectangle's minimum enclosing tile, and therefore 5 bits of the control information is dedicated to the location of the line that cuts the parent rectangle into left and right subrectangles. Whenever a kd-tree (internal or leaf) node corresponds to a parcel, a parcel descriptor (one byte) is present in the kd-tree, containing the parcel's type and layer, and other information. A parcel descriptor is followed by either a one-byte or three-byte index into the parcel identifier table; it is expected that in most cases one byte is sufficient, but when there are more than 254 parcel ID's in the table, the index is expanded. The parcel identifier table contains four-byte entries; the one byte region specifier is not included. When the region specifier is different from that of the parcel containing the parcel identifier table, then the most significant bit of the four-byte parcel identifier in the table is set to 1 to indicate that the full parcel identifier is to be looked up in a separate table of external parcel identifiers.

EXAMPLE

Referring to FIGS. 5D and 5E, there is illustrated an example of the structure of a kd-tree and parcel identifier table. For simplicity, this example includes only a single parcel type and single layer. If multiple parcel types and layers were included in this kd-tree, then any (internal or leaf) node in the kd-tree could correspond to multiple parcel identifiers. This would mean that a given kd-tree entry could be followed by more than a single left parcel descriptor and right parcel descriptor. In this example, both the kd-tree and the parcel ID table are small enough so that all offsets and indices can be contained in one byte.

C. Compression

Referring again to FIGS. 1 and 2, in a present embodiment, parcels are maintained in their compressed form in memory 20 after they are read from disk 22, and entities within a parcel are decompressed as they are requested by the navigation application functions 28, 30, 32, and 34. In a present embodiment, the parcels are maintained in their compressed form in the interface layer 41 mentioned in connection with FIG. 2. This provides for reduced memory usage. If fully decompressed parcels were kept in a memory cache, then more cache memory would be required, especially since decompressed parcel records vary in length. For example the route calculation function 28 often accesses only a small fraction of records within a parcel each time the parcel is read from disk. Using this approach therefore reduces processing time for decompression. Since compressed records normally are variable in length, even when the decompressed records are of fixed length, the following approach can be used to locate a given record within the compressed data.

Alignment Considerations

Compressed data records, as well as the intermediate (packed) form in which data records exist immediately after decompression, are in the form of character arrays (that is, binary byte arrays, not ASCII character arrays). Therefore, no alignment considerations apply. Data in the header portion of each parcel, used to describe the parcel and navigate within it, contains data that is often most conveniently stored in the form of C-language long integers, short integers, structures or unions. Length and alignment for these data types can vary on different platforms. Since the geographic data 40 is designed to be used on a variety of platforms, it is appropriate to define conventions for alignment and length. These conventions apply to data on the medium containing the map data (e.g. the CD-ROM disc); logical record formats default to alignment and length rules for the hardware platform on which the interface layer software 41 executes. Note that the intermediate (packed) form referred to above should be the same in all physical storage formats, since it is accessible to generic independent components of navigation applications.

The following conventions are used for data types in the map data parcel headers:

    ______________________________________
    Data Type     Alignment     Length
    ______________________________________
    short integer 2-byte boundary
                                2 bytes
    long integer  4-byte boundary
                                4 bytes
    structure/union
                  4-byte boundary
                                4 byte multiple
    ______________________________________


Bytes in map data are in Big Endian form (most significant bit first), and both short integers and long integers in map data are in Big Endian form (most significant byte first).

D. Intra-parcel Record Access via Entity Identifier

Within a parcel, when each record of a particular type has a unique entity identifier, one of the two methods described below of locating such a record can be used. The first method is used to locate records when entity identifiers are assigned to records of a given type consecutively within the parcel. For example,the first record within the parcel has entity identifier 0, the second 1, etc., with no gaps. The second method is used to locate records in which gaps do exist, but in which the records are still stored in order by entity identifier. The second method is used to access segment records and node records in route calculation parcels above the bottom layer, while the first method is used to access most other record types in all parcels.

Method 1

This method of record access may be used to locate records for which entity identifiers are assigned consecutively within the parcel.

(1). The parcel header contains the following information about an entity type within the parcel:

Offset: Offset to the data from the beginning of the parcel

Count: Record count

Numblks: Number of blocks of compressed records.

Blkcnt: Number of compressed records per block (carried in exponent form, where k implies 2.sup.k records).

For those entity types whose identifiers are consecutive within the parcel but do not begin at zero, the first entity identifier in the parcel is carried. This applies to bottom layer node and segment records in the route calculation parcels.

(2). Offset points to a table of Numblks+1 table entries (each of 16 bits), where table entry N points to a block of Blkcnt records, the first of which is the record with entity ID equal to (N-1) * Blkcnt.

(3). The first byte of each record block is the beginning of a Type 1 Variable Length Unsigned Value that is an offset to the first data record within that block. Following this field are 2.sup.K more Type 1 Variable Length Unsigned Value fields, each of which is the length of a compressed record in the block. The use of these length fields allows rapid navigation through the records within a block, as in the following example:

EXAMPLE

Blocks are composed of 32 records each, and the length of each record in the block is small enough to be represented by one byte. Then, the first byte of the block contains a one byte field that contains the value 33, which is the offset from the beginning of the block to the beginning of the first compressed data record in the block. The second byte in the block contains the length of the first compressed record in the block. The third byte contains the length of the second compressed record in the block, etc. The thirty-third byte in the block contains the length of the thirty-second compressed record in the block. To find the beginning of the seventh record in the block, the following are added together: the address of the beginning of the block, the offset from the beginning of the block to the first compressed record in the block, and the lengths of the first six records. In this example, the first seven bytes of the block are added to the address of the block.

For example, if there are 1500 records of a given entity type in a parcel, and Blkcnt is 32, then a table of 47 16-bit offsets is generated (i.e. Numblks=46), the first of which points to a block of compressed records with ID's 0-31, the second of which points to a block containing records with ID's 32-63, etc. Then, to find the record with ID 100, 100 is divided by 32 (or right-shift 100 by 5). This results in 3, which is the (0-based) index of the block containing record 100. For different entity types, the best value of Numblks might be influenced by record size, record count, percent of space saved by compression, or other factors, and does not need to be the same in every parcel.

Method 2

This method of record access is used to locate records that are stored in order by their unique entity identifiers, but for which entity identifiers generally do not start at 0 and are not assigned consecutively (without gaps) within the parcel.

(1). The parcel header contains the following information about an entity type within the parcel:

Offset: Offset to the data from the beginning of the parcel

Count: Record count.

(2). Offset points to a table of Count+1 table entries (each of 16 bits plus 24 bits). Twenty-four bits of each entry is a record's entity identifier, and the remaining 16 bits of the entry is the offset from the beginning of the parcel to the compressed record corresponding to that table entry's entity identifier. The last table entry does not point to a record; it points to the first byte following the last record.

(3). The length of a compressed record is equal to the difference between the offset in its table entry and the offset in the next consecutive table entry whose offset's high order bit is not equal to 1. A table entry with an offset whose high order bit is equal to 1 is a special entry that does not point to a record.

The decompressed records are variable-length and in packed form. The size of this packed record can be minimized as follows: (1) there can be no padding bytes in this format, since it will be in the form of a byte array rather than a C structure; (2) some fields that take up a limited number of bits can be combined together into one byte; (3) finally, character string data such as street names can be compressed at the data field level using a conventional text-oriented compression method.

The logical format of a record (the format returned to application software programs 28, 30, 32, and 34 from the interface layer 41) is constructed from the above decompressed, packed record.

E. Supernodes

In a present embodiment, the following procedure is used with the route calculation subset 48 (of FIG. 3) of the geographic data.

The route calculation subset of geographic data includes feature entities for nodes, conditions and segments. The node features are in the form of node entities. Some node entities are used to store positional information (i.e., latitude and longitude) about the end points of a segment. (There may also be node entities that relate to positional information other than the end points of segments.) The positional information related to each node is stored in terms of a longitude, latitude, and a relative elevation. This node entity may also contain attributes which provide additional information about the node.

As mentioned above, the parcelization method used for the route calculation subset of the data provides for maximizing the amount of relevant route calculation information that can be kept in memory at once thereby minimizing time-consuming memory operations during route calculations. According to this aspect of the present embodiment, the route calculation function 28 (in FIG. 2) may be sped up by including single collapsed nodes (referred to as "supernodes") in the route calculation data set. Each of these collapsed nodes or supernodes represents a plurality of closely-spaced or related regular nodes and segments. Wherever a supernode is used to represent a plurality of regular nodes, the segments associated with any of the plurality of regular nodes repr