Version management

Multiple layer information object repository

6983288

Abstract

Techniques for relating data stored in one or more storage systems for an enterprise include managing information chunks in one or more storage systems. Each chunk comprises a unit of data for storage and retrieval operations. The techniques also include managing a vocabulary database. The vocabulary database includes data structures describing atomic concepts among names in an enterprise-specific vocabulary and data structures describing relationships among the atomic concepts. Content in a document is arranged based at least in part on data in the vocabulary database. The content is based at least in part on an information object or "chunk" in the storage system. Thus, content originally unrelated and authored over time by many different persons and organizations can be related using the business vocabulary concepts and relationships in the vocabulary database.


Claims

What is claimed is:

1. A method of relating data stored in one or more content management systems for an enterprise, the method comprising the steps of:

managing a first information object data structure corresponding to a first information object, wherein the first information object is data stored in a file;

managing a first concept data structure corresponding to a first concept;

managing a second concept data structure corresponding to a second concept;

managing a first relationship data structure, wherein the first relationship data structure comprises a first reference to the first concept data structure and a second reference to the second concept data structure;

managing a second relationship data structure, wherein the second relationship data structure comprises a third reference to the first concept data structure and a fourth reference to the first information object data structure;

receiving a request for a document referring to the second concept; and

generating the document referring to the second concept based on a set of information, wherein the set of information comprises the first relationship data structure, the second relationship data structure, the first information object data structure, and the first information object.

2. The method of claim 1, wherein the second concept is different than the first concept.

3. The method of claim 1, wherein the first relationship data structure further comprises a reference to a third concept data structure.

4. The method of claim 1, wherein the second relationship data structure corresponds to a child-of relationship type; wherein the first concept and second concept are each one of a set of atomic concepts; a hierarchy is a subset of the set of atomic concepts related by a series of one or more relationships of the child-of relationship type to a root concept of the set of atomic concepts; and the first concept belongs to a particular hierarchy having a particular root concept.

5. The method of claim 4, wherein the particular root concept is one of an information type root concept, a document type root concept, a product type root concept, a technology type root concept, a solution type root concept, and a user type root concept.

6. The method of claim 5, wherein the second concept belongs to a second hierarchy having a second root concept, wherein the second root concept is different from the particular root concept.

7. The method of claim 1, wherein the first information object is one of a block of text, an application, a query for a database, a vector graphic, an image, audio data, and video data.

8. The method of claim 1, wherein the fourth reference comprises one of a file name, a network resource address, a universal resource locator (URL) address, a record identification in a predetermined database, a record identification in a predetermined content management system.

9. The method of claim 1, wherein the first information object is one of a plurality of information objects; the plurality of information objects reside in a plurality of content management systems; and said step of managing the plurality of information objects further comprises employing a data integration tool to retrieve the first information object from a content management system that resides on a remote platform accessible through a network.

10. The method of claim 1, further comprising the step of storing the first information object into a content cache based at least in part on data in the second relationship data structure.

11. The method of claim 10, wherein the first concept data structure and the second concept data structure are stored in a vocabulary database; and wherein the method further comprises the step of generating and storing a subset of the vocabulary database into a concept cache based at least in part on data in the second relationship data structure.

12. The method of claim 11, wherein the set of information comprises the concept cache and the content cache.

13. The method of claim 12, wherein the document is organized based at least in part on the second relationship data structure.

14. The method of claim 12, wherein the document displays information based on at least one information object in the content cache.

15. The method of claim 11, further comprising the step of editing at least one of the concept cache and the content cache.

16. The method of claim 10, further comprising step of generating and storing a plurality of information objects, said step of generating and storing the plurality of information objects further comprising combining at least two information objects of the plurality of information objects into a single information object in the content cache.

17. The method of claim 16, wherein a second information object corresponds to a second information object data structure, and a third relationship data structure comprises references to the second information object data structure and the second concept data structure.

18. The method of claim 11, said step of generating and storing the subset of the vocabulary database into the concept cache further comprising de-normalizing the concept cache to improve speed of retrieval by allowing a concept data structure for a concept that participates in more than one relationship to be stored more than once in the concept cache.

19. The method of claim 1, further comprising employing a first set of software tools including at least one tool for defining a type instance for the first information object data structure and a tool for defining a type and instance for the first relationship data structure.

20. The method of claim 1, further comprising the steps of generating and storing a first subset of a plurality of information objects into a first content cache for managing content of a Web page.

21. The method of claim 20, further comprising the step of managing the first subset by employing a second set of software tools including at least one of: a tool for editing the first information object data structure, a tool for editing the first relationship data structure, a tool for populating the first content cache, a tool for populating a first concept cache, a tool for retrieving from the first content cache, a tool for combining two or more information objects into a new information object, and a tool for de-normalizing the first concept cache to improve speed of retrieval by allowing a concept data structure for a concept that participates in more than one relationship to be stored more than once in the first concept cache.

22. The method of claim 20, wherein the first subset of the plurality of information objects excludes information objects that have become obsolete.

23. The method of claim 20, wherein the first subset of the plurality of information objects excludes information objects that have not been released.

24. The method of claim 20, further comprising the steps of generating and storing a second subset of the first content cache into a second content cache for staging content for the Web page.

25. The method of claim 24, further comprising the step of replicating the second content cache to one or more Web servers for providing content to a Web page generating process on each of the one or more Web servers.

26. The method of claim 24, further comprising the step of managing the second subset by employing a third set of software tools including at least one of: a tool for editing the first information object data structure, a tool for populating the second content cache, a tool for populating a second concept cache, a tool for ensuring each information object in the second content cache has an information object data structure and a relationship to another concept in the second concept cache, and a tool for forming a search index for the second content cache.

27. The method of claim 1, further comprising the step of:

arranging content in the document based at least in part on the second relationship data structure.

28. A computer-readable medium carrying one or more sequences of instructions for relating data stored in one or more content management systems for an enterprise, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of:

managing a first information object data structure corresponding to a first information object;

managing a first concept data structure corresponding to a first concept;

managing a second concept data structure corresponding to a second concept; the first

managing a first relationship data structure, wherein the first relationship data structure comprises a first reference to the first concept data structure and a second reference to the second concept data structure;

managing a second relationship data structure, wherein the second relationship data structure comprises a third reference to the first concept data structure and a fourth reference to the first information object data structure;

receiving a request for a document referring to the second concept; and

generating the document referring to the second concept based on a set of information, wherein the set of information comprises the first relationship data structure, the second relationship data structure, the first information object data structure, and the first information object.

29. The computer-readable medium of claim 28, further comprising one or more sequences of instructions for relating data stored in one or more content management systems for the enterprise, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of;

arranging content in the document based at least in part on the second relationship data structure.

30. A system for relating data stored in one or more content management systems for an enterprise, comprising:

means for managing a first information object data structure corresponding to a first information object;

means for managing a first concept data structure corresponding to a first concept;

means for managing a second concept data structure corresponding to a second concept;

means for managing a first relationship data structure, wherein the first relationship data structure comprises a first reference to the first concept data structure and a second reference to the second concept data structure;

means for managing a second relationship data structure, wherein the second relationship data structure comprises a third reference to the first concept data structure and a fourth reference to the first information object data structure;

means for receiving a request for a document referring to the second concept; and

means for generating the document referring to the second concept based on a set of information, wherein the set of information comprises the first relationship data structure, the second relationship data structure, the first information object data structure, and the first information object.

31. The system of claim 30, further comprising:

means for arranging content in the document based at least in part on the second relationship data structure.

32. A computer system for relating data stored in one or more content management systems for an enterprise, the computer system comprising:

one or more computer-readable mediums carrying one or more executable sequences of instructions for managing a first information object data structure corresponding to a first information object;

for managing a first concept data structure corresponding to a first concept;

for managing a second concept data structure corresponding to a second concept;

for managing a first relationship data structure, wherein the first relationship data structure comprises a first reference to the first concept data structure and a second reference to the second concept data structure;

for managing a second relationship data structure, wherein the second relationship data structure comprises a third reference to the first concept data structure and a fourth reference to the first information object data structure; and

one or more processors configured for receiving a request for a document referring to the second concept; and

for generating the document referring to the second concept based on a set of information, wherein the set of information comprises the first relationship data structure, the second relationship data structure, the first information object data structure, and the first information object.

33. The computer system of claim 32, further comprising:

one or more processors configured for arranging content in the document based at least in part on the second relationship data structure.


Description

FIELD OF INVENTION

The present invention generally relates to data processing in the field of electronic document creation and deployment. The invention relates more specifically to relating stored information objects using business vocabulary concepts and relationships in multiple layers increasingly tailored for producing and publishing of dynamic documents, such as Web pages for a Web site.

BACKGROUND OF THE INVENTION

Through economic growth, mergers and acquisitions, business enterprises are becoming ever larger. Further, large business enterprises in the field of high technology now offer ever larger numbers of products and services that derive from an increasingly large variety of technologies.

In this environment, managing the creation, use, protection and maintenance of the company's intellectual assets, such as products and technologies is an acute problem. As an enterprise grows, maintaining consistent usage of names of products and services throughout the enterprise becomes even more challenging. When an enterprise derives its business opportunities from research and development into new technologies or improvements of existing technologies, maintaining consistent usage of technology designations is a challenge, especially when there is disagreement or confusion about the uses, advantages or benefits of a particular technology. Such confusion can arise whether disagreements arise or not, as when there is no adequate or timely communication between different teams within an enterprise.

The World Wide Web is one communication medium that exacerbates the problem, by showing internal information to the enterprise's partners and customers. Large enterprises that own or operate complex Web sites or other network resources that contain product and technology information face a related problem. Specifically, ensuring consistent usage of product names and technology terms across a large, complicated Web site is problematic. A particular problem involves maintaining consistent use of terms when different parts or elements of the Web site applications are created or content is authored by different individuals or groups.

Based on the foregoing, there is a clear need for improved ways to manage one or more vocabularies of all company business practices and pertaining to all business terminology ("concept"), including but not limited to product names and technology terms.

In particular, there is a need for a way to logically structure and correlate stored information about those concepts so that it can be located and navigated easily regardless of who authored the information and where the information physically resides.

There is also a need for a system that can rapidly and efficiently select vocabulary concepts and related information from among a large volume of stored information that is inter-related by overlapping hierarchies, and deliver the selected information to another system for use in assembling electronic documents based on the selected information.

There is also a need for a way to deliver and replicate information distributed over one or more networks that is relevant to a user query based on the vocabulary information to individuals who are distributed among many groups of a large enterprise, or who are outside the enterprise.

There is also need for a logically oriented and related content management and deployment system that is extensible or adaptable when new business practices, products or technologies are developed by diverse, distributed groups in a large business enterprise.

SUMMARY OF THE INVENTION

The foregoing needs, and other needs and objects that will become apparent from the following description, are achieved in the present invention, which comprises, in one aspect, a method of relating data stored in one or more storage systems for an enterprise. The storage systems may comprise a database, content management system, file system, directory, etc. The method includes managing a plurality of information objects ("chunks") in one or more storage systems. Each chunk of the plurality of information chunks comprises a unit of data for storage, composition, correlation and retrieval operations. A series of vocabularies, relationships, and attributes are also managed. The vocabulary system includes data structures describing atomic concepts among names in an enterprise-specific vocabulary, and a plurality of data structures describing relationships and/or attributes among the atomic concepts. The data structures describing and relating atomic concepts include a first information object vocabulary having data indicating a first reference to a first chunk in the storage system. The data structures describing relationships include a first relationship between the first information object and a second concept of the atomic concepts.

In an embodiment of this aspect, the first chunk is one of a block of text, an application, a query for a database, a vector graphic, an image, audio data, and video data.

In another embodiment of this aspect, the first reference comprises one of a file name, a network resource address, a universal resource locator (URL) address, a record identification in a predetermined database, a record identification in a predetermined content management system, an object in a cache, a Web service, or an application.

In another embodiment of this aspect, the method includes generating and storing a first subset of the plurality of information chunks into a first content cache for managing content of an electronic document.

In another aspect, the invention comprises a method for arranging content for an electronic document. The method includes managing information chunks in a content cache. Each chunk is retrieved by a directory address. The method also includes managing data structures describing atomic concepts among names in an enterprise-specific vocabulary and data structures describing relationships among the atomic concepts in a concept cache.

Content on the electronic document is arranged based at least in part on data in the concept cache.

In another aspect, a method for relating data stored in one or more storage systems for an enterprise includes managing information chunks in one or more storage systems. Each chunk comprises a unit of data for storage, execution, and retrieval operations. The method also includes managing a vocabulary database. The vocabulary database includes data structures describing atomic concepts among names in an enterprise-specific vocabulary and data structures describing relationships among the atomic concepts. The content in a document is based at least in part on an information chunk in the content management system. The content is arranged in the document based at least in part on data in the vocabulary database; in one embodiment, all content is based on the vocabularies.

In one feature, the vocabulary database includes data structures describing atomic concepts among names in an enterprise-specific vocabulary and data structures describing relationships among the atomic concepts. A first subset of the information chunks is generated and stored into a first content cache for managing content of an electronic document. A second subset of the first content cache is generated and stored into a second content cache for staging content for the electronic document that can be replicated to one or more Web servers. The data structures describing atomic concepts include a first information object indicating a first reference to a first chunk in the storage system. The data structures describing relationships include a first relationship between the first information object and a second concept of the atomic concepts.

In other aspects, the invention encompasses computer readable media, and systems configured to carry out the foregoing steps.

In one aspect, this invention allows content originally unrelated and authored over time by many different persons and organizations to be related using the business vocabulary concepts and relationships in the vocabulary database.

In another aspect, this invention allows streamlined subsets of the information chunks in the content management systems and the concepts and relationships in the concept database of a vocabulary development server to be formed and managed, including being viewed and edited, to expedite the dynamic production of documents such as Web pages for a Web site.

According to other features, each information object has one or more n-ary relationships with atomic concepts in the concept vocabulary, giving maximum flexibility in using and composing larger objects. In addition, these relationships make possible other useful functions, such as distributed caching and storage, among other capabilities.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1 is a block diagram that illustrates a hypothetical product type hierarchy according to one embodiment

FIG. 2A is a block diagram that illustrates a networking solutions hierarchy including one or more concepts from the product type hierarchy of FIG. 1 according to one embodiment;

FIG. 2B is a block diagram that illustrates a non-binary relationship among concepts according to one embodiment;

FIG. 3 is a block diagram illustrating simultaneous multiple inter-related hierarchies involving a product type concept according to one embodiment;

FIG. 4A is a block diagram illustrating a vocabulary development server and external applications according to one embodiment;

FIG. 4B is a block diagram illustrating a creation layer of an information object repository and a resulting Web site according to one embodiment;

FIG. 4C is a diagram of a binary tree representation that can be modeled using one or more data structures stored in computer memory;

FIG. 4D is a diagram of a class hierarchy of an example object-oriented model;

FIG. 4E is a diagram of a data representation schema;

FIG. 4F is a block diagram of an example architecture of the VDS;

FIG. 4G is a diagram illustrating relationships among an access control list and nodes of a tree of the type shown in FIG. 4C;

FIG. 4H is a block diagram of a class hierarchy that may be used to implement an event mechanism, in one embodiment;

FIG. 5 is a block diagram that illustrates relationships involving a particular information object and other concepts in the vocabulary database.

FIG. 6A is a flow chart illustrating a method for managing an information object repository by generating and storing an information object according to one embodiment;

FIG. 6B is a flow chart illustrating a method for managing an information object repository by retrieving an information object according to one embodiment;

FIG. 6C is a flow chart illustrating a method for managing an information object repository by retrieving information content associated with an information object according to one embodiment;

FIG. 7 is a block diagram illustrating a management layer, a staging layer, and a Web server layer of an information object repository according to one embodiment;

FIG. 8A is a flow chart illustrating a method for generating a static Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 8B is a flow chart illustrating a method for generating a concept home Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 8C is a flow chart illustrating a method for generating a concept information type Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 8D is a flow chart illustrating a method for generating a concept document Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 8E is a flow chart illustrating a method for generating a concept search result Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 8F is a flow chart illustrating a method for generating an information chunk Web page based on the Web server layer of the information object repository according to one embodiment;

FIG. 9A is a flow chart illustrating a method for generating and managing a management layer of the information object repository according to one embodiment;

FIG. 9B is a flow chart illustrating a method for generating and managing a staging layer of the information object repository according to one embodiment;

FIG. 9C is a flow chart illustrating a method for preparing a Web site in a staging layer of the information object repository according to one embodiment; and

FIG. 10 is a block diagram that illustrates a computer system upon which an embodiment may be implemented.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A method and apparatus for storing business vocabulary data using multiple inter-related hierarchies are described. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.

1.0 Business Vocablary Data Processing

Business vocabulary terms are used to name products, product lines, technologies, people, processes, development efforts and other business activities of an enterprise. Some of the vocabulary terms are used only internally and some are used for interaction with the public to establish brand name recognition or to support precise communication of customer interests and orders. Terms related in meaning or form are used to associate related business products and activities in the minds of the users of those terms. For example, a device sold by an enterprise might be named Perseus, after a hero of Greek mythology, and a software program for executing on that device might be named Pegasus, after the winged horse Perseus rode. Similarly, different models of the Perseus device might be called AlphaPerseus and BetaPerseus, to show they are part of the same product line, while different versions of each model may be numbered, such as BetaPerseus 2.0 and BetaPerseus 2.4.

The present invention is based in part on a recognition that the business terms of an enterprise constitute an important type of business data that should be included in the automated data processing that the enterprise performs. This vocabulary data about the products, services and activities of a business is a form of metadata for the products, services and activities of the enterprise. Those terms can be used to categorize the products, services and activities and to retrieve other data about those products, services and activities. The data structures employed to store, retrieve and process this metadata should account for the associations in meaning and form and support rapid associative or inferential search and retrieval.

2.0 Vocabulary Development Framework

According to the present invention, the various terms that constitute the business vocabulary of an enterprise are modeled as nodes in a hierarchy called the MetaData Framework (MDF) or the Vocabulary Development Framework (VDF). In this framework, any business term that is derived from another particular business term is positioned in the hierarchy at a node that branches from the node of that particular business term from which it is derived. When the hierarchy is embodied in stored data with appropriate data structures and software programs, it is extremely useful in naming products and associating products with product lines.

For example, FIG. 1 shows a hypothetical product type hierarchy for a hypothetical enterprise that manufactures and sells network devices. In this hierarchy, node 102 is a root node representing network device products sold by the enterprise. Node 102 has three child nodes, 112, 114, 116 that are connected by arrows 105. The parent/child relationship is denoted by an arrow pointing from parent to child in FIG. 1. A relationship statement can be obtained reading from arrow head to arrow tail by the words "is a child of" or read in the opposite direction by the words "is a parent of." Thus node 112 is a child of node 102. Node 102 is a parent of node 112. In the product type hierarchy of FIG. 1, arrow 105 represents the product type parent/child relationship.

Node 112 represents the devices named "Perseus." In this embodiment, the name of node 112 includes "Perseus." Nodes 114, 116 represent devices named "Hercules" and "Jason," respectively. FIG. 1 shows that the Perseus device comes in three models, "AlphaPerseus," "BetaPerseus" and "GammaPerseus," represented by the three nodes 122, 124, 126, respectively. The BetaPerseus model has evolved over time through versions 1.0, 2.0 and 3.0, represented by nodes 132, 142, 154, respectively. The names of these nodes are "BetaPerseus 1.0," BetaPerseus 2.0," and "BetaPerseus 3.0," respectively. BetaPerseus 2.0 also experienced some evolutions called "BetaPerseus 2.4" and "SuperPerseus," which are represented by nodes 152, 162, respectively.

This hierarchy consists of binary relationships; that is, each relationship requires one parent and one child. The product type relationships of FIG. I are constrained by a rule that each child may have only one parent. There is no rule restricting the number of children a parent may have in this hierarchy.

Various applications use the information in the VDF implementation to perform different functions for the enterprise. In one application, the VDF relationships in the illustrated hierarchy are used to determine that the product named "SuperPerseus" is actually a version of the BetaPerseus model that is based on version 2.4. In another application, the VDF names are used to help provide names for products as new products are developed by automatically including the product type and model name and by preventing the re-use of an existing version number. Embodiments of this application enforce a rule that each name shall be unique. The enterprise uses the VDF with other embodiments of such an application to enforce other naming rules, such as requiring the model name shall be part of the device name. In this case the ambiguous name "SuperPerseus" is not allowed, and is discarded in favor of the automatic name, "BetaPerseus 2.5", or some allowed variation of that, which is stored as the name of node 162.

The vocabulary data framework (VDF) captures simultaneous multiple relationships among names, products, solutions, services, documentation and activities for an enterprise. In particular, the VDF allows other relationships to be established between nodes simultaneously with the product type relationship. Furthermore, the VDF allows any of these new relationships to involve more than the two nodes of the binary parent-child relationship already described. For example, it allows a trinary relationship among a father node, a mother node, and a child node. In general, the VDF allows N-ary relationships among nodes, where N is any integer equal to or greater than one and specifies the number of participants in the relationship.

In the more general realm of the VDF, the enterprise is considered a data domain that includes many atomic concepts that may be related. Atomic concepts include any data item involved in the enterprise that is not subdivided into separately referenced storage units. These atomic concepts include the business vocabulary for the enterprise data that is the subject of the present invention. Concepts include product type names, as in the above example, but also comprise names of projects and departments and references to paragraphs, chapters, documents, images, multimedia files, database records, database queries, network resources, citations, and network addresses, among other things. The concepts and relationships are captured in conceptual graphs which are organized primarily by a partial-order relationship, commonly known as a type hierarchy. The concepts are nodes in the graph and the relationships are connections between two or more nodes. Both concepts and relationships have enumerated characteristics in some embodiments.

The graph of FIG. 1 is an example of a conceptual graph ordered by its product type hierarchy of binary (parent-child) relationships. Whereas this is one example based on a product type hierarchy, the VDF allows for simultaneous and inter-related multiple type hierarchies, as is explained in more detail in the following sections.

2.1 Multiple Hierarchies

As seen above in FIG. 1, concepts are related in a graph depicting product types. All the concepts in this graph are associated with one category of information in the enterprise data. That category is device product types, and that hierarchy relates concepts for products that are related in development history, structure or function. However, enterprise data may include other categories or relationships. In general, multiple categories encompass the enterprise data For example, some of the enterprise data for an enterprise that manufactures and sells network devices are related to equipment solutions for common networking problems encountered by customers of the enterprise. Products of the enterprise that are unrelated by the hierarchy of FIG. 1 nevertheless may be useful to solve the same kind of customer problem. Thus, such products relate to the same solution. To reflect these relationships, enterprise data also are placed in a category called networking solutions in one embodiment, and are organized in a solutions hierarchy that exists concurrently with the product type hierarchy.

FIG. 2A depicts an example hierarchy of concepts in a networking solutions category. In this example, three solutions expressed by the concepts "single server local net," "wide area net (2 sites)" and "private wide area net (3 to 8 sites)" are stored in the data structures representing nodes 212, 214, 216, respectively. All three nodes are children of the root node 202 having name "networking solutions" for this category of concepts. In the solutions type hierarchy of FIG. 2A, arrow 205 represents a networking solutions parent/child relationship. All the relationships represented by arrows in FIG. 2A are of this type. This relationship type differs from the product type parent/child relationship represented by arrow 105 of FIG. 1. Both relationship types are parent/child binary relationships, but they relate concepts in different categories.

As shown in the example of FIG. 2A, the product GammaPerseus, at node 232, is part of the equipment solution for single server local networks of node 212. Both AlphaPerseus, at node 234 and Jason at node 235 are part of the equipment solution for wide area networks connecting two sites, at node 214. BetaPerseus 2.0, at node 236, and Hercules, at node 237, are part of the equipment solution for private wide area networks connecting three to eight sites represented by node 216. Nodes 242 and 244 represent software products Pegasus 3.3 and a graphical user interface (GUI) upgrade that are installed on the BetaPerseus 2.0 device in addition to the default software that comes with that device.

The concepts at nodes 202, 212, 214, 216 may be placed in a category called networking solutions. The concepts 232, 234, 235, 236, 237 have already been placed in a category called enterprise device products; but they may also be placed in the category networking solutions. The concepts at nodes 242, 244 may be placed in a category called software products and also in the networking solutions category. FIG. 2A demonstrates that hierarchies of concepts in categories of enterprise data may be defined in addition to the hierarchy of concepts in the product type category, and demonstrates that categories may overlap.

Alternatively, non-overlapping categories are used in other embodiments. In such an embodiment, the relationship represented by arrow 205 is expressed as a relationship of a subcomponent to a component of a networking solution, in which the sub-component may be a different category than the component. Rules can be expressed for the relationship. One possible rule is: software can be a sub-component of hardware, but not the other way around. Similarly, a product can be a sub-component of a networking solution category but not the other way around.

2.2 Non-binary Relationships

FIG. 2B depicts a conceptual graph of an example non-binary relationship. This ternary relationship (also called a 3-ary relationship or three participant relationship) is useful for capturing the expertise of a person in the use of a product in a technology area. In this example, this relationship is used to state whether the expertise of a technician in the use of a product device within a technology area is of a quality that can assume values of "unknown," "poor," "average," "good," or "excellent."

The characteristics of the relationship type describe the number of participants and their category or categories. In this example the relationship type includes characteristics that indicate there are three participants, one from the user category, one from the technology category and one from the product device category. In addition, the characteristics of this relationship include at least one relationship value for storing the quality of expertise (unknown, poor, average, good, excellent). More details on defining and storing concepts and relationships are given in a later section.

The conceptual graph of this relationship in FIG. 2B shows three nodes 282, 284, 286 representing the three concepts, e.g., product BetaPerseus 2.0, technology private wide area network, and technician Jane, respectively. The three nodes are connected by a three-way, nondirectional link 290. The link 290 includes an attribute named "quality" that takes on a value such as "good," indicating that Jane's expertise is good for using BetaPerseus 2.0 in private, wide area networks.

2.3 Documentation Category

Another category of concepts that is extremely useful to an enterprise, for both internal and external users, is documentation concepts, which encapsulate elements of electronic or tangible documents. Concepts within a documentation category include headings, sections, paragraphs, drawings, images, information type, and document type, among others. Information type concepts express the type of content in terms of what it says; for example, information type concepts include but are not limited to "Introduction," "Features & Benefits," "Product Photo," "External Article Section" etc. Documentation concepts may be organized in a document type hierarchy that facilitates automatically generating accurate, complete, up-to-date visual or printed documentation pertaining to a particular product or service. Document type hierarchies include, for example, "Data Sheet," "Product Home Page," "Press Release," "Operator's Manual," and "External Article." For example, a device, like the hypothetical Beta Perseus 2.0, can be linked by a relationship to a document type hierarchy describing the device, such as a "Perseus 2.0 Operator's Manual." As another example, a device, like the Beta Perseus 2.0, can be linked by a relationship to a section concept in a document type hierarchy describing the networking solutions of which the device is a component, such as a "Small Business Networking Press Release." More examples of document categories of concepts are given in a later section.

2.4 Multiple Inter-related Hierarchies

As seen in the above examples, a single concept, such as the device product BetaPerseus 2.0 may appear in several separate hierarchies. According to one embodiment, information defining the concept is stored only once in the VDF and relationships are defined to all other nodes to which the concept is adjacent in all the hierarchies.

Hierarchies may be implemented using a variety of programming techniques and data storage. One advantage of this approach is that changes to the concept can be made in only one location in the VDF and all hierarchies immediately become up-to-date and reflect the changes. Further, all information generated based upon the hierarchies, such as documentation or screen displays, automatically reflects the changes.

Another advantage is that applications that retrieve the data can navigate one of the hierarchies to a particular concept and then immediately find the other hierarchies in which that concept occupies a node. Thus, a customer who has purchased a particular device product for one networking solution can determine other solutions that use that same device. The customer follows the current solution to the product and then reviews the relationships with other networking solutions of interest to the customer that utilize the device. When a networking solution of interest is found using the device, the newly found solution can be navigated above and below the node representing the device concept in order to determine what software and other devices, if any, are components and sub-components of the new solution. Further, the customer can search by solution and identify multiple products that can satisfy the solution. The customer can then inspect each of the products, obtain its documentation, and determine which product is best suited to the customer's particular needs. In some embodiments, such information is synchronized with the customer's online profile so that it is available for later reference and can be personalized.

FIG. 3 is an example of a conceptual graph for multiple inter-related hierarchies that are associated with the device product BetaPerseus 2.0 , based on the individual hierarchies and relationships of FIG. 1, FIG. 2A and FIG. 2B. The branch of the device product type hierarchy of FIG. 1 that includes the BetaPerseus 2.0 device concept appears as nodes 302, 304, 306, 308, 390, 310 and 312 linked by the device product type, binary parent/child relationships 301. The branch of the device networking solutions hierarchy of FIG. 2A that includes the BetaPerseus 2.0 device appears as nodes 322,324, 390, 332 and 334 linked by the networking solutions type, binary parent/child relationships 321. The 3-participant expertise relationship 391 links the node 390 for the BetaPerseus 2.0 to the concept "Jane" at node 346 and the concept "private wide area networks" at node 356. Also shown is that the concept "Jane" at node 346 is a child of the concept "technicians" at node 344 which is a child of the concept "users" at node 342. These nodes are linked by user type, binary parent/child relationships represented by arrows 341. Also shown is that the concept "private wide area networks" at node 356 is a child of the concept "wide area networks" at node 354 which is a child of the concept "technologies" at node 352. These nodes are linked by technology type, binary parent/child relationships represented by arrows 351.

The BetaPerseus 2.0 concept at node 390 is linked to the following nodes in multiple inter-related hierarchies. The BetaPerseus 2.0 concept at node 390 is a product type child of the BetaPerseus 1.0 concept at node 308, as represented by arrow 301d. The BetaPerseus 2.0 concept at node 390 is a product type parent of the BetaPerseus 2.4 concept at node 310, as represented by arrow 301e, and the BetaPerseus 3.0 concept at node 312, as represented by arrow 301f. The BetaPerseus 2.0 concept at node 390 is further a solutions type sub-component of the private wide area net (3 to 8 sites) concept at node 324, as represented by arrow 321b. The BetaPerseus 2.0 concept at node 390 has solutions type sub-components of the Pegasus 3.3 software tools concept at node 332, as represented by arrow 321c, and the management software GUI upgrade concept at node 334, as represented by arrow 321d. The BetaPerseus 2.0 concept at node 390 has two companion expertise type participants as represented by link 391; one at Jane represented by node 346 and one at private wide area networks represented by node 356. In all, the example concept at node 390 has 6 binary relationships and one ternary relationship with eight nodes in four hierarchies (product type, equipment solutions, users and technologies). Each of the concepts and relationships may be represented using stored data in a database or appropriate programmatic data structures.

Many of the other nodes in FIG. 3 may have relationships with other hierarchies in addition to the relationships shown. These other relationships are omitted so that FIG. 3 and this discussion are more clear. Multiple relationships similar to the examples listed for node 390 may be defined for these other nodes.

2.5 Root Concepts

At the top of each hierarchy for each category is a category root node representing the category root concept from which all the other concepts in the category branch. For convenience in navigating from one category to the next, each of the category root nodes is made a child of an enterprise data root node representing a top-level pseudo-concept for the enterprise data. In one embodiment, the pseudo-concept is "Vocabulary," and every node related to the Vocabulary concept by a direct "child of" relationship is a root node representing a root concept for one category.

2.6 Implementation of the VDF

One embodiment uses a rule-base and declarative computation approach to express the concepts, relationships and rules of the VDF. This approach may be implemented using a high level computer programming language. In one embodiment, the approach is implemented using a logical processing language such as PROLOG™. The high level logical processing language translates statements declaring types and statements expressing rules about combining types into another language, such as the C programming language, that can be compiled and run on a large variety of general-purpose computer platforms.

In this embodiment, the concepts, relationships, attributes and logical implications (including integrity constraints and general computations) are expressed as logical assertions. There are two kinds of logical assertions, facts and rules. A fact is a logical assertion that is considered unconditionally true. A rule is a logical assertion whose truth or lack of truth depends on the truth or lack thereof of other assertions. In this implementation, concepts, relationships and attributes are generally represented as facts, whereas logical implications are represented using rules.

2.6.1 Defining Concepts

For example, in one embodiment, a statement declaring that the phrase BetaPerseus 2.0 is a concept is presented in a high level logical processing language by the expression:

    • ('BetaPerseus 2.0', isConcept)
      Similar expressions are used to enter the other concepts in the vocabulary.


  • The concept may have several attributes besides the phrase that defines it. For example the concept may have a creation date and an author. Attributes of a concept are presented with the following expression:
    • ('BetaPerseus 2.0', 'creation', 'Sep. 19, 2000', 'author', 'John Smith')
      2.6.2 Defining Relationships


  • The relationships that constitute a hierarchy connect one concept to one or more other concepts. Relationships are defined with the following expression:
    • (r('ConceptX', 'ConceptY', 'ConceptZ'), relationship (rID))
      where r is a name for the relationship type, ConceptX, ConceptY and ConceptZ are the three concepts related by this statement, making the relationship r a ternary relationship, and this particular relationship has a unique relationship identification number rID. To ensure uniqueness, the value of rID is supplied when the relationship is defined by the system performing the logical processing. Using this expression, the "product type child of" relationship can be defined by the statement:
    • (product_child_of ('BetaPerseus 2.0', 'BetaPerseus 1.0'), relationship (rID2)).
      According to this statement, the relationship rID2 links BetaPerseus 2.0 to BetaPerseus 1.0 by a relationship of relationship type "product_child_of."


  • The ternary relationship of FIG. 2B is defined, after each of the individual concepts are defined, by the expression:
    • (expertise('BetaPerseus 2.0', 'Jane', 'wide area networks'), relationship (rID3).
      According to this statement, the relationship rID3 links the concept BetaPerseus 2.0 with the concept 'Jane' and the concept 'wide area networks' by a relationship of type "expertise."


  • Similarly, a marketing document stored as a Web page on a network and identified by its universal resources Locator (URL) address 'http:///www.Enterprise.com/literature/devices/catalog/Chap2/' is related to the concept 'BetaPerseus 2.0' by the expression:
    • (marketDoc('BetaPerseus 2.0', 'http:///www.Enterprise.com/literature/devices/catalog/Chap2/'), relationship (rID4))
      The system returns a unique value for rID4, which is used to reference this particular relationship of type marketDoc in later statements.


  • The relationships defined above can also be given attributes according to this embodiment. Typical relationship attributes include the author of the relationship and the date the relationship is created. These attributes are set for a relationship having an unique identification of rID1 with the expressions:
    • (rID1, 'creator', 'John Dow')
    • (rID1, 'date', 'Oct. 10, 2000').
      Relationships may have other attributes. For example, the expertise relationship defined above has an attribute for the quality of the expertise, which, in the instance of Jane on wide area networks for the BetaPerseus2.0, is good. This attribute is expressed in this embodiment as follows
    • (rID3, 'quality', 'good')
      where rID3 is the unique identification for the expertise relationship among Jane, BetaPerseus 2.0 and wide area networks returned by the system when the relationship was created, as described above.


  • A relationship can also be defined for other relationships. For example, a relationship of type "revision" is used to track changes in another relationship.
    • (revision (rID5, rID6), relationship (rID7))
      The use of the revision relationship is illustrated in the following. If the marketing document for the BetaPerseus 2.0 is changed to a different URL, 'http://www.Enterprise.com/Hello/Chap2/', a new relationship is formed by the statement
    • (marketDoc('BetaPerseus 2.0', 'http://www.Enterprise.com/Hello/Chap2/'), relationship (rID8))
      To show that his new relationship with identification rID8 is just a revision of the old relationship with identification rID4 (see above), the revision relationship type is used as follows:
    • (revision (rID4, rID8), relationship (rID9))
      Now, relationship rID9 associated with old relationship rID4 can be used to determine the new relationship rID8 that replaces the old relationship rID4.
      2.6.3 Defining Rules


  • The hierarchies that relate concepts may have to follow certain rules. For example, as stated above, the product type hierarchy requires that a child have only one parent. These rules are enforced using logical constraints defined in a high level logical processing language as rules. A constraint that detects multiple parents in a set of expressions in the high level logical processing language of one embodiment is given by the expression:
    • (constraint(ConceptC, multiparent (ConceptP1, conceptP2)))
      • if(ConceptC, childOf, ConceptP1) , (ConceptC, childOf, ConceptP2),
      • ConceptP1˜=ConceptP2.
        which reads, ConceptC has multiple parents ConceptP1 and ConceptP2 if ConceptC is a child of ConceptP1 and ConceptC is a child of ConceptP2 and ConceptP1 is not equal to ConceptP2. A statement is inserted which throws an error if the multiparent constraint is detected.


  • Another example of a rule that is enforced in the high level logical language as a constraint is the rule that every concept must be a descendent of a root concept. As described above, a root concept is a concept that is a child of the pseudo concept "Vocabulary." A concept is a descendent of the concept Vocabulary if the concept Vocabulary is reachable from the concept by a succession of one or more "child of" relationships. If the concept Vocabulary cannot be reached from a given concept, then the given concept is an orphan concept. Orphan concepts are a violation of the rules for the product type hierarchy and generally result from errors in concept definitions or are introduced when a parent concept is deleted from the hierarchy. This constraint depends on a definition of "reachable." Reachable is defined as follows:
    • (reachable(ConceptX,ConceptY)) if (ConceptX, childOf, ConceptY)
    • (reachable(ConceptX,ConceptY)) if (reachable(ConceptX,ConceptW)),
      • (reachable (ConceptW,ConceptY))
        which reads, ConceptX reaches ConceptY either if ConceptX is a child of ConceptY or if there is a ConceptW such that ConceptX reaches ConceptW and ConceptW reaches ConceptY. The constraint is then expressed as follows:
    • (constraint (ConceptC, orphanConcept)) if˜(reachable(ConceptC,'Vocabulary'))
      which reads, ConceptC is an orphan concept if ConceptC does not reach the pseudo concept "vocabulary." A statement is inserted which throws an error if the orphanConcept constraint is detected.


  • As discussed above, the example expressions presented in this section are processed by the high level logical processing system to generate code, such as C language code, that implements the concepts, relationships and constraints defined in these expressions. The C language code can then be compiled and executed on any computer system with a C compiler. Further, the C language code can be incorporated in other application programs or compiled into libraries having functions that are called from separate application programs.

    3.0 Vocabulary Database

    A vocabulary database provides persistent storage for the concepts, relationships, and rules of the vocabulary data framework for the enterprise data.

    One embodiment uses a relational database to store the concepts and the relationships among concepts and the rules; however, any suitable data store can be used. In one specific embodiment, a cached data store is used. A relational database uses a schema to describe a series of tables each made up of one or more rows, each made up of one or more fields. The schema names the table and the fields of each row of the table. An example relational database schema to implement the VDF according to one embodiment is described below. In some embodiments the relational database includes a unique row identification number (rowID) for each row in each table.

    In this embodiment, a vocabulary table includes a row for each root concept in the VDF. The fields of each row include the concept name, the concept description and the creation date, as shown in Table 1. A unique rowID may also be included in each row but is not shown in the example tables. Example root concepts are included in several rows of Table 1.
    TABLE 1
    The Vocabulary Table
    Root Category Name Description Creation Date
    Product Product category 4/12/2000
    User User category 4/12/2000
    Technology Technology Category 5/15/2000
    Solution Networking Solutions 1/01/2001
    Category


    Each root concept in the vocabulary table has its own table comprising one row for every concept within the category. All concepts that are descendants of the root concept via the "child of" relationship are stored in the table defined by the root concept. Table 2 is an example Table for the Product root concept.
    TABLE 2
    The Product Category Table
    Name Description Creation Date
    Network Device Products Enterprise devices 4/12/00
    Perseus router product 4/12/00
    Hercules gateway product 4/12/00
    Jason hub product 4/12/00
    AlphaPerseus router product 4/12/00
    BetaPerseus router product 6/16/00
    BetaPerseus 1.0 router product 6/16/00
    GammaPerseus router product 9/19/00
    BetaPerseus 2.0 router product 9/19/00
    BetaPerseus 2.4 router product 12/12/00 
    BetaPerseus 3.0 router product 1/01/01
    SuperPerseus router product 2/01/01


    Several tables are employed to store relationships. These tables support N-ary relationships. The relationship type table holds one row for each relationship type, as illustrated in Table 3 for some sample relationship types described above. The table rows include fields for the name of the relationship type, as used in the high level language or conceptual graphs, a fuller description of the relationship, the number of participants and the creation date.
    TABLE 3
    The Relationship Types Table
    Relationship Type Number of Creation
    Name Description Participants Date
    product_child_of product lineage 2 4/12/2000
    solution_child_of solution lineage 2 4/12/2000
    user_child_of user categories 2 4/12/2000
    technology_child_of technology lineage 2 4/12/2000
    expertise expertise of person 3 1/01/2001
    with product in
    technology
    martketDoc Marketing document 2 9/19/2000
    for product
    revision track revisions in 2 2/01/01
    concepts/relationships


    The participant type table holds one row for each participant type in a relationship type, as illustrated in Table 4 for the example relationships of Table 3. This table has a row for each participant of each relationships type. Each row has fields for the name of the relationship type, the role of the participant in the relationship, and the participant type, which is the category of the concept that may fill the given role in the relationship type.
    TABLE 4
    The Participant Types Table
    Relationship Name Role Participant Type
    product_child_of child Product
    product_child_of parent Product
    solution_child_of child Networking Solution/Product
    solution_child_of parent Networking Solution/Product
    user_child_of child User
    user_child_of parent User
    technology_child_of child Technology
    technology_child_of parent Technology
    expertise person User
    expertise product Product
    expertise technology Technology
    marketDoc product Product
    marketDoc document Document
    revision old version Vocabulary/relationshipID
    revision new version Vocabulary/relationshipID


    The relationship instance table (Rinstance table) and the participant instance table (Pinstance table) have entries for every instance of the relationships as they are defined for the enterprise data. An example Rinstance table is shown in Table 5 and an example Pinstance table is shown in Table 6, for some of the relationships described above. When a particular relationship is defined between two or more concepts, a new relationship identification (rID) is generated. In one embodiment the particular relationship ID, rID, is the unique rowID corresponding to the next row in the Rinstance table.
    TABLE 5
    The Relationship Instance (Rinstance) Table
    rID Relationship Type Name Creation Date
    5000 product_child_of 9/19/2000
    5001 marketDoc 9/19/2000
    5002 product_child_of 9/19/2000
    5003 expertise 9/19/2000
    5004 marketDoc 9/20/2000
    5005 revision 9/20/2000


    When a "product child of" relationship is created between the BetaPerseus 2.0 and BetaPerseus 1.0 on Sep. 19, 2000, an entry is made into a row of Table 5 and a unique rID of "5000" is generated by the system. Then two rows are added to Table 6 for the two concepts that participate in the "product child of" relationship that has just been added to Table 5. Those two rows each list in the rID field the rID value of "5000" generated for this relationship. One row is generated in Table 6 for the concept BetaPerseus 2.0 in the participant role of child for rID "5000." A second row is generated in Table 6 for the concept BetaPerseus 1.0 in the participant role of parent for rID "5000."
    TABLE 6
    The Participant Instance (Pinstance) Table
    rID role Participant
    5000 child BetaPerseus 2.0
    5000 parent BetaPerseus 1.0
    5001 product BetaPerseus 2.0
    5001 document http:///www.Enterprise.com/literature/devices/
    catalog/Chap2/′
    5002 child BetaPerseus 2.4
    5002 parent BetaPerseus 2.0
    5003 person Jane
    5003 product BetaPerseus 2.0
    5003 technology private wide area net
    5004 product BetaPerseus 2.0
    5004 document http:///www.Enterprise.com/Hello/Chap2/
    5005 old version 5001
    5005 new version 5004


    On the same date, in this example, the new product is related to its marketing document with the marketdoc relationship that gets rID "5001." Its participants are listed in Table 6 on the two rows having rID "5001." Later that day a new product_child_of relationship is generated for BetaPerseus 2.4 and receives rID "5002." Its participants are listed in the two rows of Table 6 with rID of "5002." Then the expertise relationship of Jane using the BetaPerseus 2.0 in private wide area networking is established on the same day and gets an rID of "5003." The three participants of that relationship are added to Table 6 in the three rows with an rID value of "5003." The next day, on Sep. 20, 2000, a new marketing document is associated with the product by generating a new marketdoc relationship that receives the rID of "5004." The product and document participants are added to Table 6 in the rows showing an rID value of "5004." Finally, the revision of the marketing document is memorialized with the revision relationship, which receives an rID of "5005." The two participants of the revision relationship are added as two rows to Table 6 having an rID value of "5005" in Table 5. The two participants are the old marketdoc relationship rID of "5001" and the new marketDoc relationship rID of"5004." Though participants are listed in Table 6 with increasing values in the rID field, it is not necessary that the value of rID increase monotonically for the system to operate.

    The "is a" relationship is a common relationship that also could be represented with entries in the Relationship Type, Participant Type, Relationship Instance and Participant Instance tables. However, better performance is achieved if all instances of an "is a" relationships are placed in an "Is_A" table. For one embodiment, an example Is_A table is shown in Table 7. For this example, all "product child of" relationships are kept in this Is_A table.
    TABLE 7
    Is_A Table.
    Concept Name Parent Concept Creation Date
    Enterprise Network Device Product 4/12/2000
    Product
    Perseus Enterprise Network Device 4/12/2000
    Product
    AlphaPerseus Perseus 4/12/2000


    Attributes of concepts and relationships beyond those already included in the above tables are kept in one or more attributes tables. In one embodiment, all these additional attributes of concepts are kept in a single concepts attributes table. Similarly, all the additional attributes of relationships are kept in a single relationships attributes table. Table 8 is an example concepts attributes table for the example concepts described above.
    TABLE 8
    Concepts Attributes Table.
    Concept Name Attribute Name Attribute Value
    BetaPerseus 2.0 author John Smith


    Table 9 is an example relationships attributes table for the example relationships described above. The expertise relationship was described above to include an attribute called "quality" for indicating the quality of the expertise using one of the values "unknown," "poor", "average," "good," and "excellent" This relationship type occurred in the relationship having rID of 5003 as shown above in Table 5. Therefore the corresponding entry in the relationships attributes table is given in Table 9.
    TABLE 9
    Relationships Attributes Table.
    rID Attribute Name Attribute Value
    5003 quality good


    The rules that express general computations and constraints on the relationships are also stored in tables. In this embodiment, the rules are stored as text for the high level logical processing language. In this way, the stored rules can be imported directly into a rules engine program of the high level logical processing system. Table 10 is an example rules table including the reachable rule described above.
    TABLE 10
    Rules Table
    Rule Rule Statement
    Name Sequence Number Rule Statement
    reachable 1 reachable (ConceptX, ConceptY) if
    (ConceptX, childOf, ConceptY)
    reachable 2 reachable (ConceptX, ConceptY) if
    reachable (ConceptX, ConceptW),
    reachable (ConceptW, ConceptY)


    One embodiment of the VDF allows multiple concepts from different concept categories to have the same name. The duplicate names are converted to unique identifiers called DupIDs and the unique identifiers are used in the concept database. The duplicates table is used in the conversion process. Table 11 is an example duplicates table for an embodiment in which a product concept and a technology concept both use the name Perseus. In this case, the name inserted into the second row of Table 2 above would be "1234" instead of "Perseus."
    TABLE 11
    Duplicates Table
    DupID Name Category
    1234 Perseus Product
    2789 Perseus Technology


    One embodiment of the VDF also allows raw terms to be stored in the database. Raw terms are words or phrases that may become a concept at a later time. Raw terms can originate from a wide variety of sources, such as a trade journal article reviewing a product or a customer order. The raw terms are stored in this embodiment in a dedicated table. Table 12 is an example raw term table.
    TABLE 12
    The Raw Terms Table
    Raw Term Creation
    Name Description Date Category
    SuperPerseus term for BetaPerseus 2.5 coined 12/12/2000 Product
    by Reviewer A. Newman
    P-Routers Term for Perseus routers in  9/25/2000 Product
    customer request from
    Company A

    4.0 Vocabulary Development Server

    The Vocabulary Development Server (VDS) is one or more processes that provide management of and access to the enterprise data in the vocabulary database to other processes in an enterprise data processing system. Herein, the vocabulary database is also called the VDS Concept Database.

    In the disclosed embodiment, the VDS includes several object-oriented application program interfaces (APIs). Several of the VDS APIs use function calls that are configured to allow client processes to interact with the database application without a need to know the organization of the database implementation. This allows modifications to be made to the database organization, such as adding relationships or adding or deleting levels to one or more hierarchies, without changing the client processes. All adjustments to changes in the database are accommodated in the VDS APIs.

    FIG. 4A is a block diagram showing the architecture of the VDS 410 and its relationship to some external processes. The VDS Concept database 420 is described above. A database access API 422 provides processes to operate on the database rows and tables based on knowledge of the database schema These processes include connecting to the database, starting a transaction, such as adding, deleting or modifying a row in a table, committing the change in the row to the persistent storage, aborting a transaction, and disconnecting from the database. The database access API 422 also provides processes for adding, deleting, and modifying a raw term in the raw term table.

    A database concept access API 424 provides processes for manipulating concepts, relationships and rules in the concept database without requiring knowledge of the actual database schema. For example, processes are included to return all the concepts in a given category, to generate and store a concept category, to add a concept to a category, to return sub-concepts (that is, concepts that are descendent of a given concept), to return child concepts, to return the parent concept of a given concept, to return ancestor concepts, to rename a given concept, to set the parent of a given concept, to delete a concept, and to return duplicate mapping. The database concept access API 424 also includes processes for manipulating relationships, such as to return all relationships, to return all relationship types, to return all "Is_A" relationships, to return all relationships of a given type, to generate and store a relationship type, to generate and store a relationship, to modify a participant or participant type in a relationship type, to modify a participant instance in a relationship instance and to delete a relationship. The database concept access API 424 includes processes for manipulating attributes, such as to return attribute information for all concepts in a given category, to set attribute information, to update attribute information, and to delete attribute information. The database concept access API 424 includes processes for manipulating rules, such as to return all rules in the rule table, to return all rules with a given name, to set the definition of a rule with a given name and sequence number, to generate and store a new rule with a given name and definition, to delete a given rule, and to delete rules with a given name.

    The VDS database concept access API 424 is used by applications that are external to the VDS 410, such as concept application 408, and servlet 403a of Web Server 402. The VDS database concept access API 424 is also used by other processes within VDS 410, such as the concept import module 426 and the concept export module 428, and the rule engine 430 of the concept access API 432. All elements of FIG. 4A that are shown outside of VDS 410 are shown by way of example, and are not required. Further, the structural elements of VDS 410 are shown as examples and the specific architecture shown is not required.

    The concept import module 426 is designed for the bulk import of a large amount of data, splitting that data into concepts, and storing the concepts in the concept database 420. The concept export module 428 is designed for the bulk export of a large number of related concepts and concept attributes to an external system, such as concept application 408, and client 404 or concept web application 406 through the Web server 402 via servlet 403b.

    The concept access API 432 provides processes for use by other applications that deal with groups of related concepts, or for responding to queries about concepts, relationships and rules that are received from external application programs. The API is used, for example, by the concept application 408 and servlet 403b of Web server 402 which are technically client processes of the VDS. Through network 401 and the Web server 402, a standalone client 404 such as a Web browser or a concept Web application 406 obtains and uses concept data. These are technically client processes of the Web server 402.

    The concept access API 432 groups related concepts based on the requests made by the client processes. The concept definitions and relationships are checked to determine that constraints are not violated. Rules that are employed to define the computations or constraints employed by the concepts and relationships are obtained from the concept database 420 through the database concept access API 424, are converted to executable statements, and are executed by the rule engine 430 of the concept access API 432.

    In one embodiment, the rule engine 430 is integrated with the concept access API 432 through the use of a foreign function facility of the PROLOG™ rule engine. This component provides service functions that enable the rule engine to access information, including rules expressed in text of a high level language, from the concept database 420 through the database concept access API 424. Rule execution functions can execute in the rule engine 430 the rules retrieved from the database 420. These functions marshal the function arguments (such as concepts/relationships/attribute) into the rule arguments, execute the PROLOG™ rule and retrieve any results, and un-marshal the rule results into a results set suitable for returning back to the client process, e.g., the calling application.

    In this arrangement the concept database can be continually updated with new concepts, new hierarchies, new levels in old hierarchies, new relationships between hierarchies, and new rules, without requiring changes in the applications such as concept application 408, Web server 402, standalone client 404, or concept Web application 406. Any changes dictated by changes in the database 420 can be accommodated by changes in one or more of the APIs of the VDS, such as database access API 422, database concept access API, and concept access API 432.

    4.1 Implementation of the VDF—Alternative Embodiment

    According to one embodiment, the VDF is implemented in the form of one or more software elements that programmatically observe the following rules. The desired attributes of the VDS are derived from the Ontology model wherein the real world Objects are modeled as atomic concepts and relationships among the concepts.
    • 1. A Concept is an atomic unit of a company's intellectual property, known as and represented by a Node.
    • 2. A Concept is a normalized name, that is, one and only space character separates the words. Example: "Book title", "Author_name", "Book_written_by".
    • 3. A Concept may have zero or more properties, known as attributes.
    • 4. An Attribute is a (Name, Value) pair, where Name cannot be duplicate in a Node.
    • 5. A set of concepts arranged in a hierarchy represents a taxonomy and may be represented logically or in memory as a Tree. The tree is composed with certain rules:
      • a. A tree has a root node, called Category node.
      • b. A Category is a concept and it has a name.
      • c. A Category cannot be duplicated in a system
      • d. A Category has a special node called Orphan Node. When a Concept node is deleted, the children Concept nodes are moved under the Orphan node. The Concepts under the Orphan Node are known as Orphans.
      • e. The category node is directly attached to the pseudo node, Vocabulary Node.
      • f. A tree should not have duplicate names/concepts. The requirement may be to allow case-sensitivity.
    • 6. A node should have one and only one parent node, except the pseudo node that does not have a parent node.
    • 7. A Concept in a hierarchy may inherit the properties from the parent node unless the property value is set in the node or the parent attribute(s) is not exposed to its children. The inheritance goes all the way up to the pseudo node, known as Vocabulary node.
    • 8. A Concept may have relations with one or more other concepts. The relationships may exist across other taxonomy.
    • 9. A relationship is a Concept whose name is assigned by the system, except for the root relationship nodes, known as Relation Types. A relationship is also referred as Relationship Instance or "Instance."
    • 10. A relationship is a special node that is in addition to having the qualities of a Concept; it has two or more references to other concepts, known as Relationship Participants (in short Participants).
    • 11. A Relationship participant has a Role name and a reference. The role name is simply an identifier for the reference. A Role name cannot be duplicate in a Relationship.
    • 12. A relationship has one and only parent, Relation type.
    • 13. A relation type is a root node for one or more relationship nodes and is a relationship node itself. So the participants for the relation type, known as type participants are the references to Category nodes and Relation type nodes. The relation type node is a template and dictates the possible participants for a relationship instance and the role names for the participant candidates. A relation type is another hierarchical taxonomy, mostly at single level.


  • The rules for creation and existence of a relationship are:
    • 1. The relationship must have participants from the taxonomy of Categories or Relation types that the relation type specifies.
    • 2. The role name of the instance participant should also match with the specification in the relation type.
    • 3. In a relation type, there should not be any two instances having the same set of participants.
    • 4. If a type participant is pseudo concept Vocabulary, then the relationship instances can have concepts from any taxonomy.
    • 5. A relation type cannot be duplicated in a system.
    • 6. When a Relation instance is removed, the node is simply removed from the Relation type.
    • 7. When a Relation type is removed, all the relation instances and the type are removed from the system.


  • When a concept is removed, the following processing steps occur:
    • 1. All the relationship instances that this concept is one of participants are removed.
    • 2. The hierarchies of nodes below this concept node are moved to the Orphan node of the Category node.
    • 3. The Concept node is removed from the system.


  • When a category is removed, the following processing steps occur:
    • 1. All the relationship instances and Relationship types that this category is one of participants are removed.
    • 2. All the Concept nodes including Orphans are removed from the system with relevant relationships.
    • 3. The Category node is removed from the system.


  • In one embodiment, the VDS is configured in a way that offers good performance in terms of support for a large volume of simultaneous requests, extensibility and adaptability to new business requirements. The VDS provides security and internationalization support for concepts and relationships.

    One embodiment uses a rule-base and declarative computation approach to express the concepts, relationships and rules of the VDF. This approach may be implemented using a high level computer programming language. In one embodiment, the approach is implemented using a logical processing language such as PROLOG™. The high level logical processing language translates statements declaring types and statements expressing rules about combining types into another language, such as the C programming language, that can be compiled and run on a large variety of general-purpose computer platforms. This approach relies on the inference power of a declarative engine and reduces coding and implementation that may impose a performance penalty.

    In another approach, the taxonomy of hierarchical concepts and their relationships can be modeled as an in-memory tree data structure. FIG. 4C is a diagram of a binary tree representation that can be modeled using one or more data structures stored in computer memory. This model captures the business logic and is supplemented with constraints placed on the data model as programming logic. One example of such rule could be "a child concept should have one and only one parent." This approach is fast and efficient but has limitation that it uses up the main memory considerably. A file based or database based LRU (Least Recently Used) algorithm implementation would overcome this limitation.

    Referring now to FIG. 4C, each of the top-level nodes 491 under the Vocabulary pseudo concept node 490 is a Category node, which implements the additional business logic and facilitates fast lookup and retrieval of concept nodes. Similarly, Relation type node 492 implements additional constraints on the relationship instances and facilitates fast responses to queries of n-ary relationships. A performance response of approximately less than 1 millisecond is achieved by having appropriate indices in the Category and Relation Type nodes 491, 492. A simple Hash Map or a balanced tree data structure could model the in-memory index.

    An example for retrieval could be as follows. Assume that the system receives a query getParticipants( ) with a set of arguments the identify participants in a set of relationships. The system is expected to return the matched relationship instances. One approach would be to go through each of the relationship instances and check for the match. When there are millions of instances, this would be slow. Accordingly, in a preferred approach, the following steps are followed to retrieve. the information fast and efficient.

    The system maintains an array of relationship instances for each of the participants on the system. The array of instances that have minimum index length is chosen given the query participants. Each element in the array is checked for a match by comparing the query participants and the relationship participants. This is quick and involves less computation. As an example, referring again to FIG. 4C, a fictitious hierarchical model is shown. The Relation1 is a relation type that has index for each of the participant for all the relation instances. Example indices but provided few and the names are given as numbers for explanation.
    Participant Instances
    1 54001, 54002, 54011, 54202, 54301, 54042
    4 54000, 54001, 54042
    8 54001, 54202, 54900, 54301, 56899, 63629

    Now consider a query getRelationship("Relation1", {"4", "8"}) wherein the API returns the relation instance names. A hash table lookup using the relation type name 'Relation1' would return the Relation type Object. The relation type object contains a hash table of participants and arrays of the relationship instances as in the table. A look up on this table using "4" and "8" returns 2 arrays containing relation instances. Now the implementation chooses the array for "4" as it has minimum instances to compare. The system checks "54000" to determine if it is in the list of participant "8"; since it is not present, it is ignored. The system checks the value "54001" in the list of participant "8," and there is a match; and we exhaust all the elements. The result set is a list with one element "54001."

    Embodiments also provide flexibility and adaptability to the new requirement by having an Object Oriented Data Model, which can be implemented in any Object Oriented Language like C++ or Java. FIG. 4D is a diagram of a class hierarchy of an example object-oriented model, in which a class "VDFNode" is the base class that models the tree data structure. Because all the other nodes are inherited from VDFNode, flexibility is provided. For example, causing a Relation Type to have another Relation Type as participant, could be done by having the type participant as VDFNode.

    The core classes of an example implementation of the VDS in the Java language are described here. The implementation is shown in Java language, however the implementation could be done in any higher-level programming language, e.g., C, C++, etc.

    A ConceptName represents the name of a Concept, as in this code example:
    public interface ConceptName extends Name {
    }
    public interface Name {
    public String getName( );
    public void setName(String name);
    }
    An Attribute class encapsulates a (Name, Value) pair, as in:
    public interface Attribute {
    public Name getName( );
    public void setName(Name n);
    public Object getValue( );
    public void setValue(Object v);
    }


    A VDFNode class implements tree data structure and provides the basic tree operations. It also provides all the get, add and set APIs for attributes. However the set APIs pushes the action upwards in the tree hierarchy, which allows the calls to be trapped by the root level nodes, CategoryNode and RelationTypeNode for enforcing business logic and Access Control. An addChild method goes all the way up until it finds a Node that does checks and calls the actual add implementation_addchild( ) in VDFNode.
    public abstract class VDFNode {
    private VDFNode parent=null;
    private VDFNode firstKid=null;
    private VDFNode sibling=null;
    private int nodeID;
    private Set attributes=null;
    public void changeAttribute(Attribute attr) {
    // some implementation
    }
    public void addAttribute(Attribute attr) {
    // some implementation
    }
    // more attribute related APIs
    public void addChild(VDFNode n) {
    setChild(this, n);
    }
    public void setChild(VDFNode parent, VDFNode child) {
    VDFNode p=getParent( );
    if (p=null)
    throw new Exception("business logic not found in the
    hierarchy");
    p.setChild(this, child);
    }
    protected void_addChild(VDFNode n) {
    // imple..
    }
    protected void_removeChild(VDFNode child) {
    VDFNode prev=null;
    if (firstKid==child)
    firstKid=child.sibling;
    else if ((prev=child.getPrevSibling( ))!=null)
    prev.sibling=child.sibling;
    child.sibling=null;
    child.parent=null;
    }
    //more APIs
    }


    A ConceptNode is a VDFNode and has a Normalized Name. Again a call to a set/add method pushes the call up to the root node which sets/adds the Object after the constraint checks.
    public class ConceptNode extends VDFNode {
    Concept concept=null;
    public void setConcept(Concept c) {
    setConcept(this, c);
    }
    public void setConcept(ConceptNode node, Concept c) {
    VDFNode p=getParent( );
    if (p==null)
    throw new Exception("business logic not found in the
    hierarchy");
    if (p.getType( )!=Constants.ConceptNode)
    throw new Exception("Invalid node in the hierarchy");
    p.setConcept(node, n);
    }
    protected void_setConcept(Concept c) {
    // this.concept=c;
    }
    }


    A CategoryNode is a root node in taxonomy of concepts. It implements the business rules related to Concepts as stipulated in VDS rules. Here is an example: the setconcept( ) method is implemented here to check for duplicate Concept Name and to set the concept to the target ConceptNode, node. The root node implementation in CategoryNode and RelationTypeNode uses Read/Write Lock Object for efficiency that allows multiple reader threads to go through, instead of Java synchronization that allows single reader thread to pass through the critical path
    public class CategoryNode extends ConceptNode implements
    Comparator {
    protected ReadWriteLock rwLock=new ReadWriteLock( );
    protected boolean ignoreCase=true;
    protected TreeMap concepts=null;
    public void setConcept(ConceptNode node, Concept c) {
    rwLock.writeLock( );
    try {
    if (!concepts.contains(c.getName( ))) {
    concepts.put(c.getName( ), node);
    node._setConcept(c);
    }
    else
    throw new Exception("Concept duplicate.");
    } finally {rwLock.releaseLock( );}
    }
    }


    RelationParticipant has a reference to a VDFNode and role name for the reference.
    public class RelationParticipant {
    private VDFNode participant=null;
    private String roleName=null;
    }


    A RelationNode is the base class the relationship classes. It captures set Relation Participants with their Role Names. The field 'role_participants' is 2 dimensional array of (Role name, VDFNode). The class provides all the APIs for setting and getting the values in the collection.
    public abstract class RelationNode extends ConceptNode {
    private HashMap role_participants=null;
    public void addParticipant(RelationParticipant part) {
    setParticipant (this, part);
    }
    public void setParticipant(RelationNode node, RelationParticipant
    part) {
    VDFNode p=getParent( );
    if (p=null)
    throw new Exception("business logic not found in the
    hierarchy");
    if (p.getType( )!=Constants.RelationNode)
    throw new Exception("Invalid node in the hierarchy");
    p.setParticipant (node, part);
    }
    protected void_setParticipant(RelationParticipant part) {
    // role_participants.put(part.getRole( ), part);
    }
    }
    public class RelationTypeNode extends RelationNode {
    private ReadWriteLock rwLock=new ReadWriteLock( );
    private HashMap relations=new HashMap( );
    public void setParticipant (RelationNode node, RelationParticipant
    part) {
    rwLock.writeLock( );
    try {
    List list=node.getParticipants( );
    list.add(part);
    if (existsRelation(list))
    throw new Exception("relation already exists with same
    participants.");
    node._setParticipant(part);
    VDFNode p=participant.getParticipant( );
    // build cache in advance
    List plist=(List)relations.get(p);
    plist.add(part);
    } finally {rwLock.unlock( );}
    }
    }



    4.1.1 Defining Concepts

    In one embodiment, a statement declaring that the phrase BetaPerseus 2.0 is a concept is presented in a high level logical processing language by the expression:
    • new Concept('BetaPerseus 2.0');


  • Similar expressions are used to enter the other concepts in the vocabulary.

    The concept may have several attributes besides the phrase that defines it. For example the concept may have a creation date and an author. Attributes of a concept are presented with the following expression:

    concept.addAttribute(new Attribute( 'creation', 'Sep. 19, 2000'));

    4.1.2 Defining Relationships

    The relationships that constitute a hierarchy connect one concept to one or more other concepts. Relationships are defined with the following expression:

    new RelationTypeNodc("prod_can_have_doc", 2);

    where "prod_can_have_doc" is a relationship type and "2" is a value associated with the parameter type, i.e., in this example, a product can have 2 documents associated with it.
    • relationType.addChild(new RelationIntanceNode(new VDFNode[] {conceptNode1, conceptNod2}));
      4.1.3 Retrieving Relationships
    • relationType.getRelationship("marketDoc", "BetaPerseus 2.0", "http:///www.Enternrise.com/literature/devices/catalog/Chap2/");
      4.1.4 Persistent Data Storage


  • Changes in the VDS system need to be recorded on a permanent store for recovering and backup. VDS uses RDBMS for its persistent storage. FIG. 4E is a diagram of a data representation schema in the form of a fixed set of normalized tables that may be stored in persistent storage. The arrangement of FIG. 4E offers flexibility to model n-ary relationships and m by n level hierarchy. VDS system generates unique ID for each of the nodes as they are created in the system through adding a concept or relationship. These IDs are used as the primary keys in the database tables. The implementation commits the changes to the persistent store at specified interval as a batch update for enhancing performance. This must be accomplished at greater care to avoid loosing changes. This achieved by having a separate Thread that maintains the changes so as to update them to the persistent store at regular interval. The changes are written to transaction.dat, which accumulates the events as they happen in the system, and transaction_history.dat, which maintains the history of transaction files that are to be merged, and that are already merged successfully to the database. The format of the transaction.dat is: command!argument[!argument]* as shown below:
  • 1018!270560607! status_date!2001 06 14 16:56:56
  • 1016!570560601!REL1!2
  • 1017!570560601!57067223
    When the Thread wakes up to synchronize the database, it moves transaction.dat under a directory with time-stamp part of filename as in 2001/6/14/1958_transaction.dat. The thread runs through the lines of the files, composes SQL prepared statements and does the batch updates to the database. As one batch succeeds the lines involved in the batch update are prefixed with '+' sign to indicate that they are merged with the database. This way the server could merge the uncommitted changes to the database incase of error.
  • +1018!270560607!deploy_status_date!2001 06 14 16:56:56
  • +1016!570560601!Prod_PCR!2
  • +1017!570560601!57067223
    The transaction_history.dat is a quick index to the thread to find the files that are not fully committed. When the thread merges the changes, it marks appropriate entry in the history files with '+'. A typical history file looks like this:
  • +/opt/httpd/root/apps/mdf-sr/7213copy/2001/6/14/1656_yacs_tran.dat
  • +/opt/httpd/root/apps/mdf-sr/7213copy/2001/6/14/1757_yacs_tran.dat
  • /opt/httpd/root/apps/mdf-sr/7213copy/2001/6/14/1858_yacs_tran.dat
  • /opt/httpd/root/apps/mdf-sr/7213copy/2001/6/14/1_58_yacs_tran.dat


  • The Vocabulary Development Server (VDS) is one or more processes that provide management of and access to the enterprise data in the vocabulary database to other processes in an enterprise data processing system. Herein, the vocabulary database is also called the VDS Concept Database.

    FIG. 4F is a block diagram of an example architecture of the VDS according to another embodiment. Services provided by the VDS to clients and applications include vocabulary management and administration. Vocabulary related services are exposed to remote clients through a Metadata Access Protocol (MAP) over TCP/IP or RMI. Administration is a non-functional requirement but a desired to have features that allows remote monitoring, server fine-tuning.

    MAP is designed for performance enhancement over RMI based approach. MAP is language neutral protocol wherein the request and response are transmitted over TCP/IP as tokens. The client application must know to assemble the tokens into the desired return result. The request format is:
  • Command_Identifier!Arguments_separated by_!
  • Example
  • getChildConcepts!Category!
    • The response format is:


  • If Request succeeds, the format is:
    • +OK command_code
    • <responses in single or multiple lines>
    • <CRLF>
    • If Request is failed, the format is:
    • -ERR!error_code!error_message_in single line
      4.1.5 Security


  • VDS is a knowledge repository for storing and establishing Cisco's standard for concept categorization and their relationships. To provide a controlled access and modification to the vocabulary, VDS implements two levels of security, Authentication and Authorization.

    For Authentication, VDS supports simple username/password authentication mechanism and will service MAP over SSL in the future. It can be configured to use LDAP service to validate the user. The server also supports generic accounts (for which usernames do not exist in LDAP) through its internal authentication module.

    For Authorization, VDS supports access control on all the nodes. Access Control List (ACL) is modeled within VDS as a set of categories and relation types. FIG. 4G is a diagram illustrating relationships among an access control list and nodes of a tree of the type shown in FIG. 4C. Permission on a node is granted to an action provided one the following is true:
    • 1. If the access_mode on the node allows the action
    • 2. If the user is in the group that has the required permission on the node.
    • 3. If the permission on the parent of this node satisfies one of the above.
      4.1.6 Internationalization


  • The VDS system stores the names in double-byte character set. This achievable if the implementation language supports (like Java) or by taking care of it in the implementation by storing the name in appropriate data structure.

    4.1.7 VDS Events

    Events are the best way to have asynchronous communication to external parties like deploy process or client adapters. VDS uses an event mechanism to notify the registered clients about any change in the vocabulary data FIG. 4H is a block diagram of a class hierarchy that may be used to implement an event mechanism, in one embodiment.

    5.0 Information Object Repository (IOR)

    According to one embodiment, the concept application 408 is an information object repository application. An information object repository (IOR) holds content for documents.

    For example, in this embodiment, the marketing document described above at URL address 'http:///www.Enterprise.com/Hello/Chap2/' is in the IOR. The content is stored and retrieved in units of data herein called information objects or "chunks." An IOR application produces documents, such as operating manuals, marketing documents, and Web pages for a Web site by combining one or more information chunks in the IOR. One or more IOR processes employed by the IOR application manage the IOR by relating the content in the IOR to one or more concepts in the concept database 420 and determine the information chunks to incorporate into documents based on one or more relationships in the concept database 420.

    Using this technique, content originally unrelated and authored over time by many different persons and organizations can be related using the business vocabulary concepts and relationships in the VDS. Thus a person wishing to learn about the BetaPerseus 2.0 can use an IOR application to find all the manuals, press releases, and articles that describe it no matter when or by whom the document was written, as long as the content is registered with the IOR.

    As another example, a system put together by a joint venture can produce a system document that uses descriptions of the components originally written independently by the joint venture partners. In addition, the information chunks supplied to a requestor can be tailored to the person making the request, for example, by providing more technical information to a technical user than to a marketing user. Furthermore, information chunks can easily be reused in several documents. For example, an introductory paragraph for the BetaPerseus 2.0 written for a marketing document can be used in a press release, a data sheet, and the home page for the BetaPersus 2.0 on the Web site of the enterprise.

    Embodiments are described herein in the context of examples involving generation of electronic documents in the form of Web pages. Embodiments are applicable to generation of any form of electronic document, and are not limited to use with Web sites or Web pages.

    5.1 IOR Creation Layer

    One set of IOR processes are used to manage the registration of information chunks into the IOR and the concept database. This set of IOR processes and the data storage for the IOR comprise the creation layer of the IOR, herein designated IOR-C. FIG. 4B is a block diagram illustrating the IOR-C of the IOR according to one embodiment.

    In this embodiment, the IOR processes are invoked through an interface 462 for the IOR-C 460. For example, an application programming interface of the IOR-C interface 462 is invoked by a content generation application 444. In another example, an IOR administrator performs administration of the IOR through an administrator user interface of the IOR-C interface 462. In other embodiments the IOR processes execute under control of a standalone IOR batch or user-interactive application.

    The IOR-C interface 462 includes methods to access the business vocabulary development server (VDS) 410 of the enterprise through the concept access API 432. As shown in FIG. 4B, this embodiment of the VDS 410b has an external concept access API 432 which uses a concept cache server 440 to speed retrievals from the VDS 410b. The concept cache server 440 uses a cache memory to temporarily store a subset of the concepts and relationships in the concept database (420 in FIG. 4A) of the VDS 410b.

    The IOR-C interface 462 includes methods to store and retrieve information chunks in a content management system (CMS) such as in a local CMS 452 or over the network 401 in a remote CMS 458. A CMS includes persistent storage where an information chunk is stored. For example, persistent content store 454 includes information chunk 464.

    A CMS is capable of managing a variety of types of information in each information chunk. For example an information chunk may comprise a block of text, an application program, a query for a database, a vector graphic, an image, audio data, video data, and other binary data. The block of text may be text that represents code for a compiler, such as C code, and formatted text, such as text in the Hypertext markup language (HTML) or in the extensible markup language (XML), as well as unformatted text using one of several character codes, such as ANSI one byte and Unicode four byte codes.

    In some embodiments, the CMS comprises the local operating system directory structure. For example, different information chunks are simply kept in different files with different file extensions for the different types of data, and the files are organized into one or more directories in a hierarchy of directories and files. In another embodiment, the CMS is a database server for managing a database of information chunks.

    It is not necessary that all the information chunks be in a single CMS on one computer device. Data integration tools 456 are commercially available for associating data in one data store or content management system, such as CMS 452, with data in another data store or CMS, such as remote CMS 458. The role of tools 456 is to be a mediator that at run time resolves the semantic and syntax integration issues between data stores that have similar or related data. Appropriate data integration tools also can associate data that is in any other location that can be referenced, i.e., any object that exists, whether it is in a CMS or not, i.e., LDAP directories, Web services, application versioning, network addresses from DNS, physical objects such as bar codes, etc. In the depicted embodiment, the methods of the IOR-C interface access the data integration tools 456. In an embodiment with all the information chunks stored in a single local CMS, the data integration tools 456 are not included, and the methods of the IOR-C interface access the local CMS 452 directly.

    Each information chunk in the CMS is identified uniquely by an information chunk reference 466. Depending on the CMS employed, the reference may be a file name, a file name including one or more directories in the hierarchy of directories, a network resource address, a universal resource locator (URL) address, a record identification in a predetermined database, or a record identification in a predetermined content management system.

    FIG. 4B also shows a process 470 for generating pages 480 for a Web site on Web server 402 using the IOR-C interface to access the VDS 410 and the persistent content store 454. The process 470 is described in more detail in a later section.

    The IOR-C interface 462 includes methods to manage the IOR by relating the information chunks in the CMS to one or more concepts in the concept database 420. The IORC interface includes methods to generate and retrieve information object concepts in the concept database associated with the information chunks. The IOR-C interface also includes methods to generate and retrieve relationships between the information object concepts and other concepts in the concept database.

    5.2 Information Objects and Relationships

    For each information chunk that is registered in the IOR 460 by a method of the IOR-C interface 462, a particular information object concept is added to the concept database of the VDS 410b. In one embodiment, an information object category is added to the Vocabulary Table (such as the sample Vocabulary Table listed in Table 1). The particular information object is a child of the information object category and is represented as a new row in an Information Object Table. The concept cache server 440 or concept access API 432 is invoked by the IOR-C method to add this concept to the database.

    Table 13 lists sample entries in a hypothetical Information Object Table according to this embodiment. In this embodiment, the information object concept has a name that is the unique reference for the corresponding information chunk in the CMS. As shown in Table 13, the unique reference is a URL in this embodiment.
    TABLE 13
    The Information Object Table
    <
    Creation
    Name Description Date
    http://www.Enterprise.com/literature/ marketing document 9/19/00
    devices/catalog/Chap2/ for Perseus routers
    http://www.Enterprise.com/ marketing document 9/20/00
    Hello/Chap2/ for Perseus routers
    ftp://Enterprise.com/literature/ BetaPerseus 12/12/00 
    devices/Perseus/Intro17.txt/ introductory para-
    graph for silver
    partner
    marketing person
    ftp://Enterprise.com/literature/ BetaPerseus 2.0 4/12/00
    devices/Perseus/Intro5.txt/ introductory para-
    graph for
    technical person
    http://Enterprise.com/datasheets/DS33/ BetaPerseus 2.0 4/12/00
    data sheet table
    http://Enterprise.com/datasheets/DS12/ Jason data 4/12/00
    sheet table