Access augmentation or optimizing

Dynamic value mechanism for computer storage container manager enabling access of objects by multiple application programs

5652879

Abstract

Computer apparatus stores a subject value and a chain of sequentially associated value handlers for the subject value. The chain includes a top value handler and a bottom value handler, each of the value handlers in the chain except the bottom value handler invoking the respective next value handler when invoked, the bottom value handler performing an operation on the subject value when invoked. The value operations can be data read operations, data write operations, etc., and the value handlers in the chain can perform data transformations and/or data redirections, transparently to its caller.


Claims

We claim:

1. A computer system, comprising:

storage means for storing a software object and an indication of whether the software object is registered as a real value or a dynamic value, the storage means also for storing a data access handler and for storing a chain of dynamic value handlers if the software object is the dynamic value wherein each dynamic value handler calls a subsequent dynamic value handler in the chain and wherein a bottom dynamic value handler in the chain calls the data access handler;

processor that determines whether the software object is registered as a real value or a dynamic value, the processor executing the chain of dynamic value handlers if the software object is registered as the dynamic value such that the data access handler returns one or more data values from the software object to the chain of dynamic value handlers and each dynamic value handler in the chain performs a data conversion operation on the data values, the processor executing only the data access handler to obtain the data values from the software object if the software object is registered as the real value.

2. The computer system of claim 1, wherein the processor assembles the chain of dynamic value handlers for the software object into the storage means in response to a request for access to the software object.

3. The computer system of claim 2, wherein the chain of dynamic value handlers is specified by a hierarchical tree structure that corresponds to the software object.

4. The computer system of claim 3, wherein the processor executes the chain of dynamic value handlers and the data access routine according to a calling sequence specified by the hierarchical tree structure.

5. The computer system of claim 4, wherein the processor determines the calling sequence by performing a depth-first post-order walk on the hierarchical tree structure.

6. The computer system of claim 1, wherein one of the data conversion operations comprises a data decompression operation on the data values obtained from the software object.

7. The computer system of claim 1, wherein one of the data conversion operations comprises a format conversion operation on the data values obtained from the software object.

8. The computer system of claim 1, wherein the storage means comprises a disk and the data access handler comprises a read file routine for accessing a file system on the disk.

9. The computer system of claim 1, wherein the storage means comprises a memory and the data access handler causes the processor to perform a read operation to the memory.

10. The computer system of claim 1, wherein the storage means comprises a persistent storage means and the data access handler causes the processor to perform an input/output to the persistent storage means.

11. A method for accessing a software object, comprising the steps of:

registering the software object as a dynamic value if a data access operation and a chain of one or more data conversion operations are required to access the software object;

registering the software object as a real value if only the data access operation is required to access the software object;

determining whether the software object is registered as a real value or a dynamic value in response to a request to access the software object;

if the software object is registered as the dynamic value, then performing the data access operation on the software object to obtain one or more data values from the software object and then performing each data conversion operation in the chain on the data values;

if the software object is registered as the real value, then performing the data access operation on the software object to obtain the data values from the software object.

12. The method of claim 11, wherein each data conversion operation in the chain is performed by a corresponding dynamic value handler of a chain of dynamic value handlers for the software object.

13. The method of claim 12, wherein the step of registering the software object as a dynamic value includes the step of defining a hierarchical tree structure corresponding to the software object such that the hierarchical tree structure specifies the chain of dynamic value handlers.

14. The method of claim 13, wherein the step of performing each data conversion operation in the chain includes the step of determining a calling sequence for the chain of dynamic value handlers wherein the calling sequence is specified by the hierarchical tree structure.

15. The method of claim 14, wherein the step of determining the calling sequence for the chain of dynamic value handlers includes the step of performing a depth-first post-order walk on the hierarchical tree structure.

16. The method of claim 11, wherein the data conversion operations include a data decompression operation on the data values from the software object.

17. The method of claim 11, wherein the data conversion operations include a format conversion operation on the data values from the software object.

18. The method of claim 11, wherein the data access operation comprises a file system access operation.

19. The method of claim 11, wherein the data access operation comprises a memory read operation.

20. The method of claim 11, wherein the data access operation comprises an input/output operation.

21. An apparatus for accessing a software object, comprising:

means for registering the software object as a dynamic value if a data access operation and a chain of one or more data conversion operations are required to access the software object;

means for registering the software object as a real value if only the data access operation is required to access the software object;

means for determining whether the software object is registered as a real value or a dynamic value in response to a request to access the software object;

means for performing the data access operation on the software object to obtain one or more data values from the software object and then performing each data conversion operation in the chain on the data values if the software object is registered as the dynamic value;

means for performing the data access operation on the software object to obtain the data values from the software object if the software object is registered as the real value.

22. The apparatus of claim 21, wherein each data conversion operation in the chain is performed by a corresponding dynamic value handler of a chain of dynamic value handlers for the software object.

23. The apparatus of claim 22, wherein the means for registering the software object as a dynamic value includes means for defining a hierarchical tree structure corresponding to the software object such that the hierarchical tree structure specifies the chain of dynamic value handlers.

24. The apparatus of claim 23, wherein the means for performing each data conversion operation in the chain includes means for determining a calling sequence for the chain of dynamic value handlers wherein the calling sequence is specified by the hierarchical tree structure.

25. The apparatus of claim 24, wherein the means for determining the calling sequence for the chain of dynamic value handlers includes means for performing a depth-first post-order walk on the hierarchical tree structure.

26. The apparatus of claim 21, wherein the means for performing each data conversion operation includes means for performing a data decompression operation on the data values from the software object.

27. The apparatus of claim 21, wherein the means for performing each data conversion operation includes means for performing a format conversion operation on the data values from the software object.

28. The apparatus of claim 21, wherein the means for performing the data access operation comprises means for performing a file system access operation.

29. The apparatus of claim 21, wherein the means for performing the data access operation comprises means for performing a memory read operation.

30. The apparatus of claim 21, wherein the means for performing the data access operation comprises means for performing an input/output operation.


Description

BACKGROUND

Increasingly, documents and other collections of stored information are made up of multiple content elements, such as text, tables, images, formatting information, mathematical equations and graphs. Often content is created using one application program and then included in documents created by other applications. Subsequently, content elements may be copied out of a document and used in yet another document, and so on.

In the past, different applications typically had no way to exchange multiple content elements, unless they had a "private contract" about the format to be used. Furthermore, one application typically had no way to find the content elements in another application's document, so typically it was not able to obtain content elements from the other application's documents even if it knew the format. Moreover, every application developer who wanted to store multiple content elements in a document typically had to develop a proprietary object storage mechanism.

The use of multiple content elements in a document implicates at least two difficult issues: where each element is located and what the format of the data is. Regarding the first of these issues, it would be desirable if the data in a particular element could be stored in memory, in a local persistent storage device, across the network, or even created dynamically, all in a manner which is transparent to the application program which is operating on an element. In this way the limited resources available to application program developers can be directed toward enhancement of functionality rather than dealing with multiple types of storage devices.

Similarly, with regard to the second issue, it would be desirable if each different content element could have stored in association with it all of the routines which are needed to manipulate it, again, transparently to the application program. This, too, would free up developers' resources for more useful purposes.

In a general way, an individual developer might obtain some of the transparency described above by programming the application using an object-oriented programming language such as C++. Object-oriented programming is described in many references, including, for example, G. Booch, "Object-Oriented Design With Applications" (Benjamin/Cummings Publishing Company: 1991), incorporated herein by reference. Such languages often support the grouping together of both an item of data and a set of "methods" to manipulate the data, in a single "object". These languages also often include, through a mechanism known as "classing" and "sub-classing" of objects, a way to define inheritance relationships. In an inheritance relationship, if a routine to perform a certain type of manipulation is not defined for a particular object in a particular class, then the corresponding routine in the parent class is used instead.

While these languages can be used to address the problems described above for handling multiple content elements, it is not clear how that can be done. Certainly the languages themselves do not provide guidance on how they can be used for such purposes. For example, the inheritance mechanism in C++ is a compile-time mechanism.

The languages also permit a method for a given object (whether the method is defined specifically for the object's own class or is inherited from a superclass) to invoke other methods for operating on the given object. The method can also invoke whatever corresponding method is defined for the object's superclass, for operating on the given object, and need not know exactly which routine that might be. While these features permit some degree of hiding or encapsulation, they do not provide enough flexibility for easy development of application programs which manipulate multiple content elements since to some degree, the application program still needs to know the type of the object it is operating on.

SUMMARY OF THE INVENTION

According to the invention, roughly described, a set of procedures are defined which permit substantially arbitrary composability of chains of handlers. The procedures follow rules which render them independent of the "type" of the value for which they are called, as viewed by the caller. Thus application programs can be written at only a high level of functionality, without needing to be concerned with the differences in the way different types of values need to be handled.

The procedures determine which handler to call to perform a given operation in dependence upon the type of the object for which the procedure was invoked. The handlers, too, are relatively easy to write because like the application program, the rules permit the handlers to call the very same set of procedures (recursively if the very same procedure is called) as are available to the application program. Thus like the application program, handlers too can be written without knowledge of any characteristics of the object for which they are invoked other than the characteristics defining the type for which the handler is specifically written. For example, a read handler for a type which defines a data compression/decompression algorithm need never know where the data is physically located since it merely calls the predefined read value procedure to obtain the data to decompress.

Types can be defined in a tree structure. This further simplifies the writing of handlers since the different characteristics of a type can be divided into many small components, each defined by a different sub-type on the tree. Thus each handler can be written to accomplish only a limited objective (for example an I/0 redirection or a data transformation). The predefined procedures automatically follow the chain of handlers defined by a type tree, by knowing where in the chain a given caller of the procedure is. Neither the application program nor the handlers themselves need keep track of this information.

Additionally, the predefined procedures make no assumptions about the types in a type tree. An application developer can define novel types as required by dividing them into subtypes (if desired) and writing handlers for each subtype. As mentioned, the complexity of the handlers depend only on the complexity of the transformation or redirection which they are to individually perform, not on the complexity of either the type tree or the procedures which implement the present invention. So long as the handlers follow certain rules of good behavior, the predefined procedures will be able to follow any such user-defined type tree. Additional procedures are provided for associating the individual handlers to their corresponding types (subtypes), and for building the type trees themselves.

To implement the above procedures, computer apparatus stores a subject value and a chain of sequentially associated value handlers for the subject value. The chain includes a top value handler and a bottom value handler, each of the value handlers in the chain except the bottom value handler invoking the respective next value handler when invoked, the bottom value handler performing an operation on the subject value when invoked. The value operations can be data read operations, data write operations, etc., and the value handlers in the chain can perform data transformations and/or data redirections, transparently to its caller.

The dynamic value chain is not stored in persistent storage; rather it is created when an application program desires to perform a value operation on the subject value. The subject value has a type associated with it which determines the value handlers to be placed in the chain. The chain can have more than one value handler in it for a given value operation if the type associated with the subject value is made up of a hierarchy of sub-types.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described with respect to particular embodiments thereof, and reference will be made to the drawings, in which:

FIGS. 1 and 2 illustrate type trees;

FIG. 3 is a block diagram of a hardware computer system platform which the invention might be used;

FIG. 4 is an overall block diagram of major data structures which are created in main memory of the computer system of FIG. 3 during the pendency of a session;

FIG. 5 illustrates the structure of in-memory objects which are created by dynamic value mechanism according to an embodiment of the invention;

FIG. 6 illustrates the same structure as FIG. 5 using a simplified notation;

FIG. 7 illustrates a type object using the notation of FIG. 6; and

FIG. 8 is a flowchart of a CMReadValueData() routine used in an embodiment of the invention.

DETAILED DESCRIPTION

The embodiment described herein takes the form of a Container Manager and its associated data structures which can be used by developers of a wide variety of types of application programs. The Container Manager includes a number of C language type definitions and a number of procedures for implementing the functionality provided by the Container Manager. Together they provide a common application program interface (API) for the different types of application programs.

The structures are described first with respect to their logical organization and subsequently their physical organization in the storage apparatus managed by the container manager. That is, they will be described first with respect to the view which the container manager software provides to an application programmer via the API, and subsequently with respect to the way that logical organization is actually implemented in the present embodiment. While many of the advantages of the present invention derive from the logical organization, it will be apparent that such logical organization implies certain physical structures which are required to maintain the metaphor as viewed by the application developer. The physical organization described hereinafter includes many inventive aspects of the invention, but it is by no means the only physical structure which can support the logical organization presented to the application developer.

TABLE OF CONTENTS

I. GENERAL OVERVIEW

A. Overview of Container Manager Entities

B. Overview of Types and Dynamic Values

C. Format Overview

D. Format Definition

II. IMPLEMENTATION

A. Hardware

B. In-Memory Data Structures

C. Routines

1. Session Operations

2. Object Operations

3. Type and Property Operations

4. Value Operations

III. DYNAMIC VALUE HANDLERS

A. Sample Session Flow

B. Sample Value Handlers

APPENDIX A

APPENDIX B

APPENDIX C

APPENDIX D

I. GENERAL OVERVIEW

In the present embodiment, an object is a collection of data that "hangs together" and that can be referenced by other data. Objects can be simple or complex, small (a few bytes) or large (up to 2.sup.64 bytes). Compared with objects in languages such as C++, objects of the Container Manager are typically larger and more complex, because they represent user meaningful content elements, rather than the atoms and molecules used to build this content.

For example, a sequence of bytes of data would not by itself be an object, because we can only understand the bytes if we know how they will be used. A paragraph, an image, etc. can be an object if it contains enough information so that we know how to interpret it. Typically an object contains information about what kind of object it is, and some data, which provides the content of the object. In this description, the information "about" the object is called metadata, and the content of the object is called its value.

The Container Manager groups objects in an object container, which is some form of data storage or transmission (such as a file, a piece of RAM, or an inter-application message) that is used to hold one or more objects (both their metadata and their values). These containers are defined by a set of rules for storing multiple objects in a such a container, so that software that understands the rules can find the objects, figure out what kind of objects they are, and use them correctly. The rules accommodate a wide variety of different kinds of objects, different ways that applications want to use objects, and system considerations about how data can be stored.

The Container Manager provides a container definition that can conveniently, efficiently, and reliably hold all the different kinds of objects that users and applications want to group together, store, and exchange. The Container Manager does not define how any given object is structured internally (within its value) so as not to limit the formats which an application developer may want to define. Objects stored in a container can have proprietary or standard formats, they can be designed to use the Container Manager mechanisms or they can be completely ignorant of the existence of the Container Manager.

A. Overview of Container Manager Entities

The Container Manager manipulates and stores data using primary and secondary entities. The primary entities used by the Container Manager are containers, objects, properties, values, and types.

Containers. Every object is in some container. An object consists of a set of properties. The properties are not in any particular order. Each property consists of a set of values with distinct types. The values are not in any particular order. Every object must have at least one property, and that property must have at least one value. Each value consists of a variable length sequence of bytes.

The Container Manager knows very little about a container beyond the objects in it. However, the container always contains a distinguished object, and applications can add arbitrary properties to that object, so applications can specify further information about the container if they wish.

Containers are often files, but they can also be many other forms of storage. For example, in various applications developers already support the following types of containers: blocks of memory, the clipboard, network messages, and Container Manager values. Undoubtedly other types of containers will be useful as well.

Objects. Each Container Manager object has a persistent ID which is unique within its container. Other than that, objects don't really exist independent of their properties. An object contains no information beyond what is stored in its properties.

Properties. A property defines a role for a value. Properties are like field names in a record or struct, with two differences. First, properties can be added freely to an object, so an application should never assume an object only has the properties it knows about. Second, property names are globally unique, so that they can never collide when various different applications add properties to the same object. This also means that the same property name always means the same thing, no matter what object it is in. Properties are distinct from types, just as field names are distinct from the data type of the field.

For example, different properties of an object might indicate the name of an object, the author of the object, a comment, a copyright notice, and so on. These different properties could all have values of the same type: string.

Conversely, a property indicating the date created might have a string, Julian day, or OSI standard date representation. These different formats would not be indicated by the property, but by the type (see below).

Values. Values are where the data is actually stored. In terms of physical location, this data might actually be stored anywhere in a container. In fact, it can be broken up into any number of separate pieces, and the pieces can be stored anywhere. (See the discussion of value segments below.)

Each value may range in size from 0 bytes to 2.sup.64 bytes, although that range can differ in a different embodiment. The overhead per value varies depending on the circumstances. For an object with a single value, the typical overhead will be 21 bytes. For a small value which is one of several values associated with a property, the overhead can be as low as five bytes.

Types. The type of a value describes the format of that value. Types record the structure of a value, whether it is compressed, what its byte ordering is, and so on. The Container Manager provides an open-ended mechanism, so that types can be extended to include whatever metadata is required.

To continue the example above, the type of a string value could indicate the alphabet, whether it was null terminated, and possibly other information (such as the intended language). It might also indicate that the string was stored in a compressed form, and could indicate the compression technique, and the dictionary if one was required. If the string used multi-byte characters, and the byte-ordering was not defined by the alphabet, the type could indicate the byte-ordering within the characters.

The Container Manager defines an inheritance mechanism to make building complex types like this efficient. The structure of types is tied into the mechanism for accessing values, so that the type associated with a value causes the appropriate code to be invoked to access the value, decompress it, byte-swap it, and so on. The specific mechanism for doing this is referred to herein as Dynamic Values.

Secondary Entities. In addition to the primary entities manipulated by the Container Manager, there are several additional entities that play supporting roles in the Container Manager design. These entities are important to fully understand how The Container Manager works, but they do not significantly change the picture given above.

Type and property descriptions. Each property associated with a value is actually a reference to a property description. Similarly, the type of a value is actually a reference to a type description. These type and property descriptions are objects, and their IDs are drawn from the same name-space as other object IDs.

Many type and property descriptions will simply consist of the globally unique name of the type or property. To continue the example above further, the type of a string of 7-bit ASCII, not compressed or otherwise transformed, would simply be described by a globally unique name. This would allow applications to recognize the type.

References to type and property descriptions are distinct from references to ordinary objects in the API to allow language type checking to catch errors in the manipulation of type and property references. However, type and property references can still be passed to the Container Manager routines which manipulate user-defined objects and values, so that value manipulation can be done on types and properties in the same manner as it can be done on user-defined objects.

Globally unique names. Globally unique names are public or private identifiers in a format defined by the ISO 9070 Public Text Identifier standard. They are simply strings written in a subset of 7 bit ASCII. They begin with a name that is assigned by a naming authority designated by ISO (companies can easily register as naming authorities). After this come additional segments, as determined by the naming authority, each of which is unique in the context of the previous segments.

The most common globally unique names will be generated by system vendors or commercial application developers, and may be registered. However, in many cases names will be generated by vertical application developers to record their local types and properties. To meet this need, the naming rules allow for local creation of unregistered unique names, for example by using a product serial number as one of the name segments.

IDs. The Container Manager assigns each object a persistent ID that is unique within the container in which the object is created. These IDs are never reused once they have been assigned, so even if an object is deleted, its ID will never be reassigned.

These IDs are obviously essential to the functioning of the Container Manager format, but they do not appear directly in the API. The only points at which an application actually deals with anything corresponding to an ID is when it needs to store an object reference into a value, or find the object corresponding to a reference retrieved from a value. Even in this case, however, the API does not give the application direct access to an object ID, but only to a token that corresponds to the ID in the context of that particular value. This hiding of actual IDs permits the Container Manager to perform reference tracking.

Refnums. In the API, types, properties, and objects are referred to using opaque reference numbers provided by the Container Manager. The refnums are much more convenient to use than IDs because they are unique within the session, while an ID would need to be used together with a container reference. Since they are opaque, they allow implementations of the API that support caching schemes in which only portions of the container metadata are in memory at any given time.

Refnums have no persistent meaning, so they cannot be stored in values as references to other values. The tokens provided by the reference calls must always be used for persistent references.

Dynamic values. As mentioned above in the discussion on "Types", a Container Manager value can be compressed, encrypted, byte-swapped, etc. during read/write. Furthermore, these transformations can be composed together to form a chain of transformations.

In addition to data transformation, the same mechanism also supports I/O redirection. In this case a value actually stored in a container is a description of how to find the data, rather than the data itself. Such descriptions can be as simple as references to files, or to objects in another container, or as complex as queries that cause data to be retrieved from a database.

Both I/O transformations and I/O redirection are carried out implicitly by the Container Manager library, using "handlers" determined by the type of the value. These handlers are attached to temporary entities called dynamic values created by the library. Dynamic values are never visible to the application, and have no persistent meaning. The Dynamic Value mechanism is described in more detail below.

Value segments. To support interleaving and other uses that require breaking a value up into pieces, The Container Manager allows a value to be stored as multiple segments stored at different locations in the container. These segments are not visible at the API, since the Container Manager routines concatenate them to create a single stream of bytes.

The Container Manager also takes advantage of value segments to represent insertions, deletions, and overwrites of contiguous bytes in a value. This allows the Container Manager to represent these operations directly in recording updates, rather than having to create a new copy of the value.

Handlers. The Container Manager makes use of dynamically linked handlers supplied by the execution environment for two reasons: portability and extensibility. The use of handlers means that the Container Manager library is almost trivially portable, since all the system dependencies are in the handlers. The Container Manager library is also easily extensible, with the addition of newly written handlers, since the handler interfaces are carefully designed to provide cleanly encapsulated abstractions.

The Container Manager employs session handlers, container handlers and value handlers. Session handlers are global to the session as a whole. These include allocating and de-allocating memory, and reporting errors. Container handlers perform all of the actual I/O to containers. These handlers map I/O to the underlying storage in a way that depends on the container type. Container handlers basically provide a stream I/O interface to the container storage.

Value handlers implement both I/O transformations and value indirection. These handlers are determined by the type of each value. New handlers to carry out new types of data transformations or support new types of indirect values can be written at any time.

These handlers are invoked entirely by the library. The accessing application does not need to know that it is using handlers to access the value. Of the three kinds of handlers used by the Container Manager, only the value handlers are described in detail herein since they are the only ones which are important to an understanding of the invention.

B. Overview of Types and Dynamic Values

The Container Manager provides a very powerful mechanism for transforming values during I/O, and for following indirect references. The Container Manager type mechanisms are probably best explained in terms of some usage examples.

Usage example 1--External File. Suppose an application developer would like to have a value that represents a file. When the application calls the Container Manager's Write Value Data procedure (CMWriteValueData) for writing data to the value, we want to actually perform I/O to the file.

The mechanisms described herein allow us to store a reference to the file in a value. When the value is used, an I/O redirection is set up, without the application being aware of it.

Note that this raises the thorny problem of platform-independent file references. The Container Manager avoids this problem. It allows any number of different types of references, implemented by handlers.

Usage Example 2--Compressed Value. Suppose an application developer would like to compress data as it is written to the value, and decompress it as it is read out. In addition to maintaining the data in the value itself, this compression may depend on a dictionary associated with the type of value. Furthermore, the compression routine may need to maintain a state, since the compression at any point may depend on what has already been written.

The mechanisms described herein allow us to give the value a type that causes the compression/decompression handler to be transparently invoked when the application does I/O. Again, this is an extensible mechanism, so that new compression algorithms (or more generally, arbitrary transformations) can be added without modifying the library.

Usage Example 3--Compressed, Format Converted Array. Suppose the value which an application is dealing with is actually an array of pixels. In addition to decompressing it, on a given platform we want to convert each pixel to a different format.

The mechanisms described herein allow us to take two (or more) data transformations, such as compression and format conversion, and compose them together. Just as the application does not need to be aware of the underlying transformations, the individual transformations do not need to be aware of each other.

Usage Example 4--All of the Above. The next step is to put the compressed pixel array out in a file, and convert it to a different format when it is read in. This is all supported using exactly the same composition as used in the previous example. The interfaces to data transformations and I/O redirection are the same, so no special mechanism is required.

Other Usage Examples. To briefly illustrate further where this leads, here are some further examples:

A value contains a query that is used to look information up in a database. The "I/O redirection" provides access to a table retrieved from the database.

A value contains a file reference that is encrypted because it also holds the file-server password. A decryption stage is required before the I/O redirector can be applied to the file reference.

A value contains a query that is used to generate a file reference, which then becomes the basis for a second level of I/O redirection.

Numerous other usages can be developed which can take advantage of the mechanisms described herein.

All of the above examples are based on the types associated with the values involved. The examples depend on two aspects of Container Manager types.

First, every value handler is bound to values only indirectly through the name of a type. Handlers are associated with type names through the CMSetMetaHandler Container Manager operation. This association is session-wide. Then the handler is bound to a particular type in a given container through the name of that type. This binding is done when the container is opened.

Second, even in the simplest examples above, such as the value that is merely an indirection to a file, or the value that is merely compressed, the value essentially has two types: the type visible to the application, which encodes the format of the data from the application's point of view, and the type used to find the appropriate handler for compression, I/O redirection, etc.

As the more complex examples show, multiple types of a value need to be independent. This leads to a view of a value as having multiple, independent types. By analogy with C++ (an analogy which is not perfect, as described below) we call these "base types" of the value's type. Base types can be added to and removed from any Container Manager type using the Container Manager CMAddBaseType() and CMRemoveBaseType() operations.

Base types are normal types, and themselves may have base types. This could be useful, for example, when the combination of file access and decompression is used in a variety of different contexts. The two could be made base types of a new type, and then that new type could be used in various ways, including making it a base type of the "all of the above" type which adds format conversion.

To illustrate the concept of base types, FIG. 1 is a symbolic diagram of a tree having three types 102, 104 and 106. A value may have a "compressed file type" 102 associated with it, but the compressed file type 102 has two base types: a "file access type" 104 and a "compression type" 106. The complex "compressed file type" 102 can be created by first defining the compressed file type 102 object, than calling the Container Manager procedure to add a base type 104 to the type 102, and then by calling the procedure again to add the base type 106 to the compressed file type 102.

FIG. 2 illustrates a more complex type tree. As shown in FIG. 2, the type "format converted compressed file type" 202 has two base types, "compressed file type" 204 and "format conversion type" 206. As with compressed file type 102 in FIG. 1, compressed file type 204 has two base types, "file access type" 208 and "compression" 210.

The addition of base types will always form a tree routed in the original type. If the same type is used as a base type in more than one place in the tree, the separate uses are treated as entirely separate types.

To understand how a given tree of types will behave, the tree is flattened into a linear "chain" of types. In the present embodiment, this is done by performing a depth-first, post-order walk on the tree. Thus, in the case of FIG. 1, the resulting sequence is file access, compression, then compressed file. If an application program calls the Container Manager routine to read data from a value (CMReadValueData), and the value has the type, "compressed file", then the Container Manager will first call the read handler for the compressed file type 102. The read handler for the compressed file type 102 will then (through another call to CMReadValueData) call the read handler for the compression type 106, which in turn calls (through yet another call to CMReadValueData) the read handler for file access type 104. The read handler for file access type 104 obtains (through yet another call to CMReadValueData) the information which is stored in the container in the storage area allocated to the value which the application desires to read, and uses this information to access the actual data on, for example, a hard disk. This data, obtained from the hard disk, is the return value of the read handler for file access type 104. This data gets decompressed by the read handler for compression type 106, and then returned to the caller by the read handler for the compressed file type 102.

The chain formed by the flattened type tree is considered herein to have a "top" and a "bottom" type which are, respectively in FIG. 1, compressed file type 102 and file access type 104. This means that the first handler to be called for any value operation is the value handler associated with the "top" type on the chain. That handler invokes the next handler on the chain, which in turn invokes the next handler on the chain, and so on down to the "bottom" handler on the chain. The handlers then return one by one to their respective calling handlers, until the "top" handler returns to the application program.

In the type tree of FIG. 2, the depth-first, post-order walk of the tree flattens it into the following linear chain: file access type 208, compression type 210, compressed file type 204, format conversion type 206, and format converted compressed file type 202. Format converted compressed file type 202 is the "top" type on the chain, and "file access" type 206 is the "bottom" type on the chain. Note that compressed file type 204 and format converted compressed file type 202 do not have handlers associated with them (let us assume), they will not have any effect on the value.

In order to support the above examples, the present embodiment assumes two design constraints. First, the application, and each handler, must always think that it is dealing with a "normal" value (i.e. one without redirection or transformations); that is, any redirection or transformation must be completely transparent to the caller. Second, in several cases we saw that handlers might have a non-trivial amount of state to manage.

We address these constraints by giving each handler its own "private" value, called a dynamic value. Dynamic values are transient (i.e. not persistent); they are created just to provide an environment for the handlers, and they are never written to the container, saved in the container's Table Of Contents (TOC), etc. However, they do have refnums and from the "outside" (i.e. from any application code or handler code except the handler that "owns" them) they look exactly like normal values. It will be seen that dynamic values have the same "value header" as real values, except that instead of pointing to storage locations which contain actual value data, they point to a vector of "handlers", one for each of a predefined set of "value operations", to be called when a prior caller desires to use the value.

The following value operations are supported by the Container Manager. The Container Manager routines which implement these operations first check whether the specified value is real or dynamic. If real, then the routine simply operates on the real data. If dynamic, then the routine calls the handler which is associated with the specified value for the specified value operation. Thus for a given dynamic value, a handler can be provided to support each of the following value operations:

    ______________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    ______________________________________
    CMSize  CMGetValueSize(CMValue value);
    CMSize  CMReadValueData(CMValue value, CMPtr buffer,
            CMCount offset, CMSize maxSize);
    void   CMWriteValueData(CMValue value, CMPtr buffer,
            CMCount offset, CMSize size);
    void   CMInsertValueData(CMValue value, CMPtr buffer,
            CMCount offset, CMSize size);
    void   CMDeleteValueData(CMValue value, CMCount offset,
            CMSize size);
    void   CMGetValueInfo(CMValue value, CMContainer *container,
            CMObject *object, CMProperty *property,
            CMType *type, CMGeneration *generation);
    void   CMSetValueType(CMValue value, CMType type);
    void   CMSetValueGeneration(CMValue value,
            CMGeneration generation);
    void   CMReleaseValue(CMValue);
    ______________________________________


As an aside, the present description often uses C-language notation as a shorthand way of describing the steps performed by, or other characteristics of, a Container Manager routine. In this notation, all module names and external data that can possibly be visible to an application programmer begin with the letters "CM" or "cm". The upper case "CM" prefixes all API visible routines and macros. The prefix "kCM" is used for constants. The lower case "cm" is used for all inter-module references within the Container Manager. All other data and modules have no other naming conventions and should not be visible outside of the file in which they occur. Macros used within the Container Manager do not follow these conventions since they are never visible in the generated object modules. Thus names beginning with "cm" or (upper or lower case) are reserved by the API and should not be used by the applications using the API.

Also as an aside, routines or code portions which are not described herein are considered self-documenting either due to commenting or due to the use of self-documenting symbol names. For example, it will be apparent to the reader without further explanations that the CMGetValueSize() operation mentioned above returns the size of the specified value.

Returning again to the above Container Manager value routines, none can be called for a particular value until one of the following preparatory routines are called for that value: CMNewValue() or CMUseValue(). As described below, if the desired value is a dynamic value, these routines set up the chains of dynamic value handlers needed to support the above routines.

When a dynamic value is spawned by CMNewValue() or CMUseValue(), the pointer to the top-most dynamic value header is returned as the refNum. Then, whenever the user passes a refnum to an API value routine, it checks to see if the refNum is a dynamic value. If it is, it initiates the call to the corresponding value handler. That may cause a search up the base value chain to look for an "inherited" value routine. In the limit, we end up using the original API value routine if no handler is supplied and we reach the "real" value in the chain. Thus the handler must be semantically identical to the corresponding API call.

These dynamic values only exist from creation during the CMUseValue() until they are released by CMReleaseValue(). A dynamic value can have its own data, but this data is stored in the value's refCon rather than in the value data itself. Dynamic values do not have associated data in the normal sense.

A dynamic value is created when a value is created by CMNewValue() or used by CMUseValue(), and the following two conditions occur:

1. The type of the value, or any of its base types, have metahandlers which have been registered by the Container Manager CMSetMetaHandler() routine in a session-wide metahandler symbol table (CMSetMetaHandler() is usually called when a container is first opened); and

2. The metahandlers support a Use Value Handler, and in addition for CMNewValue(), a New Value Handler.

The New Value Handlers are used to save initialization data for the Use Value Handlers. The Use Value Handlers are called to set up and return a refCon. Another metahandler address is also returned. This is used to get the address of the value operation handlers corresponding to the standard API CM . . . value routines mentioned above.

When a CMNewValue() or CMUseValue() is almost done, a check is made on the value's type, and all of its base types (if any) to see if it has an associated registered metahandler. If it does it is called with a Use value operation type to see if a Use Value Handler exists for the type. If it does, we spawn the dynamic value.

The spawning is done by calling the Use Value Handler. The Use Value Handler is expected to set up a refCon to pass among the value handlers and a pointer to another metahandler. These are returned to CMNewValue() or CMUseValue() which does the actual creation of the dynamic value. The extensions are initialized, the metahandler pointer and refCon are saved. The pointer to the created dynamic value header is what CMNewValue() or CMUseValue() returns to the user as the refNum.

Now, when the user attempts to do a value operation using this refNum, we will use the corresponding handler routine in its place. The vector entries are set on first use of a value operation. If a handler for a particular operation is not defined for a value, its "base value" is used to get the "inherited" handler. This continues up the chain of base values, up to the original "real" value that spawned the base values from the CMNewValue() or CMUseValue(). Once found, we save the handler in the top layer vector (associated with the refNum) so we don't have to do the search again. Thus, as in C++, dynamic values may be "subclassed" via their (base) types.

Note that if we indeed do have to search up the base value chain then we must save the dynamic value refNum (pointer) along with the handler address. This is very much like C++ classes, where inherited methods are called and the appropriate "this" must also be passed.

The Container Manager supports layering of dynamic values. The best way to describe layering is in terms of C++. Say we have the following class types (using a somewhat abbreviated notation):

    ______________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    ______________________________________
    class Layer { // a base class
    <layer1 data> // possible data (fields)
    Layer1(<layer1 args>); // constructor to init the data
    other methods...
                  // value operations in our case
    };
    class Layer2 {
                  // another base class
    layer2 data>  // possible data (fields)
    Layer2(<layer2 args>); // constructor to init the data
    other methods...
                  // value operations in our case
    class T: Layer1, Layer2 {
                  // the class of interest!
    <T data>      // possible data (fields)
    T(<T args>, <layer1 args>, <layer2 args>);
    .sup.        // constructor to init the data and bases
    other methods...
                  // value operations in our case
    };
    ______________________________________


In Container Manager terminology, T is to be a registered type with other registered types as base types (classes). All type objects are created using the standard API call CMRegisterType(). Base types can be added to a type by using CMAddBaseType(). This defines a form of inheritance like the C++ classes.

Type T would be registered with its base types as follows:

    ______________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    ______________________________________
    layer1 = CMRegisterType(container, "Layer1");
    layer2 = CMRegisterType(container, "Layer2");
    t = CMRegisterType(container, "T");
    CMAddBaseType(t, layer1);
    CMAddBaseType(t, layer2);
    ______________________________________


For the t object, the global name property and value are created as usual by CMRegisterType (container, "T"). The CMAddBaseType () calls add the base types. These are recorded as the object ID's for each base type in the order created as separate value segments for a special "base type" property belonging to the type object.

As mentioned above, CMNewValue() or CMUseValue() spawn dynamic values if the original type or any of its base types have an associated Use Value Handler. Assume that was done for "T" in the above example. What happens is that CMNewValue() or CMUseValue() will look at its type object (t here) to see if the base type property is present. If it is, it will follow each type "down" to leaf types using a depth-first search.

In the example, "layer1" will be visited, then "layer2", and finally the original type "T" itself. If the "layer1" type object had base types of its own, they would be visited before using "layer1" itself. Hence the depth-first search down to the leaf types.

For each type processed, if it has a Use Value Handler of its own, it will be called to get a refCon and value handler metahandler.

Note that this scheme allows total freedom for the user to mix types. For example, type T1 could have base types T2 and T3. Alternatively, T1 could just have base type T2 and T2 have T3 as its base type

In the C++ class types shown above, note that each class could have its own data along with its own constructor. The T class has a constructor that calls the constructors of all of its base classes. We can carry this analogy with the Container Manager just so far. Here is where it starts to break down.

The problem here is that C++ class types are declared statically. A C++ compiler can see all the base classes and can tell what data gets inherited and who goes with what class. In the Container Manager, all "classes" (i.e., our type objects) are created dynamically. So the problem is we need some way to tell what data "belongs" to what type.

The solution is yet another special handler, which returns a format specification called metadata. The handler is the Metadata Handler whose address is determined by the Container Manager from the same metahandler that returns the New Value and Use Value Handler addresses.

Metadata is very similar to C-language printf() format descriptions, and is used for similar purposes. The next section will describe the metadata in detail. For now, it is sufficient to know that it tells CMNewValue() how to interpret its ". . . " parameters. The rest of this section will discuss how this is done to dynamically create data.

As with C++ classes, the data is created when a new value is created, i.e., with a CMNewValue() call. The data will be saved in the container, so CMUseValue() uses the type format descriptions to extract the data for each dynamic value layer.

CMNewValue() is defined with the following prototype:

    ______________________________________
    CMValue
           CMNewValue(CMObject object, CMProperty property,
    CMType type, ...);
    ______________________________________


The ". . . " is an arbitrary number of parameters used to create the data. Metadata, accessed from the Metadata Handler, tells CMNewValue() how to interpret the parameters just like a printf() format tells printf() how to use its arguments.

The order of the parameters is important. Because base types are done with a depth-first search through the types down to their leaves, the CMNewValue() ". . . " parameters must be ordered with the parameters for the first type in the chain occurring first in the parameter list. Note what's happening here is that the user is supplying all the constructor data just like T constructor class example above.

The way the data gets written is with a special handler, called the New Value Handler. After CN/NewValue() calls the Metadata Handler, it uses the metadata to extract the next set of CMNewValue() ". . . " parameters. CMNewValue() then passes the parameters along in the form of a data packet to the New Value Handler. The New Value Handler is then expected to use this data, which it can extract with the Container Manager CMScanDataPacket() routine. Once it has the data, it can compute initialization values to write to its base value. It is the data written by the New Value Handler that the Use Value Handler will read to create its refCon.

Only CMNewValue() does this. The New Value Handler is only for new values, but the Use Value Handler is used by both CMNewValue() and CMUseValue().

In the simplest case, with only one dynamic value, it can be seen that the data is written to the "real" value. Now if you layer another dynamic value on to this, the next chunk of data is written using that layer's base value and hence its handlers. The second layer will thus use the first layer's handlers. That may or may not end up writing to the "real" value depending on the kind of layer it is. If it's some sort of I/O redirection handler (i.e., it reads and writes somewhere else), the second layer data will probably not go to the "real" value.

The Use Value Handler is called both for CMNewValue() and CMUseValue(). The Use Value Handler reads the data from its base value to create its refCon. If the user comes back the next day and does a CMUseValue(), only the Use Value Handler is called. Again it reads the data from its base value to construct the refCon and we're back as we were before in the CMNewValue() case.

It should be pointed out here that the Metadata and New Value Handlers will always be executed with a Container Manager running on some particular hardware (obviously). The data packet built from the CMNewValue() ". . . " parameters is stored as a function of the hardware implementation on which it is run (i.e., whatever the sizes are for bytes, words, longs, etc.). How it is stored is a function of the metadata returned from the Metadata Handler. In other terms, the New Value Handler has a contract with both the Container Manager and the Metadata Handler on the meaning of the parameter data.

Note, however, it is not required that you be on the same hardware when you come back the next day and to the CMUseValue() that leads to the Use Value Handler call. The handler writer must keep this in mind. Specifically, the Use Value Handler must know the attributes (bytes size, big/little endian, etc.) of the data written out by the New Value Handler so it knows how to use that info. In other words, the Use Value Handler has a (separate) "contract" with its own New Value Handler on the meaning of the data written to the base value.

There is another, relatively minor, thing to keep in mind. That is that the value handlers for any one layer must take into account the size of its own data when manipulating additional data created by the handlers for CMReadValueData(), CMWriteValueData(), etc. This simply offsets the write and read value data operations by the proper amount. Remember all operations are on their base values. So if a New Value Handler writes data, this basically prefixes the "real" stuff being written by the handler operations.

The Metadata Handler is only needed for CMNewValue() so that the proper number of CMNewValue() ". . . " parameters can be placed into a data packet for the New Value Handler. The Metadata Handler must follow the prototype,

    ______________________________________
    CMMetaData metaData.sub.-- Handler(CMType type);
    ______________________________________


where "type" is the (base) type layer whose metadata is to be defined.

The Metadata Handler simply returns a C string containing the metadata using the format descriptions described above.

The type is passed as a convenience. It may or may not be needed. It is possible for a type object to contain other data for other properties. Types, after all, are ordinary objects. There is nothing prohibiting the creation of additional properties and their values. This fact could be used to add additional (static and private) information to a type to be used elsewhere. For example, the type could contain a compression dictionary.

The New Value Handler must follow the prototype,

    ______________________________________
    CMBoolean newValue.sub.-- Handler(CMValue baseValue,
              CMType type,
              CMDataPacket dataPacket);
    ______________________________________


where

baseValue=the base value which is to be used to write the refCon data for the Use Value Handler

type=the type corresponding to this New Value Handler

dataPacket=the pointer to the data packet, created from the CMNewValue() ". . . " parameters according the types metadata format description.

The type is passed again as a convenience just as in the Metadata Handler. It can also be used here to pass to CMScanDataPacket() to extract the dataPacket back into variables that exactly correspond to that portion of the CMNewValue() ". . . " parameters that correspond to the type. It is not required, however that CMScanDataPacket() be used.

The Use Value Handler is called for both the CMUseValue() and CMNewValue() cases. If its companion New Value Handler wrote data to its base value, the Use Value Handler will probably read the data to create its refCon. The refCon will be passed to all value handlers. The Use Value Handler returns its refCon along with another metahandler address that is used to get the value handler addresses. These are used to create the dynamic value.

The Use Value Handler should follow the prototype,

    ______________________________________
    CMBoolean useValue.sub.-- Handler(CMValue baseValue,
              CMType type,
              CMMetaHandler *metahandler,
              CMRefCon *refCon);
    ______________________________________


where

baseValue=the base value which is to be used to write the refCon data for the Use Value Handler

type=the type corresponding to this New Value Handler

metahandler=a pointer to the value operations metahandler which is returned by the Use Value Handler to its caller

refCon=a reference constant built by the Use Value Handler and returned to its caller.

The baseValue and type are identical to the ones passed to the New Value Handler. The type may or may not be needed in the Use Value Handler. Like the Use Value Handler, it could be used to supply additional information from other properties.

It is expected that the Use Value Handler will read data from its base value to construct its refCon. The refCon is then returned along with a pointer to another metahandler that is used by the Container Manager to get the addresses of the value operations.

Note, both the New Value and Use Value Handlers return a CMBoolean to indicate success or failure. Failure means (or it is assumed) that the handlers reported some kind of error condition or failure. As documented, error reporters are not supposed to return. But in case they do, we use the CMBoolean to know what happened. It should return 0 to indicate failure and non-zero for success.

Value Operation Handlers. The value operation handler routines can do a Container Manager CMGetValueRefCon() call on the value which was passed, in order to get at the refCon set up by the Use Value Handler. This provides a communication path among the value handlers. Further, the value handler should usually do its operations in terms of their base value, which can be accessed using the Container Manager CMGetBaseValue() call.

The release handler is an exception to this rule. A set of one or more dynamic value layers are spawned as a result of a single CMUseValue() or CMNewValue(). The layers result from the specified type having base types. From the caller's point of view s/he is doing one CMUseValue() or CMNewValue() with no consideration of the base types. That implies that the returned dynamic value should have a single CMRelaseValue() done on it. The handlers have no business doing CMReleaseValue() on their base value. This is detected and treated as an error.

A count is kept by the Container Manager of every CMUseValue() and CMNewValue(). Calling CMReleaseValue() reduces this count by one. When the last release is done on the dynamic value (its count goes to 0), the release handler will be called. It is the Container Manager who calls the release handler for all the layers, not the handler. The Container Manager created them as a result of the original type; it is therefore responsible for releasing them.

The reason the Container Manager is so insistent on forcing a release for every use of a dynamic value is mainly to enforce cleanup. Most value operation handlers will, at a minimum, use a refCon that was memory allocated by the Use Value Handler. Release handlers are responsible for freeing that memory. In another example, if any files were open by the Use Value Handler, the releases would close those files.

A trivial value handler might merely get its base value and use it to recursively call the Container Manager value procedure which initially invoked the handler to do its operation (again except for the release handler). In this case what it is basically doing is invoking the "inherited" value operation. In this case, the value operation could be omitted entirely by having the metahandler for the value's type return NULL when asked for that value operation. The Container Manager uses that as the signal to search up the dynamic value inheritance chain to find the first metahandler that does define the operation. In the limit, it will end up using the original "real" value.

Possible Limitations On Value Operations. Value I/O operations are basically stream operations. That is, you read or write information linearly from a specified offset. In addition, the Container Manager provides insert and delete value data API calls CMInsertValueData() and CMDeleteValueData().

Insert and delete can cause problems because base types may want to do certain transformations on their data that depend on what has occurred previously in that stream of data. For example, encryption using a cyclic key, or compression generally cannot be done simply by looking at a chunk of data starting at some random offset. A cyclic key encryption can be deterministic if you can always determine where to start in the key as a function of offset. But you can see that inserts and deletes will change the offsets of following data. You would not know where to start in the key.

What all this means is that certain data transformations only make sense if you are willing to refuse to support the insert/delete operations. Basically only data transformations that are position independent can be supported with the full set of value operations.

Even simple I/O to a file may create problems, since most file systems do not support inserts and deletes in the middle of a file. If you do want to support inserts and deletes, then you should consider the potential for data intensive and/or computationally intensive operations.

C. Format Overview

A conceptual description of the Container Manager data format is now presented. As an overview, certain caveats and tricks are omitted at this level which are covered in more detailed parts of this description.

Five key ideas underlie the Container Manager format:

1) everything in a container is an object,

2) objects have persistent IDs,

3) all the metadata lives in the TOC (Table of Contents),

4) objects consist entirely of values, and

5) each value knows its own property, type, and data location.

The five ideas will each be discussed in turn.

Everything is an object. In a Container Manager container, every accessible byte is part of a value of some object. Even the metadata that defines the structure of the container, and the label of the container, are values of an object. Type descriptions are objects, property descriptions are objects, etc. We will exploit this fact in various ways below.

Objects have persistent IDs. Every Container Manager object is designated by a persistent ID which is unique within the scope of its container. Objects may have additional IDs and/or names that are unique in larger scopes, but this is not required.

Object IDs provide a compact, convenient way to refer to an object. An efficient mechanism is provided to get from any object ID to information about that object.

All the metadata lives in the TOC. This is a difference between the Container Manager and most other container formats, such as ASN.1, formats derived from IFF (such as Microsoft/IBM's RIFF), etc. In these other formats, the metadata is associated with the chunks of data that it describes, a design approach that we call internally tagged. There are three reasons for this difference from other formats:

a) The Container Manager embodiment described herein is designed to support very flexible layouts, such as multi-media interleaving, and internal tags would be inconvenient and even harmful for this.

b) Applications inspecting an object can make decisions about it more efficiently if all of its metadata is concentrated in one place, rather than being spread out over the container with its values.

c) We want to be able to assimilate existing formats that contain collections of objects without forcing them to change. This implies that we must be able to designate regions within the existing structure as values, without forcing them to somehow retrofit internal tags.

This approach to metadata does impose one significant design constraint. A Container Manager container can only be read by starting with the TOC. This raises two questions: (1) how do we find the TOC, and (2) how do we access the TOC when we need information?

1) In standard Container Manager containers the container label points to the TOC. Possibly some non-standard containers will exist that require other mechanisms. However, these will be exotic cases.

2) Since we need to access the information in the TOC whenever we want to read a value, we have to have it available at all times. This normally means that the container needs to be on a random access device.

If a container needs to be read on a device that does not support efficient random access (such as a CD-ROM) the TOC can be split up into sub-TOCs that sit in front of the groups of objects they describe, and then the container can be accessed largely in stream order.

Objects consist entirely of values. In the Container Manager, an object has no value as such. Each object has properties, and each property has values. The Container Manager format provides no information about an object except its ID.

Of course, an object can have a single value; in that case the value of the property "is" the value of the object. Thus the Container Manager format can easily accommodate this "normal" case.

Each value knows its own property, type, and data location. Each value consists of a property ID (or role), a type (or format), and data. For example, a graphic object might have a value that describes its "clip mask"; the property ID would specify what role the value plays, but not what format it is stored in. The type would define how the mask is represented: rectangle, bit mask, path, Mac region, PostScript path, etc. The data would be the representation of the mask itself.

At the level of the container standard itself, there are no restrictions on what values an object can have, how many values it can have, etc. However, individual object formats may dictate rules in this area. In general, applications should be prepared to encounter additional values that they do not understand; these can be ignored. This allows other applications to annotate objects with additional values that may not be generally understood. Typically, these values will be associated with properties that are unknown to the application.

The data of a value is an uninterrupted sequence of bytes which may be from 0 to 2.sup.64 bytes long, although these limits may vary in a different embodiment. This sequence of bytes has no format requirements or restrictions. Furthermore, the byte sequences representing the data for various values of various objects can be placed anywhere in the container. Thus there are no strong data format requirements for the container as a whole, although it must contain the metadata to define its structure somewhere.

Special Cases. All of the mechanisms above are consistent across all the uses of objects. However, there are two special cases that need to be considered.

First, The Container Manager format allows a single object to have multiple values with the same property ID. All the values must have different types. Such multiple values are intended to be used as alternative representations of the same information.

Second, the table of contents can contain multiple entries for a single value. These entries mean that the value represented by the entry is actually stored in multiple segments. This permits values to be broken up into chunks and interleaved, without creating problems for applications that view them as single values. In addition, it allows an application to build TOC entries that "synthesize" a value out of separate parts, as is required in retrofitting some file formats.

Note that these two special cases can be mixed freely. A property can have multiple values, and one or more of the values can be composed of multiple segments.

Other Issues--Globally Unique Names. To fulfill the requirement for locally generated unique names for types and properties, the Container Manager embodiment described herein supports identifiers defined in ISO 9070. These are names that begin with a naming authority (assigned to a system vendor or an application vendor), and then continue with a series of more and more specific segments, until they end in a specific type or property name.

While another embodiment can use a different naming convention, names generated according to ISO 9070 are both unique and self-documenting. Individual users can generate unique names using this approach. For example, a user developing educational stackware might want to create properties, or even types, to use in scripts. The stackware development environment could automatically generate a unique prefix for the user, based on the serial number of the development tool, and then append the user generated property or type name.

This ensures that if that user's scripts and data are combined in a container with other information generated by other users, no naming conflicts can occur.

Note that globally unique names are not limited to property and type descriptions. Any object can be given a unique name using exactly the same mechanism, and such object names may be useful in some applications.

Note also that objects can be given short names that are only locally unique, as in the RIFF TOC. These would be a different type than Globally Unique Names.

Recall that type and property descriptions are objects as well. Since types and properties need to have globally unique names, so that applications can recognize them, type and property descriptions will typically have a globally unique name value. In many cases, this may be the only contents of a type or property description object.

In some situations, however, we may wish to put more information into a description. Here are some examples of useful information that can be attached to types or properties:

Base types. As previously mentioned, base types allow inheritance of semantics from other existing types for composition into more complex types. Such base type information is intended to include uses such as encryption, compression, I/O redirection, etc.

Encoding information. A type definition may indicate the default encoding of its values. Typically, all of the values with the same basic format in a container will have the same encoding, so this new subtype can be shared by all these values. In this case the encoding can be indicated directly in the type description for the format.

If values with the same basic format but multiple encodings exist in the same container, a more complex solution is required. In this case, a subtype may be created just to record the encoding. Such a subtype will typically not need a globally unique name.

Compression information. In addition to the compression technique, typically recorded via a base type, the type can record compression parameters, the codebook used (if applicable), etc. As with encoding information, a type that exists just to record compression information typically will not need a globally unique name. It will refer to the underlying format type and the compression technique, both of which will have globally unique names.

A template or grammar for a type. This allows applications that have never seen this type before to parse values of that type and potentially get some useful information out of them. Examples of description mechanisms that could be used in this way are ASN.1 and SGML. The more general type will be indicated as the super-type. For example, a given SGML DTD as a type will have a specific SGML definition of the DTD. The super-type of this type would be SGML itself, which defines the basic encoding conventions.

Method descriptions for a type. A type could have properties that provide method definitions. Providing methods in the container would allow fully encapsulated use of values.

D. Format Definition

The concrete format of the table of contents (TOC) of a Container Manager container will now be described. The TOC consists of a sequence of entries. Each entry corresponds to a single segment of a value of some object.

TOC entries are sorted by object ID, and within a single object they are sorted by property ID. Thus all the entries for a given object are contiguous in the TOC, and all the entries for a given property are contiguous within the object. Also, an object can be found within the TOC, or a property can be found within an object, by binary search. If an object ID or a property is not defined, we can quickly determine that it is not defined.

Thus, each object in the container is represented in the TOC by a sequence of entries, one for each segment of a value of the object. The Container Manager has no way to represent an object without at least one value.

Since each TOC entry defines a value, we know immediately that it must indicate the object ID, property, type, and data of the value. In addition, it indicates the generation number of the value in order to allow applications to check consistency between different properties. The TOC entry may also contain bookkeeping information for the value.

The object ID field in a TOC entry identifies the object that this value is part of. The property field identifies the value's property by the object ID of a property description. The type field indicates the value's type by the object ID of a type description.

The entry indicates the value's data by the offset and length of the sequence of bytes representing the value. The offset is a 0 origin byte offset from the beginning of the container. The length is a byte count, and may be 0, indicating a 0 length value. If the data is four bytes long or less, it may be included directly in the TOC as an immediate value, rather than being referenced by offset and length.

A TOC entry could simply be defined by putting all the information above in a record. This record would be relatively large, however, and would be very likely to contain redundant and/or unused information. The presently described Container Manager embodiment therefore uses an approach in which each TOC entry contains only the information that is new or different compared with the previous TOC entry. This results in a TOC that is organized as a stream rather than a table, and is parsed as it is read in. The actual format of the TOC is not important for an understanding of the invention.

Note that every TOC contains a standard object that is used to describe the TOC itself. In particular, it is object ID 1, so the TOC entries for the TOC itself always come at the beginning of the TOC. (Object ID 0 is never used). Additional TOC properties can be useful. For example, an index to speed access to the entries by ID could be attached to the TOC through another property. Potentially several such indexes, using different formats, could be attached.

Object IDs other than IDs of standard objects are generated by sequentially incrementing a counter from 0.times.00010000. Object IDs are never reused in later generations of a container if an object is deleted. The last ID number generated is kept as a property of object #1 to allow generating further IDs without reuse.

II. IMPLEMENTATION

A. Hardware

The Container Manager of the present embodiment is implemented entirely as software instructions and data, to be executed on general purpose computer hardware. No specific hardware platform is required. For completeness, however, FIG. 3 illustrates a typical hardware computer system platform on which the Container Manager might run.

The computer system of FIG. 3 comprises a CPU 302, main memory 304, which may be volatile, an I/O subsystem 306, and a display 308, all coupled to a CPU bus 310. The I/O subsystem 306 communicates with peripheral devices including persistent storage devices, such as a disk 312. In typical operation, an application program, together with at least those Container Manager routines which are used by the application program, are retrieved from the disk 312 into main memory 304 for execution by the CPU 302. All of the data structures described below are also created in the main memory 304, in the sense that memory space is allocated for the information to be contained in the data structures, and all of the software routines which read or write to such memory locations do so according to some known definition of fields. In addition, pointers are written to certain of the allocated main memory storage areas, which pointers refer to other structures in memory in a known manner which is defined by the data structure. Thus a data structure, as used herein, is an abstract description of the organization of data in main memory 304; when the data structure is "created" in main memory 304, this description is imposed on regions of main memory 304 so that specific items of information can be found and/or interpreted according to the data structure. The term "pointer", as used herein, is a well-known shorthand for physical signals which are stored as charge, current or voltage levels in the memory cells which implement the main memory 304. These signals "identify" an item of information memory 304 in the sense that, when applied to the memory 304 as an address (either directly or via an address translation mechanism), they cause the memory 304 to read out data from the item pointed to or identified by such signals.

Also, it will be understood that even though different types of computer systems implement schemes such as caching and virtual memory, in which some of the data may not actually be located in main memory 304 itself at various times, these mechanisms are transparent to the Container Manager embodiment described herein. Thus, the data is referred to herein as being located "in" main memory 304, even if it is actually, transparently, located elsewhere.

B. In-Memory Data Structures

FIG. 4 is an overall block diagram of the major data structures which are created in main memory 304 during the pendency of a "session". Data block 402 is a "session global data" block containing all of the session-wide data for a given Container Manager session. There is no static global data in the code. All open containers are tied to the session on a doubly linked list whose head and tail pointers are contained in the session global data. The root of a metahandler table 404 (described below) is kept here as well along with the session handler pointers for malloc, free, and error reporting.

Containers are identified in the session global data block 402 by a pointer to the container's Container Control Block (CCB). Each time a container is opened with CMOpenContainer() or CMOpenNewContainer(), a new container control block 406 is created. The pointer to the container control block 406 is what is returned to the user as a container "refNum" (reference number).

There are five primary data structures tied to the container. Four are shown in the diagram and are discussed later. The fifth is the "touched chain", used for recording updates. The "touched chain" is not important for an understanding of the invention and is not described herein.

The four shown main data structures pointed to from the container control block 406 are the list 408 of deleted values (TOCValueHdr(s)), a list 410 of embedded container pointers, the global name symbol table 412, and a pointer to a TOC 414 control block 416.

The table of contents (TOC) 414 is the set of related data structures that organize objects by object IDs. The requirement that objects be kept in sorted order (sorted by object ID) puts certain constraints on its organization. Further, the fact that the IDs are generated sequentially in new containers also must taken into account (for example, binary trees would not be a good choice in such a situation).

The method used in the Container Manager of the embodiment described herein is an index table algorithm. It is somewhat memory intensive but allows objects to be accessed linearly in time and keeps the objects in the required sorted order. The index tables correspond to "powers" of a chosen index table size. For example, if the table size is 256 and the maximum ID is 0.times.FFFFFFFF (32-bits unsigned on MC68XXX machines) the access depth will be 4 for any ID.

To illustrate this, if we had ID 0.times.00123456 we would have 4 indices: 0.times.00, 0.times.12, 0.times.34, and 0.times.56. Four index tables would exist each corresponding to the indices 00 to 0.times.FF, i.e., mod the size of an index table. Each index is used to index into its corresponding index table. Thus, in this example, the first table would have its 0.times.00'th entry pointing to the next table. That next table would have its 0.times.12 entry pointing to the third table. The third table would have its 0.times.34 entry pointing to the last table. The 0.times.56 entry in the last table would point to the actual object with ID 0.times.00123456.

Continuing with this example, if every ID possible were represented, then there would still be only one top level table. But there would be 256 second level tables corresponding to the 256 level-one indices. Each of those 256 level-two tables would have pointers to 256 level three tables and so on down to level 4.

Fortunately, new containers are generated with sequential IDs so that only the minimum number of tables is required. But if a new nonsequential number is needed the requisite new tables are generated as needed to go from the top level table to the lowest level table.

The routines that maintain this data structure are generalized to support any size table (within limits). There are trade-offs between table size and access time, which are apparent to a person of ordinary skill.

Because of this generalization, a TOC has associated with it all the variables that are needed to manipulate the index tables. This is kept in TOC control block 416, pointed to from the container control block. The TOC control block 416 is to TOC object access, what the container control block 406 is to the entire container.

The TOC control block points to another data structure not shown here to keep the drawings simple. It is a set of three head/tail list pointers to doubly linked lists of the TOCObject(s). The three lists are for all the objects, property objects, and type objects in the container. Thus the type and property lists are subsets of the object list. These lists are only just for the CMGetNextxxx() routines. These lists are kept as part of the TOC since, there can be only one TOC and one of these list sets. Note that for updating, there can be multiple containers using the same TOC, so putting these data structures here is the most convenient way to deal with them during updating.

Note, that since there can be multiple users of a TOC, a TOC requires a "use count" to prevent premature release of the TOC.

The lowest level of the TOC index tables 418 contain pointers to the container objects themselves instead of to other index tables. These objects are TOCObjects 420. The TOC entries for an object are linked off of their TOCObject. TOCObjects are returned to the user as object refNums (CMObject, CMType, and CMProperty).

The properties, TOCProperties 422, for an TOCObject are contained on a doubly linked list off the TOCObject. The values for each property are on a doubly linked list of value headers, TOCValueHdrs 424, off of each TOCProperty. Finally, a specific real (as opposed to dynamic) value, such as one of the TOCValues 426, is linked to its TOCValueHdr.

The reason the values are linked to a value header is because of continued (multi-segment) values. A multi-segment value can have more than one value entry. Hence the chain. Also, it is pointers to value headers that are returned to the user as value refNums (CMValue).

As used herein, a "header" for an item or items of information is a logical collection of information which applies generally to the item or items. The header need not be physically located in a contiguous region of memory, nor must it be contiguous with any of the items themselves.

Each TOCValue 426 can be either immediate, non-immediate, or a global name. Immediate values contain 1, 2, or 4-byte value data encoded directly in the entry. Non-immediates contain a container offset to the value data and its length. Non-immediates can also represent dynamic values (discussed below). Global name values, such as 428, are pointers to global name symbol table entries (discussed shortly) and once the value data has been written to the container, the container offset.

Note the diagram shows, in addition to the doubly linked list structures, a pointer for each TOCValue back to its value header. Similarly, each TOCValueHdr has a pointer back to its TOCProperty. Finally, each TOCProperty has a pointer back to its TOCObject. Not shown is a pointer from each TOCObject and each TOCValueHdr back to its container control block. The result is that anything can be accessed from almost anywhere and in any direction.

When a CMRegisterType() or CMRegisterProperty() is done, a check must be made to see if the specified global name already exists. For this, a simple binary tree symbol table 412 is used. Since a global name is itself a type or property object value, there is also a pointer from a TOCValue to the name in the global name symbol table. Each global name symbol table is unique to its container. Hence the container control block has the root to its tree of global names.

Whenever a container is opened a set of predefined global names is generated. Basically the equivalent of CMRegisterType() and CMRegisterProperty() is done but the object IDs are standard rather than user IDs.

Note, global names are not written to the container at the time they are created. Instead they are kept in the global name symbol table. When a container that was open for writing is closed, the global name symbol table is "walked" and all user defined names written to the container. At that point the TOCValues associated with global names are set with the container offsets for those names. This is done using the back pointer from each global name entry to its TOCValue. The TOC is then written followed by the label. Since the TOC is written after the global names, all the global name offsets will be set by that time. Thus everything is correct when the container is to be read.

The Container Manager of the presently described embodiment supports embedded containers. Embedded containers are treated just about like any other. The main difference is that they require a special handler that writes or reads (CMWriteValueData() and CMReadValueData()) to a value that belongs to the parent container. The handlers keeps track of offsets with the value that it is treating as a container.

The effect is to write or read a parent value as if it was a container. All the data for the parent value is created as a container, complete with its own TOC and label. The offsets in the TOC are relative to the start of the value, offset 0, just as in the non-embedded case. This means that a parent value could be read to copy the container as is.

Aside from the special handlers, most of the other stuff needed to open and close a container is independent of whether it is embedded or not. There are a few wrinkles, however. First, a container can have any number of embedded containers open at the same time. Each of those could also, and so on. The result is essentially a tree of open embedded containers. Since the data for a parent value is its embedded container, then if there are any more deeply embedded containers, they would also be part of the parent's value. This gets very confusing if you try to think of it more than two levels deep.

In all cases, when a parent is closed, we want to close all of its descendants.

The embedded container list 410 pointed to from the container control block is used so that a parent container can keep track of all of its immediate descendants. Each entry in the list is simply a pointer to a descendent container control block. At open time an entry for the embedded container is created in its parent embedded container list. At close time CMCloseContainer() will go through its list of embedded containers (i.e., the list of its immediate descendants) and recursively call CMCloseContainer() to close those. The net result is the desired one of closing all the descendants of the parent in the tree of embedded containers. An embedded container being closed is responsible from removing itself from its parent's embedded container list so that it won't be "seen" again if a parent further up the tree is closed.

Note, the functionality of embedded containers can also be done using dynamic values. However, the Container Manager, not being aware of this use of dynamic values, will not maintain the embedded containers list for it. Thus each dynamic value corresponding to an embedded container must be explicitly "closed" using CMReleaseValue().

When CMDeleteObject() is called, an object is to be deleted. When CMDeleteValue() is called, a value for a property of an object is to be deleted. As mentioned above, the refNums for objects (CMObject, CMProperty, and CMValue) are pointers to TOCObjects. Values (CMValue) are pointers to TOCValueHdrs. Thus we cannot truly delete the items (i.e., free their memory) these point to because there is no reliable way to verify that the pointers are valid.

The solution adopted is to put all deleted objects and values on a list of deleted items associated with the container. There are two lists: list 430 for objects pointed to from the TOC control block, and list 408 for deleted values pointed to from the container control block itself.

Note, since object refNums are TOCObjects, and value refNums are TOCValueHdrs, the only thing needed on these lists are those data structures. TOCProperties and TOCValues can be freed. The TOCObjects and TOCValues are flagged as "deleted". Whenever any object or value is passed to the API it is checked for the flag. It is an error to use a deleted item.

CMSetMetaHandler() is called by the user to record metahandler/type name associations. These are maintained in binary tree symbol table 404. The root of this tree is a "global" in the session data. It is not tied to any one container. When a container is opened, a type name is passed. This is used to look it up in the metahandler symbol table. This yields a metahandler function address which in turn is used to get actual handler routine addresses.

The following C-language struct defines the layout of all in-memory TOCObjects. The objects are accessed by their object ID.

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    struct TOCObject { /* Layout of a TOC object: */
    CMObjectID  objectID;
                       /* the object's ID (keep first for debugging) */
    ListHdr propertyList;
                       /* list of object property entries */
    struct Container *container;
                       /* ptr to "owning" container control block */
    struct TOCObject *nextObject;
                       /* chain to next object by increasing ID */
    struct TOCObject *prevObject;
                       /* chain to previous object by decreasing ID */
    struct TOCObject *nextTypeProperty;
                       /* chain of next type/property by increasing ID */
    struct TOCObject *prevTypeProperty;
                       /* chain of previous type/property by decr. ID */
    unsigned short objectFlags;
                       /* info flags about the object */
    CMRefCon  objectRefCon;
                       /* user's object refCon */
    unsigned long useCount;
                       /* count of nbr of times "used" */
    struct TOCObject *nextTouchedObject;
                       /* link to next touched object */
    ListHdr touchedList;
                       /* values/properties touched IN this object *
    };
    typedef struct TOCObject TOCObject, *TOCObjectPtr;
    __________________________________________________________________________


The following object flags are defined:

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    #define UndefinedObject
                  0x0001U
                       /*
                         1 ==> object created but undefined*/
    #define ObjectObject
                  0x0002U
                       /*  ==> object is a base object */
    #define PropertyObject
                  0x0004U
                       /*  ==> object is a property descriptor*/
    #define TypeObject
                  0x0008U
                       /*  ==> object is a type descriptor*/
    #define DeletedObject
                  0x0010U
                       /*  ==> object has been deleted */
    #define DynamicValuesObject
                  0x0800U
                       /*  ==> object "owns" dynamic values*/
    #define TouchedObject
                  0x1000U
                       /*  ==> object has been "touched" */
    #define ProtectedObject
                  0x2000U
                       /*  ==> object is locked/protected */
    #define LinkedObject
                  0x4000U
                       /*  ==> object linked to master lsts*/
    #define UndefObjectCounted
                  0x8000U
                       /*  ==> object counted as undefined */
    __________________________________________________________________________


Note that the properties 502 and 504 in the object of FIG. 5 are described by property descriptors which are themselves objects which follow the above layout. The layout of each of the properties 502, 504 and 542 is defined as follows:

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    struct TOCProperty {
                  /* Layout of a TOC object property: */
    ListLinks propertyLinks;
                  /* links to next/prev property (must be first) */
    TOCObjectPtr theObject;
                  /* ptr to "owning" object */
    CMObjectID propertyID;
                  /* the property's ID */
    ListHdr  valueHdrList;
                  /* list of the property's values */
    };
    typedef struct TOCProperty TOCProperty, *TOCPropertyPtr;
    __________________________________________________________________________


Types, too, are described using objects of the TOCObject form set out above. The structures of TOCValueHdrs 424 and TOCValues are set forth hereinafter.

As previously mentioned, the Container Manager routines CMNewValue () and CMUseValue () create a dynamic value chain for each type that has a "UseValue" and a "NewValue" handler. FIG. 5 illustrates the structure of in-memory objects which are created by the dynamic value mechanism.

Referring to FIG. 5, it is assumed that one of the TOCObjects 420 (FIG. 4) has a series of properties 502, 504, and so on (corresponding to 422 in FIG. 4). It is assumed further that property 502 has two values associated with it, indicated by value headers 506 and 508 (424 in FIG. 4). These values are of different types, as will be seen from the fact that different dynamic value chains are created for these values. Property 504 also has values associated with it, but these are shown only in the abbreviated form of an arrow 510.

Associated with real value header 506 is a segment 512 of real value data, and associated with value header 508 are two segments 514 and 516 of real value data. If the values for the property 502 were not of types which require creation of dynamic values, then the actual data of the values would be stored in segments 512, 514 and 516. Since the type of these values call for dynamic value creation, however, the data stored in real value data segments 512, 514 and 516 may instead be transformed versions of the actual data and/or may contain only indirection information.

The value header structure includes a pointer to the top dynamic value header 518 in a chain of dynamic value headers 518, 520 and 522. Each of the dynamic value headers 518, 520 and 522 have a format which is identical to the value header (also called a "real value header") 506, except that the field in real value header 506 which pointed to dynamic value header 518, is redefined in dynamic value header 518 to point to a set of dynamic value header extensions 524. The extensions 524 include an entry which points to the base value of the dynamic value header 518, which in the case of this chain, merely points to the second dynamic value header 520 on the chain. Dynamic value header 520 in turn points to its own dynamic value header extensions 526, which in turn points, in the base value field, to dynamic value header 522. Dynamic value header 522 also points to its dynamic value header extensions 528. But since dynamic value header 522 is at the bottom of the chain, its base value is the real value data stored in segment 512. Thus, the "base value" field of extensions 528 points back to the real value header 506.

Recall that the purpose for creating a chain of dynamic value headers 518, 520 and 522 is to implement a complex value type which transparently handles data transformations and redirections. Each of the dynamic value headers 518, 520 and 522 corresponds to a respective one of the types on the tree defining the complex type of the value headed by real value header 506. Thus, each of the dynamic value headers 518, 520 and 522 maintains its own vector of value handlers to be used when a higher level caller desires to invoke a value operation. These dynamic value vectors are illustrated in FIG. 5 as 530, 532 and 534, pointed to respectively by extensions 524, 526 and 528. The dynamic value vectors 530, 532 and 534 contain a series of pointers to the respective value handlers to be called. The pointers are in predefined locations in the vector; for example, the third entry in each vector contains the pointer to the WriteValueData handler to be called for a value data write operation.

The value header 508 in FIG. 5 is for a value whose type spawned only a single dynamic value header 536. Thus, the value header 508 points to dynamic value header 536, which in turn points to its extensions 538, which in turn points both to a dynamic value vector 540 and, for the base value, back to the value header 508.

When a real value spawns dynamic values, a special dynamic value property 542 is created only to contain the dynamic value headers. Only the bottom most layer of each dynamic value chain (the layer whose base value is the "real" value) is on the dynamic property chain. All higher layers are not part of the dynamic property chain. The dynamic value property chain is used to simplify deletion of dynamic values, for example when the container is to be closed.

Dynamic value headers never have value segment lists. No data is ever written to a dynamic value because these headers are removed when the value is released using a CMReleaseValue(). If there is any data, it must be associated with a "real" value--the real value associated with a dynamic value or some place else.

In each value header there is a pointer (a union called "dynValueData" with alternative fields called "dynValue" and "extensions") that contains three possible values:

1. dynValueData is NULL for "real" value headers that don't have a corresponding dynamic value.

2. dynValueData.dynValue is a pointer to the top-most layer if it is a "real" value that does have a corresponding dynamic value header.

3. dynValueData.extensions is a pointer to the extensions if it is itself a dynamic value header.

The value header's flags determine how to interpret the pointer.

When a dynamic value is spawned by CMNewValue() or CMUseValue(), the pointer to the top-most dynamic value header is returned as the refNum. That means whenever the user passes it to an API value routine, it will check to see if the refNum is a dynamic value. If it is, it initiates the call to the corresponding value handler using the vector in the extensions. That may cause a search up the base value chain to look for the inherited value routine. In the limit, the original API value routine is used if no handler is supplied and the "real" value in the chain is reached. That's how data could get in there.

FIG. 6 illustrates the same structure as FIG. 5 using a simplified notation. This notation will make it easier to describe how dynamic values are spawned and layers created. Here "0" is object, "P" is property, "VH" real value header, "DVP" the dynamic value property, and "DVH" a dynamic value. The value segments are omitted.

As previously mentioned, when a CMNewValue() or CMUseValue() is almost done, a check is made on the value's type, and all of its base types (if any) to see if it has an associated registered metahandler. If it does, it is called with a "use value" operation type to see if a "use value" handler exists for the type. If it does, the dynamic value is spawned. Thus if CMNewValue() or CMUseValue() sees any (base) type that has an associated "use value" handler, it will spawn a dynamic value.

The spawning is done essentially by calling the "use value" handler. It is expected to set up a refCon to pass among the value handlers and a pointer to another metahandler. These are returned to CMNewValue() or CMUseValue() which uses newDynamicValue() to do the actual creation of the dynamic value. The extensions are initialized, the metahandler pointer saved, and the refCon is also saved. The pointer to the created dynamic value header is what CMNewValue() or CMUseValue() returns to the user as the refNum.

When the user attempts to do a value operation using this refNum, a check is made that the refNum is for a dynamic value. If it is, the corresponding handler routine will be called. The vector entries are set on first use of a value operation. It may mean searching up the base value chain, but once found, the handler address is saved in the top layer vector (associated with the refNum) so the search doesn't have to be done again.

Note that if the search must be done up the base value chain, then the dynamic value refNum (pointer), in addition to the handler address, must be saved. This is very much like C++ classes, where inherited methods are called and the appropriate "this" must also be passed. The "this" in this case is the refNum.

Previously there was described a layered type T which was registered in the Container Manager with its two base types Layer1 and Layer2 as follows:

    ______________________________________
    layer1     = CMRegisterType(container, "Layer1");
    layer2     = CMRegisterType(container, "Layer2");
    t          = CMRegisterType(container, "T");
    CMAddBaseType(t, layer1);
    CMAddBaseType(t, layer2);
    ______________________________________


Internally, the t object can be represented as shown in FIG. 7 (using the notation of FIG. 6). The value data segments are shown here with the data the segment will point to in the container.

For the t object 702, the global name property 704 and value 706 are created, as usual, by calling CMRegisterType(). The CMAddBaseType() calls add the base types. These are recorded as the object IDs for each base type in the order created as separate value segments 708, 710 for a special "base type" property 712 belonging to the type object 702. The value segments 708, 710 store only the Object IDs of the base types; the global name of the base types are stored as values such as 706 in respective type objects similar to 702.

As mentioned above, CMNewValue() or CMUseValue() spawn dynamic values if the original type or any of its base types have an associated "use value" handler. Assume that was done for T in the above example. What happens is that CMUseVALUE0 or CM'seValue() will look at its type object (t here) to see if the base type property is present. If it is, it will follow each type "down" to leaf types using a depth-first search.

In the example, layer1 will be visited, then layer2, and finally the original type T itself. If the layer1 type object had base types of its own, they would be visited before using layer1 itself. Hence the depth-first search down to the leaf types.

For each type processed, if it has a "use value" handler of its own, it will be called to get a refCon and value handler metahandler. These are passed to newDynamicValue() to create a dynamic value for the original "real" value. newDynamicValue() always returns its refNum that will be the dynamic value it created. The first layer will create the dynamic value property and put the dynamic value header on its value header list. All further calls to newDynamicValue() will pass the most recent refNum returned from it. newDynamicValue() then chains these off the first dynamic value header. This produces the desired layering result.

The following C-language code defines TOCValue, the format of one of the TOCValue data segments 426 (FIG. 4) or 512, 514, 516 (FIG. 5):

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    union TOCValueBytes {      /* Layout of value/Length fields; */
    struct {                   /* value if not immediate(not explicitly
                               here): */
    CM.sub.-- ULONG value;     /* offset to value if not immediate */
    CM.sub.-- ULONG valueLen;  /* value length if not immediate */
    } not Imm;
    struct {                   /* value for a global name */
    CM.sub.-- ULONG offset;    /* offset to value in container */
    struct GlobalName *globalNameSymbol;
                               /* ptr value for a global name (in memory) */
    } globalName;
    union {                    /* actual value if immediate placed here: */
    CM.sub.-- UCHAR ucharsValue[2*sizeof(CM.sub.-- ULONG)];
                               /* value if immediate unsigned char(s) */
    CM.sub.-- ULONG ulongValue;
                               /* value if immediate unsigned long */
    CM.sub.-- USHORT ushortValue;
                               /* value if immediate unsigned short */
    CM.sub.-- UCHAR ubyteValue;
                               /* value if immediate unsigned byte */
    void *ptrValue;            /* value if immediate pointer */
    } imm;
    };
    typedef union TOCValueBytes TOCValueBytes, *TOCVatueBytesPtr;
    struct TOCValue {          /* Layout of a TOC type's value: */
    ListLinks valueLinks;      /* links to next/prev value (must be first)
                               */
    struct TOCValueHdr *theValueHdr;
                               /* ptr back to ValueHdr "owning" this value
                               */
    ContainerPtr container;    /* ptr to "owning" container control block */
    CMValueFlags  flags;       /* flags */
    TOCValueBytes  value;      /* value and length or immediate value */
    unsigned long LogicalOffset;
                               /* original (unedited) logical offset */
    };
    typedef struct TOCValue TOCValue, *TOCValuePtr;
    enum ConstValueType {      /* Data types used to copy data into
                               TOCValue's: */
    Value.sub.-- NotImm,       /* not immediate ==> value and valueLen */
    Value.sub.-- GlobalName,   /* global name ptr ==> in-memory str ptr */
    Value.sub.-- Imm.sub.-- Chars,
                               /* immediate, chars ==> ucharsValue */
    Value.sub.-- Imm.sub.-- Long,
                               /* immediate, long ==> ulongValue */
    Value.sub.-- Imm.sub.-- Short,
                               /* immediate, short ==> ushortValue */
    Value.sub.-- Imm.sub.-- Byte
                               /* immediate, byte ==> ushortValue */
    };
    typedef enum ConstValueType ConstValueType;
    __________________________________________________________________________


The following C-language code defines the format of a Value Header (both real value headers and dynamic value headers).

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    struct TOCValueHdr {
                       /* Layout of a TOC property type: */
    ListLinks valueHdrLinks;
                       /* Links to next/prev value hdr (must be first) */
    struct TOCProperty *theProperty;
                       /* ptr to "owning" property */
    ListHdr valueList; /* list of actual values */
    CMObjectID
            typeID;    /* the value's type ID */
    ContainerPtr
            container; /* ptr to "owning" container control block */
    unsigned long
            size;      /* total current size of the value data */
    unsigned long
            logicalSize;
                       /* original (unupdated) size of the value data */
    unsigned short valueFlags;
                       /* flags indicating stuff about the value */
    CMGeneration generation;
                       /* generation number */
    unsigned long useCount;
                       /* count of nbr of times "used" */
    CMRefCon valueRefCon;
                       /* user's value refCon */
    TouchedListEntryPtr touch;
                       /* ptr to updating touched list entry */
    union {            /* this field depends on kind of value hdr: */
    struct TOCValueHdr *dynValue;
                       /* ptr to dynamic value hdr or NULL */
    struct DynValueHdrExt *extensions;
                       /* ptr to dynamic value hdr extensions */
    } dynValueData;    /* [extensions onty when it's a dynamic value] */
    union {            /* references recorded by this value */
    TOCObjectPtr refDataObject;
                       /* associated ref object; NULL if no refs */
    ListHdrPtr refShadowList;
                       /* or shadow list of the actual data */
    } references;      /* (refShadowList only in recording value) */
    };
    typedef struct TOCValueHdr TOCValueHdr, *TOCValueHdrPtr;
    struct DynValueHdrExt {
                       /* Extensions to TOCValueHdr for a dynamic value: */
    TOCValueHdrPtr  baseValue;
                       /* ptr to base value of this dynamic value */
    DynamicValueVector dynValueVector;
                       /* dynamic value handler vector */
    CMMetaHandler metaHandler;
                       /* metahandler to get handler addresses*/
    };
    typedef struct DynValueHdrExt DynValueHdrExt, *DynValueHdrExtPtr;
    /* Some of following valueFlags echo the flags field a TOCValue entry.
    That is because a
    CMValue "refNum" that an API user is given and in turn given back to us
    is a pointer to a
    TOCValueHdr. It is sometimes more convenient therefore to check the kind
    of value we have
    by looking at the header then "going out" to the value. In all but
    continued values there
    is only one TOCValue entry on the valueList anyway. So echoing is more
    efficient then
    always going after the tail or head (they're the of a valueList just to
    see the flags and
    the kind of value. */
    #define ValueDeleted
                 0x0001U
                      /* valueFlags: 1 ==> deleted value */
    #define ValueContinued
                 0x0002U
                      /* ==> continued */
    #define ValueGlobal
                 0x0004U
                      /* ==> global name */
    #define ValueImmediate
                 0x0008U
                      /* ==> immediate value */
    #define ValueOffPropChain
                 0x0800U
                      /* ==> dynamic value off prop chain */
    #define ValueDynamic
                 0x1000U
                      /* ==> dynamic value */
    #define ValueUndeletable
                 0x2000U
                      /* ==> can't be deleted */
    #define ValueProtected
                 0x4000U
                      /* ==> locked/protected value */
    #define ValueDefined
                 0x8000U
                      /* ==> fully defined (in read only) */
    /* ValueUndeleteable and ValueProtected are levels of protection bits.
    */
    /* In order to make dealing with dynamic values easier, the following
    macros are provided.
    IsDynamicValue(v) is a more self-documented test to see if a TOCValueHdr
    is indeed a
    dynamic value, while DYNEXTENSIONS(v) allows simpler notational access to
    a dynamic value
    header's extension fields.
    #define IsDynamicValue(v) ((((TOCValueHdrPtr)(v))->valueFlags &
    ValueDynamic) != 0)
    #define DYNEXTENSIONS(V)
                      /* to make access to extensions a "little" easier*.backs
                      lash.
    (((TOCValueHdrPtr)(v))->dynValueData.extensions)
    /* The dynamic value vectors are defined as follows */
    struct DynamicValueVectorEntries {
                      /* Layout of a dynamic value vector entry: */
    CMHandlerAddr handler;
                      /* the handler address */
    CMValue thisValue;
                      /* the handler's value (C++ "this") */
    Boolean active;   /* true ==> handler is in calling chain */
    };
    typedef struct DynamicValueVectorEntries DynamicValueVectorEntries,
             *DynamicValueVectorEntriesPtr;
    struct DynamicValueVector {
    DynamicValueVectorEntries  cmGetValueSize;
    DynamicValueVectorEntries  cmReadValueData;
    DynamicValueVectorEntries  cmWriteValueData;
    DynamicValueVectorEntries  cmInsertValueData;
    DynamicValueVectorEntries  cmDeteteValueData;
    DynamicValueVectorEntries  cmGetValueInfo;
    DynamicValueVectorEntries  cmSetValueType;
    DynamicValueVectorEntries  cmSetValueGen;
    DynamicValueVectorEntries  cmReleaseValue;
    };
    typedef struct DynamicValueVector DynamicValueVector;
    __________________________________________________________________________


When a handler is called, it is expected to do its operations on the "base value" of the value passed to it. It gets its base value using CMGetBaseValue (). However, we don't want to allow recursive use of the API for the same value. That would call the handler again and we would be in an infinite loop. Thus the active switch is provided to check for this so we can report an error.

The dynamic value vector is initialized with each handler address thisValue set to NULL. On first use we use the metahandler which was returned from the "use value" handler (the metahandler address is saved in the value header extensions) to get the proper value handler address. It is saved in the handler field of the vector entry. Remember we may have to search up through a dynamic value chain to find an "inherited" value handler operation. Thus the handler used may correspond to a different dynamic value. We must therefore save the dynamic value refNum along with the handler address (in the thisValue). It is similar to the C++ "this" pointer for the value handler operation).

Of course, in the simplest case, where the handler is provided for the original value, the thisValue will point to its own dynamic value header. At the other extreme no handlers are supplied for the operation and we end up using the "real" value that spawned the dynamic value(s). In that case the handler pointer in the vector entry remains NULL and the thisValue will be the "real" value refNum. With no handler we use the actual API routine to process the real value.

As with standard handlers, to simplify this description, some macros are defined for calling the dynamic value handlers. These macros will require the following typedefs as casts to convert the generic handler typedef, HandlerAddr (the type used to store the addresses in the vector), to the actual function type:

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    CMSize (*TcmGetValueSize)(CMValue value);
    typedef CMSize (*TcmReadValueData)(CMValue value, CMPtr buffer, CMCount
    offset, CMSize
       maxSize);
    typedef void (*TcmWriteValueData)(CMValue value, CMPtr buffer, CMCount
    offset, CMSize
       size);
    typedef void (*TcmInsertValueData)(CMValue value, CMPtr buffer, CMCount
    offset, CMSize
       size);
    typedef void (*TcmDeleteValueData)(CMValue value, CMCount offset, CMSize
    size);
    typedef void (*TcmGetValueInfo)(CMValue value, CMContainer *container,
    CMObject *object,
       CMProperty *property, CMType *type, CMGeneration *generation);
    typedef void (*TcmSetValueType)(CMValue value, CMType type);
    typedef void (*TcmSetValueGen)(CMValue value, CMGeneration generation);
    typedef void (*TcmReleaseValue)(CMValue);
    __________________________________________________________________________


Here now are the macros used to do the calls using the vector.

    __________________________________________________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    __________________________________________________________________________
    #define CMDynGetValueSize(v)
    (*(TcmGetValueSize)DYNEXTENSIONS(v)->dynValueVector.cmGetValueSize.handler
    )((CMValue)(v))
    #define CMDynReadValueData(v, b, x, m)
    (*(TcmReadValueData)DYNEXTENSIONS(v)->dynValueVector.cmReadValueData.handl
    er)((CMValue)(v),
    (CMPtr)(b), (CMCount)(x), (CMSize)(m))
    #define CMDynWriteValueData(v, b, x, n)
    (*(TcmWriteValueData)DYNEXTENSIONS(v)->dynValueVector.cmWriteValueData.han
    dler)((CMValue)(v
    ), (CMPtr)(b), (CMCount)(x), (CMSize)(n))
    #define CMDynInsertValueData(v, b, x, n)
    (*(TcmInsertValueData)DYNEXTENSIONS(v)->dynValueVector.cmInsertValueData.h
    andler)((CMValue)
    (v), (CMPtr)(b), (CMCount)(x), (CMSize)(n))
    #define CMDynDeleteValueData(v, x, n)
    (*(TcmDeleteValueData)DYNEXTENSIONS(v)->dynValueVector.cmDeleteValueData.h
    andler)((CMValue)
    (v), (CMCount)(x), (CMSize)(n))
    #define CMDynGetValueInfo(v,c,obj,p,t, g)
    (*(TcmGetValueInfo)DYNEXTENSIONS(v)->dynValueVector.cmGetValueInfo.handler
    )((CMValue)(v),
    (CMContainer*)(c), (CMObject*)(obj), (CMProperty*)(p), (CMType*)(t),
    (CMGeneration*)(g))
    #define CMDynSetValueType(v, t)
    (*(TcmSetValueType)DYNEXTENSIONS(v)->dynValueVector.cmSetValueType.handler
    )((CMValue)(v),
    (CMType)(t))
    #define CMDynSetValueGen(v, g)
    (*(TcmSetValueGen)DYNEXTENSIONS(v)->dynValueVector.cmSetValueGen.handler)(
    (CMValue)(v),
    (CMGeneration)(g))
    #define CMDynReleaseValue(v)
    (*(TcmReleaseValue)DYNEXTENSIONS(v)->dynValueVector.cmReleaseValue.handler
    )((CMValue)(v))
    __________________________________________________________________________


As mentioned earlier, each corresponding API value operation must check to see if it has a dynamic value and call the corresponding handler which does the operation. It must get the proper address on first use. It must set switches to mark the handler as active. It must also set a switch to allow CMGetBaseValue () calls which are only allowed from dynamic value handlers. Thus the algorithm for calling a value handler looks something like this (ignoring all errors for the moment):

    ______________________________________
    .COPYRGT. 1992 Apple Computer, Inc.
    ______________________________________
    if (IsDynamicValue(v)) {
    v = GetDynHandlerAddress(v, h, g);
    if (IsDynamicValue(v)) {
    SignalDynHandlerInUse(v, h);
    AllowCMGetBaseValue(container);
    Call the proper dynamic value handler with one of the
    above macros definitions. The macro will pass the
    appropriate value corresponding to a possibly in