Modular scalable system for managing data in a heterogeneous environment with generic structure for control repository access transactions6654747Abstract A Data Management System has a plurality of data managers and is of a layered architecture. The system performs with a data manager and with a user input via an API a plurality of process on data residing in heterogeneous data repositories of said computer system including promotion, check-in, check-out, locking, library searching, setting and viewing process results, tracking aggregations, and managing parts, releases and problem fix data under management control of a virtual control repository having one or more physical heterogeneous repositories. The system provides for storing, accessing, tracking data residing in said one or more data repositories managed by the virtual control repository. User Interfaces provide a combination of command line, scripts, GUI, Menu, Web Browser, and other interactive means which maps the user's view to a PFVL paradigm. Configurable Managers include a query control repository for existence of peer managers and provide logic switches to dynamically interact with peers. A control repository access layer provides a common process interface across all managers, which utilizes a virtual table paradigm to standardize communication with the control repository. Command translators map the generic control repository accesses into the appropriate format for interfacing with the underlying physical embodiment of the control repository. Claims What is claimed is: Description Trademarks: S/390 and IBM are registered trademarks of International Business Machines Corporation, Armonk, N.Y., U.S.A. Other names may be registered trademarks or product names of International Business Machines Corporation or other companies.
Package An arbitrary grouping of data objects that has some
relationship or common bond with each other. Each
package contains one or more variances.
Variance One or more objects within a package that, when
combined with the remaining objects in the same
Variance or from one or more dependent Variances,
comprise a coherent and meaningful collection of
objects
Level A collection of objects, within a Variance, that have
achieved some arbitrary degree of quality.
Filetype A collection of objects sharing the same data type or
format.
Version An iteration of a data object.
As an example, FIG. 2A depicts Package "A" (20) comprised of two Variances. Within each Variance are one or more data objects (21) of a given Filetype, residing at one or more Levels, with one or more Versions of the object. In the simplest case, a single Version of a single Filetype exists at a single Level within a single Variance of a single Package. Our invention achieves tremendous flexibility by allowing any of these attributes to be expanded n ways. By varying the dimensions of the cube, and the number of cubes in the Package, one can create a DMS capable of managing data in almost any environment. The present invention also permits Packages to be arranged hierarchically. This is illustrated at the bottom of FIG. 2A where Package "A" (20) is embedded within a higher level Package (22). The higher level Package may also contain its own data objects (21) as shown in the figure. This is possible because each Package in the hierarchy has its own set of PFVL attributes. For example, a printed circuit board could be considered a high level Package comprised of various ASICs, resistors, capacitors and connectors. The ASICs on the board could be considered Packages themselves, where each ASIC Package is comprised of the underlying circuit designs. FIG. 2B contemplates two examples of how the:PFVL Paradigm can be implemented in actual applications. The first table (23) demonstrates a typical electrical engineering design environment comprised of design objects dispersed in the DMS. The primary design object is an MPEG design consisting of multiple versions of a schematic residing in the "dsgn_lib" design library. This library also contains a VHDL object for the MPEG design. It can also be seen that the dsgn_lib library contains two Levels, Test and Prod. Versions of the MPEG schematic simultaneously exist at both Levels. Most of the objects are classified under the Universal Serial Bus (USB) Variance, except for a PCI Variant of the MPEG schematic. Our invention allows Variances to be completely independent or dependent upon other Variances. In this example, if the PCI Variance is based on the USB Variance, then all objects in the USB Variance can be picked up and used in the PCI Variance, unless they need to be modified. DMS Table 23 also illustrates an additional object, the Bus Controller, which also resides in the PCI Variance of the dsgn_lib library. Finally, the diagram illustrates an MPEG Layout which resides in a separate Package known as the Circuits library. The second DMS Table (24) in FIG. 2B shows how the same PFVL paradigm can be used to track objects and sub-assemblies in an automotive environment. In this case, Packages are used to denote the Cooling and Engine sub-assemblies as well as the Electro-Mechanical main assembly. Within each sub-assembly are one or more components described in the form of schematics, layouts and VHDL, and residing at quality levels QA1, and QA2. Also, some components exist under distinct Variances in order to accommodate two different automobile models. Returning to the overall architectural diagram identified as FIG. 1, the top layer is the User Interface Layer (10). This layer makes possible such scenarios as sharing electrical and mechanical design information by acting as an environmental adapter. An example of such an adaptation is present in a large electronic design organization where several design groups need to share data among several libraries. A common DMS application in this scenario would be a Check-In operation which allows data to enter the DMS from a user's private work space. Since the DMS accommodates several design groups using numerous libraries, the DMS Check-In application's API requires one of the invocation parameters to be the Package. If the methodology requires all the designers on a team to check their data into a single library, the User Interface Layer may employ a local "wrapper" or user utility which only requires the user to enter the name and type of design object being checked in. This wrapper then passes this information to the DMS Check-In application. It also supplies the sole library name as the Package as well as a hard-coded Level and Variance. To further demonstrate the advantage of the User Interface Layer, consider a second design group which also uses the same DMS to manage their data. Unlike the first design team, this one designs sub-assemblies in which each sub-assembly is treated as a Package. Since this team requires access to multiple packages, their Check-In function may consist of a "wrapper" in the User Interface Layer which invokes a menu that permits the user to specify a Sub-Assembly name. The wrapper then calls the same DMS Check-In application used by the aforementioned design group. However, this wrapper passes the Sub-Assembly name as the Package rather than hard-coding it like the first wrapper. One skilled in the art could easily envision how the User Interface Layer can employ several methods such as, but not restricted to, wrappers, shell scripts, batch files, command line interfaces, graphical user interfaces, web browsers, menus, or voice activated systems, which would be customized to the user's environment or methodology. The advantage to this approach is it allows different methodologies or processes to utilize the same underlying Data Management System. In addition, if an existing methodology changes, the underlying DMS functions remain intact. Only the functions in the User Interface Layer need to be modified to accommodate the new methodology. Returning to FIG. 1, our preferred embodiment contemplates the use of several layers which comprise the core architecture of the DMS. Spanning three of the layers are the DMS Managers (11). These are comprised of a plurality of functions, some of which belong to the DMS Application, Client/Server and Control Repository Access layers. By grouping these functions into isolated Managers with standardized interfaces, a great deal of modularity is achieved. Furthermore, these functions can be combined to form larger, more complex, applications. Consider the following portion of an example promotion application which illustrates one way to deploy a modular DMS: if (Lock Manager_Installed) { query Control Repository for any locks that exist on the file if (locks_exist) fail the promote } if (Authority_Manager_Installed) { query Cntl Repos to see if user has authority to do the promote if (user_not_authorized) fail the promote. } if (Process_Manager_Installed) { query Cntl Repos to see if any Library Processes need to run if (library_processes_exist) invoke them and wait for completion } Check Promotion Criteria Tell Control Repository to update level of the file Perform update to Data Repos (move file, update link, etc.) Within each code branch one or more Manager functions are invoked to perform the necessary DMS operations. By combining these functions together in an algorithmic way, one can achieve highly complex DMS applications. Furthermore, one can see how modularity can be achieved using the if statements to test the Control Repository for existence of a particular Manager. This permits Managers to be installed or configured in a "plug-n-play" manner simply by setting switches in the Control Repository. One could also envision an alternate embodiment where all the functions within each manager are compiled into independent objects. A DMS vendor or supplier could then construct customized DM systems based on the customer's needs, simply by linking together the required modules. For example, customer "A" may only require basic data management services so the DMS provider would only link the object code from the Library, Package and Lock Managers into a "lite" version of the DMS. Customer "B", on the other hand, may require use of applications involving aggregations (configurations) and Library Processing. This customer's DMS would link the object code from the Library, Package, Lock, Aggregation and Process Managers. Regardless of the implementation method, one skilled in the art can clearly envision the advantages afforded by such a system since enhancements or changes to functions in one Manager don't require the entire DMS to be recompiled, or redistributed. FIG. 1 also depicts the DMS Applications layer (12) which contains all the standard utilities that a user needs in order to interact with the DMS. This includes things like Check-In, Check-Out, Promotion, Locking, Library Searching, creating and tracking an aggregation or configuration, and setting or viewing process results. These utilities are described further is this disclosure as either functions residing within a particular Manager, or applications which consist of one or more functions, confined to a single Manager or involving a plurality of Managers. All functions and applications within this layer follow a consistent, standardized Application Program Interface which allows them to remain isolated from any user environment or methodology. This feature of the invention allows a single DMS to be deployed through several user groups performing similar or disparate work, yet having the need to share data between them. In the preferred embodiment, all functions and applications communicate with the Control and Data Repositories through the Client/Server Interface (13) layer. This is an expandable or contractible layer designed to allow either communication between the various layers in a client-only environment or between clients and one or more servers existing anywhere in a global enterprise. The same set of Manager functions, DMS applications and Control Repository Access routines are utilized regardless of the client/server topology. All communication into the Client/Server interface layer are directed to either the Control Repository Access Layer (14) or the Data Repository (15). The Control Repository Access Layer consists of one or more "transactions" which perform simple or complex operations against the Control Repository (CR) itself. These can typically be categorized as adding information to the CR, modifying existing information in the CR, deleting information from the CR, or extracting (and potentially filtering) information out of the CR. Regardless of the type of operation, all transactions in this layer are written as if the Control Repository is a single virtual repository consisting of tables organized around the PFVL paradigm. This approach allows different physical implementations of the Control Repository. It even permits a plurality of physically different implementations to appear as a single virtual Control Repository. Our invention further contemplates a virtual Data Repository (15) comprised of one or more physical repositories. The underlying repositories can be a simple file management system such as the Distributed File System (DFS) or a simple directory structure organized on a hard or floppy disk. Correspondingly, the data repository could be constructed using proprietary or commercially available storage engines or PDM products such as RCS, Sherpa, MetaPhase, SCCS, CMVC, and ClearCase. Furthermore, the present invention permits Automated Library Machines to be employed as Data Repositories. As shown in FIG. 1, all communication with the Data Repository is performed through the Client/Server Interface layer, which permits the Data Repository to be locally accessible to the client, or distributed anywhere in the global enterprise on a remotely accessible server. FIG. 3 depicts a complex Data Repository comprised of Data Repository "A" (30) which is a simple unix directory where the files in the DMS may reside. One skilled in the art can see how a similar structure can be employed on other file systems such as DOS, Windows NT, Linux, etc. Additional data may be stored in Data Repository "B" (31) which is a commercially available PDM such as RCS or Sherpa. Although these storage engines automatically handle revision control whenever a user checks data into or out of the system, the preferred embodiment maintains it's own unique file identifier in the form of a File Reference number within the Control Repository. The main reason for this is that it allows all data in the DMS to be tracked in a similar fashion regardless of the physical storage method employed. Furthermore, if the data ever needs to be transplanted from one storage engine to a completely different one, the operation can be accomplished by checking the data out of the old storage engine, checking it into the new one, and updating the associated Control Repository table which maps the File Reference number into a revision number. Since all information associated with the object is tracked by PFVL and File Reference number, the information is kept completely in tact even if the old and new storage engines use completely different revision control methods. One can also envision a simpler alternate embodiment wherein the revision number of the commercial storage engine plays the role of the File Reference number. Returning to FIG. 3, Data Repository "C" (32) could be a physical location on a server accessible via a Universal Resource Locator (URL) on the World Wide Web (WWW). Although all data in this system is stored using a variety of means, the PFVL Paradigm serves as the common storage model such that any client (33) can interact with the data. Furthermore, data is directed to the appropriate Data Repository through the use of the Data Repository Table (34). It clearly illustrates how the PFVL attributes can be used in any combination to segregate the data into one or more physical repositories. For example, all VHDL in the MPEG design library will be stored in Repository "B" which represents one of the commercial revision control engines such as RCS or Sherpa. Wiring Layouts for the Rel.sub.-- 1 Level of the Base Variant of the MPEG design library are stored in a DFS directory represented by Repository "A", and customer documentation for the MPEG design is stored in a publicly accessible URL on the World Wide Web (WWW) represented by Repository "C." One skilled in the art will also note that the use of wildcards in conjunction with the PFVL attributes permits a great deal of granularity in storage partitioning. The example shows a wildcard (*) in the Filename field, but this could also be filled in with a specific file or a family of files matching a certain pattern. Additional fields could also be added to the table such as a Version field to allow data lid to be physically segregated by revision number, File Reference numbers, or any pattern of said version control mechanisms. This approach offers the advantage of being able to not only use different storage methods for different types of data, but also solves problems associated with large, incompressible, files filling up physical storage media. This problem is prevalent in many commercial available data management systems which require either entire libraries or entire releases of data to be physically stored using the same means under a common directory structure. Returning to FIG. 1, the bottom of the diagram shows the Control Repository (17) which can be implemented using a multitude of methods, including, but not limited to, Table Formatted Files, Relational or Object Oriented Databases, or Meta-Data files in any format. Our invention also permits one or more of the above implementations to be used simultaneously to comprise a single virtual Control Repository. Regardless of the physical implementation of the Control Repository, all information is organized under the PFVL paradigm such that any entry in the repository directly or indirectly maps to one or more PFVLs. This permits users to access information about any object residing in any Package or library, at any Level or Variance regardless of whether that piece of information exists in a relational database, a simple ASCII file or a binary encoded MetaData file. Information can be freely reorganized or transplanted between different Control Repository implementations without the need to modify any DMS Applications, Manager functions or Control Repository Access transactions. Tables support underlying Manager functions and DMS Applications. A key player in enabling the aforementioned feature are the Conunand Translators (16) which interface between the Control Repository Access Layer and the Control Repository (17). Each physical implementation of the Control Repository would employ a unique Command Translator to map the generic Control Repository Access transactions into the appropriate command to satisfy the physical repository. Our invention contemplates the use of any syntax structure for the Control Repository Access (CRA) transactions. The syntax can be chosen to accommodate the physical embodiment of the DMS. The only requirement is that the syntax adheres to the PFVL paradigm. For example, in an homogenous environment where the entire Control Repository is implemented as a relational database, the CR Access transaction syntax might be structured in a manner similar to SQL commands. Thus, only a minor translation may be required prior to interfacing with the relational database. On the other hand, a heterogeneous environment with several physical implementations of the Control Repository may employ a much more generic CRA syntax based on a flexible programming structure more adept to multiple translations. In a similar manner to the Data Repository, this approach also enables a great deal of flexibility in upgrading the Control Repositories or permitting data from disparate sources to appear as one logical repository. For example, a SQL database may be employed as the primary Control Repository which includes all information necessary to track each object in the DMS by File Reference, PFVL, physical location, etc. This repository may also contain a Part Number table for all the manufactured pieces of a product. Off to the side might exist a Lotus Notes database containing service call or defect repair information organized by Part Number for the same product. Our invention would allow Control Repository Access transactions to be written, using an identical generic syntax, to extract design information about the part from the SQL database and repair actions from the Lotus Note database. This permits someone with no knowledge of the underlying Control Repository structure to write a DMS Application to invoke said functions and create a customized report containing information from both databases. The Command Translators would be responsible for mapping the generic transactions for the design information into a true SQL query, and the repair action transaction into a Notes extraction. DMS Application Layer Our invention contemplates an architectural layer dedicated to the various DMS Functions and Utilities that a user invokes to manipulate the Data Management System. Common functions found in this layer include, but aren't limited to, Check-In, Check-Out, Promote, Setting Locks, Checking Authorities, etc. Furthermore, these functions share a consistent application program interface (API) following the PFVL paradigm, which allows this layer to remain methodology and environment independent. FIG. 4 conveys the preferred embodiment of this layer. The DMS Applications Layer (41) is comprised of all the applications that enable a user to interact directly with the DMS. Each application consists of one or more application modules (42) which may or may not interact with the various Managers (44). FIG. 4 depicts various scenarios involving the interaction with the application modules: Non-Manager Interaction (45) An application may desire to interface directly to the Control Repository (46) without the need to interact with any Managers. An example might be a function which extracts project management data from the Control Repository and displays it in a formatted report. Single Manager Interaction (47) An application may only need to interface with a single Manager in order to execute all the steps in the application's algorithm. For example, an application which associates an object to a problem fix number only requires functions within the Problem Fix/EC/PN Manager. Multiple Manager Interaction (48) Often application algorithms require interaction with a plurality of Managers. For instance, a promotion algorithm may interface with the Authority Manager to determine if the user has the proper promote authorization. Next, it may execute Process Manager functions to determine if the object meets the necessary promotion criteria. Finally, it may interface with the Library Manager to perform the actual promotion to the next level. Control Repository Coupled with Manager Interaction (49) Any combination of the above methods may be used to construct an application which interacts with one or more Managers in addition to the Control Repository. For instance, an application may query the Control Repository to see which Managers are currently installed in a user environment, and use that information to branch through various parts of the algorithm which interface with the Managers. Within each Manager, FIG. 4 depicts one or more Manager Functions (43) which combine to form a library of utilities upon which applications can be constructed. For example, the Problem Fix/EC/PN Manager contains: Functions to associate objects to Problem numbers Functions to associate Problem numbers to releases Functions to associate Part Numbers to objects Together these modules form a library of functions within each Manager, upon which application developers can create more complex utilities. FIG. 4 also depicts an Application Program Interface (40) common to all applications and functions in the DMS Application Layer. The API is based on the PFVL paradigm. By requiring all the functions to conform to the PFVL paradigm they remain methodology independent while retaining the flexibility to be adapted to any user environment through the use of the User Interface Layer. Our preferred embodiment requires all functions in this layer to be invoked by passing Package, Filetype, Variance, and Level as the minimum amount of information. Additional information such as filename, iteration, or run-time options may also be supplied. Our embodiment also permits the wildcard character (*) to be used on any combination of PFVL attributes. For instance, if a wildcard is passed in place of the Level, then all information matching the remaining PFVL attributes at all levels is accessable. The wildcard can be combined with a partial PFVL attribute in a similar manner. In this case, a level attribute of prod* would access all information matching the remaining PFVLs at any level beginning with prod. Finally, a placeholder such as the percent (%) character can be used to ignore any attribute. Certain DMS applications may not require information regarding all the PFVL attributes, so use of the % character allows every DMS application to use an identical API to facilitate interaction with the user environment. For example, the following API could be used to interface with all DMS applications regardless of their underlying function: DMS_App_Name <filename> <filetype> <package> <variance> <level> <options> If a user environment doesn't necessitate the use of all the PFVL attributes, the user interface layer can suppress or hard-code them prior to invoking the underlying DMS application. For example, a user environment may exist such that variances aren't applicable and data only resides in two levels of a single package (library). Furthermore, the current process only permits users to check data into the DMS at the lowest level. The corresponding user interface may be a simple menu where the only two fields the user enters are the file name and file type. The underlying user interface code would automatically pass the sole package and level, and hard-code or suppress the variance to the DMS check-in API. The advantage to the present invention is in the event the use model changes to allow users to perform additional actions, such as checking data into the second level, only the user interface needs to be updated. Neither the underlying DMS applications nor the information in the Control Repository need to be updated. Our invention permits any combination of these functions, or any other functions in this or any other Manager, to be referenced by any application or by other functions in this or any other Manager. The modularity afforded by this approach results in novel implementation techniques. For example, the present invention allows the various steps of our promotion algorithm to be broken down into modular pieces of code whereby some constitute Manager Functions while others comprise the promotion application. Although the Manager Functions tend to be small enough to lend themselves to conventional procedural programming techniques, the application code itself may or may not. Although conventional procedural code can be used to implement any application in our DMS, the length and complexity can lead to difficulties in debugging or maintaining the code. One of the advantages of our modular DMS is it permits alternative techniques for implementing complex application algorithms. FIGS. 5A and 5B describe an alternative method using state tables. FIG. 5A shows an example promotion algorithm implemented using State Table (51) which is comprised of 12 steps. This table illustrates a full promotion where each step interacts with one or more Manager functions, including steps 6, 7 and 9 which interact with the Process Manager to invoke Library Processing. Each step has 2 Next State columns associated with it. The primary Next State column (labeled GOOD), indicates the next step which should be undertaken if the result of the present step is successful. The secondary Next State column (labeled BAD) indicates the state transition if the present step results in an unacceptable result. In addition to making it very simple to follow the algorithmic flow of the promote, this method also presents the advantage of simplified customization. State Table (52) illustrates the ease by which the promotion application can be updated to eliminate the Library Processing. This is accomplished by simply changing the Next State columns for steps 4, 5 and 8. to bypass steps 6, 7 and 9. Furthermore, State Table (53), in FIG. 5B, demonstrates how the DMS application can be modified to completely omit steps 6, 7 & 9 and renumber the remaining steps for simplicity. An additional advantage of the State Table method for DMS applications is code maintenance and deployment. Since these tables can exist in a simple ASCII format, one could make changes to the application flow without the need for formal programming education. Furthermore, one skilled in the art could envision how the state tables could be executed by an interpreter such that the application code could be changed dynamically by the user without the need to recompile any code modules in the DMS. This novel approach to implementing complex data management applications yields an improved method for deploying a general purpose DMS to a large customer base, yet allowing more sophisticated users to customize the applications. This improves upon the current state of the art by eliminating the need for the DMS provider to maintain a plurality of customized DM systems to satisfy a multitude of customer environments variations. Client/Server Interface The present invention contemplates the use of a Client/Server Interface embedded between the DMS Application and Control Repository Access layers. All communication between the DMS applications and the Control Repository functions is performed through special interface routines. These routines are responsible for locating the proper Control Repository, making the connection, and passing the appropriate information to the underlying Control Repository Access function. This feature allows a completely scalable DMS ranging from a low-end DMS where the Control Repository is directly accessible from the user's client to a high-end enterprise DMS where the Control Repository can be literally spread across a plurality of worldwide servers. For the low-end implementation, the client/server routines simply pass the required information from the DMS application to the CR Access function, much like a parent module invoking an external function or subroutine. In the high-end scenario, the routine would locate the server where the appropriate CR resides, make the appropriate connection and pass the information to the CR function. In addition to controlling the interface between the DMS applications and Control Repository, the client/server interface also controls access to the Data Repository. Once again, a low-end system may exist whereby the data resides in a file system directly updateable by the user's client. For example, during a Check-In process, the client would physically copy the data from the source location to the actual data repository. This could be accomplished by providing write access to the data repository for all users, or writing a client/server routine which utilizes techniques such as Unix setuid bits to ensure that data can only be written to the repository via the proper DMS applications. In the high-end scenario, the client/server routines could establish a connection with the server where the data repository resides, and employ the server to perform the appropriate file operation. This implementation lends itself to a more secure DMS since access to the data repository can be very tightly controlled, and user clients can not directly update the data repository outside of the DMS either intentionally or accidentally. Referring to FIG. 1, the DMS applications(12) and the Various Managers(11) all communicate with the Data Repository(15) and the Control Repository(17) via the Client/Server Interface(13). This interface is depicted in FIG. 7, where the DMS applications, the various Mangers and the CR interface layer is shown(71). Depending on the location of the server, i.e. local or remote, the respective Communications Services(72) are invoked. These services support a variety of protocols including but not limited to those depicted in the (73) layer. Some of these services communicated either directly to the Data Servers(75), through the network and or the severs depicted at the server layer (74). As an alternate embodiment for this layer, Automated Library Machines (ALMs) are employed in a "batch" environment which permits a large number of DMS operations to be queued and processed by these virtual machines. The Client/Server routines are responsible for creating work requests on the user's client and transmitting them to the appropriate ALM for processing. Use of ALMs also provides the advantage of breaking up large complex DMS applications into foreground and background pieces. The foreground portion runs on the user's client, then a work request is created and transmitted to the ALM through the Client/Server Interface. Upon receipt of the work request, the ALM processes the background portion of the DMS operation, including all file manipulations. Since the foreground portion tends to comprise a series of checks as opposed to intensive computing, improved client throughput can be achieved by offloading the more compute intensive portions of the DMS application to the ALM. Control Repository Access Layer Our invention contemplates the use of a separate Control Repository Access Layer comprised of a library of functions or transactions which extract, add, modify or delete information from the Control Repository. There are two main advantages to separating this code from the functions comprising the DMS Application Layer: 1. Many transactions can be used in multiple DMS applications, so in an effort to modularize the code and prevent duplication, one skilled in the art could envision how these transactions can be instantiated in DMS applications much like a logic designer instantiates circuit macros. 2. In larger DM systems where performance is a critical issue, it is frequently prudent to combine several smaller transactions into "macro" transactions. This is best performed by someone with intimate knowledge of the internal organization of the Control Repository. By separating the CR Access functions from the DMS applications, the end users can readily modify the DMS applications without acquiring the aforementioned knowledge. 3. This approach readily lends itself to a plurality of physical embodiments of the Control Repository since all the transactions can employ the same format, and only the command translation code needs to be personalized for its associated physical Control Repository. 4. One physical embodiment of a Control Repository can be replaced with a different physical embodiment without the need to alter any of the DMS applications or underlying Control Repository Access transactions. The administrator of the Control Repository need only update the command translation code to reflect the new physical embodiment. FIG. 6 illustrates an example whereby our invention employs command translators in a heterogeneous environment with two disparate physical Control Repositories which together comprise a single virtual Control Repository. The virtual Control Repository contains an authorization table and a financial results table. In this example, the Get_Auth (61) transaction is used to query information from the authorization table. Similarly, the Get_Rslt (62) transaction is used to query financial results. One will immediately notice that both transactions utilize a similar syntax with the only difference being the quantity of information supplied. In both cases, all PFVL (Package, Filetype, Variance, Level) information is supplied as well as addition parameters. It is also important to note that one can't deduce the manner by which the control information is physically organized or stored, nor can one deduce whether the authorization and financial results are stored in the same or different Control Repositories. The Get_Auth (61) transaction is then processed through Command Translator "A" (63) which performs a relatively simple remap of the authorization query into the appropriate SQL queries (64) to interface with the Relational Database (65) where the authority table resides. Conversely, the Get_Rslt (62) transaction is processed through Command Translator "B" (66). Unlike the authorization table, the financial results table may reside in an extremely secure physical Control Repository such as cryptographic data storage product. Thus, Command Translator "B" would need to locate the proper Meta Data file, perform the necessary file read operation, decrypt the data, format it properly and return it to the calling application. One of the key advantages contemplated by this invention is that the DMS application writer only needs to reference the available transactions and their parameter lists. Once the application is written, it can remain intact, even if the underlying physical Control Repositories are further distributed, combined or in any other way reorganized. The aforementioned example also demonstrates another advantage of the present invention. One could envision a scenario whereby the users initially interact with both physical control repositories via a traditional computer system employing those elements typically found in said system, such as a keyboard, monitor, central processing unit, memory, mouse, etc. In such an environment, the users would obtain authorization to query financial data by manually entering their employee identification. Suppose the desire exists to improve the authorization process by introducing a pervasive device such as a badge reader or biomedical device such as retina or iris scanner. By using a generic API for the Control Repository Access functions, one could appreciate how much easier it would be to interface with nontraditional devices such as badge readers or retina scanners since they only need to provide information to be passed to the Command Translator. Since the device doesn't directly communicate with the Control Repository it can employ a relatively simple protocol such as TCP/IP or RS232, and doesn't need to generate complex commands such as SQL queries. One might conclude from the previous example that our invention implicates a one-to-one correspondence between Control Repository Access functions and Command Translators. However, our invention permits any combination of Control Repository Access transactions to interact with any combination of Command Translators to communicate with any combination of physical Control Repositories. FIGS. 8A and 8B illustrate how the various architectural layers can be used to achieve greater overall processing efficiency. In this example, it necessitates the use of a single Control Repository Access transaction to perform multiple repository accesses. The left side of FIG. 8A depicts the control flow of a simple DMS application designed to establish an ownership or File Check Out for Update lock on a data object. This is a typical requirement for any data management system that permits multiple users to access and update the same piece of data. Step 80 displays the File Lock menu which can be any type of text based or graphical menu in which the user enters the necessary Package, Filetype, Variance, Level and Filename of the object they wish to update. One can appreciate how this step could utilize code similar to that previously disclosed to test for the existence of various Managers and tailor the menu accordingly. In this example, the application permits the user to select multiple files so Step 81 sets up a File Loop to perform the desired action for each file. Step 82 represents the Control Repository Access transaction for setting an ownership lock. Step 83 displays the results of the lock setting operation back to the user. To further illustrate the advantage of this invention, assume that the example employs a client/server implementation such that the DMS application submits the lockset CRA transaction through one of the client/server interface means described in FIG. 7. Also assume that the example methodology requires that only authorized users of the DMS may establish ownership locks. The layered architecture disclosed in the present invention permits a very efficient implementation of such an example environment. The right side of FIG. 8A (84) depicts the internal steps comprising the CRA lockset transaction (82). Our invention contemplates the use of several different types of locks on data objects in the DMS, therefore the Lock Menu (80) may offer the user a choice of locks. Therefore, the first step in the internal lockset transaction (84) is Step 85 which tests to see if the type of lock desired is an ownership lock (file check-out). Step 86 then queries the Authority table to ensure the user is authorized to update the requested file. The virtual Authority table (88) is shown in FIG. 8B. Step 87 then performs the necessary updates to the Lock table (89) also shown in FIG. 8B. Since steps 86 and 87 are both required for all File Check Out operations, it's more efficient to combine them into a single Control Repository Access transaction. This way, all the overhead associated with the client/server communication is only incurred once per file. It is worth noting that both tables are structured identically, but this does not imply they reside in the same physical Control Repository. In fact, it's not possible using FIG. 8B to discern how these tables are physically organized, nor if they reside in the same or separate physical embodiments. The present invention permits the data comprising a virtual Control Repository to be organized in any physical arrangement desirable, and furthermore, one or more of these physical Control Repositories can be accessed from the same CR access transaction. Conversely, multiple CR access transactions can access data from the same physical Control Repository. The existence of Command Translators in our invention permits any conceivable arrangement of Control Repository Access transactions to interact with any organization of one or more physical Control Repositories. Additionally, this example demonstrates a further advantage of having the DMS applications architecturally segregated from the CRA transactions. If the user desires to set ownership locks on all the files of a given Type in a given Level and Variance of a particular Package, then the File Loop (81) could recognize this and rather than initiating a multitude of CRA transactions for each selected file, it could generate a single transaction substituting a * for the Filename. The aforementioned example shows how a single CRA transaction may require a plurality of control repository accesses. However, our invention does not mandate the quantitative relationship between accesses and the underlying command translators. For example, the authority check in Step 86 and the lock table update in Step 87 will both employ command translators, but said translation code can be implemented in any desirable embodiment. Each step may call an independent translator implemented as an independent entity, or both translators could be embodied together within the same entity. Our invention even permits the translation code to be incorporated directly into the Control Repository Access transaction code as subroutines, methods, etc. One skilled in the art could appreciate how various programming techniques such as.dynamic link libraries, subroutines, modules, and features found in object oriented programming languages can work in concert with the flexible architecture disclosed herein to produce the most efficient means of packaging the numerous CRA transactions and command translators which might comprise a typical data management system. Turning our attention to FIG. 9A, the illustration depicts a heterogeneous physical Control Repository comprised of an Authorization table physically stored on a server (91) using a conventional database application such as DB/2, Oracle, Access, Notes, or even a flat file. One advantage to using this type of implementation is that the virtual Authorization table can be physically manipulated using the traditional row and field paradigm. On the other hand, the Lock table is physically implemented as a plurality of .LCK files resident in a directory structure (92) mapped to the PFVL architecture. The directory structure (92) shows a typical engineering design library where the Package=DSGN_LIB, Variance=USB, Level=TEST, Filetype=SCHEMATIC, and Filename=MPEG. The MPEG.LCK file denotes the existence of a lock on this particular file. The contents of the MPEG.LCK file contain other information such as the identity of the lock owner, and the type of lock. One can see how this physical arrangement can be derived from a virtual Lock table such as the one depicted in FIG. 8B. Although this specific example only depicts seven fields of information in the virtual Lock table, one can see how this can be expanded to add additional information such as time/date when the lock was set, reason for the lock, or any other desirable meta data, and how this additional information can be easily added to the contents of the .LCK file. In addition, one skilled in the art can appreciate how other files in this example system can contain their own .LCK files, or how alternate embodiments of this system could use various other means of implementation including but not limited to symbolic links, flat files which contain information on a plurality of locks, multiple lock files per data object, etc. Once again, the preferred embodiment only conveys a small subset of the possibilities afforded to the user by the present invention. Continuing with the figures, FIG. 9B depicts an example Control Repository Access transaction (93). The syntax of the transaction is purely arbitrary and can be chosen to accommodate the environment. The Perl code Command Translator (94) on the left side of FIG. 9B shows how CR Access transaction (93), which treats the information as if it's located in rows and fields of a table, is translated into the file I/O routines necessary to manifest the physical embodiment depicted in the PFVL-based directory structure (92) of FIG. 9A. Although this represents a simple subset of a realistic Command Translator, it illustrates the four minimum steps required to accomplish the task. First the arguments are parsed according to their position in the entry argument list. Second, the PFVL portion of the argument list is used to construct the path down to the file being locked. The third step takes into account the possibility that a PFVL attribute (such as the Variance) may be absent in the physical embodiment of the DMS. As stated earlier in this disclosure, absence of a PFVL attribute is denoted with a key character such as "%". In the event a "%" is passed in it means the directory corresponding to that PFVL attribute is missing. Finally, the fourth step writes the LCK file containing the type of lock and identity of the owner. One skilled in the art can see how the example Perl code could be easily implemented in virtually any programming language such as C, Java, Basic, Rexx, Pascal, etc. and it should be noted that Perl was selected for illustration purposes only. One of the key advantages of the present invention is the ability to easily replace the physical embodiment of the Lock table shown in the PFVL-based directory structure (92) with a more centralized physical embodiment such as the traditional database server (91) managing the Authority table. This is accomplished by simply replacing the Perl Command Translator (93) with the appropriate database translator such as the SQL Translator (95) shown on the right side of FIG. 9B. In this example, the SQL Translator (95) performs the same function as the Perl Translator (93) by using three steps. The first two are combined into an atomic database operation which updates the reference ID of the Lock Table by incrementing the last known reference ID and returning the newly incremented value into the ref variable. The third SQL statement uses this ref variable to insert the PFVL, owner and lock type information into the newly created row of the table. Notice how the same CR Access transaction (93) is used, but the underlying actions are entirely different. Rather than performing file system operations, the new command translator must perform traditional database table modifications. Although this may be a radical alteration to the command translator, it Is completely hidden from the CR Access transaction and any DMS application which uses the craSetLock function. It is this manner by which our invention permits a low-end simplistic data management system such as that shown in the right side of FIG. 9A to grow into a more sophisticated high-end system. In addition it provides a means of replacing one physical embodiment of a Control Repository with another embodiment that may be very similar or completely different, without impact to the end users. This is a strategic advantage in a business environment where mergers and consolidations require constant modifications to their computer and information systems. As previously stated, the syntax of the example Control Repository Access transaction (93) is purely arbitrary. Our invention does not dictate the format of the Control Repository Access functions, it merely requires that the syntax allow for inclusion of any and all PFVL attributes necessary to define a data object, along with any additional information pertinent to the corresponding virtual Control Repository table. In this particular example: craSetLock(Pkg, Type, Var, Lvl, FileName, LockType, Userid) the syntax resembles a typical subroutine or function call where the information is passed as a series of arguments or parameters where the order of the parameters is determined in advance. One should note that this same transaction could use any other imaginable syntax including but not limited to the following examples: craSetLock(Package=Pkg, FileType=Type, Variance=Var, Level=Lvl,FileName=FileName, LockType=LockType, Owner'Userid) craSetLock(?Package Pkg?FileType Type?Variance Var ?Level Lvl?FileName FileName?LockType LockType ?Owner Userid) craSetLock.fwdarw.Package.fwdarw.FileType.fwdarw.Variance.fwdarw.Level.fwda rw.FileName.fwdarw.LockType.fwdarw.Owner=(Pkg Type Var Lvl FileName LockType Userid) SELECT(Lock, Pkg, Type, Var, Lvl, FileName, LockType, Userid) The present invention affords the opportunity to select the syntax of the Control Repository Access transactions to best accommodate the implementation of the DMS. For example, a DMS which is predominantly implemented as SQL databases would likely choose a different syntax from a DMS largely constructed out of C code. Furthermore, our invention doesn't mandate that all Control Repository Access transactions follow the same syntax. Although the preferred embodiment demonstrates the advantages of using a single syntax to create a uniformity and consistency across the entire DMS, our invention recognizes where there may be circumstances that warrant use of a plurality of syntaxes for different groups of CR access functions. For instance, consider the case where the virtual Control Repository is comprised of three physical repositories such that some of the information is stored in a relational database, some of it is organized as simple ASCII files in a file system and the remainder resides on a web server. If the situation is such that the environment is unlikely to change, our invention permits the administrator to have three sets of CR access transactions. The first could use a "SQL-like" syntax, the second a simple parameterized list, and the third might use Extensive Markup Language (XML). The disadvantage to this approach is it requires the DMS application developers to use a multitude of CR access syntaxes. However, this may not be a major concern if the DMS applications tend to access data only within a particular physical Control Repository. On the other hand a considerable performance advantage may be obtained by simplifying the Command Translation code. As previously stated, the present invention simply requires that whatever syntax is used permits any and all PFVL attributes to be expressed. While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.
|
Same subclass Same class Consider this |
||||||||||
