Archiving or backup

Asynchronous information sharing system

6889231

Abstract

Techniques for sharing information in a wide variety of contexts allows both an explicit capture process and an implicit capture process to add information items to a staging area. An information sharing system supports both implicit and explicit consumption of information items that are stored in the staging area. A rules engine allows users to create and register rules that customize the behavior of the capture processes, the consuming processes, and propagation processes that propagate information from the staging areas to designated destinations. Exactly-once handling of sequence of items is achieved for items maintained in volatile memory. DDL operations are recorded, and operations are asynchronously performed based on the previously-performed DDL operations.


Claims

1. A method for sharing information, the method comprising the steps of:

an explicit capture process adding a first set of one or more information items to a staging area by making an explicit call through an API associated with the staging area;

an implicit capture process automatically adding a second set of one or more information items to said staging area based on events that occur in a system associated with said implicit capture process; and

a consumer process consuming information items that are stored in said staging area.

2. The method of claim 1 where the step of automatically adding a second set of one or more information items is performed by a capture process that inserts information items into the staging area based on the content of log files generated within a database system.

3. The method of claim 2 wherein the staging area is managed by said database system.

4. The method of claim 2 wherein:

the log identifies changes made within said database system; and

the set of one or more information items includes records that reflect a set of said changes.

5. The method of claim 4 wherein:

the set of said changes is a subset of all changes reflected in said logs; and

the capture process selects which changes to reflect in said records based on rules that are indicated in metadata stored within said database system.

6. The method of claim 1 wherein:

the explicit capture process, the implicit capture process and the consuming process are components in an information sharing system; and

the method further comprises the steps of:

the information sharing system receiving rules data that specifies one or more rules that indicate how at least one of said components is to operate; and

registering said rules data by storing, within said information sharing system, metadata that represents said one or more rules;

said at least one component reading said metadata and operating in a manner specified in said metadata.

7. The method of claim 6 wherein the rules data includes rules that indicate how said at least one component is to transform said information items.

8. The method of claim 6 wherein:

the information sharing system includes a database system;

the step of registering said rules data includes storing, within the database system, metadata that represents said one or more rules; and

said at least one component is a process that executes within said database system.

9. The method of claim 1 further comprising the steps of:

the implicit capture process storing, within each information item of said second set of one or more information items, a tag value that indicates that the information item to corresponds to an event that occurred in said system; and

using said tag values to avoid a cycle that would otherwise cause said event to be, re-applied within said system.

10. The method of claim 9 further comprising the steps of:

propagating the information item from said staging area to a second system;

making a change in said second system by applying the event associated with the information item in said second system;

wherein the second system is configured to propagate changes in said second system to said first system; and

wherein the step of using said tag values to avoid a cycle includes, based on the tag value in said information item, preventing said change from being propagated to said first system.

11. The method of claim 1 wherein:

said system is a first system; and

the an implicit capture process executes in a second system that is remote relative to said first system.

12. The method of claim 11 wherein:

the method further comprises the step of communicating a log from the first system to the second system; and

the implicit capture process generating said second set of information items based on information contained in said log.

13. The method of claim 12 wherein:

the first system is a first database system;

the second system is a second database system;

the log is a redo log generated by the first database system; and

the second set of information items identifies changes that were made in said first database system.

14. The method of claim 13 wherein:

the staging area resides in said second database system; and

the method further comprises the step of using an apply process to read said second set of information items from said staging area and to make changes in said second database system based on said second set of information items.

15. A method for sharing information, the method comprising the steps of:

an capture process automatically adding a set of one or more information items to said staging area based on events that occur in a first system associated with said capture process; and

an capture process automatically adding a set of one or more information items to said staging area based on events that occur in a first system associated with said capture process; and

the capture process storing, within each information item of said set of one or more information items, a tag value that indicates that the information item corresponds to an event that occurred in said first system;

propagating an information item from said staging area to a second system;

making a change in said second system by applying in said second system the event associated with the information item;

wherein the second system is configured to propagate changes in said second system to said first system; and

using said tag values to avoid a cycle by preventing said change from being propagated to said first system, based on the tag value in said information item.

16. A method for sharing information, the method comprising the steps of

a capture process automatically performing the steps of inspecting redo log files in a first database system;

adding a set of one or more information items to a staging area based on events that are indicated in said redo log files; and

a consuming process automatically processing information items from said staging area by reading information items in said staging area and causing changes to be made in a second database system based on events that are indicated in said redo log files.

17. The method of claim 16 wherein the step of causing changes to be made in said second database includes maintaining in said second database system replicas of one or more database objects in said first database system, wherein said one or more database objects are a subset of replicable objects in said first database system.

18. The method of claim 16 wherein the step of causing changes includes the steps of:

constructing a database command to cause said changes; and

submitting said database command to said second database system.

19. The method of claim 16 wherein the first database system is a different type of database system than said second database system.

20. The method of claim 16 further comprising the steps of:

receiving subscription data that indicates subscribers and information in which the subscribers are interested;

a second consuming process automatically reading information items from said staging area and, based on said subscription data, notifying subscribers that are interested in events associated with the information items.

21. A method for sharing information, the method comprising the steps of:

registering a set of capture rules within a database system;

registering a set of propagation rules within said database system;

based on said set of capture rules, determining which events that occur within said database system are to be captured;

capturing said events by storing information about said events in a staging area;

based on said set of propagation rules, determining how to propagate information from said staging area; and

propagating information from said staging area based on said set of propagation rules.

22. The method of claim 21 wherein further comprising the steps of:

registering a set of apply rules;

receiving said information propagated from said staging area; and

applying said information received from said staging area based on said set of apply rules.

23. The method of claim 21 wherein the step of capturing said events includes reading logs generated by said database system, and selectively storing in said staging area information about events identified in said logs.

24. The method of claim 21 wherein:

the step of propagation includes propagating said information to a second staging area; and

the method further includes the step of applying changes identified in said second staging area to a second database system.

25. The method of claim 24 further wherein:

the method further comprises the step of registering a set of apply rules; and

the step of applying changes is performed based on said set of apply rules.

26. The method of claim 24 further wherein:

the method further comprises the step of registering a user procedure; and

the step of applying changes is performed based on said user procedure.

27. A method for sharing information, the method comprising the steps of:

a capture process adding a first set of one or more information items to a staging are;

an apply process automatically reading information items from said staging are and selectively consuming said information items; and

an explicit dequeue process consuming an information item from said staging area by making a explicit call to an API associated with said staging area.

28. A computer-readable medium having stored thereon instructions which, when executed by one or more processors, cause the processors to perform the method recited in any one of claims 1-26 and 27.


Description

FIELD OF THE INVENTION

The present invention relates to information sharing systems.

BACKGROUND OF THE INVENTION

The ability to share information easily and in a timely fashion is a crucial requirement for any business environment. Consequently, information sharing has been supported by many mechanisms, such as discussions, mail, books, periodicals, and computer technology. Many computer-based technologies have evolved to promote the goal of information sharing, such as reports/statements, replication and messaging.

Unfortunately, most information sharing is still handled through applications, which represent a relatively expensive solution due to the costs associated with developing, deploying, operating and maintaining the applications that provide the information sharing services. In addition, the services provided by such applications often lack desired functionality, such as support for ad-hoc requests, customization, as well as timely and flexible delivery.

An important feature of any database management system is the ability to share information among multiple databases and applications. Traditionally, this has involved users and applications pulling information from the database using various overlapping technologies. Today, new efficiencies and business models require a more comprehensive and automatic approach. Many information sharing solutions are targeted to specific information sharing problems. While such solutions may solve the specific information sharing problem to which they are directed, they may not be applicable to, and may even be incompatible with, other information sharing problems.

Based on the foregoing, it is desirable to provide a system and techniques for, sharing electronic information in a manner that is more flexible than current problem-specific solutions.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1 is a block diagram of an information sharing system configured according to an embodiment of the invention;

FIG. 2 is a block diagram illustrating three general phases experienced by data items as they flow through an information sharing system, according to an embodiment of the invention;

FIG. 3 is a block diagram illustrating the automated capture of changes in a database, according to an embodiment of the invention;

FIG. 4 is a block diagram illustrating events that are propagated from a source queue to a destination queue according to an embodiment of the invention;

FIG. 5 is a block diagram illustrating a directed networks environment, implemented according to an embodiment of the invention;

FIG. 6 is a block diagram illustrating the explicit enqueue and dequeue of events in a single queue, according to an embodiment of the invention;

FIG. 7 is a block diagram illustrating an explicit enqueue, propagation and dequeue of events, according to an embodiment of the invention;

FIG. 8 is a block diagram illustrating an apply process according to an embodiment of the invention;

FIG. 9 is a block diagram illustrating a transformation during an apply operation, according to an embodiment of the invention;

FIG. 10 is a block diagram illustrating the use of an information sharing system to share data from an Oracle database system to a non-Oracle database system;

FIG. 11 is a block diagram illustrating the use of an information sharing system to share data from a non-Oracle database system to an Oracle database system;

FIG. 12 is a block diagram that illustrates an information sharing system implemented within a single database, according to an embodiment of the invention;

FIGS. 13A and 13B are block diagrams illustrating an information sharing system used to share information between multiple databases, according to an embodiment of the invention;

FIG. 14 is a block diagram illustrating stages in a rule set evaluation process, according to an embodiment of the invention;

FIG. 15 is a block diagram illustrating that one rule set can be used by multiple clients of a rules engine, according to an embodiment of the invention;

FIG. 16 is a block diagram illustrating transformation during capture, according to an embodiment of the invention;

FIG. 17 is a block diagram illustrating transformation during propagation, according to an embodiment of the invention;

FIGS. 18A, 18B and 18C are block diagrams illustrating a multiple-node system in which each database is both a source and a destination database;

FIG. 19 is a block diagram illustrating the use of tags when each database is a source and destination database;

FIG. 20 is a block diagram illustrating a primary database sharing data with several secondary databases;

FIG. 21 is a block diagram illustrating tags used at the primary database;

FIG. 22 is a block diagram illustrating tags used at a secondary database;

FIG. 23 is a block diagram illustrating a primary database and several extended secondary databases;

FIG. 24 is a block diagram illustrating the in-memory streaming of change information from a source site to a destination site through one intermediary site, according to an embodiment of the invention;

FIG. 25 is a flowchart illustrating steps performed by an apply engine, according to an embodiment of the invention, that uses a persistently stored LOW WATERMARK, persistently stored data that identifies ABOVE-MARK APPLIED transactions, and non-persistently stored HIGHEST SO FAR CSNs, to achieve exactly-once behavior; and

FIG. 26 is a block diagram of a computer system on which embodiments of the invention may be implemented.

DETAILED DESCRIPTION OF THE INVENTION

A method and system are described for sharing electronic information. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.

Chains of Triggered Activities

Conventional database system technology frequently treats the manipulation of data as an isolated action. However, in many real-world scenarios, this is not the case. Specifically, the manipulation of data often triggers a series or "chain" of activities. The activities thus triggered my fall into various categories, including but not limited to:

    • Information creation, modification, deletion, or the passage of time: activities in this category may constitute a "business event".
    • Evaluation of information requirements: determining who needs/likes to be informed about a business event.
    • Creation of desired information: the information is created in a mutually agreed format, using applications, views, and/or transformations.
    • Transfer of information to the desired location via the desired transport.
    • Modification of the data at a target location: absorption of new information in the target environment organized according to the needs of the recipient.
    • Notification of new state: provides low latency knowledge for recipients or programs; notification may activate applications.
    • Access to information: potentially for a reaction, creating and/or modifying information (thereby causing another "business event").


  • According to one embodiment, rules may be established for the various activities to automatically carry out the chain of activities that are desired for certain data modification events. Of course, the specific chain of activities that is triggered by any given data manipulation event will vary based on the nature of the event and the rules that have been established.

    Functional Overview

    A flexible asynchronous information sharing system is described hereafter. The system provides numerous features that can be used alone or in combination to solve wide varieties of information sharing problems. According to one embodiment, the information sharing system includes one or more staging areas for storing information that is to be shared. One set of software processes, referred to herein as "capture processes", place information in the staging areas. Another set of software processes, referred to herein as "consuming processes", consume information from the staging areas.

    According to one embodiment, the information sharing performed through the staging areas is asynchronous. Specifically, the processes that generate changes that are captured by capture processes do not pause execution to await the capture of the changes by the capture processes. Conversely, the capture processes need not report back to the processes that generated the changes. Similarly, the capture processes do not pause execution to await the further processing of the information that they add to the staging areas. Similarly, the consuming processes need not report back to the capture processes to prompt the capture processes to continue execution.

    According to one aspect, the information sharing system supports a wide variety of capture processes, including implicit capture processes and explicit capture processes. An implicit capture process is a process that adds the information to one or more staging areas based on events that occur in a system associated with said implicit capture process. A log capture process is an example of an implicit capture process. A log capture process reads logs, such as logs generated by a database system in response to events that occur within the database system, and places information into a staging area based on the contents of the logs. An explicit capture process is a process the adds information to a staging area by making an explicit function call, through an API associated with a staging area, to add information to the staging area.

    According to another aspect, the information sharing system supports a wide variety of consuming processes, including apply processes, propagation processes and explicit dequeue processes. An apply process is a process that automatically dequeues and acts upon information contained in a staging area. A propagation process automatically dequeues and moves information from one staging area to a specified destination. The specified destination may be, for example, another staging area. An explicit dequeue processes retrieves information from a staging area by making an explicit call, through an API associated with the staging area, to retrieve the information from the staging area.

    Consuming processes may be configured to perform a wide variety of operations with the information they consume. For example, a consuming process may be configured to deliver messages that are extracted from the queue to "subscriber processes" that have previously registered an interest in receiving, or being notified about, certain types of information or events. In another context, the extracted information may represent changes that have been made within one database system, and the consuming process may be configured to make corresponding changes in another database system.

    System Overview

    FIG. 1 is a block diagram of a system 100 for asynchronously sharing information according to an embodiment of the invention. Referring to FIG. 1, it includes a plurality of staging areas 102, 104, 106. Information is added to each of staging areas 102, 104, 106 by capture processes 112, 114 and 116, respectively. Information is consumed from each of staging areas 102, 104, 106 by consuming processes 122, 124 and 126, respectively. Capture processes 112, 114 and 116 may include implicit capture processes and/or explicit capture processes. Consuming processes 122, 124 and 126 may include apply processes and explicit dequeue processes.

    System 100 further includes a propagation process 118 configured to extract information from one staging area 106 and add the information to another staging area 102. As shall be described in greater detail hereafter, the source and target of a propagation process 118 need not always be a staging area. For example, a propagation process may be configured to selectively extract information from a staging area and send the extracted information to another process that is interested in the information. The other process may be, for example, a process running in a system that is remote relative to system 100.

    According to one embodiment, staging areas 102, 104 and 106 are implemented as queues that are not type-specific. Because the staging areas 102, 104 and 106 are not type-specific, the same staging area can be used to store numerous different types of data. Consequently, various pieces of information may be stored together within a staging area in a sequence or arrangement that reflects a relationship between the pieces of information, even when the pieces of information correspond to different types of data. In alternative embodiments, the staging areas may be type specific, where each staging area is designed to store a particular type of information item.

    Information sharing system 100 enables users to share data and events. The information sharing system 100 can propagate this information within a database or from one database to another. The information sharing system 100 routes specified information to specified destinations. The result is a new feature that provides greater functionality and flexibility than traditional solutions for capturing and managing events, and sharing the events with other databases and applications. Information sharing system 100 enables users to break the cycle of trading off one solution for another. Information sharing system 100 provides the capabilities needed to build and operate distributed enterprises and applications, data warehouses, and high availability solutions. Users can use all the capabilities of information sharing system 100 at the same time. If needs change, then users can implement a new capability of information sharing system 100 without sacrificing existing capabilities.

    Using information sharing system 100, users control what information is put into the information sharing system 100, how the information flows or is routed from staging area to staging area or from database to database, what happens to events in the information sharing system 100 as they flow into each database, and how the information sharing system 100 terminates. By configuring specific capabilities of information sharing system 100, users can address specific requirements. Based on user specifications, information sharing system 100 can capture, stage, and manage events in the database automatically, including, but not limited to, data manipulation language (DML) changes and data definition language (DDL) changes. Users can also put user-defined events into the information sharing system 100. Then, information sharing system 100 can propagate the information to other databases or applications automatically. Again, based on user specifications, information sharing system 100 can apply events at a destination database. FIG. 2 shows the phases through which information typically flows when being shared through information sharing system 100.

    Information Sharing Options

    As mentioned above, the chain of activities that can be carried out by system 100 in response to an event may take many forms. In general, the chain of activities may involve one or more of: Data capture, out-bound staging, propagation, in-bound staging and consumption. According to one embodiment, system 100 provides mechanisms to perform each of these activities in a variety of ways. Table 1 lists various options for some of the characteristics for each of the various activities.

    TABLE 1 COMPONENT ELEMENT OPTION COMMENTS Data Capture Mode E - Explicit One to choose I - Implicit Data Type S - Schema One to choose B - Business Object Constraints N - None Any combination S - Sequence CY - Cycle CO - Conflict P - Process D - Data TR - Transactional Staging: Out- N - None One to choose Bound J - Journal B - Basic S - SQL Propagation Delivery B - Best Effort One to choose E - Exactly Once Security C - Confidential Any combination S - Signed Addressing O - Open One to choose C - Closed Constraints Same options as data capture Staging: In- Same options as Bound Staging Out-bound, except J Consumption Same options as data capture

    With respect to the data type in which information is captured, the "schema" option refers to a schema-oriented view of the data. Conversely, the "B" option refers to a business document oriented view of the data.

    The list of activities, elements, and corresponding options given in Table 1 is not exhaustive. The information sharing framework described herein may be implemented in a manner that provides numerous other activities, elements and options. For example, another option for the delivery element of the propagation activity may be "at least once". Thus, Table 1 is merely intended to illustrate the flexibility of the information sharing system described herein.

    Table 2 illustrates how the flexibility of the information system described herein may be exploited to accomplish information sharing tasks in a diversity of contexts. Specifically, Table 2 lists a context in which information sharing is desirable or required, and lists the options that might be used when using system 100 to carry out the information sharing activities in that context.

    TABLE 2 DATA CAPTURE AND OUTBOUND INBOUND CONSUMPTION STAGING PROPAGATION STAGING CONTEXT OPTIONS OPTIONS OPTIONS OPTIONS Messaging - E, B, TR B/S N/A N/A Local Messaging - E, B, TR B/S *, *, *, TR B/S Remote Application to E, B, P, TR S E, C, O, TR S Application Replication - I, S, S, CY, CO, S E, C, C, TR B Standard TR Replication - I, S, S, CY, CO, J E, C, C, TR B Journal TR Replication - I, B, S, CY, CO, S/J E, C, C, TR B Semantic or TR B2B HA I, S, TR J E, C, C, TR B HA - Semantic I, B, TR J E, C, C, TR B B2B Messag- E, B, TR S E, *, O, TR S ing B2B Protocols E, B, P, TR S E, *, O, TR S

    Operational Overview of Information Sharing System 100

    According to one embodiment, users can use information sharing system 100 to capture changes at a database, enqueue events into a queue, propagate events from one queue to another, dequeue events, apply events at a database, implement directed networks, perform automatic conflict detection and resolution, perform transformations, and implement heterogeneous information sharing.

    With respect to capturing changes, users can configure a background log capture process to capture changes made to tables, schemas, or the entire database. According to one embodiment, a log capture process captures changes from the redo log and formats each captured change into a "logical change record" (LCR). The database where changes are generated in the redo log is called the source database.

    With respect to placing events into a queue, at least two types of events may be staged in a queue of information sharing system 100: LCRs and user messages. A capture process enqueues events into a queue that users specify. The queue can then share the events within the same database or with other databases. Users can also enqueue user events explicitly with a user application. These explicitly enqueued events can be LCRs or user messages.

    With respect to propagating events from one queue to another, the queues may be in the same database or in different databases.

    With respect to dequeueing events, a background apply process can dequeue events. Users can also dequeue events explicitly with a user application.

    With respect to applying events at a database, users can configure an apply process to apply all of the events in a queue or only the events that users specify. Users can also configure an apply process to call user-created subprograms (e.g. subprograms written in the PL/SQL language) to process events. The database where events are applied and other types of events are processed is called the destination database. In some configurations, the source database and the destination database may be the same.

    Typical Applications of Information Sharing System 100

    Information sharing system 100 is flexible enough to achieve a virtually unlimited number of information sharing objectives. Consequently, the number of applications to which information sharing system 100 may be put is equally great. For the purpose of illustrating the utility and versatility of information sharing system 100, details shall be given as to how information sharing system may be applied to implement message queuing and data replication.

    With respect to message queuing, information sharing system 100 allows user applications to enqueue messages of different types, propagate the messages to subscribing queues, notify user applications that messages are ready for consumption, and dequeue messages at the destination database. A rule-based message notification consuming process may be used in conjunction with a log capture process. With this combination of components, the capture process may add to a staging area LCRs that reflect events reflected in the log files of a database, and the consuming process may send out notifications to those subscribers that have indicated an interest in particular types of database events. The specific events in which subscribers are interested may be stored as subscription data, which may identify the data in which a subscriber is interested using one or more SQL statements. Significantly, such notifications may be sent directly to subscribers, to subscribers through remote but compatible messaging systems, or to subscribers through message gateways to messaging systems that are otherwise incompatible with the system in which the LCRs were originally generated.

    According to one embodiment, information sharing system 100 implements staging areas 102, 104 and 106 using a type of queue that stages messages of SYS.AnyData type. Messages of almost any type can be wrapped in a SYS.AnyData wrapper and staged in SYS.AnyData queues. Information sharing system 100 interoperates with a queuing mechanism that supports all the standard features of message queuing systems, including multiconsumer queues, publishing and subscribing, content-based routing, Internet propagation, transformations, and gateways to other messaging subsystems.

    With respect to data replication, information sharing system 100 can efficiently capture both Data Manipulation Language (DML) and Data Definition Language (DDL) changes made to database objects and replicate those changes to one or more other databases. A capture process (e.g. capture process 116) captures changes made to source database objects and formats them into LCRs, which can be propagated to destination databases (e.g. via propagation process 118) and then applied by an apply processes (e.g. consuming process 122).

    The destination databases can allow DML and DDL changes to the same database objects, and these changes may or may not be propagated to the other databases in the environment. In other words, users can configure information sharing system 100 with one database that propagates changes, or users can configure an environment where changes are propagated between databases bidirectionally. Also, the tables for which data is shared need not be identical copies at all databases. Both the structure and the contents of these tables can differ at different databases, and the information in these tables can be shared between these databases.

    Core Services

    The components of system 100 provide a set of core services. According to one embodiment, those core services include event capturing, event distribution and event consumption.

    Event capturing generally refers to establishing a record of events that occur in a system of interest. For example, the system of interest may be a database system, and the event capturing may be performed by a set of capture processes, as shall be described in greater detail hereafter.

    Event distribution generally refers to distributing information about the events to the entities that are interested in the events. Such entities may reside within the system that created the event of interest, or external to the system. For example, event distribution may involve sending information about the changes that are made in one database system to another database system.

    Event consumption generally refers to reading the captured event information. Frequently, the consuming process will perform some action, or initiate some chain of activities, based upon the captured events. For example, a process in a target database system that receives change information from a source database system may read the change information from the source database system and initiate changes in the target database system based on corresponding changes made in the source database system.

    Implicit Capture Process Example

    As mentioned above, system 100 supports both explicit and implicit capture processes. A log capture process is an example of an implicit capture process. According to one embodiment, a log capture process is a process configured to read information stored in the log files of a database server, and to store information into one or more staging areas based on the information in the log files. Such log files may include, for example, the redo log files that are generated by the database system to record changes that are being made by the database system.

    A redo log file may, for example, include a redo record that indicates that, at a particular point in time, the database server changed the value in a particular column of a particular row of a particular table from X to Y. The information contained in such redo records is typically used by the database server to ensure that no committed changes are lost when failures occur. However, the use of a log capture process to selectively share the information contained in the redo records with other processes, by placing the information in one or more staging areas accessible to consuming processes, allows the information to be used in a wide variety of ways beyond the recovery purpose for which the logs were originally generated. For example, consuming processes may selectively provide the change information from the staging area to processes that reside external to the database server that produced the logs.

    According to one embodiment, the log capture process selectively captures information from a log file. For example, an asynchronous trigger may be defined to fire in response to a particular type of change made to a particular table. Consequently, when a transaction makes the particular type of change to the particular table (1) the database server will generate a redo record in response to the change, and (2) the trigger will fire and a capture process will capture the new redo record. Because the trigger is asynchronous, the execution of the capture process will not be performed as part of the transaction that caused the change. Thus, the transaction may proceed without waiting for the capture process, and the capture process may capture the new redo record some time after the change was made.

    Executing the capture process in response to the firing of an asynchronous trigger is merely one example of capture process operation. Alternatively, the log capture process may simply be programmed to check the appropriate logs for new records on a periodic basis. As another alternative, the log capture process may be executed in response to a synchronous trigger. When a synchronous trigger is used, the capturing operation is performed by the capture process as part of the transaction that made the change that caused the trigger to fire. Thus, the capture of the change is "synchronous" relative to the transaction that caused the change. However, any other activities in the chain of activities associated with the chain (e.g. staging, propagation, consumption) may still be performed asynchronous relative to that transaction.

    According to one embodiment, the capture process retrieves the change data extracted from the redo log, and formats the change data into an LCR. The capture process places the LCR into a staging area for further processing. In one embodiment, support is provided for both hot mining an online redo log, and mining archived log files. When hot mining is performed, the redo stream may be mined for change data at the same time it is, written, thereby reducing the latency of capture.

    As mentioned above, changes made to database objects in a typical database are logged in the redo log to guarantee recoverability in the event of user error or media failure. In one embodiment, an implicit capture process is a background process, executing within the database server that is managing a database, that reads the database redo log to capture DML and DDL changes made to database objects. After formatting these changes into LCRs, the implicit capture process enqueues them into a staging area.

    According to one embodiment, there are several types of LCRs, including: row LCRs contain information about a change to a row in table resulting from a DML operation, and DDL LCRs contain information about a DDL change to a database object. Users use rules to specify which changes are captured. FIG. 3 shows an implicit capture process capturing LCRs.

    As shall be explained in greater detail hereafter, users can specify "tags" for redo entries generated by a certain session or by an apply process. These tags then become part of the LCRs captured by a capture process. A tag can be used to determine whether a redo entry or an LCR contains a change that originated in the local database or at a different database, so that users can avoid sending LCRs back to the database where they originated. Tags may be used for other LCR tracking purposes as well. Users can also use tags to specify the set of destination databases for each LCR. Depending on the rules that have been established for the various components of information sharing system 100, the tag values associated with an LCR may be set, modified and/or transformed at various points as the LCR flows through the system. For example, for an LCR created for a change identified in a log file, a tag value may be set by the capture process to indicate the database in which the change originated. As another example, a tag value for an LCR may be set by a propagation process to indicate the system from which the propagation process is propagating the LCR.

    A capture process that mines logs for changes may reside either locally (in the system whose logs are being mined) or remotely (outside the system whose logs are being mined). Where the capture process is executing remotely, the logs may be exported from the system that generated them to the system in which the capture process is executing. For example, a capture process may be configured to mine the logs of a first database, and to store into a staging area LCRs for the various events represented in the logs. The capture process may actually be executing in a second database system. In this scenario, the log files may be communicated from the first database system to the second database system, for processing by the capture process in the second database system. The staging area into which the capture process stores the LCRs may also reside within the second database system. The ability to "offload" the overhead associated with the capture process in this matter may be useful for the purposes of load and resource balancing.

    Staging Areas

    As illustrated in FIG. 1, staging areas may be used to temporarily hold information between capture, distribution and consumption of the information. The nature of the staging area that is used to hold the information may vary depending on the nature of the information and the chain of activities triggered by the information. For example, the staging area used to hold information between capture, distribution and consumption of the information may take any of the following forms:
    • None: captured information is passed directly to a propagation or consumption process.
    • Journal: information in a recovery journal is used to find the captured events.
    • Basic: the information is held in a memory area that does not itself provide a recovery mechanism.
    • SQL: the information is stored, but not necessarily retained, in a data container that can be queried using a database language, such as SQL.
    • Documented: the same as the SQL option, except that the information is retained in the data container.


  • Staging areas with the characteristics described above may be implemented in a variety of ways, and the present invention in not limited to any specific implementation. For example, the SQL and Documented options may be implemented using the Advanced Queuing mechanisms in the Oracle 9iR2 database system currently available for Oracle Corporation. Further, the Advanced Queuing functionality may be used in conjunction with Oracle Workflow 2.6, also available for Oracle Corporation, to attain the ability to check events in the context of other events. For example, an explicit event (e.g. a message received from an application in a call made by the application through an API) can be seen in the context of other explicit events (e.g. other messages received from the same application). Similarly, an implicitly captured event (e.g. a change to data managed by a database server) can be seen in the context of other implicitly captured events (e.g. other database changes).

    In one embodiment, information sharing system 100 uses queues to stage events for propagation or consumption. Users can use information sharing system 100 to propagate events from one queue to another, and these queues can be in the same database or in different databases. The queue from which the events are propagated is called the source queue, and the queue that receives the events is called the destination queue. There can be a one-to-many, many-to-one, or many-to-many relationship between source and destination queues.

    Events that are staged in a queue can be consumed by one or more consuming processes, such as an apply processes or a user-defined subprogram. If users configure a propagation process (e.g. propagation process 118) to propagate changes from a source queue to a destination queue, then users can use rules to specify which changes are propagated. FIG. 4 shows propagation from a source queue to a destination queue.

    Directed Networks Overview

    Information sharing system 100 enables users to configure an environment where changes are shared through directed networks. A directed network is a network in which propagated events may pass through one or more intermediate databases before arriving at a destination database. The events may or may not be processed at an intermediate database. Using information sharing system 100, users can choose which events are propagated to each destination database, and users can specify the route events will traverse on their way to a destination database.

    FIG. 5 shows an example directed networks environment. In the example shown in FIG. 5, the queue at the intermediate database in Chicago is both a source queue and a destination queue.

    Explicit Enqueue and Dequeue of Events

    User applications can explicitly enqueue events into a staging area of information sharing system 100. User applications can format these events as LCRs, which allows an apply process to apply them at a destination database. Alternatively, these events can be formatted as user messages for consumption by another user application, which either explicitly dequeues the events or processes the events with callbacks from an apply process. Events that were explicitly enqueued into a queue can be explicitly dequeued from the same queue. FIG. 6 shows explicit enqueue of events into and dequeue of events from the same queue.

    When events are propagated between queues, events that were explicitly enqueued into a source queue can be explicitly dequeued from a destination queue by a user application without any intervention from an apply process. FIG. 7 shows explicit enqueue of events into a source queue, propagation to a destination queue, and then explicit dequeue of events from the destination queue.

    While many of the examples given herein involve the capture, propagation and application of LCRs, the techniques illustrated in those examples are equally applicable to any form of shared data. Such shared data may, for example, take the form of explicitly enqueued user messages, or even implicitly captured information that is organized in a format that differs from LCRs.

    Apply Process Overview

    According to one embodiment, an apply process is a background process, running within a database server, that dequeues events from a queue and either applies each event directly to a database object or passes the event as a parameter to a user-defined procedure called an apply handler. These apply handlers can include message handlers, DML handlers, and DDL handlers.

    According to one embodiment, an apply process is designed to be aware of transaction boundaries. For example, an apply process is aware of which changes, represented in the LCRs that the apply process is consuming, were initially made as part of the same transaction. The apply process assembles the changes into transactions, and applies the changes in a manner that takes into account the dependencies between the transactions. According to one embodiment, the apply process applies the changes in parallel, to the extent permitted by the dependencies between the transactions.

    Typically, an apply process applies events to the local database where it is running, but, in a heterogeneous database environment, it can be configured to apply events at a remote database that is a different type of database than the local database. For example, the local database may be a database created by a database server produced by one company, and the remote database may be a database created by a database server produced by another company. Users use rules to specify which events in the queue are applied. FIG. 8 shows an apply process processing LCRs and user messages.

    According to one embodiment, an apply process detects conflicts automatically when directly applying LCRs. Typically, a conflict results when the same row in the source database and destination database is changed at approximately the same time. When conflicts occur, users need a mechanism to ensure that the conflict is resolved in accordance with user-specified business rules. According to one embodiment, information sharing system 100 includes a variety of prebuilt conflict resolution handlers. Using these prebuilt handlers, users can define a conflict resolution system for each of the users' databases that resolves conflicts in accordance with user-specified business rules. If users have a unique situation that the prebuilt conflict resolution handlers cannot resolve, then users can build custom conflict resolution handlers. According to one embodiment, if a conflict is not resolved, or if a handler procedure raises an error, then all events in the transaction that raised the error are saved in an exception queue for later analysis and possible reexecution.

    As mentioned above, LCRs are merely one example of the type of shared information that may be handled by an apply process. Apply processes may be configured to "apply" any form of shared information, including explicitly enqueued user messages and automatically captured data that is not organized as an LCR.

    Rules-Driven Information Sharing

    As explained above, each of the activities in a chain of activities may be performed in a variety of ways. For example, propagation may be performed with "Best Effort" and "Open" characteristics, or "Exactly Once" and "Closed" characteristics. According to one embodiment of the invention, a rule registration mechanism is provided to allow users to register rules that specify:
    • a chain of activities to perform in response to a particular event, and
    • how each activity in the chain of activities is to be performed.


  • According to one embodiment, the registration mechanism is implemented within a database system. When an information sharing rule is registered with the database system, the database system generates and stores metadata (referred to herein as "rules metadata") that reflects the rule. In addition, the database system generates any mechanisms required to execute the rule. For example, assume that a user wants to use system 100 to replicate at a target database a table that exists in a source database. To program system 100 to carry out the replication, the user could register a set of rules that:
    • identify the database table that is to be replicated
    • identify the target database, and
    • specify the data capture, staging, propagation and consumption options for performing the replication.


  • In response to receipt of this set of rules, the database system would generate metadata to record the rules, and generate any supporting mechanisms to implement the rules. Such supporting mechanisms may include, for example, an asynchronous trigger for triggering execution of a capture process in response to modifications performed on the database table. The metadata might include, for example, (1) metadata that instructs the capture process about which log to capture information from, which information to capture, the capture options to use, and where to stage the captured information; (2) metadata that instructs a propagation process which information to propagate, how the information is to be transformed prior to propagation, where to propagate the data, etc. (3) metadata that instructs an apply process in the target database system where to receive the propagated information, how to process the propagated information, how to apply the propagated information to keep a table in the target database system in sync with the changes reflected in the propagated, information, etc.

    Rules Overview

    Information sharing system 100 enables users to control which information to share and where to share it using rules. A rule is specified as a condition that is similar to the condition in the WHERE clause of a SQL query, and users can group related rules together into rule sets. According to one embodiment, a rule includes a rule condition, a rule evaluation context, and a rule action context.

    The rule condition combines one or more expressions and operators and returns a Boolean value, which is a value of TRUE, FALSE, or NULL (unknown), based on an event.

    The rule evaluation context defines external data that can be referenced in rule conditions. The external data can either exist as external variables, as table data, or both.

    The rule action context is optional information associated with a rule that is interpreted by the client of the rules engine when the rule is evaluated.

    For example, the following rule condition may be used in information sharing system 100 to specify that the schema name that owns a table must be hr and the table name must be departments for the condition to evaluate to TRUE:
    • :dml.get_object_owner( )=‘hr’ AND :dml.get_object_name( )=‘departments’.


  • Within information sharing system 100, this rule condition may be used in the following ways:
    • To instruct a capture process to capture DML changes to the hr. departments table
    • To instruct a propagation to propagate DML changes to the hr. departments table
    • To instruct an apply process to apply DML changes to the hr. departments table.


  • Information sharing system 100 performs tasks based on rules. These tasks include capturing changes with a capture process, propagating changes with a propagation, and applying changes with an apply process. According to one embodiment, users can define rules for these tasks at three different levels: table rules, schema rules, and global rules.

    When users define a table rule, the task is performed when a change is made to the table that the users specify. For example, users can define a rule that instructs a capture process to capture changes to the hr.employees table. Given this rule, if a row is inserted into the hr. employees table, then the capture process captures the insert, formats it into an LCR, and enqueues the LCR into a queue.

    When users define a schema rule, the task is performed when a change is made to the database objects in the schema users specify, and any database objects added to the schema in the future. For example, users can define two rules that instruct a propagation to propagate DML and DDL changes to the hr schema from a source queue to a destination queue. Given these rules, suppose the source queue contains LCRs that define the following changes:

    The hr. loc city_ix index is altered.

    A row is updated in the hr. jobs table.

    The propagation propagates these changes from the source queue to the destination queue, because both changes are to database objects in the hr schema.

    When users define a global rule, the task is performed when a change is made to any database object in the database. If it is a global DML capture rule, then a capture process captures all DML changes to the database objects in the database. If it is a global DDL propagation or apply rule, then the task is performed for all DDL changes in a queue.

    The Rules Engine

    As mentioned above, the various components of system 100 may be designed with a default behavior that can be overridden by registering rules with system 100. When a rule is registered, metadata is generated within system 100 to reflect the rule. The various components of system 100 are configured to read the metadata and modify their behavior according to any rules reflected therein which (1) apply to them, and (2) apply to the context in which they are currently operating.

    For example, a particular user may register a rule that changes the propagation policy from a default "Exactly once" to a new value "Best effort" when the item being propagated is a particular type of message. The process responsible for propagating that particular type of message is configured to read the metadata and use a "Best effort" propagation technique when processing that particular type of message for that particular user. However, when propagating the same type of message for other users, the propagation process may continue to use the default "Exactly once" technique.

    In addition to overriding the default behavior of components, rules may be used to supplement the behavior. For example, a particular capture process may be configured to capture certain types of information and add the information to a staging area. Rules may be registered with system 100 which specify several additional tasks for the capture process to perform before, during, and/or after performing the task addressed by its default behavior. For example, the capture process may, based upon registered rules, be configured to perform numerous additional tasks when adding information to the staging area, such as (1) adding tags to the information before placing it in the staging area, and (2) sending out notifications to various entities after placing the information in the staging area.

    The various processes involved in registering and managing the rules used by the components of system 100 are collectively referred to herein as the "rules engine".

    Transformations Overview

    A rule-based transformation is any modification to an event that results when a rule evaluates to TRUE. For example, users can use a rule-based transformation when users want to change the datatype of a particular column in a table for an event. In this case, the transformation can be a PL/SQL function that takes as input a SYS.AnyData object containing a logical change record (LCR) with a NUMBER datatype for a column and returns a SYS.AnyData object containing an LCR with a VARCHAR2 datatype for the same column.

    According to one embodiment, a transformation can occur at the following times:
    • During enqueue of an event, which can be useful for formatting an event in a manner appropriate for all destination databases
    • During propagation of an event, which may be useful for subsetting data before it is sent to a remote site
    • During dequeue of an event, which can be useful for formatting an event in a manner appropriate for a specific destination database


  • FIG. 9 shows a rule-based transformation during apply.

    Heterogeneous Information Sharing Overview

    In addition to information sharing between databases produced by the same company, information sharing system 100 supports information sharing between databases from different companies. Typically, the features supported by the database system offered by one company differ from the features supported by database systems offered by other companies. Consequently, the task of sharing information between two different types of database systems can be quite challenging. As shall be described in greater detail hereafter, information sharing system 100 may be employed to significantly facilitate information sharing among such heterogeneous database systems.

    For the purpose of describing how information sharing system 100 may be used to share data among heterogeneous databases, it shall be assumed that data is to be shared between an Oracle database server and a non-Oracle database server. However, the techniques described herein are not limited to such a context. Thus, the actual types of databases within the heterogeneous systems in which these techniques are applied may vary from implementation to implementation.

    For the purpose of explanation, the database system that originally produces the information that is to be communicated to the other database system is referred to herein as the "source" database. Conversely, the database system that received the shared information is referred to as the "destination" database. If an Oracle database is the source and a non-Oracle database is the destination, then the non-Oracle database destination will typically lack the following components of information sharing system 100: a queue to receive events, and an apply process to dequeue and apply events.

    To share DML changes from an Oracle source database with a non-Oracle destination database, the Oracle database functions as a proxy and carries out some of the steps that would normally be done at the destination database. That is, the events intended for the non-Oracle destination database are dequeued in the Oracle database itself, and an apply process at the Oracle database uses Heterogeneous Services to apply the events to the non-Oracle database across a network connection through a gateway. FIG. 10 shows an Oracle database sharing data with a non-Oracle database.

    According to one embodiment, a custom application is used to capture and propagate changes from a non-Oracle database to an Oracle database. This application gets the changes made to the non-Oracle database by reading from transaction logs, using triggers, or some other method. The application assembles and orders the transactions and converts each change into a logical change record (LCR). Then, the application enqueues the LCRs into a queue in an Oracle database by using the PL/SQL interface, where they can be processed by an apply process. FIG. 11 shows a non-Oracle databases sharing data with an Oracle database.

    FIG. 12 shows how information sharing system 100 might be configured to share information within a single database, while FIGS. 13A and 13B show how information sharing system 100 might be configured to share information between two different databases.

    It should be noted that each of the various components involved in the information sharing operation shown in FIGS. 13A and 13B may operate according to rule sets stored in a rules engine. For example, the capture process used to capture changes made at the source database may operate according to rules registered by a user. The rules may dictate, among other things, which changes to capture, how to transform the changes, and how to generate and tag the LCRs that represent those changes. Similarly, the propagation process, the apply process, and the various handler procedures may all be rules-driven.

    According to one embodiment, these various components are designed with a default behavior that they perform in the absence of any registered rule set.

    Replication Example

    As mentioned above, information from the redo logs of a database server (hereinafter the "source server") may be selectively added to a staging area by a capture process. A consuming process may then selectively provide this information from the staging area to a process external to the source server. The change information may be, for example, provided to a process in a different database server (hereinafter the "target" database server). The process in the target database server may then use the change information from the source database server to maintain information that resides in the target database in sync with corresponding information in the source database server. For example, the process may update a table T1 in the target database server based on changes that were made to a table T2 in the source database server, so that T1 may serve as a replica of T2.

    An Oracle-Based Example of the Redo Log and Capture Process

    Every Oracle database has a set of two or more redo log files. The redo log files for a database are collectively known as the database's redo log. The primary function of the redo log is to record all changes made to the database.

    Redo logs are used to guarantee recoverability in the event of human error or media failure. According to one embodiment, a capture process of information sharing system 100 is implemented as an optional Oracle background process that reads the database redo log to capture DML and DDL changes made to database objects. When a capture process is configured to capture changes from a redo log, the database where the changes were generated is called the source database.

    Logical Change Records (LCRS)

    A capture process reformats changes captured from the redo log into LCRs. An LCR is an object that describes a database change. According to one embodiment, a capture process captures multiple types of LCRs, including row LCRs and DDL LCRs.

    After capturing an LCR, a capture process enqueues an event containing the LCR into a queue. A capture process is always associated with a single SYS.AnyData queue, and it enqueues events into this queue only. Users can create multiple queues and associate a different capture process with each queue. FIG. 3 shows a capture process capturing LCRs.

    A row LCR describes a change to the data in a single row or a change to a single LOB column in a row. The change results from a data manipulation language (DML) statement or a piecewise update to a LOB. For example, a DML statement may insert or merge multiple rows into a table, may update multiple rows in a table, or may delete multiple rows from a table. So, a single DML statement can produce multiple row LCRs. That is, a capture process creates an LCR for each row that is changed by the DML statement. Further, the DML statement itself may be part of a transaction that includes many DML statements.

    A captured row LCR may also contain transaction control statements. These row LCRs contain directives such as COMMIT and ROLLBACK. These row LCRs are internal and are used by an apply process to maintain transaction consistency between a source database and a destination database.

    According to one embodiment, each row LCR contains the following information:
    • The name of the source database where the row change occurred
    • The type of DML statement that produced the change, either INSERT, UPDATE, DELETE, LOB ERASE, LOB WRITE, or LOB TRIM
    • The schema name that contains the table with the changed row
    • The name of the table that contains the changed row
    • A raw tag that can be used to track the LCR
    • The identifier of the transaction in which the DML statement was run
    • The system change number (SCN) when the change was written to the redo log
    • The old values related to the change. If the type of the DML statement is UPDATE or DELETE, then these old values include some or all of the columns in the changed row before the DML statement. If the type of the DML statement INSERT, then there are no old values.
    • The new values related to the change. If the type of the DML statement is UPDATE or INSERT statement, then these new values include some or all of the columns in the changed row after the DML statement. If the type of the DML statement DELETE, then there are no new values.


  • A DDL LCR describes a data definition language (DDL) change. A DDL statement changes the structure of the database. For example, a DDL statement may create, alter, or drop a database object.

    According to one embodiment, each DDL LCR contains the following information:
    • The name of the source database where the DDL change occurred
    • The type of DDL statement that produced the change, for example ALTER TABLE or CREATE INDEX
    • The schema name of the user who owns the database object on which the DDL statement was run
    • The name of the database object on which the DDL statement was run
    • The type of database object on which the DDL statement was run, for example TABLE or PACKAGE
    • The text of the DDL statement
    • The logon user, which is the user whose session executed the DDL statement
    • The schema that is used if no schema is specified for an object in the DDL text
    • The base table owner. If the DDL statement is dependent on a table, then the base table owner is the owner of the table on which it is dependent.
    • The base table name. If the DDL statement is dependent on a table, then the base table name is the name of the table on which it is dependent.
    • A raw tag that can be used to track the LCR
    • The identifier of the transaction in which the DDL statement was run
    • The SCN when the change was written to the redo log.


  • Capture Rules

    According to one embodiment, a capture process within information sharing system 100 (e.g. capture process 116) captures changes based on rules that users define. Each rule specifies the database objects for which the capture process captures changes and the types of changes to capture. In one embodiment, users can specify capture rules at the following levels:
    • A table rule captures either DML or DDL changes to a particular table.
    • A schema rule captures either DML or DDL changes to the database objects in a particular schema.
    • A global rule captures either all DML or all DDL changes in the database.


  • Capture Process Rule Evaluation

    A running capture process completes the following series of actions to capture changes:

    1. Finds changes in the redo log.

    2. Performs prefiltering of the changes in the redo log. During this step, a capture process evaluates rules in its rule set at the object level and schema level to place changes found in the redo log into two categories: changes that should be converted into LCRs and changes that should not be converted into LCRs.

    Prefiltering is a safe optimization done with incomplete information. This step identifies relevant changes to be processed subsequently, such that:

    A change is converted into an LCR if one or more rules may evaluate to TRUE after conversion.

    A change is not converted into an LCR if the capture process can ensure that no rules would evaluate to TRUE after conversion.

    3. Converts changes that may cause one or more rules to evaluate to TRUE into LCRs based on prefiltering.

    4. Performs LCR filtering. During this step, a capture process evaluates rules regarding information in each LCR to separate the LCRs into two categories: LCRs that should be enqueued and LCRs that should be discarded.

    5. Discards the LCRs that should not be enqueued based on the rules.

    6. Enqueues the remaining captured LCRs into the queue associated with the capture process.

    For example, suppose the following rule is defined for a capture process: Capture changes to the hr. employees table where the department-id is 50. No other rules are defined for the capture process, and the parallelism parameter for the capture process is set to 1.

    Given this rule, suppose an UPDATE statement on the hr. employees table changes 50 rows in the table. The capture process performs the following series of actions for each row change:

    1. Finds the next change resulting from the UPDATE statement in the redo log.

    2. Determines that the change resulted from an UPDATE statement to the hr. employees table and must be captured. If the change was made to a different table, then the capture process ignores the change.

    3. Captures the change and converts it into an LCR.

    4. Filters the LCR to determine whether it involves a row where the department id is 50.

    5. Either enqueues the LCR into the queue associated with the capture process if it involves a row where the department-id is 50, or discards the LCR if it involves a row where the department-id is not 50 or is missing.

    Event Staging and Propagation Overview

    Information sharing system 100 uses queues of type SYS.AnyData to stage events. There are two types of events that can be staged in a queue: logical change records (LCRs) and user messages. LCRs are objects that contain information about a change to a database object, while user messages are custom messages created by users or applications. Both types of events are of type SYS.AnyData and can be used for information sharing within a single database or between databases.

    Staged events can be consumed or propagated, or both. These events can be consumed by an apply process or by a user application that explicitly dequeues them. Even after an event is consumed, it may remain in the queue if users have also configured information sharing system 100 to propagate the event to one or more other queues or if message retention is specified. These other queues may reside in the same database or in different databases. In either case, the queue from which the events are propagated is called the source queue, and the queue that receives the events is called the destination queue. There can be a one-to-many, many-to-one, or many-to-many relationship between source and destination queues. FIG. 4 shows propagation from a source queue to a destination queue.

    According to one embodiment, the ordering of information items is maintained during the propagation of the data items. Maintaining the order is particularly useful when the order of the items has functional ramifications. For example, if the items being propagated are changes made to a database system, it is important to maintain the order so that propagated changes are made in the target system after the propagated changes on which they depend.

    Users can create, alter, and drop a propagation, and users can define propagation rules that control which events are propagated. The user who owns the source queue is the user who propagates events. This user must have the necessary privileges to propagate events. These privileges include the following:
    • Execute privilege on the rule set used by the propagation
    • Execute privilege on all transformation functions used in the rule set
    • Enqueue privilege on the destination queue if the destination queue is in the same database.


  • Captured and User-Enqueued Events

    According to one embodiment, events can be enqueued in two ways:
    • A capture process enqueues captured changes in the form of events containing LCRs. An event containing an LCR that was originally captured and enqueued by a capture process is called a captured event.


  • A user application enqueues user messages of type SYS . AnyData. These user messages can contain LCRs or any other type of message. Any user message that was explicitly enqueued by a user or an application is called a user-enqueued event. Events that were enqueued by a user procedure called from an apply process are also user-enqueued events.

    Thus, each captured event contains an LCR, but a user-enqueued event may or may not contain an LCR. Propagating a captured event or a user-enqueued event enqueues the event into the destination queue.

    According to one embodiment, events can be dequeued in two ways:
    • An apply process dequeues either captured or user-enqueued events. If the event contains an LCR, then the apply process can either apply it directly or call a user-specified procedure for processing. If the event does not contain an LCR, then the apply process can invoke a user-specified procedure called a message handler to process it.
    • A user application explicitly dequeues user-enqueued events and processes them. Captured events cannot be dequeued by a user application; they must be dequeued by an apply process. However, if a user procedure called by an apply process explicitly enqueues an event, then the event is a user-enqueued event and can be explicitly dequeued, even if the event was originally a captured event.


  • The dequeued events may have originated at the same database where they are dequeued, or they may have originated at a different database.

    Event Propagation Between Queues

    Users can use information sharing system 100 to configure event propagation between two queues, which may reside in different databases. Information sharing system 100 uses job queues to propagate events.

    According to one embodiment, a propagation is between a source queue and a destination queue. Although propagation is between two queues, a single queue may participate in many propagations. That is, a single source queue may propagate events to multiple destination queues, and a single destination queue may receive events from multiple source queues. According to one embodiment, only one propagation is allowed between a particular source queue and a particular destination queue. Also, a single queue may be a destination queue for some propagations and a source queue for other propagations.

    A propagation may propagate all of the events in a source queue to the destination queue, or a propagation may propagate only a subset of the events. Also, a single propagation can propagate both captured and user-enqueued events. Users can use rules to control which events in the source queue are propagated to the destination queue.

    Depending on how users set up the information sharing system 100 environment, changes could be sent back to the site where they originated. Users need to ensure that the environment is configured to avoid cycling the change in an endless loop. Users can use tags to avoid such a change cycling loop.

    Propagation Rules

    A propagation propagates events based on rules that users define. For events, each rule specifies the database objects for which the propagation propagates changes and the types of changes to propagate. Users can specify propagation rules for events at the following levels:
    • A table rule propagates either DML or DDL changes to a particular table.
    • A schema rule propagates either DML or DDL changes to the database objects in a particular schema.
    • A global rule propagates either all DML or all DDL changes in the source queue.


  • For non-LCR events and for LCR events with special needs, users can create their own rules to control propagation.

    A queue subscriber that specifies a condition causes the system to generate a rule. The rule sets for all subscribers to a queue are combined into a single system-generated rule set to make subscription more efficient.

    Apply Process Overview

    According to one embodiment, an apply process is a background process that; dequeues logical change records (LCRs) and user messages from a specific queue and either applies each one directly or passes it as a parameter to a user-defined procedure. The LCRs dequeued by an apply process contain data manipulation language (DML) changes or data definition language (DDL) changes that an apply process can apply to database objects in a destination database. A user-defined message dequeued by an apply process is of type SYS.AnyData and can contain any user message, including a user-created LCR.

    Events applied by an apply process are applied by an apply user. The apply user is the user who applies all DML statements and DDL statements and who runs user-defined apply handlers.

    Apply Rules

    An apply process applies changes based on rules that users define. Each rule specifies the database objects to which an apply process applies changes and the types of changes to apply. Users can specify apply rules at the following levels:
    • A table rule applies either DML or DDL changes to a particular table. Subset rules are table rules that include a subset of the changes to a particular table.
    • A schema rule applies either DML or DDL changes to the database objects in a particular schema.
    • A global rule applies either all DML or all DDL changes in the queue associated with an apply process.


  • For non-LCR events and for LCR events with special needs, users can create their own rules to control apply process behavior.

    Event Processing with an Apply Process

    An apply process is a flexible mechanism for processing the events in a queue. Users have options to consider when users configure one or more apply processes for your environment. This section discusses the types of events that an apply process can apply and the ways that it can apply them.

    According to one embodiment, a single apply process can apply either captured events or user-enqueued events, but not both. If a queue at a destination database contains both captured and user-enqueued events, then the destination database must have at least two apply processes to process the events.

    According to one embodiment, when users create an apply process, users use an apply captured parameter to specify whether the apply process applies captured or user-enqueued events.

    The database where an event originated is important to an apply process for captured events but not for user-enqueued events. For a captured event, the source database is the database where the change was generated in the redo log. According to one embodiment, for a user-enqueued event, an apply process ignores information about the database where the event originated, even if the event is a user-enqueued LCR. A single apply process can apply user-enqueued events that originated at different databases.

    Event Processing Options

    Options for event processing depend on the kind of event received by an apply process. FIG. 8 shows the event processing options for an apply process.

    Captured LCRs from multiple databases may be sent to a single destination queue. If a single queue contains captured LCRs from multiple databases, then one or more apply processes may be used to retrieve these LCRs. When multiple apply processes are used, each of these apply processes may be configured to receive captured LCRs from exactly one source database using rules.

    If there are multiple capture processes running on a source database, and LCRs from more than one of these capture processes are applied at a destination database, then one or more apply processes may be used to apply the changes.

    Users can configure an apply process to process a captured or user-enqueued event that contains an LCR in the following ways: directly apply the event or pass the event as a parameter to a user procedure for processing. The following sections explain these options.

    Apply the LCR Event Directly: If users use this option, then an apply process applies the event without running a user procedure. The apply process either successfully applies the change in the LCR to a database object or, if a conflict or an apply error is encountered, tries to resolve the error with a conflict handler or a user-specified procedure called an error handler.

    If a conflict handler can resolve the conflict, then it either applies the LCR or it discards the change in the LCR. If the error handler can resolve the error, then it should apply the LCR, if appropriate. An error handler may resolve an error by modifying the LCR before applying it. If the error handler cannot resolve the error, then the apply process places the transaction, and all LCRs associated with the transaction, into an exception queue.

    Call a User Procedure to Process the LCR Event: If users use this option, then an apply process passes the event as a parameter to a user procedure for processing. The user procedure can then process the event in a customized way.

    A user procedure that processes row LCRs resulting from DML statements is called a DML handler, while a user procedure that processes DDL LCRs resulting from DDL statements is called a DDL handler. An apply process can have many DML handlers and DDL handlers.

    For each table associated with an apply process, users can set a separate DML handler to process each of the following types of operations in row LCRs:

    Insert Update Delete Lob Update.

    For example, the hr. employees table may have one DML handler to process INSERT operations and a different DML handler to process UPDATE operations.

    A user procedure can be used for any customized processing of LCRs. For example, if users want each insert into a particular table at the source database to result in inserts into multiple tables at the destination database, then users can create a user procedure that processes INSERT operations on the table to accomplish this. Or, if users want to log DDL changes before applying them, then users can create a user procedure that processes DDL operations to accomplish this.

    Non-LCR User Message Processing

    A user-enqueued event that does not contain an LCR is processed by the message handler specified for an apply process, if the user-enqueued event satisfies at least one rule in the rule set for the apply process. A message handler is a user-defined procedure that can process non-LCR user messages in a customized way for your environment.

    The message handler offers advantages in any environment that has applications that need to update one or more remote databases or perform some other remote action. These applications can enqueue user messages into a queue at the local database, and information sharing system 100 can propagate each user message to the appropriate queues at destination databases. If there are multiple destinations, then information sharing system 100 provides the infrastructure for automatic propagation and processing of these messages at these destinations. If there is only one destination, then information sharing system 100 still provides a layer between the application at the source database and the application at the destination database, so that, if the application at the remote database becomes unavailable, then the application at the source database can continue to function normally.

    For example, a message handler may format a user message into an electronic mail message. In this case, the user message may contain the attributes users would expect in an electronic mail message, such as from, to, subject, text-of-message, and so on. A message handler could convert these user messages into electronic mail messages and send them out through an electronic mail gateway.

    Apply Process Components

    According to an embodiment of the invention, an apply process includes a reader server, a coordinator process, and one or more apply servers.

    The reader server dequeues events. The reader server is a parallel execution server that computes dependencies between LCRs and assembles events into transactions. The reader server then returns the assembled transactions to the coordinator, which assigns them to idle apply servers.

    The coordinator process gets transactions from the reader and passes them to apply servers. The apply servers apply LCRs to database objects as DML or DDL statements or that pass the LCRs to their appropriate handlers. For non-LCR messages, the apply servers pass the events to the message handler. Each apply server is a parallel execution server. If an apply server encounters an error, it then tries to resolve the error with a user-specified error handler. If an apply server cannot resolve an error, then it rolls back the transaction and places the entire transaction, including all of its events, in an exception queue.

    When an apply server commits a completed transaction, this transaction has been applied. When an apply server places a transaction in an exception queue and commits, this transaction also has been applied.

    If a transaction being handled by an apply server has a dependency with another transaction that is not known to have been applied, then the apply server contacts the coordinator and waits for instructions. The coordinator monitors all of the apply servers to ensure that transactions are applied and committed in the correct order.

    For example, consider these two transactions:

    1. A row is inserted into a table.

    2. The same row is updated to change certain column values.

    In this case, transaction 2 is dependent on transaction 1, because the row cannot be updated until after it is inserted into the table. Suppose these transactions are captured from the redo log at a source database, propagated to a destination database, and applied at the destination database. Apply server A handles the insert transaction, and apply server B handles the update transaction.

    If apply server B is ready to apply the update transaction before apply server A has applied the insert transaction, then apply server B waits for instructions from the coordinator. After apply server A has applied the insert transaction, the coordinator process instructs apply server B to apply the update transaction.

    The Components of a Rule

    According to one embodiment, a rule is a database object that enables a client to perform an action when an event occurs and a condition is satisfied. Rules are evaluated by a rules engine which, according to one embodiment, is built into a database server that manages information sharing system 100. Both user-created applications and information sharing system 100, can be clients of the rules engine. According to one embodiment, a rule consists of the following components:
    • Rule Condition
    • Rule Evaluation Context (optional)
    • Rule Action Context (optional).


  • Each rule is specified as a condition that is similar to the condition in the WHERE clause of a SQL query. Users can group related rules together into rule sets. A single rule can be in one rule set, multiple rule sets, or no rule sets.

    A rule condition combines one or more expressions and operators and returns a Boolean value, which is a value of TRUE, FALSE, or NULL (unknown). An expression is a combination of one or more values and operators that evaluate to a value. A value can be data in a table, data in variables, or data returned by a SQL function or a PL/SQL function. For example, the following condition consists of two expressions (department-id and 30) and an operator (-):
    • department id=30.


  • This logical condition evaluates to TRUE for a given row when the department-id column is 30. Here, the value is data in the department id column of a table.

    A single rule condition may include more than one condition combined with the AND, OR, and NOT conditional operators to form compound conditions. For example, consider the following compound condition:
    • department id=30 OR job_title=‘Programmer’.


  • This rule condition contains two conditions joined by the OR conditional operator. If either condition evaluates to TRUE, then the rule condition evaluates to TRUE. If the conditional operator were AND instead of OR, then both conditions would have to evaluate to TRUE for the entire rule condition to evaluate to TRUE.

    Variables in Rule Conditions

    Rule conditions may contain variables. According to one embodiment, variables in rule conditions are preceded with a colon (:). The following is an example of a variable used in a rule condition:
    • :x=55.


  • Variables enable users to refer to data that is not stored in a table. A variable may also improve performance by replacing a commonly occurring expression. Performance may improve because, instead of evaluating the same expression multiple times, the variable is evaluated once.

    A rule condition may also contain an evaluation of a call to a subprogram. These conditions are evaluated in the same way as other conditions. That is, they evaluate to a value of TRUE, FALSE, or unknown. The following is an example of a condition that contains a call to a simple function named is_Manager that determines whether an employee is a manager:
    • is_manager(employee id)=‘Y’.


  • Here, the value of employee id is determined by data in a table where employee_id is a column.

    Users can use user-defined types for variables. Therefore, variables can have attributes. When a variable has attributes, each attribute contains partial data for variable. In rule conditions, users specify attributes using dot notation. For example, the following condition evaluates to TRUE if the value of attribute z in variable y is 9:
    • :y.z=9.


  • Simple Rule Conditions

    A simple rule condition is a condition that has either of the following forms:
    • simple-rule-expression operator constant
    • constant operator simple-rule-expression.


  • The Components of a Rule

    In a simple rule condition, a simple rule expression is one of the following:
    • Table column
    • Variable
    • Variable attribute
    • Method result where the method takes no arguments and the method result can be returned by the variable method function, so that the expression is either a numerical or character type.


  • For table columns, variables, and variable attributes, all numeric (NUMBER, FLOAT, DOUBLE, INTEGER) and character (CHAR, VARCHAR2) types are supported. Use of other types of expressions results in non-simple rule conditions.

    In a simple rule condition, an operator is one of the following:
    • =, <=, or >=.


  • Use of other operators results in non-simple rule conditions. A constant is a fixed value. A constant can be:

    A number, such as 12 or 5. 4 A character, such as x or $.

    A character string, such as "this is a string" Therefore, the following conditions are simple rule conditions: tabl.col=5
    • :v1>‘aaa’
    • :v2.a1<10.01
    • :v3.m( )=10.


  • Rule Set Evaluation

    The rules engine evaluates rule sets based on events. An event is an occurrence that is defined by the client of the rules engine. The client initiates evaluation of an event by calling the DBMS-RULE. EVALUATE procedure. The information specified by the client when it calls the DBMS-RULE. EVALUATE procedure includes the following:

    The name of the rule set that contains the rules to use to evaluate the event The evaluation context to use for evaluation. Only rules that use the specified evaluation context are evaluated.

    Table values and variable values: The table values contain rowids that refer to the data in table rows, and the variable values contain the data for explicit variables. Values specified for implicit variables override the values that might be obtained using a variable value evaluation function. If a specified variable has attributes, then the client can send a value for the entire variable, or the client can send values for any number of the variable's attributes. However, clients cannot specify attribute values if the value of the entire variable is specified.

    An optional event context: An event context is a variable-length array of type SYS. RE$NV_LIST that contains name-value pairs that contain information about the event. This optional information is not directly used or interpreted by the rules engine. Instead, it is passed to client callbacks, such as an evaluation function, a variable value evaluation function (for implicit variables), and a variable method function.

    The client can also send other information about the event and about how to evaluate the event using the DBMS-RULE. EVALUATE procedure. For example, the caller may specify if evaluation must stop as soon as the first TRUE rule or the first MAYBE rule (if there are no TRUE rules) is found.

    The rules engine uses the rules in the specified rule set to evaluate the event. Then, the rules engine returns the results to the client. The rules engine returns rules using the two OUT parameters in the EVALUATE procedure: true-rules and maybe_rules. That is, the true rules parameter returns rules that evaluate to TRUE, and, optionally, the maybe_rules parameter returns rules that may evaluate to TRUE given more information.

    FIG. 14 shows the rule set evaluation process:

    1. A client-defined event occurs.

    2. The client sends the event to the rules engine by running the DBMS_RULE.EVALUATE procedure.

    3. The rules engine evaluates the event based on rules in the rule set and the relevant evaluation context. The client specifies both the rule set and the evaluation context in the call to the DBMS_RULE.EVALUATE procedure. Only rules that are in the specified rule set and use the specified evaluation context are used for evaluation.

    4. The rules engine obtains the results of the evaluation. Each rule evaluates to either TRUE, FALSE, or NULL (unknown).

    5. The rules engine returns rules that evaluated to TRUE to the client. Each returned rule is returned with its entire action context, which may contain information or may be NULL.

    6. The client performs actions based on the results returned by the rules engine. The rules engine does not perform actions based rule evaluations.

    Overview of how Ruins are Used in Information Sharing System 100

    In information sharing system 100, each of the following mechanisms is a client of a rules engine, when the mechanism is associated with a rule set: a capture process, a propagation, and an apply process.

    In one embodiment, each of these mechanisms can be associated with at most one rule set. However, a single rule set can be used by multiple capture processes, propagations, and apply processes within the same database. FIG. 15 illustrates how multiple clients of a rules engine can use one rule set.

    Specifically, users use rule sets in Information sharing system 100 to do the following:

    (1) Specify the changes a capture process captures from the redo log. That is, if a change found in the redo log causes any rule in the rule set associated with a capture process to evaluate to TRUE, then the change is captured by the capture process.

    (2) Specify the events a propagation propagates from one queue to another. That is, if an event in a queue causes any rule in the rule set associated with a propagation to evaluate to TRUE, then the event is propagated by the propagation.

    (3) Specify the events an apply process retrieves from a queue. That is, if an event in a queue causes any rule in the rule set associated with an apply process to evaluate to TRUE, then the event is retrieved and processed by the apply process.

    In the case of a propagation or an apply process, the events evaluated against the rule sets can be captured events or user-enqueued events.

    If there are conflicting rules associated with a mechanism, then the mechanism performs the task if either rule evaluates to TRUE. For example, if a rule set associated with a capture process contains one rule that instructs the capture process to capture DML changes to the hr. employees table, but another rule in the rule set instructs the capture process not to capture DML changes to the hr.employees table, then the capture process captures DML changes to the hr. employees table.

    System-Created Rules

    Information sharing system 100 performs three tasks based on rules: Capturing changes with a capture process, propagating changes with a propagation, and applying changes with an apply process. Both user-created and system-created rules can be used to govern how each of these tasks is performed. Further, for any one of these tasks may be governed by a single rule set that includes both system-created rules and user-created rules.

    A system-created rule specifies one of the following levels of granularity for a task: table, schema, or global. This section describes each of these levels. Users can specify more than one level for a particular task. For example, users can instruct a single apply process to perform table-level apply for specific tables in the oe schema and schema-level apply for the entire hr schema.

    Table 6-1 shows what each level of rule means for each Information sharing system 100 task.

    Types of Tasks and Rule Levels
    Task Table Rule Schema Rule Global Rule Capture Capture the Capture the changes Capture the changes changes in the in the redo log for to all the database redo log for the the database objects objects in the data- specified table, in the specified base, convert them convert them into schema, convert into LCRs, and logical change them into LCRs, enqueue them. records (LCRs), and enqueue them. and enqueue them. Propagate Propagate the Propagate the LCRs Propagate all of the LCRs relating to related to the data- changes in the the specified table base objects in the source queue to the in the source specified schema in destination queue. queue to the the source queue to destination queue. the destination queue. Apply Apply all or a Apply the LCRs in Apply all of the subset of the the queue relating to LCRs in the queue. LCRs in the the database objects queue relating to in the specified the specified schema. table.

    Rule-Based Transformations and a Capture Process

    If a capture process uses a rule set, then both of the following conditions must be met in order for a transformation to be performed during capture:

    A rule evaluates to TRUE for a particular change found in the redo log.

    An action context containing a name-value pair with a particular, system-recognized name.

    A TRANSFORM FUNCTION is returned to the capture process when the rule is evaluated.

    Given these conditions, the capture process completes the following steps:
    • 1. Formats the change in the redo log into an LCR


  • 2. Converts the LCR into a SYS.AnyData object
    • 3. Runs the PL/SQL function in the name-value pair to transform the SYS.AnyData object
    • 4. Enqueues the transformed SYS.AnyData object into the queue associated with the capture process.


  • FIG. 16 shows a transformation during capture. For example, if an event is transformed during capture, then the transformed event is enqueued into the source queue. Therefore, if such a captured event is propagated from the dbs1.net database to the dbs2.net and the dbs3.net databases, then the queues at dbs2 net and dbs3.net will contain the transformed event after propagation.

    The advantages of performing transformations during capture are the following:

    Security can be improved if the transformation removes or changes private information, because this private information does not appear in the source queue and is not propagated to any destination queue.

    Space consumption may be reduced, depending on the type of transformation performed. For example, a transformation that reduces the amount of data results in less data to enqueue, propagate, and apply.

    Transformation overhead is reduced when there are multiple destinations for a transformed event, because the transformation is performed only once at the source, not at multiple destinations.

    The possible disadvantages of performing transformations during capture are the following:
    • All sites receive the transformed event.
    • The transformation overhead occurs in the source database.
    • Rule-Based Transformation Errors During Capture.


  • If an error occurs when the transformation function is run during capture, then the change is not captured, the error is returned to the capture process, and the capture process is disabled. Before the capture process can be enabled, users must either change or remove the rule-based transformation to avoid the error.

    Rule-Based Transformations and Propagation

    If a propagation uses a rule set, then both of the following conditions must be met in order for a transformation to be performed during propagation:
    • A rule evaluates to TRUE for an event in the source queue for the propagation. This event can be a captured or a user-enqueued event.


  • An action context containing a name-value pair with a particular, system-recognized name
    • A TRANSFORM-FUNCTION is returned to the propagation when the rule is evaluated.


  • Given these conditions, the propagation completes the following steps:
    • 1. Starts dequeuing the event from the source queue
    • 2. Runs the PL/SQL function in the name-value pair to transform the event
    • 3. Completes dequeuing the transformed event
    • 4. Propagates the transformed event to the destination queue.


  • FIG. 17 shows a transformation during propagation. In several of the examples given hereafter, the information being transformed is in the form of an LCR. However, as explained above, LCRs are only one type of information that can be shared using system 100. Thus, the various techniques described herein, including rule-based transformations, apply equally regardless of the form of the information that is being shared.

    Referring again to FIG. 17, suppose users use a rule-based transformation for a propagation from the dbs 1.net database to the dbs2 .net database, but users do not use a rule-based transformation for a propagation from the dbs1.net database to the dbs3 .net database. In this case, an event in the queue at dbs1.net can be transformed before it is propagated to dbs2.net, but the same event can remain in its original form when it is propagated to dbs3.net. In this case, after propagation, the queue at dbs2.net contains the transformed event, and the queue at dbs3.net contains the original event.

    The advantages of performing transformations during propagation are the following:

    Security can be improved if the transformation removes or changes private information before events are propagated.

    Some destination queues can receive a transformed event, while other destination queues can receive the original event.

    Different destinations can receive different variations of the same event. The possible disadvantages of performing transformations during propagation are the following:

    Once an event is transformed, any database to which it is propagated after the first propagation receives the transformed event. For example, if dbs2.net propagates the event to dbs4. net, then dbs4.net receives the transformed event.

    When the first propagation in a directed network performs the transformation, the transformation overhead occurs on the source database.

    The same transformation may be done multiple times when multiple destination databases need the same transformation.

    If an error occurs when the transformation function is run during propagation, then the event that caused the error is not dequeued, the event is not propagated, and the error is returned to the propagation. Before the event can be propagated, users must change or remove the rule-based transformation to avoid the error.

    Rule-Based Transformations and an Apply Process

    If an apply process uses a rule set, then both of the following conditions must be met in order for a transformation to be performed during apply:
    • A rule evaluates to TRUE for an event in the queue associated with the apply process. This event can be a captured or a user-enqueued event.


  • An action context containing a name-value pair with a particular, system-recognized name
    • A TRANSFORM_FUNCTION is returned to the apply process when the rule is evaluated.


  • Given these conditions, the apply process completes the following steps:
    • 1. Starts to dequeue the event from the queue
    • 2. Runs the PL/SQL function in the name-value pair to transform the event during dequeue
    • 3. Completes dequeuing the transformed event
    • 4. Applies the transformed event.


  • For example, suppose an event is propagated from the dbs1.net database to the dbs2.net database in its original form. When the apply process dequeues the event from a queue at dbs2. net, the event is transformed.

    The possible advantages of performing transformations during apply are the following:

    Any database to which the event is propagated after the first propagation can receive the event in its original form. For example, if dbs2.net propagates the event to dbs4. net, then dbs4.net can receive the original event.

    The transformation overhead does not occur on the source database when the source and destination database are different.

    The possible disadvantages of performing transformations during apply are the following:

    Security may be a concern if the events contain private information, because all databases to which the events are propagated receive the original events.

    The same transformation may be done multiple times when multiple destination databases need the same transformation.

    Rule-Based Transformation Errors During Apply Process Dequeue

    If an error occurs when the transformation function is run during apply process dequeue, then the event that caused the error is not dequeued, the transaction containing the event is not applied, the error is returned to the apply process, and the apply process is disabled. Before the apply process can be enabled, users must change or remove the rule-based transformation to avoid the error.

    Integration with Gateways

    According to one embodiment, an apply process may be configured to "apply" a set of LCRs to a database by (1) reading the LCRs to identify the changes reflected in the LCRs, (2) constructing a database command (e.g. a SQL command) that will cause the desired changes, and (3) executing the database command against the database.

    According to one embodiment, the apply process may be configured to construct a remote SQL statement for a database other than the database that the originally made the change reflected in the LCR. When executed within a remote database, the SQL statement will cause the desired changes to be made at the remote database.

    Once such a remote SQL statement is constructed, the SQL statement may be sent to the remote database through a gateway. The gateway may be configured, for example, to transform the query as necessary when the remote database is a different type of database than the source database. For example, a set of LCRs may be created in response to changes made in an Oracle database. Based on the LCRs, an apply process may construct a remote SQL query, and send the SQL query to a gateway. The gateway may then transform the SQL as necessary prior to forwarding the query to a non-Oracle data store. The non-Oracle data store may then execute the query to effect changes, asynchronously and remotely, in response to the changes, made to the Oracle database, upon which the LCRs were originally based.

    Integration with Flashback

    Various database languages, such as SQL (Structured Query Language), support special-purpose constructs referred to herein as "cursors". Prior to retrieving the results of a specific query statement, the DBMS may perform a significant amount of preliminary work for the statement, such as parsing, semantic analysis, and query plan generation. A cursor stores the results of much of this preliminary work. Consequently, when a query statement arrives, the DBMS first attempts to match the statement to statements for which cursors have already been created. If a match is found, the cursor is shared by the query statements, and the overhead work is avoided.

    A "flashback cursor" is a particular type of cursor that is used to access past data. A flashback cursor is created in response to receipt of a "flashback query". Unlike conventional queries, flashback queries specify a flashback time, and return data as it existed at the specified flashback time. One technique for handling flashback queries is described in patent application Ser. No. 09/676,305, filed Sep. 29, 2000, entitled SYSTEM AND METHOD FOR PROVIDING FINE-GRAINED TEMPORAL DATABASE ACCESS, by JONATHAN D. KLEIN, et al, the contents of which are incorporated herein by this reference.

    According to one embodiment, flashback queries and cursors can be used in conjunction with information sharing system 100 to make decisions about how to handle a change in a manner that is both (1) asynchronous to the change, and (2) takes into account the state of the system at the time of the change.

    For example, assume that a user makes a change to a source database at time T10. The change is reflected in the redo log at the source database. Eventually, a capture process reads the log and generates an LCR that corresponds to the change. The LCR is then stored in a staging area.

    According to one embodiment, the time at which the change was made permanent (committed) at the source database is stored in the LCR. Eventually, an apply process reads the LCR and passes the LCR to an update handler. By the time the update handler receives the LCR, the state of the system may have significantly changed relative to the state of the system at time T10. The update handler may read the change time T10 from the LCR and execute a flashback query to see the state in which the database system existed at the time the change was originally made (at time T10). The update handler may then determine what actions to take in response to the change based on the condition of the database system at T10.

    Flashback queries are generally able to specify the same types of operations as standard queries. Thus, the flashback queries used by the update handler to see the previous state of the system may involve performing complex operations using values that existed at that previous time. For example, the flashback query could perform complex joins and comparisons, all of which would be performed on the data values that existed at the previous point in time, in order to determine what actions to take in response to an LCR that identifies a change made at that previous, point in time.

    Tags and Cycle Avoidance

    As mentioned above, the various components of information sharing system 100 may be configured such that a particular event main initiate a complex chain of activities. Because each activity in a chain (e.g. the propagation of the event from one staging area to another) may itself initiate another chain of activities, it is possible for cycles to form. For example, assume that the components to information sharing system 100 are configured to propagate changes made to a first database to a second database, and to propagate changes made to the second database to the first database. In this scenario, the event associated with a change in the first database would be propagated to and applied at the second database. However, the application of the event at the second database would constitute a change to the second database. The event for that change at the second database would (without a mechanism for cycle avoidance) be propagated back to and applied at the first database. The application of the event at the second database would constitute a "change" to the first database, which would cause the entire process to repeat itself. According to one embodiment, the various components of information sharing system 100 set tags and inspect tags in a manner that avoids perpetuating such cycles.

    Introduction to Tags

    According to one embodiment, every redo entry in the redo log has a tag associated with it. The datatype of the tag is RAW. By default, when a user or application generates redo entries, the value of the tag is NULL for each redo entry, and a NULL tag consumes no space in the redo entry.

    Mechanisms are provided to allow users to configure to components of information sharing system 100 to customize how the components (1) set tag values, (2) inspect tag values, and (3) interpret and use the tag values, at various stages in an information sharing operation. For example, a tag can be used to determine whether an LCR contains a change that originated in the local database or at a different database, so that users can avoid change cycling (sending an LCR back to the database where it originated). Tags may be used for other LCR tracking purposes as well. Users can also use tags to specify the set of destination databases for each LCR.

    According to one embodiment, a variety of mechanisms are provided to allow users to control the value of the tags generated in the redo log. These mechanisms include, but are not limited to procedures referred to hereafter as SET_TAG, CREATE_APPLY, and ALTER_APPLY.

    The SET_TAG procedure is used to specify the value of the redo tags generated in the current session. When a database change is made in the session, the tag becomes part of the redo entry that records the change. Different sessions can have the same tag setting or different tag settings.

    The CREATE_APPLY and ALTER_APPLY procedures are used to control the value of the redo tags generated when an apply process runs. All sessions coordinated by the apply process coordinator use this tag setting. By default, redo entries generated by an apply process have a tag value that is the hexadecimal equivalent of ‘00’ (double zero).

    These tags become part of the LCRs captured by a capture process retrieving changes from the redo log. Based on the rules in the rule set for the capture process, the tag value in the redo entry for a change may determine whether or not the change is captured.

    Similarly, once a tag is part of an LCR, the value of the tag may determine whether a propagation propagates the LCR and whether an apply process applies the LCR. The behavior of a transformation, DML handler, or error handler can also depend on the value of the tag. In addition, users can set the tag value for an existing LCR using the SET TAG member procedure for the LCR. For example, users may set a tag in an LCR during a transformation.

    According to one embodiment, users create rules, by default each rule contains a condition that evaluates to TRUE only if the tag is NULL. In DML rules, the condition is the following:
    • :dml.is null tag( )=‘Y’.


  • In DDL rules, the condition is the following:
    • :ddl.is null_tag( )=‘Y’.


  • Consider a rule set with a single rule and assume the rule contains such a condition. In this case, capture processes, propagations, and apply processes behave in the following way:
    • A capture process captures a change only if the tag in the redo log for the change is NULL and the rest of the rule conditions evaluate to TRUE for the change.
    • A propagation propagates an event containing an LCR only if the tag in the LCR is NULL and the rest of the rule conditions evaluate to TRUE for the LCR.
    • An apply process applies an event containing an LCR only if the tag in the LCR is NULL and the rest of the rule conditions evaluate to TRUE for the LCR.


  • Specifically, the following procedures are provided to create rules that contain one of these conditions by default:
    • ADD_GLOBAL_PROPAGATION_RULES
    • ADD_GLOBAL_RULES
    • ADD-SCHEMA_PROPAGATION_RULES
    • ADD_SCHEMA_RULES
    • ADD_SUBSET_RULES
    • ADD_TABLE_PROPAGATION_RULES
    • ADD_TABLE_RULES


  • If users do not want the created rules to contain such a condition, then they may set the include_tagged_lcr parameter to true when users run these procedures. This setting results in no conditions relating to tags in the rules. Therefore, rule evaluation of the LCR does not depend on the value of the tag.

    For example, consider a table-level rule that evaluates to TRUE for all DML changes to the hr.locations table that originated at the dbs1. net source database. Assume the ADD_TABLE_RULES procedure is run to generate this rule:

    BEGIN
    DBMS_STREAMS_ADM.ADD_TABLE_RULES(
    Table_name => ‘hr.locations’, streams_type => ‘capture’, streams_name => ‘capture’, queue_name => ‘streams_queue’, include_tagged_lcr => false, -- Note parameter setting source_database => ‘dbsl.net’, include_dml => true, include_ddl => false);
    END;

    Notice that the include_tagged_lcr parameter is set to false, which is the default. The ADD-TABLE-RULES procedure generates a rule with a rule condition similar to the following:
    • (((:dml.get_object_owner( )=‘HR’ and :dml.get_object_name( )=‘LOCATIONS’)) and :dml.is_null_tag( )=‘Y’ and :dml.get_source_database_name( )=‘DBS1.NET’).


  • If a capture process uses a rule set that contains this rule, then the rule evaluates to FALSE if the tag for a change in a redo entry is a non-NULL value, such as ‘0’ or ‘1’. So, if a redo entry contains a row change to the hr.locations table, then the change is captured only if the tag for the redo entry is NULL.

    However, suppose the include_tagged_lcr parameter is set to true when ADD_TABLE_RULES is run:

    BEGIN
    DBMS_STREAMS_ADM.ADD_TABLE_RULES(
    table_name => ‘hr.locations’, streams_type => ‘capture’, streams_name => ‘capture’, queue_name => streams_queue include_tagged_lcr => true, -- Note parameter setting source_database => ‘dbsl.net’, include_dml => true, include_ddl => false);
    END;
    .
    • In this case, the ADD_TABLE_RULES procedure generates a rule with a rule condition similar to the following:
    • (((:dml.get_object_owner( )=‘HR’ and :dml.get_object_name( )=‘LOCATIONS’)) and :dml.get_source_database_name( )=‘DBS1.NET’).


  • Notice that there is no condition relating to the tag. If a capture process uses a rule set that contains this rule, then the rule evaluates to TRUE if the tag in a redo entry for a DML change to the hr. locations table is a non-NULL value, such as ‘0 ’or ‘1’. The rule also evaluates to TRUE if the tag is NULL. So, if a redo entry contains a DML change to the hr. locations table, then the change is captured regardless of the value for the tag.

    If users are using global rules to capture and apply DDL changes for an entire database, then online backup statements will be captured, propagated, and applied by default. Typically, database administrators do not want to replicate online backup statements. Instead, they only want them to run at the database where they are executed originally. To avoid replicating online backup statements, users can use one of the following strategies:
    • Include one or more calls to the SET TAG procedure in users' online backup procedures, and set the session tag, to a value that will cause the online backup statements to be ignored by a capture process.
    • Use a DDL handler for an apply process to avoid applying the online backup statements.


  • Tags and an Apply Process

    An apply process generates entries in the redo log o