Method and apparatus for moving subtrees in a distributed network directory5956718Abstract A method and apparatus for reorganizing a distributed directory is disclosed. A source logical group, such as a partition, is identified for reorganization. A destination object is selected to which the source logical group will become subordinate. Multiple simultaneous reorganizations of the distributed directory can be handled. Claims We claim: Description TECHNICAL FIELD
TABLE 1
______________________________________
Begin Move Entry Structure (42 (0x2A))
Offset Content Type
______________________________________
Request Format
32 Version = 0 Int4
36 Flags = 0 Int4
40 Destination Parent Entry ID
Int4
36 New RDN Ustring
. . . Align4
. . . Source Server's DN
Ustring
Reply Format
16 Completion Code Int4
______________________________________
*Int4 a byte integer transmitted in LowHigh order
*Ustring a nullterminated Unicode string. Unicode is a fixedlength
character encoding scheme, 16 bits per character. It defines encodings
from all the world's languages. The representation was chosen to have
fixed width to facilitate processing. In Unicode, the range from 0x0000
through 0x007F is sevenbit ASCII (that is, ANSI X3.4).
*Align4 is a pad field of zero to three bytes making the next field star
on twobyte boundary.
* . . . When a variablelength field occurs, the subsequent fields are no
at a fixed offset. Ellipses appear in the offset column to indicate this.
*Completion Codes Success = 0
*Distinguished Name or DN is a representation of the sequence of
hierarchical components. An NDS object is identified by its name and by
the names of the objects in which it is contained, in a hierarchical tree
structure. The object's own name is called its partial name, or RDN (for
Relative Distinguished Name). Proceeding up the hierarchy, each containin
object has its own RDN. For # example, CN = Jan. O = Acme. C = US has
three partial names (RDNs). The Common Name is "Jan." The Organization
Name is "Acme." And, the Country Name "US.
This request is addressed to the server holding the master replica of the
destination container. The new parent of the NDS object is identified by
Destination Parent Entry ID. Within that container, its relative
distinguished name will be New RDN. The client also identifies the server
holding the master replica of the existing entry, by sending the Source
Server's DN.
3. If no anomalies are detected, the destination server replies with Success. At the same time it records the details of the move operation and starts a ten-minute timer. If the timer expires before the operation completes, it purges its record of the move and step 5 will not complete successfully. 4. The client makes a Finish Move Entry request to the source server. The Finish Move Entry NDS Protocol verb has the structure identified in Table 2.
TABLE 2
______________________________________
Finish Move Entry (43(0x2B))
Offset Content Type
______________________________________
Request Format
32 Version = 0 Int4
36 Flags Int4
40 Source Entry ID Int4
44 Destination Parent Entry ID
Int4
48 New RDN Ustring
. . . Align4
. . . Destination Server's DN
Ustring
Reply Format
16 Completion Code Int4
______________________________________
Flags
0x00000001 Remove Old Name Values
Completion Codes Success = 0
Remarks
This request is addressed to the server holding the master replica of the
object being moved. The Source Entry ID identifies the object on that
server. The client identifies the server holding the master replica of th
destination container by sending the Destination Server's DN. The
Destination Parent Entry ID identifies the parent container itself The ne
parent of the NDS object is identified by Destination Parent Entry ID.
Within that container, its relative # distinguished name will be New RDN
If the Remove Old Name Values flag is set, old values of the naming
attribute remain as multiple values of the attribute (but not as part of
the RDN). This choice is unavailable if the naming attribute is
singlevalued. If the flag is zero, all prior values of the naming
attribute are deleted before New RDN is added.
5. The source server makes a Restore Entry request to the destination server to transfer the complete object information. This can take several iterations. If there is a temporary anomaly, this step is retried several times before completing or being abandoned. The structure of the Restore Entry NDS Protocol verb is provided in Table 3.
TABLE 3
______________________________________
Restore Entry (46 (0x2E))
Offset Content Type
______________________________________
Request Format
32 Version = 0 Int4
36 Request Flags Int4
40 Iteration Handle Int4
44 Parent Entry ID Int4
48 Relative Distinguished Name
Int4
. . . Align4
. . . Source Distinguished Name|
Ustring
. . . Align
. . . Data Size = N Int4
. . . Entry Data Byte ›N!
Reply Format . . . Moving = 0
16 Completion Code Int4
20 Iteration Handle Int4
Reply Format . . . Moving = 1 and More = 1
16 Complete Code Int4
20 Iteration Handle Int4
24 Reserved Field = 0 Int4
Reply Format . . . Moving = 1 and More = 0
16 Completion Code Int4
20 Reply Flags = 0x00000400
Int4
24 New Distinguished Name
Ustring
. . . Align4
. . . New Tuned Name Tuned Name
______________________________________
*Request Flags 0x00000001 More, 0x00000002 Moving
*Reply Flags 0x00000400 Reply includes the New Tuned Name
*Completion Codes Success = 0
|Note: The Source Distinguished Name field is present if and only if the
Moving request flag is set to one.
Remarks
This operation serves two purposes.
(a) Restoring an entry previously backed up to an external medium.
(b) Conveying an entry's information to its new location when moving an
NDS leaf entry. The Moving flag indicates which case it is; zero for (b);
one for (b). In case (b), collision with an existing name is considered a
error.
The Parent Entry ID indicates the immediate parent of the entry being
restored.
The Relative Distinguished Name identifies the entry itself.
The Source Distinguished Name identifies the entry's former name, in case
of a move operation.
The Iteration Handle is used differently here from elsewhere. In other
situations, the amount of data returned from the server is (potentially)
larger than a single NDS message can accommodate. Here, the opposite
holds. The request can be larger than the largest NDS message. When the
More bit of the Request Flags field is set to one, the Restore Entry
request is incomplete, and is to be continued in another Restore Entry
request. If the bit is reset to zero, the client is indicating the #
completion of a series of Restore Entry requests. Only on completion does
the server process the request. On the first NDS request of the series,
the client sets the Iteration Handle to 0xFFFFFFFF; on subsequent
requests, to the value returned by the server in the preceding reply. The
reply format depends on the Request Flags, as indicated above. When movin
an entry, the last reply conveys information about the entry in its new
location; its new distinguished name (in # typed form), and its new Tune
Name.
6. If step 5 was successful, the source server removes the entry from its active database. It creates a moved obituary for the entry, identifying the destination location. The obituary propagates to replicas of the source partition through the synchronization channel. 7. The source server sends a Finish Move Entry reply to the client. FIG. 7 illustrates the three-party exchange. The additional steps that follow show the interaction of a wider group of network servers. 8. If another server has an external reference to the old copy of the moved object, the source server holds a Back Link attribute for the object identifying the other server. Using information in the Back Link, it notifies the other server to update the external reference. 9. This uses the Synch External Reference operation. The source uses a "Back Link. . . Moved" obituary for each other server to keep track of which ones have been notified. If new back links appear while this operation progresses, corresponding "Back Link. . . Moved" obituaries are created. The structure of the Synch External Reference NDS Protocol verb is provided in Table 4.
TABLE 4
______________________________________
Synch External Reference
Offset Content Type
______________________________________
Request Format
32 Version = 0 Int4
36 Flags = 0 or Purge obituary
Int4
40 Remote ID (hint) Int4
44 Entry Name Ustring
. . . Align4
. . . Parent Tuned Name
. . . Align4
______________________________________
Obituary Information
______________________________________
1) Restored
2) Dead
3) Moved
4) New RDN
Common Parameters
Type Int2
Flags Int2
Unused Int4
Creation Time Time Stamp
Data Parameters
Restored Creation Time
Restored CTS
Dead NULL
Moved Moved Destination Name - Tuned
New RND RDN - Name
______________________________________
10. Meanwhile, starting at step 3, the destination object has an Inhibit Move obituary attribute attached, indicating that a move is under way. As long as this attribute exists, the object cannot be moved again or deleted. This prevents race conditions as things settle down. Replica synchronization propagates the new object (with its Inhibit Move obituary) throughout replicas of the destination partition. 11. When (a) the deletion of the source object has been propagated throughout the source partition, and (b) the notifications of step 6 have been completed, the object is about to be purged. The source server notifies the destination server using the Release Moved Entry operation. At this point, the destination server removes the Inhibit Move obituary attribute from the new object. Through replica synchronization, the attribute removal propagates to other replicas of the destination partition. When this has occurred, and the destination server purges the obituary, the moved object becomes eligible to be moved again. Moving a Subtree As indicated above, from the client's viewpoint, moving a subtree looks the same as moving a single entry. The same Begin Move Entry and Finish Move Entry operations apply, as illustrated in FIG. 3. The exchange among servers is quite different, however. FIG. 8 shows the move of a subtree. In the example, partition C is being moved under object G. (As customary, the partition is named by its root-most object.) G is in partition F. Partition A is the parent of partition C. Three partitions participate in the operation. Each has a master replica. In the following detailed discussion of the operation to move a subtree the following terminology is used. The discussions assumes that some of the three partitions can be the same, that some of the servers can be the same, and that: S is the server holding the master replica of partition A. T is the server holding the master replica of partition C. U is the server holding the master replica of partition F. V is the server holding the master replica of server U's object. 1. The client sends a Begin Move Entry request to U. U enforces its access control to permit or deny the move operation. 2. If all is well, U replies with Success. At the same time, it records the details of the operation and starts a ten-minute timer. If the timer expires before T responds, U purges its record of the move details and the operation will not complete. 3. The client sends a Finish Move Entry request to T. T enforces its access control and the directory schema rules. Also, T locates the object representing server U in the name space, and identifies the server, V, holding the master replica of U's object. It sends V a "Control. . . Get Entry Move State" request to determine if U's object is itself moving. If it is moving, the subtree move operation cannot proceed. If any of these checks reveal a problem, T sends an error reply to the client and the operation terminates. The structure of the Control. . . Get Entry Move State NDS Protocol verb is provided in Table 5.
TABLE 5
______________________________________
Control . . . Get Entry Move State
Offset Content Type
______________________________________
Request Details
40 Verb = 2 Int4
44 Entry ID Int4
Reply Details
20 Parent Entry ID Int4
______________________________________
This operation reports if an entry is being moved or not. The entry is
indicated by Entry ID. If the entry is being moved, the completion code
"Move in Progress" is returned, and the Parent Entry ID reports the new
parent of the object.
4. T sends a Start Move Tree request to U. U checks the request against its expected details. It also checks that its software version--and the versions of servers identified in the back links of the destination partition root object (F)--are high enough that they support moving a subtree. If all is well, it sends a Success reply to T. In the reply, the Partition Overlap flag is set if partitions A and F are the same partition. The structure of the Start Move Tree NDS Protocol verb is provided in Table 6.
TABLE 6
______________________________________
Start Move Tree
Offset Content Type
______________________________________
Request Details
40 Version = 0 Int4
44 Flags Int4
48 Revision Int4
52 Destination ID Int4 (on destination server)
. . . Source Name Tuned Name
. . . Align4
. . . New RND Ustring
Reply Details
20 Version Int4
24 Flags Int4
28 Source ID Int4 (on destination server)
32 Destination Root ID
Int4 (on destination server)
______________________________________
5. U sets F's partition operation to Move Subtree Destination. It sets the partition state and the replica states to Move State 0, and the Partition Control Distinguished Name to identify C. If the leaf name of the object is being changed in the course of the move, it also adds a Tree Old RND obituary recording the prior name. (With this information, a server can do efficient lookups even if packets arrive from not-yet synchronized servers using an unexpected name.) It starts propagating these changes to the replicas of partition F. 6. T sets C's partition operation to Move Subtree Source. It sets the replica states to Move State 0. It also creates three partition control attributes. Each of the three has State=Moved State 0 and Operation=Move Subtree Source. The Distinguished Name depends on the Type, as follows:
______________________________________
Type Distinguished Name
______________________________________
0 Identifies G (new parent object).
1 Identifies B (old parent object).
2 Empty string in the Partition overlap case; otherwise, identifies A
(root object of partition immediately above C).
______________________________________
It starts propagating these changes to the replicas of partition C. 7. If the leaf name (relative distinguished name) of the object is being changed in the course of the move, it also adds a Tree New RND obituary recording the new name. (With this information, a server can do efficient lookups even if packets arrive from not-yet synchronized servers using an unexpected name). T makes a list of servers to be notified of the operation. The following servers are included in the list (duplicates are suppressed): Servers holding replicas of partition C. Servers holding replicas of partition F. Servers holding external references to objects in C (as identified by back links on the objects). this is the "Notification List." It is recorded as a Move Subtree obituary for each server on the list. T starts propagating all these changes to the replicas of partition C. 8. If the Partition Overlap flag was not set in step 4, T sends a Control request to S, indicating Lock Partition for partition A (C's parent). This prevents other partition operations on partition A while the move is in progress. For moves within the same partition, it is unnecessary to lock the parent. 9. T sends a Finish Move Entry reply to the client. The client is out of the picture from this point onward. 10. T drives completion of the operation. It sends a Move Tree request to every server in the Notification List, driven by the secondary obituaries. The structure of the Move Tree NDS Protocol request is provided in Table 7.
TABLE 7
______________________________________
Move Tree
Offset Content Type
______________________________________
Request Details
40 Version = 0 Int4
44 Flags Int4
48 Parent Name Tuned Name
. . . Align4
. . . Name Ustring
. . . Align4
. . . Creation Time Time Stamp
. . . Destination Parent
Tuned Name
. . . Align4
Name Flags Int4
New Name Int4
. . . Align4
. . . Replica Pointer for Master
Reply Details
20 Version Int4
24 Flags Int4
28 Replica Root ID Int4
______________________________________
It persists with periodic retries until all have been contacted. As each server is contacted successfully, T sets the corresponding obituary's Notified flag. The request conveys: The Tuned Name of C (the source) The Tuned Name of G (the destination) The Replica Pointer for T (the master server of partition C) 11. T adds a moved obituary to its entry for C, so that any requests about an object in partition C using its old name can be treated correctly while the operation is in progress. When a server, W, on the Notification List receives the request, its action depends on what replicas it holds of partitions C and F. In the following table: R means the server holds a replica of the partition. E means the server holds an external reference to the partition's root object. N means the server holds neither of the above.
______________________________________
Case 1 2 3 4 5 6 7 8 9
______________________________________
Partition C
R R R E E E N N N
Partition F
R E N R E N R E N
______________________________________
In Cases 1, 2, 3 and 5:
W locally switches its record of C's parent
from B to G.
In Cases 4 and 7:
W locally creates a subordinate reference
for partition C. Its reply has the Created
Subordinate Reference flag set, informing
T to add the subordinate reference to C's
replica list.
In Case 6: W locally creates an external reference for
G. Its reply has the Created External
Reference flag set, informing T to create a
back link to W.
In Cases 8 and 9:
These do not occur. Such servers would
not be on the Notification List.
______________________________________
12. Once the servers on the Notification List have been successfully contacted, T sends another request to the same Notification List: End Move Subtree. This causes the Moved obituaries to be purged. Every server has seen the new name, so it is no longer necessary to deal with requests that use objects' old names. As each request completes successfully, the corresponding obituary has the Purgeable flag set. 13. Once all the obituaries are marked Purgeable, T sends a Control request to U and (in the non-Partition Overlap case) to S, indicating Unlock Partition for A and F (respectively). A server receiving this request sets the partition state and the replicas' states to On, and propagates the change to the partition's replicas through the synchronization channel. Finally, T performs the Unlock Partition operation itself for C. Handling Multiple Simultaneous Moves In another embodiment of the invention, multiple simultaneous moves of logical groups in the distributed directory are handled. This functionality is particularly desirable where access to and the services provided by the distributed directory will be continued with minimal interruption. This preferred embodiment builds on the foregoing embodiment of move subtree by changing the manner names are handled during background server operations. In the foregoing embodiment of move subtree, only one component, the Relative Distinguished Name ("RDN") of an object's Distinguished Name ("DN"), may be in transition due to a move at any given time. Therefore, if a partition is in the process of being moved, it is preferred that a separate partition or move tree operation will not be accepted for that partition. This is enforced by serializing the partitioning and move subtree operations. In addition, if only leaf partitions are being moved, a move subtree operation will result in only one RDN in any object's DN to be effected/remapped. FIG. 9 illustrates the foregoing move subtree embodiment. Assume partition C (the source partition) is being moved from being directly subordinate to the parent object X in partition B (the parent partition), to being directly subordinate to the destination object Y in partition D (the destination partition). Object C's DN will change from C.X.B.A to C.Y.D.A. Assume that there are two servers holding replicas of object C: S1 and S2. The following table summarizes portions of the processing performed for referencing the source object C.
TABLE 8
__________________________________________________________________________
Transmitter (S1)
Receiver (S2)
Comments
__________________________________________________________________________
Has seen new name
Has seen new name
A local MapNameToID (LookForTunedRDN
or FineTuneRDN) can locate the object on
the receiver. This is the simplest scenario.
Has seen new name
Has NOT seen new
The transmitter sends New Name to the
name receiver. In this case C.Y.D.A. Since the
receiver has not seen the new name it creates
it as a reference object. When the MOVED
obituary is skulked to the receiver the old
object will be collapsed with this reference as
a part of the move.
Has NOT seen New
Has seen New Name
A local MapNameToID (LookForTunedRDN
Name or FineTuneRDN) can locate the object using
the forward reference placed at the old object
location. This case is similar to the first case
listed above. In both of these cases a
reference for the object is not created but the
actual object is found.
__________________________________________________________________________
The various rows of the table indicate the state of the object as perceived by server S1 and S2. The state of the object refers to whether the server is aware of the move or not, i.e., whether it internally refers to the object by its old name (C.X.B.A) or by its new name (C.Y.D.A). It should be noted that the name component effected by the move in this case is the RDN of the moved object. During the move process, each server that performs the move operation creates a forward reference (in the form of an OBT.sub.-- MOVED obituary) from the previous location of the object to the new location. Only the former and final names are known to the server. In this example a forward reference is placed at location C.X.B.A to point to C.Y.D.A This may be due to a reference synchronization or a MapNameToID request. DN transitions during a move subtree operation are handled in a similar fashion. Since an object is not allowed to be moved until any prior moves on it have completed, each component can have at most two names: a source name and a destination name. The handling of multiple simultaneous moves builds on the basic architecture of a standard move tree operation. Consider FIG. 10, where for illustrative purposes only the root objects are depicted, which are assumed directly subordinate to the parent partition's root object. The following simultaneous moves have been requested: D.C.B.A moves to D.F.E.A and B.A moves to B.H.A. In this example we are allowing a non-leaf partition to move (i.e. B.A) as well as a leaf-partition (i.e. D.C.B.A). This introduces multiple components in the name path for D.C.B.A to change simultaneously. Depending on the name states known to the different servers, several references may be created representing various stages of the move operation. After the move is completed, these references will need to be collapsed successfully. When servers on the notification list communicate to one another, they refer to and identify an object affected by a move, not only by the final name state of an object, but also by any and all intermediate name states and variations that object may have based on the multiple objects in its name path being moved. This applies for objects within the partition being moved as well as all objects subordinate to the partition. For instance, the following table summarizes the name states for object D.C.B.A seen by a server depending on whether the move has been performed on the server or not.
TABLE 9
______________________________________
variations of name states
D.C.B.A -> D.F.E.A
B.A -> B.H.A Name States
______________________________________
No No D.C.B.A
Yes No D.C.B.A
D.F.E.A
No Yes D.C.B.A
D.C.B.H.A
Yes Yes D.C.B.A
D.F.E.A
D.C.B.H.A
______________________________________
The number of name states are increased as more components are being moved and as moves are executed on a server. The receiver uses the names in the name state list to locate the object. Preferably, no references should be created unless requested by the transmitter. This will make it possible to locate a partition root even if multiple components in its name path are in the process of being moved. In order to construct the names in the name state list, a backward pointer is maintained when an object moves. This backward pointer will be used to construct the old name of a moved component. If during the process of the move a transmitter sends a Distinguished Name to a receiver, it constructs a list of names that include the current name of the entry as known to the transmitter followed by any previous names the entry may have had. These previous names (or name states) may vary based on moves that are currently in progress for objects in the name path of this entry. For example, in the table above multiple names exist for entry D.C.B.A because one of its components, namely B.A is moving. Depending on the move state on the receiver the receiver may "know" the entry for D.C.B.A as one of the following:
TABLE 10
______________________________________
Name known as
State of Moves on receiver
______________________________________
D.C.B.A None of the moves have occurred on the receiver
D.F.E.A D.C.B.A -> D.F.E.A
D.C.B.H.A B.A -> B.H.A
D.F.E.A D.C.B.A -> D.F.E.A
B.A -> B.H.A
______________________________________
It can be seen from the above table that the sequence of moves (i.e, if D or B moves first) dictate which "name" the receiver will know the entry by. Regardless, the receiver and the transmitter will know the entry by one of the names in the name list. As such, information about the object being transmitted can be processed on the fly while the move is in process without interrupting distributed directory services. The transmitter constructs a list of names that the receiver may know the entry by. This list acts as a hint for the receiver in locating the correct entry. Once the correct entry has been located processing can resume normally. If no moves are in progress for any component in the name path for the object then the list will consist of only one name, which is the current state of the entry. The receiver tries to locate the entry by processing each of the names in the list. Once a match has been found the entry is identified and the rest of the names in the name list may be ignored. Preferably, the receiver will not create a reference for an entry it has not seen unless explicitly requested by the transmitter. Therefore, the issue of collapsing a reference object with the real object when the move completes may not arise. The following functions can be used for are used to build a list of names for a specified entry. An entry is a complete object identifier. It is constructed by starting with the relative identifier of an object and adding the relative identifier of each ancestor object until the relative identifier of the root of the object hierarchy is added. All parameters are called by value and the called routine may modify the values in any way. AddComponent()--This function adds a component to a specified name. For example, if the name is "a.b.c" then AddComponent(name, d) will result in name becoming "a.b.c.d". Calling convention for this function is: name=AddComponent(name, entry). GetMovingFromComponent()--This function returns the moved from component for an object that may have moved. It uses the MOVED.sub.-- FROM indicator and returns the moved from name and creation time. (Other data may be returned that identify an object uniquely. Refer to the tuned name patent). If entry has no MOVED.sub.-- FROM indicator then component is null. The calling convention for this function is: component=GetMovingFromComponent(entry). RemoveComponent()--This function removes a component from a specified name. For example, if name is "a.b.c.d", then RemoveComponent(name) returns name as "a.b.c". The following calling convention is used for this operation is: name=RemoveComponent(name). AddToNameList()--This function adds a specified name to a list of names. The list may be an array or some other data structure. The following call convention may be used for this operation is: AddToNameList(list, name). GetRootMostComponent()--This function gets the root most component of a name. It does not modify the name. In other words, the root most component returned is not removed or modified from the name. For instance, if name is "a.b.c.d" then this function returns "d". The calling convention for this function is: entry=GetRootMostComponent(name). GetParentOfEntry()--This function returns the parent (ancestor) of the specified entry. The calling convention for this operation is: parent=GetParentOfEntry(entry). The following code segment illustrates how the foregoing functions can be used to build a list of name variations:
______________________________________
BuildNameList(namelist, partialName)
// Get the root most component of the partial name.
entry = GetRootMostComponent(partialName);
while (entry |= root)
{
fromEntry = GetMovingFromComponent(entry);
if (fromEntry |= null)
{
partialName = RemoveComponent(partialName);
partialName = AddComponent(partialName, fromEntry);
BuildNameList(nameList, partialName);
partialName = RemoveComponent(partialName);
partialName = AddComponent(partialName, entry);
}
entry = GetParentOfEntry(entry);
partialName = AddComponent(partialName, entry);
}
AddNameToList(nameList, partialName);
}
______________________________________
The ability to handle multiple simultaneous moves additionally facilitates moving non-leaf partitions. While the previous embodiments of move subtree could handle moving a non-leaf partition, it is preferred that only leaf partitions be moved. This is because it is relatively simple to check to see if the source and destination partitions are involved in such an operation and lock such partitions from other move tree or partition operations. As a move of a non-leaf partition could effect any or all partitions up or down the tree relative to the parent, source or destination partitions, the steps of checking and locking these partitions could be an expensive and time consuming operation, particularly with large and complicated trees spanning many different servers. If a system can handle multiple simultaneous moves, the need for locking partitions would be eliminated and non-leaf partitions can be moved without having to go through the process of traversing the distributed directory and locking relevant partitions. Moving a Non-Root Container Object In the several embodiments for move subtree, discussed above, the logical group being moved is a partition in the distributed directory. In other words, the root-most object defining a partition boundary along with its subordinate objects within the partition are being moved. One reason to enforce this aspect is that move subtree does not move any physical data but only remaps the distinguished names of the objects that have been effected. A non-root object could be moved, but such a move would traditionally require moving the object physically by performing a backup and restore operation on the object. The non-root object being moved would require data in the entire tree to be physically moved, a slow and burdensome process. In another embodiment of the invention, a non-root object in a partition can be moved using the foregoing functionality of move subtree. This is achieved by creating a partition boundary as a part of the move. This new logical group is then moved using the foregoing move subtree operations. The new logical group may or may not be left as a partition after the move. Assuming the schema rules and access requirements are satisfied, the move of a non-root container is achieved by first performing a partition split operation, preferably automatically, in the background before performing the actual move. The split is performed only if the object being moved is not a partition root or a leaf object. A partition root would use any of the move subtree embodiments and the move of a leaf object would use move leaf object. This introduces the concept of a multi-step partition operation. All current partition operations may be classified as single step multi-state operation. For example, a Join Partition performs only one step, i.e., the Join Operation although it may transition through several states before completing the operation. In this context a move subtree becomes a multi-step operation since it may first perform a split operation followed by a move operation. Each of the steps in a multi-step operation are self contained and carry enough information to complete the operation in their respective Partition Control attributes. These Partition Control attributes are set up when the user issues the original request. The Partition Control type field are redefined to include a sequence number field as described below:
______________________________________
typedef struct PARTCNTL
{
uint32
type;
uint32
function;
uint32
state;
uint32
partnerPartID;
} PARTCNTL;
______________________________________
Consider the proposed move depicted in FIG. 11. The non-root object F (the source object) is being moved from the parent object C to the destination object E. Two distinct partition operations are used, using the following Partition Control values: 1. Partition Split Operation Function PC.sub.-- SPLITTING, Sequence Number=0 2. Move Subtree operation, as discussed above Function PC.sub.-- MOVE.sub.-- SUBTREE.sub.-- SRC, Sequence Number=1 After the move subtree operation is completed, Partition F will remain in the distributed directory. If any of the operations are aborted, the entire sequence will be aborted. A sequence is started only after the previous sequence has been successfully completed. This approach could be used to build a queue of unrelated partition operations that the system would perform in sequence. The Split Operation is initiated by sending a DSASplitPartition request to the server holding the partition's master replica. In the request, the container where the partition is splitting is identified in the New Partition Root Entry ID field. The server holding the master replica calls SetRingState to set the Replica State to RS.sub.-- SS.sub.-- 0 for each replica. This server then calls SetPartitionControl to set the Partition State to Split State 0, and stores in its Partition Control attribute the split container's Distinguished Name and the current operation (Split Partition). The changes are propagated to all the partition's replicas and obituaries on the new partition root entry are processed. The Partition Control state is then set to RS.sub.-- SS.sub.-- 1. The server holding the master replica sends a DSALowLevelSplit request to each of the partition's replicas (except subordinate references). Servers holding replicas of the partition process the DSALowLevelSplit request by taking the partition attributes and values and copying them to the container to be split. These servers return a DSALowLevelSplit reply indicating success or failure. The server holding the master replica sets each replica's state to RS.sub.-- SS.sub.-- 1, and then advances the partition state and all replica's states to RS.sub.-- ON for both the parent partition and new partition. Lastly, the changes are propagated to all replicas through the synchronization process. The DSASplitPartition request contains a completion code indicating success or failure, and includes the following information:
TABLE 12
______________________________________
Field Type Description
______________________________________
Version uint32 0.
Flags uint32 0.
New Partition Root Entry
ID uint32 The new partition root entry is
Entry ID.
______________________________________
The DSALowLevelSplit reply contains a completion code and the Entry ID of the new child partition's root entry, and includes the following information:
TABLE 13
______________________________________
Field Type Description
______________________________________
Version uint32 2. DNs are Ustrings.
3. DNs are Tuned Names.
Flags uint32 0.
Iteration Handle
uint32
Parent Distinguished Name
Tuned Name
The Distinguished Name of
the parent partition of the
partition being split.
Align4
Child Distinguished Name
Tuned Name
The entry that will be the
partition root for the
new partition.
______________________________________
As shown in FIG. 12, a non-root container object can be moved within a partition. In such a move, the same first two partition operations (i.e., split and move) will occur. In addition, a third operational partition operation, join, may also be employed as follows: 3. Join operation (for moves within a partition only) Function PC.sub.-- JOINING.sub.-- UP, Sequence Number=2 Preferable, this embodiment includes a Partition Control type to keep track of the new Base Class of the moved object. If the base class is not being changed as a result of the move this value will not be created. This Partition Control value would contain the following information:
TABLE 14
__________________________________________________________________________
Field Length
Values Description
__________________________________________________________________________
type 4 Bytes
a) Sequence Number (2)
The sequence number
b) type (2) PCT.sub.-- 3
would be the number
corresponding to the Move
operation PCT.sub.-- 0 Partition
Control value.
function
4 Bytes
PC.sub.-- MOVE.sub.-- SUBTREE.sub.-- SRC,
Specifies Move operation
state 4 Bytes
N.A. N.A.
partnerPartID
4 Bytes
class ID Specifies the new base class
of the moved object
__________________________________________________________________________
A Join Operation is initiated by sending a DSAJoinPartition request to the server holding the child partition's master replica. The Partition Root Entry ID field in the request identifies the partition root entry of the child partition. The Server holding the child's master replica calls DSAResolveName to locate the parent partition's master replica. Next, the server preferably checks that: (i) the request is valid, (ii) the child partition is not involved in another partition operation, and (iii) the software versions on the servers involved support joining the partitions. If all is okay, the server sends a DSAStartjoin request to the server holding the master replica of the parent partition. The server holding the parent's master replica calls SetRingState and SetPartitionControl to set the parent replica's replica state and partition control state to RS.sub.-- JS.sub.-- 0. Next, the replica state to RS.sub.-- JS.sub.-- 0 is set for each of its replicas. The server sets its partition state to Join State 0, and stores in its Partition Control attribute: (i) the child partition's Distinguished Name, and (ii) the current operation (in this case, PC.sub.-- JOINING.sub.-- DOWN). The server returns a DSAStartJoin reply to the server holding the child partition's master replica. The server holding the child's master replica sets the replica state to RS.sub.-- JS.sub.-- 0 for each partition replica, and sets the partition state to RS.sub.-- JS.sub.-- 0. The server then stores in its Partition Control attribute: (i) the parent partition's Distinguished Name, and (ii) the current operation (in this case, PC.sub.-- JOINING.sub.-- UP). The server returns a DSAJoinPartitions reply indicating success or failure. The synchronization process then propagates these changes to all replicas of both partitions. Both servers compare replica rings to determine which servers they must add replicas to. DSAAddReplica requests are sent to other servers as follows: (i) servers containing replicas of the child partition, but not the parent, receive new replicas of the parent with new replica numbers, and (ii) servers containing replicas of the parent partition, but not the child, receive new replicas of the child partition (except subordinate references) with new replica numbers. The new replicas' states are set to RS.sub.-- JS.sub.-- 1. The server holding the parent's master replica checks that the child partition's state is PC.sub.-- JOINING.sub.-- UP. If the state is not PC.sub.-- JOINING.sub.-- UP, the server calls SetRing State and SetPartitionControl to set the partition and replica states to RS.sub.-- ON. This stops joining the partitions. If the state is PC.sub.-- JOINING.sub.-- UP the server sends a DSALowLeveljoin request to all servers holding replicas of the two partitions. Servers holding non-master replicas erase the boundary between the partitions and put all entries from both partitions into one partition. Server holding the parent's master replica performs the LowLevelJoin operation on its replicas. The server sets the replica state of the new partition's replicas to RS.sub.-- JS.sub.-- 2. The synchronization process propagates changes to all replicas. The server holding parent's master replica sets all replica states to RS.sub.-- ON, and sets the replica types as follows: (i) the parent's master replica becomes the master replica of the new partition, (ii) the child's master replica and all read/write replicas become read/write replicas of the new partition, and (iii) read-only replicas become read-only replicas of the new partition. The synchronization process propagates these changes to all replicas. Partition operations can now resume. The DSAJoinPartitions reply contains a completion code that indicates success or failure, and includes the following information:
TABLE 15
______________________________________
Field Type Description
______________________________________
Version uint32 0.
Flags uint32 0.
Partition Root Entry ID
uint32 The Entry ID of the child partition's
root entry.
______________________________________
The DSAStartJoin reply contains a completion code that indicates success or failure, and includes the following information:
TABLE 16
______________________________________
Field Type Description
______________________________________
Version uint32 2. DNs are returned as Ustrings.
3. DNs are returned as Tuned Names.
Flags uint32 0.
Iteration Handle
uint32
Parent DN Tuned Name
The DN of the parent partition.
Align4
Child DN Tuned Name
The DN of the child partition's root
entry.
______________________________________
The DSALowLevelJoin reply contains a completion code indicating success or failure, and includes the following information:
TABLE 17
______________________________________
Field Type Description
______________________________________
Version uint32 2. DNs are Ustrings.
3. DNs are Tuned Names.
Flags uint32 0.
Iteration Handle
uint32
Parent DN Tuned Name
The Distinguished Name of the
parent partition.
Align4
Child DN Tuned Name
The Distinguished Name of the
child partition's root entry.
______________________________________
With the present invention any logical group in a directory tree may be moved either within a directory or to another directory. With the invention, ease of providing administration of distributed network directories increases. Accordingly, use of distributed network directories will also increase, making pervasive network computing possible. Any of the above described embodiments may be implemented in a computer readable medium storing software, programs, data, files and the like. As shown in FIG. 9, the computer readable medium is a floppy diskette. However, one with ordinary skill in the art will readily appreciate that computer readable medium can take a variety of forms, including magnetic storage (such as hard disk drives, floppy diskettes, etc.), optical storage (such as laser discs, compact discs, etc.), electronic storage (such as random access memory "RAM", read only memory "ROM", etc.), and the like. The foregoing description of the preferred embodiment of the invention has been presented for purposes of illustration and description. It is not intended to exhaustive nor to limit the invention to the precise form disclosed. Many alternatives, modifications and variations will be apparent to those skilled in the art in light of the above teaching. Accordingly, this invention is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
|
Same subclass Same class Consider this |
||||||||||
