Application-directed variable-granularity caching and consistency management6021413Abstract A system and method for application-directed variable-granularity consistency management, in one embodiment, carries out the steps of: predefining a template specifying a structure of a file; imposing the template on the file including registering fields/records within the file for consistency; creating an index table for the file; detecting a write to the file, at one of a file system server and a file system client; and queuing, upon detecting the write to the file and in the event a portion of the file to which the write occurs is registered for consistency, the write for propagation to another of the file system server and the file system client. The system and method may employ an application program that predefines a template specifying a structure of a file and imposes the template on the file including registering fields/records within the file for consistency; and further employs a file system that creates an index table for the file; detects a write to the file, at one of a file system server and a file system client; and queues, upon detecting the write to the file and in the event a portion of the file to which the write occurs is registered for consistency, the write for propagation to another of the file system server and the file system client. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
TABLE 1
______________________________________
pconsistency (file.sub.-- name, template.sub.-- name, consistency.sub.--
fields, fallback)
______________________________________
Operation:
The pconsistency () call ensures that consistency.sub.-- fields as
defined in template.sub.-- name from file file.sub.-- name are
consistent with the server. If the server cannot be
reached, then the fallback behavior is followed.
Fallback Behavior:
NOTIFY: Notify the user of a failure to maintain
consistency and return error.
LOCAL: Return error.
______________________________________
The embodiment of the file system described supports multiple levels of reads and writes at the file system client (the portable computer) 100, enabling applications to override default consistency behavior for the file on a per request basis. The multi-level reads and writes are performed through special library calls referred to herein as PREAD and PWRITE, which take the following arguments: file descriptor, buffer, length of byte to read or write, level, and fallback behavior. The level field specifies the consistency level of the read/write, and the fallback behavior field specifies the action if the requested level of read/write fails, e.g., "notification" or "local" (see above). Tables 2 and 3 summarize the types of PREAD and PWRITE calls available in accordance with the present embodiment.
TABLE 2
______________________________________
pread (file.sub.-- descriptor, buffer, length, level, fallback)
______________________________________
Operation:
The pread () call reads length bytes from the file
identified by file.sub.-- descriptor into buffer. The level
option specifies the consistency level of the read
operation. For consistent reads, the fallback behavior
controls what to do when the server cannot be reached.
Level = CONSISTENT.sub.-- READ; Fallback Behavior;
Read from the server, cache locally, then return, Fallback
can be:
ABORT: Return error.
BLOCK: Block until server can be reached.
LOCAL: Read from local copy and return.
Level = LOCAL.sub.-- READ:
Read from the local copy and return.
______________________________________
A PREAD consistency level may be one of the following: (a) a local read, or (b) consistent read. A local read is performed on the local cache copy at the file system client (the portable computer) 100, while a consistent reads checks for consistency between the file system client (the portable computer) 100 and the file system server/file client 1022, and returns the consistent copy. A side effect of a consistent read is that local copy is updated (if needed) at the file system client (the portable computer) 100. A PWRITE level may be one of the following: (a) local write, (b) writeback, or (c) writethrough. A local write only writes to the local cache copy at the file system client (the portable computer) 100, a writeback updates the local cache copy and queues the write for transfer to the file system client (the portable computer) 100 (every sixty seconds the queued writes are flushed to the file system client (the portable computer) 100), and writethrough synchronously updates both the local cache copy and the file system server copy. A PREAD or PWRITE library call may fail due to a number of reasons, including disconnectness (for consistent read, writethrough, writeback), an uncached file (for local read), no disk space (for writes), etc. In each case, the fallback behavior specifies the action to be performed upon failure of the library call and may be one of the following: abort, blocked or local. Abort returns an error upon failure of the request. Block returns an error, if the failure was due to an uncached file (for local read) or lack of disk space (for writes) or blocks (by suspending a current operation and waiting) until network connectivity is reestablished, if the failure was due to disconnectedness Local is only applicable for consistent read, writethrough and writeback, wherein upon disconnectedness, it converts consistent reads to local reads, and writethroughs to writebacks. Note that during disconnection, the writeback effectively becomes a log file. By default, the file system maps the standard read library call to a PREAD with local read level and abort fallback, and the standard write library call to a PWRITE with local write level and abort fallback. When either the whole file or relevant parts of the file are kept consistent between the portable and the home, the underlying consistency mechanism ends up overriding the local read/write. Advantageously, during disconnection, and for parts of the file that are not kept consistent during partial connectedness, the standard read/write call does not incur network overhead. It is assumed that applications typically use the standard read/write library calls in conjunction with the application directed consistency policy. The PREAD and PWRITE library calls provide a mechanism to override the standard read/write behavior and serve to enforce consistent reads/writes on a per call basis. It is further assumed that partial connectivity, in accordance with the present embodiment, is the common mode of operation, and it is also assumed that applications are typically "smart" enough to optimize their file consistency. Thus, support for disconnected operation in the present embodiment is rudimentary. The present embodiment does not have any special default mechanisms for hoarding files. A user may explicitly hoard a file using a PHOARD library call, which will cache the file in all directories in its path. However, as has been noted previously, a user may not be able to predict all of the files required for disconnected operation in advance, because he/she may not be aware of all the files required by an application for execution. Intelligent mechanisms for predictively caching both user data and system/resource files are generally required during file hoarding, however, such predictive systems are flawed at best. Also, as mentioned above, hoarded files are very susceptible to concurrent writes, thus rendering them vulnerable to inconsistencies. Fallback behavior determines the action to be taken upon disconnection for the PREAD, PWRITE and the PCONSISTENCY library calls. Upon disconnection, the PCONSISTENCY call may either go into local mode, or notify the user of the disconnection (i.e., "notification" or "local" fallback modes). Standard reads and writes are mapped by default to local read/writes, and are not affected by disconnection. The consistent PREAD library call and writethrough/writeback PWRITE library calls will (as mentioned above) return an error, and may either abort, or block, or may fallback on the local copy. A local read on an uncached file will fail and return an error. The present embodiment can detect concurrent write/write conflicts, but not read/write conflicts. Upon reconnection the present embodiment will detect write/write conflicts by comparing version vectors at the granularity of a whole file, and at the finer granularity of a record or a field, depending on the current consistency semantics. Conflicts are aggregated (i.e., all conflicts are listed in a file, and all changes that are discarded are also written to a file) and the user is notified. While the present embodiment does not resolve conflicts, it is envisioned by the inventors that a more intelligent future embodiment will not only detect but also resolve conflicts, based on an approach similar to that described in Demers, "The Bayou Architecture: Support for Data Sharing Among Mobile Users", IEEE Workshop on Mobile Computing Systems and Applications, 1994; and/or in Terry, "Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System, Proceedings of the Fifteenth ACM Symposium on Operating System Principles", December 1995. Both the Demers and Terry references are hereby incorporated by reference herein as if set for in their entirety. The template file not only contains the structure of records and fields, but also optional procedures for conflict detection and conflict resolution for each field/record/template structure. Upon reconnection or change of the consistency fields in PCONSISTENCY, the conflict resolution procedure is invoked in order to resolve detected conflicts. Referring to FIG. 3, a high level block diagram is shown of various components used in the file system client and the file system server. A shared library 100 is linked into applications 302 at run time and provides the support routines for PREAD, PWRITE, PCONSISTENCY and PHOARD library calls and performs name space mapping. A client daemon 304 runs on the file system client (the portable computer) 100 and coordinates all remote file accesses by client applications. The server daemon 306 runs on the file system server/file client 102 and services all remote file accesses by the client daemon 304. Applications apply data format templates to both the client daemon 304 and server daemon 306 to support application-directed consistency control. In accordance with the present embodiment, both the file system client (portable computer) 100 and the file system server/file client 102 are user level processes. Communication is implemented through BSD sockets but could be implemented through RPC. The implementation architecture is simple and highly portable, making porting of the software to, for example, Windows 95, Windows NT, Windows CE, Macintosh, or other operating systems, possible without architectural changes. All configuration information for the file system is handled through environment variables, and therefore can be adjusted on a per-application basis. For example, the file system client cache directory is passed to the shared library using, for example, the PFS.sub.13 ENABLE environment variable. The shared library provides both mobility-aware and mobility-unaware applications with access to the mobility support features of the file system. Mobility unaware applications are supported by linking the shared library into the mobility unaware applications at run time. As mentioned above, the shared library defines the PREAD, PWRITE, PCONSISTENCY and PHOARD library calls. In addition, the shared library also concludes a set of low level file input/output routines that override (overload) the low level file input/output routines normally provided by the operating system. The overridden system routines can be grouped into four categories: non-mutating directory operations, such as opendir, stat, lstat, readlink, access, chdir, and chroot; mutating directory operations, such as: mkdir, chown, chmod, rmdir, unlnk, symlak, link, rename, utime and cimes; non-mutating file operations, such as: open execl, execle, execv, execve, execvep, and read; and mutating file operations, such as: create, truncate and write. All of the overridden system routines are modified to parse the PFS.sub.-- ENABLE environment variable the first time any one of them is called in order to perform name space mapping of files accessed in the file system client's cache. Note that by using an environment variable to configure the host port number, and cache directory, each application can potentially work out of its own cache, or share a global cache. All mutating operations log their actions to an output queue for sending to the file system server/file client 102. (Note that when disconnected, the output queue essentially becomes a log file.) If operating in stand alone mode, then all of the routines operate by default on the local copy of the file and no network traffic is generated. If operating in connected mode, then mutating and non-mutating directory operations will first fetch file metadata from the file system server/file client 102, and then operate on the locally cached data, Mutating operations log their actions to the output queue. A TRUNCATE library call and all non-mutating file operations, with the exception of PREAD, use PHOARD to cache the whole file prior to accessing the local copy. Additionally, the TRUNCATE library call writes an entry to the output queue. Regardless of the current connection mode, all read and write operations mapped to PREAD and PWRITE have cache only access. The CREATE library call always creates the new file directly into the local cache. The client daemon 304 is the background process that runs on the file system client (the portable computer) 100 and "listens" for PREAD, PWRITE, PHOARD and PCONSISTENCY requests from client applications. The client daemon 304 acts as the central contact point for the file system client (the portable computer) 100, maintaining state and managing consistency for every application on the system. File data from the file system server/file client 102 is cached by the client daemon 304 using the file system client's (the portable's) native file system. The client daemon 304 also creates and maintains auxiliary files that contain end of season version vectors for the consistency fields of registered files. These version vectors are used to check for fine grain changes to files, and for the detection of write/write conflicts between the portable and the file system server. When the client daemon 304 receives a PCONSISTENCY request from an application, it first checks to see if the file has already been registered. If not, a check is made to see if the file has been cached. If the file has not been cached, then the server daemon 306 is contacted and the fields for which consistency is desired are transferred to the client cache. Once cached, the client daemon 304 registers the fields or records for which consistency is being requested, so that the changes made to the server copy of the file will cause the server daemon 306 to notify the client daemon 304 of such change. Whenever a new file is fetched from the file system server/file client 102, the index and version vector information relative to the specified template are also sent to the client daemon 304. This index file is updated by the client daemon 304 whenever the file is modified. Version vectors are compared between the client and server to identify which portions of the file must be transferred upon update, and also to detect write/write conflicts. If the client daemon 304 is notified of a conflict, then changes made to the file on the client are saved, replaced with the server version, and the user is notified of the conflict (assuming "notification" is the fallback behavior selected). In accordance with the present embodiment, no effort is made to resolve the conflict automatically, although it is contemplated that such could be accomplished using heretofore known techniques or improvements thereon. The role of the server daemon 306 is to handle read/write and consistency requests from one or more client daemons 304. The server daemon 306 runs as a background operation on the file system server/file client 102 and serves files from any of one or more readable or writable partitions mounted on its file system, e.g., such as network volumes from the file server 104. Thus, the file system client (the portable computer) 100 can access files stored on machines other than the file system server/file client 102 as long as the files are directly accessible from the file system server/file client 102, such as through its local area network. Since the server daemon 306 runs with privileges granted to the user (rather than root), only those files that are normally accessible to the user of the file system server/file client 102 are made available to the file system client (the portable computer) 100. When a file is registered by a client daemon 304, the server daemon 306 monitors the metadata of that file at 60 second intervals to detect server side changes. If a change is detected on the server side, the index file is recomputed, the registered portions are queued, and the queue is flushed to the client daemon 304. Before responding to a PCONSISTENCY request from a client daemon 304, the server first consults the index and version vector file corresponding to the specified template. If the index file is outdated or does not exist, the server daemon 306 precomputes all of the indices and assigns them a unique version vector. In subsequent operations with the client daemon 304, the version vector is used for the detection of write/write conflicts. When conflicts are detected the client daemon 304 is notified immediately. As mentioned above, in accordance with the present embodiment, the server copy is always assumed to be correct, leaving it up to the client daemon 304 to "resolve" any file conflicts. Applications inform the file system of their data file semantics through data format templates. All templates are stored in a single configuration file identified by distinct names and are used by both the client daemon 304 and the server daemon 306. In order to handle a wide variety of data files, three distinct forms of templates are employed: fixed length templates specify field lengths in bytes; variable length continuous templates specify strings that identify record and field boundaries in data files in which for contiguous templates all fields must occur in the specified order, cannot be nested, and have no missing fields; variable length generic templates are specified using regular expressions and subsequently have fewer restrictions than the variable length contiguous templates or fixed length templates in that, for example, they can have fields that nest, overlap, are missing, or occur in any order. Variable length generic templates are the most processor intensive of these three different forms of templates. Index computation is trivial in the case of fixed length templates but may be involved in the case of a variable length generic template. The basic format of a template expression is shown below. Colons are used to separate different depths of field nesting, and the backslash can be used to extend the template onto more than one line. Note that templates cannot use combinations of fixed, variable length contiguous and variable length generic forms in the same definition.
______________________________________
Template.sub.-- Name
: (Record Separators) [.backslash.
: Id1 = (Field Separators), . . . [.backslash.
: Id2 = (Subfield Separators), . . . ]]
______________________________________
Templates consist of a Template.sub.-- Name, followed by a colon, followed by a record separator. If records can be broken down into individual fields or subfields, these can be specified using additional colon-separated entries in the template. Records within a file are uniquely identified by their location relative to the beginning of the file. Fields and subfields within a record are identified by the Id values specified in the template file. For fixed length templates, the records separated are simply the record or the field length. For variable length contiguous templates the field separator is a ("marker string", index) pair. The marker string is a search string that identifies the record or field boundary, and the index is the number of bytes from the beginning of the marker string to the true boundary point. In the case of variable length generic templates, the separator is described using two regular expressions, one that matches the start of the field, and another that matches the end of the field. As with contiguous templates, each regular expression has an index to precisely identify the boundary location within the matching expression. The following are examples of a fixed length template, a variable length contiguous template, and a variable length generic template for an e-mail spool file, such as might be used with the e-mail system described below in the example.
______________________________________
Fixed.sub.-- Sized.sub.-- Example .backslash.
: (1024) .backslash.
: .1=(16), .2=(512), .3=(496) .backslash.
: .1.1=(8), .1.2=(8), .2.1=(256), .2.2=(256)
______________________________________
This template defines 1,024 byte records with three subfields of lengths 16, 512, and 496 bytes respectively. Sub field one consists of two eight byte fields and sub field two has two 256 byte fields.
______________________________________
Variable.sub.-- Sized.sub.-- Contiguous.sub.-- Example .backslash.
: (*.backslash.n.backslash.nFrom *,2) : .1=(*.backslash.nFrom: *,1),
.backslash.
.2=(*.backslash.nSubject: *,1), .3=(*.backslash.n.backslash.n*,2)
______________________________________
For this template, records are separated by "/nFrom", and contain three subfields. The first sub field starts with "From:", the second sub field starts with "Subject: ", the third starts after two new lines. For string matches, field lengths are not computed and are assumed to span between separators.
______________________________________
Variable.sub.-- Fixed.sub.-- Generic.sub.-- Example .backslash.
: (/F/,0 - /.backslash.n.backslash.nFrom /,1) .backslash.
: .1=(/F/,0 - /.backslash.n.backslash.n/,0), .backslash.
.2=(/.backslash.n/,0 - /.backslash.n.backslash.nFrom *,0) .backslash.
: .1.1=(/.backslash.nSubject: *,1 - /.backslash.,0), .backslash.
.1.2=(/.backslash.nFrom: /,1 - /.backslash.n/,0)
______________________________________
The regular expressions used in the generic template are specified using surrounding slashes (i.e., /F/) for fields defined by regular expressions, field indices are computed in a recursive manner, starting with records and working down to the deeper fields. Subfields are only searched for within the higher level fields in which their Id belongs. For example, in the above example, the "Subject:" and "From:" fields are only searched for within the e-mail headers. Since the higher level boundaries are already known, this makes it possible to identify missing and out of order sub fields. In the argument list for PCONSISTENCY library calls, the field for which consistency is desired must be listed. Since records are in numerical order starting with record one at the beginning of the file, consistency fields can be specified using number ranges and wild cards. In addition, the special character $ can be used to represent the number of the last record in the file. For example, consistency for the "Subject:" fields of the last ten e-mail messages can be denoted by: $-10.1.1-$.1.1 for the example below. Consistency of all the "Subject:" and "From:" fields are denoted by: *.1.1. EXAMPLE The following example relates to an e-mail system employing the teachings of the above-described embodiment. There are two processes involved in the e-mail system: the e-mail background that keeps desired parts of a mail spool file consistent between the portable and the file system server and an e-mail frontend, which allows the user to browse e-mail headers, and selectively read e-mail messages. The background runs continuously and illustrates application-directed consistency depending on an available quality of service, while the frontend is a user invoked process that illustrates the use of PREAD to override the default consistency provided by the background. The e-mail background of the present example is designed to handle two possible quality of service classes: Ethernet and Ram. In the Ethernet class, the background keeps the entire mail spool file consistent, while in the Ram class, the background keeps only the "from" and the "subject" fields consistent. The template structure used for the spool file is as follows:
______________________________________
Variable.sub.-- Fixed.sub.-- Generic.sub.-- Example .backslash.
: (/F/,0 - /.backslash.n.backslash.nFrom /,1) .backslash.
: .1=(/F/,0 - /.backslash.n.backslash.n/,0), .backslash.
.2=(/.backslash.n/,0 - /.backslash.n.backslash.nFrom *,0) .backslash.
: .1.1=(/.backslash.nSubject: *,1 - /.backslash.,0), .backslash.
.1.2=(/.backslash.nFrom: /,1 - /.backslash.n/,0)
______________________________________
The following are relevant parts of e-mail backend code with five points of interest indicated using parenthetical numbers in the margin.
______________________________________
email.sub.-- backend() {
. . .
(1) make.sub.-- QoS.sub.-- Option(&QoS.sub.-- Struct, 2);
set.sub.-- QoS.sub.-- Option(&QoS.sub.-- Struct, O, ETHERNET
proc.sub.-- ether, ROLLBACK, ROLLBACK)
(2) set.sub.-- QoS.sub.-- Option(&QoS.sub.-- Struct, 1, RAM, proc.sub.--
ram,
ROLLBACK, ROLLBACK);
(3) get.sub.-- QoS(&QoS.sub.-- Struct);
. . .
}
proc.sub.-- ether() {
. . .
(4) Pconsistency(mailspool, MAIL.sub.-- TML, *, LOCAL);
(5) pause();
. . .
}
proc.sub.-- ram() {
Pconsistency(mailspool, MAIL.sub.-- TML, *.1.*, LOCAL);
pause ();
. . .
}
______________________________________
Items (1), (2) and (3) above relate to adaptive run time support. (1) make.sub.-- QoS.sub.-- Option call specifies the number of options provided for the subsequent get.sub.13 QoS.sub.-- call. (2) The get.sub.-- QoS.sub.-- Option call takes 6 parameters: the pointer to the QoS structure (&QoS.sub.-- Struct), the index of the option (0), the desired QoS class (Ethernet), the procedure to execute if that class is satisfied (Proc.sub.-- ether), the action to perform if the QoS goes below the negotiated class (ROLLBACK), and the action to perform if the QoS goes above the negotiated class (ROLLBACK). (3) The get.sub.-- QoS call handles both QoS negotiations and reaction to notification. When get.sub.-- QoS is called, the run time system finds the highest QoS class among the options that it can satisfy, and executes the corresponding procedure (e.g., if Ethernet can be satisfied, it will execute Proc.sub.-- ether. If, during execution of the procedure, the QoS goes below the specified QoS class, then the action corresponding to the decrease of QoS class (ROLLBACK) is performed. Note that four possible actions are allowed: rollback, abort, block or ignore. Rollback will go to the start of the get.sub.-- QoS call, and start the procedure corresponding to the currently highest QoS class. In the above example, if during execution of the Proc.sub.-- ether call the QoS comes down from Ethernet to Ram, the effect is to abort Proc.sub.-- ether and to start the execution of Proc.sub.-- ram. (4) In Proc.sub.-- ether, the application specific whole file consistency (which is the default for the file system in the connected mode). The PCONSISTENCY function is invoked with the file name (mail spool) template name (mail.tml), consistency flags (*), and fallback behavior (local). (5) The pause then suspends the process until a SIGUSR1 signal (by appropriately setting the mask) from the runtime system signifies a change in the QoS class. If the QoS class is now Ram, Proc.sub.-- ram is invoked, and it's effect is to change the consistency policy from whole file to only the "From" and "Subject" fields. The above description of the background illustrates two-points: (a) PCONSISTENCY and get.sub.-- QoS have close interaction, and (b) the consistency is completely independent of opens and closes. The e-mail frontend is designed to let the user browse through the headers of all e-mail messages ("From" and "Subject" fields), and then read specific e-mail messages selected by an index number. The relevant parts of the e-mail frontend are given below, and points of interest indicated by parenthetical numbers located in the margins.
______________________________________
email.sub.-- frontend() {
. . .
(1) fd = open(mailspool, O.sub.-- RDONLY);
. . .
operation = get.sub.-- input();
switch(operation) {
case BROWSE.sub.-- HEADERS:
. . .
index = O;
while (get.sub.-- email(index, &pos, &len)) (
index ++;
lseek(fd, pos,, SEEK.sub.-- SET);
(2) read(fd, buffer, len);
. . .
}
. . .
break;
case GET.sub.-- EMAIL MESSAGE:
. . .
index = get.sub.-- email.sub.-- index();
if (get.sub.-- email(index, &pos, &len)) {
lseek(fd, pos, SEEK.sub.-- SET);
(3) pread(fd, buffer, len
CONSISTENT.sub.-- READ, ABORT);
. . .
break;
. . .
}
. . .
}
______________________________________
By way of example, consider the QoS class to be Ram. The background thus has the "From" and "Subject" fields consistent for the messages, but not the body of the messages. (1) The file open is independent of the consistency policy. In fact, in the present example, the application that maintains consistency (i.e., the e-mail background) is different from the application that uses the consistency for its reads (i.e., the e-mail frontend). The frontend allows the user to select the operation using get.sub.-- input. If the operation is to browse the headers, the frontend repeatedly gets the start position and length of the next e-mail message (it may use an index table for this purpose); seeks to specify the location in the file; and reads the message. (2) The read call automatically maps to a local read in the file system. Since the headers are consistent, the read call returns the latest (with the 60 second propagation delay) e-mail headers. However, the body of the messages may be inconsistent, and may be composed of all zeros at the file system client (portable computer) 100 if they have not yet been retrieved from the file system server/file client 102. (3) If the user desires to read specific e-mail message, a consistent read is performed through a PREAD library call, which specifies the consistent.sub.-- read option and abort fallback. This read synchronously accesses the file system server and aborts the read request upon failure to access the file system server. While the invention herein disclosed has been described by means of specific embodiments and applications thereof, numerous modifications and variations could be made thereto by those skilled in the art without departing from the scope of the invention set forth in the claims.
|
Same subclass Same class Consider this |
||||||||||
