Declarative automatic class selection filter for dynamic file reclassification5495603Abstract A system for associating application or system information with data files according to data file attributes. The system employs an Automatic Classification Selection (ACS) filter having an ordered sequence of rule-based declarations, each of which specifies a range of values for selected data file attributes. Each rule-based declaration includes specifications for data file attributes, any of which can be specified using wild-cards. Each data file is tested against the ordered declarations and the first declaration that matches the data file attributes is enabled to assign a classification to that data file. Because the ACS filter is declarative, it may be easily modified without programming expertise. Because any data file can be quickly sieved through the ACS filter, the data file class linkages need not be stored and thus are always dynamically updated in response to changes in data file attributes over time. Claims We claim: Description BACKGROUND OF THE INVENTION
TABLE 1
__________________________________________________________________________
Storage Existing
Assigned
Group Directory
File
File
File (DOS)
Management
Management
Name Path Name
Size
Attributes
Class Class
__________________________________________________________________________
SYSTEM
.backslash.OS2.backslash.*.backslash.
*.* * * * NOBACKUP
DATA .backslash.ACCOUNTS
*.WKS
>300k
* * BACKUPNOW
TOOL* .backslash.*
*.SYS
* SH * MONTHLY
TOOL* .backslash.*
*.INI
* * MONTHLY
DAILY
TOOL* .backslash.*
*.INI
* * * DAILY
TOOL* .backslash.*
*.* * R * RUN (CRITMC)
* .backslash.*
* * * * DEFAULT
. . .
__________________________________________________________________________
The ACS filter specification in Table 1 is not a detailed example but serves to illustrate the basic layout and operation of such a data object. Table 1 can be viewed essentially as a drop- through "sieve". The declaratory rules are generated by the user to assign management classes to categories of data files whose attributes match the expressions specified. The declaratory rules are referenced by the general-purpose ACS routine 26 in FIG. 2, which differs from ACS routine 20 in FIG. 1 because it is neither customized nor user-written. That is, ACS routine 26 can be provided as an element of the storage system itself. Referring to FIG. 2, ACS routine 26 references the declaratory rules in the ACS filter 28 (Table 1) to derive class assignment 24 from the attributes of data file 22. The declaratory rule list (Table 1) is searched from top to bottom and left to right using the actual data file attributes 22 (FIG. 2). The first match found specifies the management class that is assigned to the data file. The "Assigned Management Class" column must provide a specific entry without wild-cards. Although more than one declaratory rule (row) may match the incoming data file specifications 22, routine 26 selects the first matching row as the class assignment. The ordering of the rows within ASC filter 28 is user-selected. FIG. 3 provides a simple block diagram of an illustrative storage system of this invention containing a plurality of data objects. It can be appreciated from FIG. 3 that the above discussion in connection with FIG. 2 is applicable to the like-named data objects shown in FIG. 3. Referring to Table 1, declaratory rules may use wild-card notation for matching file attributes. For example, character attribute matching rules may use "?" and "*" to match a single character or string of characters, respectively, and numeric attribute matching rules may use comparison operators (>, < or =) to compare the actual data file attribute value with the value specified by the declaratory rule. In Table 1, a data file larger than 300K bytes will match column 4 of the second row, for instance. Where additional processing is needed to associate information with the file, the application information assigned by the declaratory rule of specification may include reference to a particular executable routine. In Table 1, files having attributes matching the specification in the sixth row are passed to an executable routine CRITMC, as indicated by the RUN(. . . ) specification in the Assigned Management Class column. The CRITMC routine may be a user-customized ACS routine of the type illustrated in FIG. 1, for example. Such routine is then responsible for assigning additional management class or other information to the data file in whatever manner is provided for by the customizing user. The declaratory attribute matching rules shown in Table 1 can be extended to include other constructs, such as range checking or set membership/nonmembership. Although the types and numbers of file attributes vary extensively across different data systems, the ACS system of this invention can be adapted to any such system merely by modifying the rule declarations illustrated in Table 1. Note that the ACS system shown in FIG. 3 does not require storage of class assignment 24. This is because general ACS routine 26 can be automatically invoked whenever class assignment 24 information is desired by the system. The advantages of this are several. First, class assignment 24 is always determined for the most recent version of data file 22, thereby always responding to dynamic changes in the characteristics and attributes of data file 22. As an example, consider the change in class assignment 24 resulting from a 300K byte increase in size of data file 22 that results from using ACS filter 28 shown in Table 1. Secondly, because class assignment 24 need not be saved, memory and processor efficiency is improved over the prior art. Finally, no user programming expertise is required. Modification of ACS filter 28 is accomplished merely by changing the declarative rules illustrated in Table 1. ACS routine 26 need never be modified when using the ACS filter 28 of this invention. It can be apreciated that the ACS method of this invention may be used to associate constructs other than Management Class with data files. The filtering mechanism can be used to bind constructs such as storage class and the like. Attributes such as data file format, allocated size and the like may be used as filtering specifications to accomplish this task. In general, the filtering approach may be used to associate any policy constructs with any data object based on attributes of the data object in a straightforward declarative manner. For instance, the ACS filter in Table 1 may be extended to provide for "custom attributes", as illustrated in the following Table 2.
TABLE 2
__________________________________________________________________________
Input Assigned
Storage
Directory
File
File
File (DOS)
Custom
Custom
Management
Management
Group Path Name
Size
Attributes
Attr #1
Attr #2
Class Class
__________________________________________________________________________
SYSTEM
.backslash.OS2.backslash.*.backslash.
*.* * * ACY* * * NOBACKUP
DATA .backslash.ACCOUNTS
*.WKS
>300k
* ENG* * * BACKUPNOW
TOOL* .backslash.8
*.SYS
* S * * * MONTHLY
TOOL* .backslash.*
*.INI
* * * * MONTHLY
DAILY
TOOL* .backslash.*
*.INI
* * * * * DAILY
TOOL* .backslash.*
*.* * * * TED * YEARLY
TOOL* .backslash.*
*.* * * * JOHN * YEARLY
* .backslash.*
* * * * * * RUN (MCASSIGN)
. . .
__________________________________________________________________________
The columns labelled "Custom Attr 1" and "Custom Atrr 2" denote file attributes that may be assigned and specified by the user for inclusion in ACS filters. For example, the first may be accounting information while the second may refer to ownership of the data object. After data files are assigned the custom attribute values, the ACS filter in Table 2 operates as discussed in connection with Table 1, assigning management class based on wild-card matching, including matching of user-defined custom attributes. ACS Filter Wild-Card Specifications The following wild-card specifications are suitable for use in the "columns" of the ACS filter specification table to match data file attributes passed into the ACS routine.
______________________________________
* When used in a string specification, the "*"
character matches 0 or more characters for the
attribute value passed into the filter
routine. For example SYS* matches SYS,
SYSTEM, SYS01, and does not match STS5.
? When used in a string specification, the "?"
character matched one character for the
respective attribute value passed into the
filter routine. For example, the
specificaiton SYS?E? matches SYSTEM AND
SYSGEN, but not SYSB01.
string When used in a string specification, the
specification matches ONLY the string value
specified. For example, the specification
SYSTEM matches SYSTEM only.
!string When used in a string specification, the
specification matches ALL strings BUT the
string value specified. For example, the
specification !SYSTEM matches SYSGEN, but
nor SYSTEM.
!NULL When used in a string specification, the
specification matches ALL non-null (0-length)
strings for the respective attribute
specified.
>n For a numeric attribute, this specification
matches any value specified that is greater
than the numeric value "n". For example, the
specification >5000 matches 10000, 5001, but
not 5000.
<n For a numeric attribute, this specification
matches any value specified that is less than
the numeric value "n". For example, the
specification <5000 matches 100, 50 but not
50000.
=n For a numeric attribute, this specification
matched only a value specified that is exactly
equal to the numeric value "n".
(str1, str2, . . . )
When used in a string specification, the
specification matches any string in the list
specified. The string list may also contain
"*" and "?" wildcards (see above).
!(srt1, str2, . . . )
When used in a string specification, the
specification matches any string NOT in the
list specified. The string list may also
contain "*" and "?" wildcards (see above).
______________________________________
An Example ACS Filter An exemplary ACS rule definition is now discussed. The ACS filter definition provides parameters for a SYSTEM STORAGE GROUP, assumed to contain the OS/2 operating system files that are normally in the C: drive and other storage groups containing spreadsheets, drawings and miscellaneous data files. The example ACS definition is specified in Table 3.
TABLE 3
__________________________________________________________________________
DEFINE.sub.-- ACS.sub.-- RULE
ACE.sub.-- NAME(EXAMPLE)
DESCRIPTION(Example ACS Rule definition for an OS/2 User)
MATCH( DESCRIPTION(Do not backup the OS/2 Operating System)
STORAGE.sub.-- GROUP(SYSTEM)
DIRECTORY.sub.-- PATH((.backslash.OS2*, .backslash.DOS.backslash.*,
.backslash.SPOOL.backslash.*, .backslash.MUGLIB.backslash.*,
.backslash.CMLIB.backslash.*, SQLLIB.backslash.*, .backslash.IBMLAN.backsl
ash.*))
FILE.sub.-- NAME(*)
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(NOBACKUP)
MATCH( DESCRIPTION(Do NOT backup selected Files in system Root
directory)
STORAGE.sub.-- GROUP (SYSTEM)
DIRECTORY.sub.-- PATH(.backslash.)
FILE.sub.-- NAME(!(*.CMD, *.SYS, *.BAT))
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(NOBACKUP))
MATCH( DESCRIPTION(Backup selected Files in system Root directory)
STORAGE.sub.-- GROUP (SYSTEM)
DIRECTORY.sub.-- PATH(.backslash.)
FILE.sub.-- NAME((*.CMD, *.SYS, *.BAT))
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(WEEKLY)
MATCH( DESCRIPTION(Backup Product Programs monthly)
STORAGE.sub.-- GROUP(*)
DIRECTORY.sub.-- PATH(*)
FILE.sub.-- NAME((*.EXE, *.COM, *.SYS, *.DLL))
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(MONTHLY))
MATCH( DESCRIPTION(Backup User's Daily - important)
STORAGE.sub.-- GROUP(*)
DIRECTORY.sub.-- PATH(*)
FILE.sub.-- NAME((*.CDR, *.WKS))
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(DAILY)
MATCH( DESCRIPTION(Backup Large Files Weekly)
STORAGE.sub.-- GROUP(*)
DIRECTORY.sub.-- PATH(*)
FILE.sub.-- NAME(*)
FILE.sub.-- SIZE(>500000)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(WEEKLY)
MATCH( DESCRIPTION(Backup all other file twice a week)
STORAGE.sub.-- GROUP(*)
DIRECTORY PATH(*)
FILE.sub.-- NAME(*)
FILE.sub.-- SIZE(*)
FILE.sub.-- ATTRIBUTES(*)
ASSIGN.sub.-- MGMT.sub.-- CLASS(TWICEAWEEK);
__________________________________________________________________________
Table 2 assumes that the management classes are named, in general, for the backup frequency specified in the management class itself. It is management class "DAILY" as a backup frequency of one day. The OS/2 operating system files are not backed up because the management class is NOBACKUP. This is because the operating system itself must be reinstalled if a failure occurs. The system will be restored from a control archive. OS/2 must be up and operational on a node before any data files can be restored. Thus, such installation is required before file recovery. The OS/2 operating system includes files in the directories of .backslash., .backslash.OS2, DOS, .backslash.SPOOL, .backslash.MUGLIB, .backslash.CMLIB, .backslash.SQLLIB and .backslash.IBMLAN. Files with extensions .BAT, .CMD and .SYS are backed up from the root of the system storage group on a weekly basis because they contain information that users will likely customize and may need at at later date. Examples include STARTUP.CMD, AUTOEXEC.BAT and CONFIG.SYS. Files with the .EXE, .COM, .DLL and .SYS extensions are only backed up monthly wherever they reside. This is because these files usually represent programs or program products that are available for reinstallation if a failure occurs. The ACS filter may be differently specified if the workstation user is a software developer, where the selected extensions may represent integral elements of a system under developement. User data files such as *.WKS and *.CDR are backed up on a daily basis because they represent primary information used regularly on the workstation. Large files are backed up on a weekly basis (greater than 500,000 bytes). The reasoning for this is to minimize the daily back up time necessary by deferring movement of larger files to weekly backups. All other files that fall through the upper declarations are associated with the TWICEAWEEK management class. Clearly, other embodiments and modifications of this invention will occur readily to those of ordinary skill in the art in view of these teachings. Therefore, this invention is to be limited only by the following claims, which include all such embodiments and modifications when viewed in conjunction with the above specification and accompanying drawing.
|
Same subclass Same class Consider this |
||||||||||
