Method for classification of year-related data fields in a program5794048Abstract The method of the invention enables a computer to examine a software application, which includes operands and operators, and to identify operand fields which include a year value. An operand association table is provided for each operator and indicates, based upon inter-relationships of operands associated with the operator, whether an associated operand that has been classified as a year field or a probable year field, should be assigned a revised classification and what that revised classification should be. The method reviews the application to identify each operand which can be initially classified as a year field or a probable year field and lists each such operand in an operand table. The method also reviews the application to identify every operator listed therein and lists every operator and any associated operands in an operator table. Thereafter, the method determines, for operator entries located in the operator table, and from operands associated therewith and an operand association table, whether the classification for each respective operand in the operand table should or should not be altered. Claims I claim: Description FIELD OF THE INVENTION
______________________________________
10 YEAR-OF-BIRTH
15 YEAR-OF-BIRTH PIC 99
15 MONTH-OF-BIRTH PIC 99
15 DAY-OF-BIRTH PIC 99
______________________________________
The variable "year-of-birth" includes 4 labels that indicate year, two format elements that indicate year, and one ordering that indicates year. The format elements are the six numeric digits that make up "year-of-birth", and the subdivision of the six digits into three groups of two digits. The ordering is the sequence of labels that are in a common year order, namely, "year", "month", and "day". If a voting procedure is employed to enable the initial classification, the classification score for the variable shown above is 7 and its sub-fields would also have a score of 7. This is the maximum score a field can achieve, using such a scoring process. If a score exceeds 7, it is set to 7. For other formats, the score will be less than 7 which can be normalized to a value between 0.1 and 0.9 to provide a probability indicator that the operand comprises a probable year field. As will be hereafter understood, a plurality of further tables are employed in the further analysis of the operands to enable a more precise classification to be assigned. A selection of the tables will be described hereinbelow. Table A is indicative of the fact that certain constants are associated with date calculations. Table A includes examples of most of these. The use of such a constant in certain kinds of calculations is an indicator that a variable is year-related.
______________________________________
Constant
Usage Operatars Involved
______________________________________
1-11 Month of Comparison, Addition
Fiscal Year End
4 Leap Year Test
Division
7 Days in a week
Comparison, Addition, Subtraction
8 Days in a week + 1
Comparison
12 Months in a year
Comparison, Addition, Subtraction
13 Months in a year + 1
Comparison
28 Days in a Month (Feb)
Comparison, Addition, Subtraction
29 Days in a Month (Feb)
Comparison, Addition, Subtraction
30 Days in a Month
Comparison, Addition, Subtraction
31 Days in a Month
Comparison, Addition, Subtraction
100 Leap Year Test
Division
365 Days in a Year
Comparison, Addition, Subtraction
366 Days in a Year
Comparison, Addition, Subtraction
367 Days in a Year + 1
Comparison
400 Leap Year Test
Division
4000 Leap Year Test
Division
______________________________________
As can be noted from an examination of Table A, if a numeral, for example "30", is associated with a comparison, addition or subtraction operator, an initial presumption can be made that the variable is year-related. As a further example, the value "400" is often used to test whether a year is a leap year (in a division action). Thus, the presence of one of the constants indicated in Table A provides one indicator that an associated operand field is year-related. "Comparison" association operator Table B below is indicative of the situation when a pair of fields are associated with a comparison command. If Field 1 and Field 2 are initially classified as year fields, then a post-analysis classification, as judged by the association of the fields, will always indicate year fields for both Field 1 and Field 2. If, however, Field 1 is classified as a probable year field and Field 2 is classified as a year field, then the post analysis classification of Field 1 changes from probable year to year, as there few, if any, situations where any value other than another year is compared with a further year value.
TABLE B
______________________________________
Comparison Operator
Initial Classification
Post Analysis Classification
Field 1 Field 2 Field 1 Field 2
______________________________________
Year Year Year Year
Year Probable Year Year
Year Not Year Year Year
Probable
Year Year Year
Probable
Probable F 1 = (F 1 + F 2)/2
F 2 = (F 1 + F 2)/2
Probable
Not Year Not Year Not Year
Not Year
Year Year Year
Not Year
Probable Not Year Not Year
Not Year
Not Year Not Year Not Year
______________________________________
In the classification noted in Table B, and other tables to follow, a Field 1 is denoted as (F.sub.-- 1); Field 2 as (F.sub.-- 2); and Field 3 as (F.sub.-- 3). Note also from Table B that "not year" classifications can be changed to a "year" classification when a comparison operator requires such a field to be compared with a year field. Association Tables C and D below are, respectively, used with addition operators and subtraction operators and logically relate initial classifications of associated operand fields to post analysis classifications of the same fields. It is important to understand that while fields 1 and 2 may be present in a first component program within the application, Field 3 is often present in another component program within the application. Any analysis of individual component program listings, without taking into account the interrelationships between the listings, overlooks valuable information which aids in achieving substantially higher accuracies of year field classification. The logical association of all of the inter-related fields that result from the utilization of operand table 26 operator table 28 and the various association tables (to be further described below) provides a high probability that all year-related fields are discovered and are properly classified. It is further to be noted that, in the main, the use of the association tables enables probable year field initial classifications to be changed to either a non-year classification or to a year classification and a non-year classification to be changed to either a probable year or a year classification. The logic which enables the construction of the addition association table and subtraction association table shown below is readily apparent to those skilled in the art. Each line of each table is logically constructed in accordance with common usages in known programming environments.
TABLE C
__________________________________________________________________________
Addition Operator
Initial Classification
Post Analysis Classification
Field.sub.-- 1
Field.sub.-- 2
Field.sub.-- 3
Field.sub.-- 1
Field.sub.-- 2
Field.sub.-- 3 If Present
__________________________________________________________________________
Year Year -- Year Year -
Year Probable
-- Year Year -
Year Not Year
-- Year Year -
Probable
Year -- Probable Year -
Probable
Probable
-- Probable Probable -
Probable
Not Year
-- Not Year Not Year -
Not Year
Year -- Not Year Year -
Not Year
Probable
-- Not Year Probable -
Not Year
Not Year
-- Not Year Not Year -
Year Year Any Year Year Year
Year Probable
Year Year Not Year Year
Year Probable
Probable
Year Not Year Year
Year Probable
Not Year
Year Nat Year Year
Year Not Year
Year Year Not Year Year
Year Not Year
Probable
Year Not Year Year
Year Not Year
Not Year
Year Not Year Year
Probable
Year Year Not Year Year Year
Probable
Year Probable
Probable Year Year
Probable
Year Not Year
Probable Year Year
Probable
Probable
Year F.sub.-- 1 = (F.sub.-- 1 + F 2)/2
F.sub.-- 2 = (F.sub.-- 1 + F 2/2
Year
Probable
Probable
Probable
Probable Probable Probable
Probable
Probable
Not Year
Fd 1 = F 1/2
F 2 = F 2/2
Not Year
Probable
Not Year
Year Year Not Year Year
Probable
Not Year
Probable
Probable Not Year Probable
Probable
Not Year
Not Year
Not Year Not Year Not Year
Not Year
Year Year Not Year Year Year
Not Year
Year Probable
Not Year Year Year
Not Year
Year Not Year
Not Year Year Year
Not Year
Probable
Year Not Year Year Year
Not Year
Probable
Probable
Not Year MAX (F.sub.-- 2, F 3)
MAX (F.sub.-- 2, F 3)
Not Year
Probable
Not Year
Not Year Not Year Not Year
Not Year
Not Year
Year Probable Year
Probable Year
F.sub.-- 1 = 0.5
Year F.sub.-- 2 = 0.5
Not Year
Not Year
Probable
Not Year Not Year Not Year
Not Year
Not Year
Not Year
Not Year Not Year Not Year
__________________________________________________________________________
As shown below, a division operator association Table E is utilized. It is generally used only for fields classified as year fields and Field 3 is the numerical remainder of the operation. The test for Field 3 is an IF statement that is executed subsequent to the divide operation.
TABLE E
______________________________________
Division Operator
Post Analysis
Initial Classification Classification
Field 1 Field 2 Field 3 Field 2
______________________________________
Constant of 4,
Year Tested for 0
Year
100, 400 or 4000
Constant of 4,
Probable Year
Tested for 0
Year
100, 400 or 4000
Constant of 4,
Non-year Tested for 0
= MAX (.5, 1-
100, 400 or 4000 ((1-F 2)/2))
______________________________________
To illustrate the use of Table E, the following code listing for a leap year determination is provided. The leap year test is determined in Field 3.
______________________________________
SET Leap-Year EQUAL TO FALSE
DIVIDE 4 INTO Year GIVING Temp REMAINDER Leap-Year-
Test
IF Leap-Year-Test EQUAL 0 THEN
SET Leap-Year TO TRUE
DIVIDE 100 INTO Year GIVING Temp REMAINDER
Leap-Year-Test
IF Leap-Year-Test EQUAL 0 THEN
SET Leap-Year TO FALSE
DIVIDE 400 INTO Year GIVING Temp REMAINDER
Leap-Year-Test
IF Leap-Year-Test EQUAL 0 THEN
SET Leap-Year TO TRUE
DIVIDE 4000 INTO Year GIVING Temp
REMAINDER Leap-Year-Test
IF Leap-Year-Test EQUAL 0 THEN
SET Leap-Year TO FALSE
END IF
END IF
END IF
END IF
______________________________________
Association Table F is used when a move operator is present.
TABLE F
______________________________________
Move Operator
Initial Classification
Post Analysis Classification
Field.sub.-- 1
Field.sub.-- 2
Field.sub.-- 1
Field.sub.-- 2
______________________________________
Year Year Year Year
Year Probable Year Year
Year Not Year Year Year
Probable
Year Year Year
Probable
Probable MAX (F.sub.-- 1, F.sub.-- 2)
MAX (F.sub.-- 1, F.sub.-- 2)
Probable
Not Year Not Year Not Year
Not Year
Year Not Year Year
Not Year
Probable Not Year Not Year
Not Year
Not Year Not Year Not Year
______________________________________
Turning now to FIGS. 4a-4c, the detailed operation of the method of the invention will be described. Initially, as shown in FIG. 4a, memory space is allocated for operand table 26 and operator table 28 in program memory 14 (box 100). Thereafter, the data definition segment of application 16 (and any programs associated therewith) is scanned and each operand listed therein is identified, and its format is determined. From the label and its format, an initial classification is assigned, i.e., as to whether it initially appears to be a year field, a probable year field, or a non-year field (box 102). The flags in reclassification column 35 are reset. As described above, the initial classifications are accomplished through use of known methods of analysis. The results of the scan of the data definition segment are inserted into operand table 26 (box 104). Thereafter, the procedure portion of application 16 and its associated programs is parsed; and every field with an operator is identified, as is the operator (box 106). Next (box 108), operator table 28 is built by listing every operator found in the procedure portion of the application, along with the operands identified therewith. The initial classifications of each operand from operand table 26 are also inserted into operator table 28. At this stage, operand table 26 and operator table 28 are complete, with operand table 26 including a listing of each operand and an initial field classification thereof, and operator table 28 including a listing of every operator in application 16 and its associated operands, along with initial classification indications thereof. Turning to FIG. 4b, the procedure now turns to a refinement of the initial classification of each of the operands to arrive at a more accurate classification thereof. The procedure commences by examining all operators of a first type, e.g., move operators, and the associated operands found in operator table 28. In the first portion of the procedure, operator entries whose Field 1 initial classification indicates a year classification are processed. Once all such operator entries of the first type have been examined, all entries of a next operator type with a Field 1 year classification are searched, etc. until all operator types and their associated operands have been analyzed. Thereafter, the procedure is repeated for operators having a Field 1 initial classification of probable year. In further detail, operator table 28 is initially searched for a first entry therein of the chosen operator type with a Field 1 classification of "year" (box 110). Then (box 112), the operand association table for the operator is accessed and its "initial" classification column is searched to determine an entry therein with identical operand classifications as those operands associated with the chosen initial operator entry from operator table 28. When such an entry is found, the post analysis classification portion thereof is examined and if a field re-classification is indicated (i.e., by an entry for an operand in a post-analysis classification column that is different from the initial classification portion), the indicated operand reclassification is entered in both operator table 28 and operand table 26 (box 116), the associated reclassification flag set, and the procedure moves to box 118. There, steps 110, 112, 114 and 116 are repeated for each first operator type entry in operator table 28 having a Field 1=year. After no further operator entries of the first type remain to be analyzed and only if no operand reclassifications are indicated as having been performed on the last iteration, does the procedure moves to box 124. Otherwise (box 122), steps 110, 112, 114, 116 and 118 are repeated for each first operator type entry in operator table 28 having Field 1=year. This repetition occurs because any reclassification of an operand can affect other related operands and may cause further reclassification actions on a succeeding iteration. Note from the above that each classification of an operand is dependent upon both the operator with which it is associated and other operands present in the same operator entry. Those operands may be reside in the same program or may be from another program in the application. As a result of the steps shown in FIGS. 4a and 4b, there is a high probability that errors in the initial classification of operands associated with first operator entries will be corrected and that proper classifications will be reconfirmed. Turning to FIG. 4c, steps 110-122 are now repeated for the next operator type with a Field 1=year, using an association table for the next operator type (box 124). Thereafter, when all operator types having a Field 1=year operand have been analyzed (box 126), the procedure repeats for all operator entries in operator table 28 having a Field 1 operand=probable year (box 128). This procedure will not discover more year classifications but will change the probabilities assigned to certain probable year entries to enable a more accurate decision to be made with respect to which entries require further analysis. In this regard, any probable year entry with a probability level below a set threshold value is listed in a "programmer attention" table for further analysis by the user. The above-noted procedure enables identification of year fields with high probability, and thereafter enables each of those year fields to be automatically accessed and to have their year designation altered in such a manner as to enable a removal of the ambiguity which arises from the onset of the year 2000 and forward. A preferred method for revising the year format and for enabling the year conversion procedure is described in U.S. patent application Ser. No. 08/657,657 (attorney docket 835.0001 USU) entitled "Date Format and Date Conversion Procedure" to Brady. The disclosure of the aforementioned patent application is incorporated herein by reference. Hereafter is a pseudo-code listing which carries out the procedure shown in the flow diagrams of FIGS. 4a-4c.
______________________________________
Loop Once On Each Operator (Move, Camparison,
Subtraction, Addition, Division)
Set Current.sub.-- First.sub.-- Table.sub.-- Entry to the first
First.sub.-- Table.sub.-- Entry marked as Year
Loop Until Year.sub.-- State.sub.-- Change=False
Set Year.sub.-- State.sub.-- Change to False
Loop Until All Second.sub.-- Table.sub.-- Entries using the
Current.sub.-- First.sub.-- Table.sub.-- Entries's field
label or aliases are processed
Set Current.sub.-- Second.sub.-- Table.sub.-- Entry to the next
(first) Second.sub.-- Table.sub.-- Entry using the
Current.sub.-- First.sub.-- Table.sub.-- Entry's labile or
alias
If Current.sub.-- Second.sub.-- Table.sub.-- Entry.sub.-- Operator=
Current.sub.-- Operator
Then Process according to the
corresponding operator table and if
there is a change in a year's status,
mark it as changed and
Set Year.sub.-- State.sub.-- Change=True
End if
End Loop
End Loop
Comment: The following loop will not discover more
years but will change the probability assigned to a
probable year.
Set Current.sub.-- First Table.sub.-- Entry to the first
First.sub.-- Table.sub.-- Entry marked as Probable year
Loop Until Year.sub.-- State.sub.-- Change=False
Set Year.sub.-- State.sub.-- Change to False
Loop Until All Second.sub.-- Table Entries using the
Current.sub.-- First.sub.-- Table.sub.-- Entry's field label or
aliases are processed
Set Current.sub.-- Second.sub.-- Table.sub.-- Entry to the next
(first) Second Table.sub.-- Entry using the
Current.sub.-- First.sub.-- Table.sub.-- Entry's label or alias
If Current.sub.-- Second.sub.-- Table.sub.-- Entry.sub.-- Operator=
Current.sub.-- Operator
Then Process according to the
corresponding operator table and if
tbere is a change in a year's
probability mark it as changed and
Set Year.sub.-- State.sub.-- Change=True
End if
End Loop
End Loop
End Loop Once
______________________________________
It should be understood that the foregoing description is only illustrative of the invention. Various alternatives and modifications can be devised by those skilled in the art without departing from the invention. Accordingly, the present invention is intended to embrace all such alternatives, modifications and variances which fall within the scope of the appended claims.
|
Same subclass Same class Consider this |
||||||||||
