Method and apparatus for cross-referencing text6295542Abstract A method and apparatus for the automatic insertion of hypertext links into a passage or document of encoded text is disclosed. A program, resident on a personal computer, for example, receives and parses input text in HTML format. In a first part of the processing, label strings identifying each paragraph number are located in the read in document. These are converted into an unambiguous format. Next, the text is re-read, with the paragraphs/section headers masked off, to locate text strings within the body of the text which cross-reference the section headers, or term definitions, or external links. These are also placed in an unambiguous format. Finally, the cross-references are matched up as far as possible with section/paragraph headers and the original HTML text is marked up automatically with hyperlinks, using the unambiguous section labels and cross-references as HTML anchors and destinations. Claims What is claimed is: Description FIELD OF THE INVENTION
TABLE 1
Input Pattern Result
NUM/ ignore
KEY NUM TEXT ignore
NUM TAB= ignore
NUM TEXT) ignore
QTE NUM QTE TAB ignore
NUM NUM TAB= ignore
KEY NUM NUM TEXT ignore
NUM TAB TEXT TAB ignore
QTE NUM NUM QTE TAB ignore
KEY NUM : CTEXT Section number
KEY NUM NUM Section number
LEY NUM CTEXT Section number
KEY NUM Section number
NUM) TAB NUM) TAB Section number
NUM) NUM) QTE TEXT Section number
NUM TAB TAB NUM) TAB Section number
NUM TAB NUM) TAB Section number
NUM) NUM) TEXT Section number
NUM) NUM) CTEXT Section number
NUM) NUM) TAB Section number
NUM NUM NUM TAB Section number
NUM TAB NUM TAB Section number
NUM NUM) CTEXT Section number
NUM NUM) TAB Section number
NUM TAB TAB CTEXT Section number
NUM NUM TAB Section number
NUM TAB Section number
NUM CTEXT Section number
(Note NUM is either of NUM, SA or RN defined above)
Two further rules are applied. An RN is classified as an SA if it is next in the alphabetic sequence from the last recognized section number. This prevents incorrect recognition of, for example "Section 6(i)" as "Section 6, Roman 1" if the previous section number was "Section 6(h)". Similarly, an SA is classified as an RN if the previously identified section number was an RN. This prevents, for example, "Section 6(v)" being recognized as "Section 6 letter v" when the previously identified section number was "Section 6(iv)". Sequences of NUM, SA or RN are extracted from the paragraph following the above rules and concatenated together with a decimal point separator, as will be explained below. For example, the text in a paragraph might read "see sub-section 6.2(c)". Often, that subsection will itself be labelled only with the letter `c)` at its beginning. The anchor number assigned to it must distinguish between a `c)` in this section and in some other section or appendix. To do this, a document hierarchy is defined, similar to that specified in the outline file: document.section.sub-section.sub-sub-section . . . document.appendix.section.sub-section . . . document.appendix.annex.section . . . and so on. Each level is expressed in its full form (step 290). In this example section 6.2(c) will be: 6.2.c where 6 is the section, 2 the sub-section and c is the sub-sub-section. Note that from the example it is rare to get a direct correspondence between sub-sub-section 6.2.c and the words used "sub-section". Often the words section and paragraph are used interchangeably and this can lead to ambiguity when referring to paragraphs in an appendix and sections in the main part of the document. Local cross-referencing in an appendix (or other forms of attachment) usually does not include the name of the attachment. III--Generation of Anchor Labels for Each Section Based upon the above rules, the program starts at the beginning of the HTML file (or at the <!--++start--> comment), and generates a list of paragraph numbers in the above form. Passages marked not to be processed are skipped. An example of the section numbers seen towards the end of a document, and the standardized format they are placed into after recognition, is shown in Table 2. Here, the left-hand column contains the section numbers as seen by a user, and the right-hand column contains the encoded section number anchor labels:
TABLE 2
Section No. as actually Anchor label generated by
labelled in document program
Section 6 6
6.1 6.1
a) 6.1.a
b 6.1.b
(i) 6.1.b.i
(ii) 6.1.b.ii
6.2 6.2
6.3(a) 6.3.a
7 7
Schedule A SchA
1.1 SchA.1.1
a) SchA.1.1.a
Annex I SchA.AnxI
In addition to generating a database of the Section numbers together with their full anchor labels, the corresponding first line of text may also be stored. This can then be used to generate an automatic index if desired. IV--Recognition of Cross-References Once the program has identified and generated anchor labels for the paragraph headings, it must then process the full text to mark cross-references within the body of the text with these anchor labels, so that hyperlinks may be created. This procedure will now be described with reference to steps 300-330 of FIG. 5. It will be understood that truncating the paragraph for the purposes of locating section headings reduces the risk of cross-references in the body of the text accidentally being processed as part of a header. To prevent double processing, words identified as part of a header are reclassified (i.e., ring fenced) at step 310 so that the algorithm does not attempt to identify words already identified as parts of a header as cross-references as well. The input paragraph from the HTML input file (at step 300) is read in and parsed using the same technique as in the identification of section numbering (described in I above). The read in file is filtered to remove all HTML code. Only TEXT, NUM and PUNC are left after filtering. TEXT is then reclassified (as with the headings as described above). In addition to identifying roman and short alpha words, word patterns are sought to help identify where in the text cross-references may be found. The program looks in particular for key words ("KEY"), CONJUNCTIONs (cf CONJ, which defines the "&" character), PREPOSITIONs. These are defined as: KEY: "clause", "paragraph", "section", "article", "schedule", "appendix", "annex", "table", "note", "part", "chapter", "sub-section", "subparagraph", "subclause", "exhibit", "directive", "condition", "attachment". CONJUNCTION: "and", "or" PREPOSITION: "of", "the", "to". The words "Act" or "Rules" are special cases. When these are located, all Capitalized words, CONJUNCTIONs and PREPOSITIONs, as well as words in parentheses, that precede the words "Act" or "Rules" are included. In the following, therefore, ACT is defined accordingly. The program also records the different numbering styles. NUMBER is defined by (1) all digits (e.g. 12.5) (2) lower case short alpha length 1 (SWL1): (e.g. a) (3) lower case short alpha length 2 (SWL2): (e.g. aa) NUMBER: (4) roman lower case short alpha length 1 (SWV1): (e.g. B) (5) upper case short alpha length 2 (SWV2): (e.g. BB) (6) roman lower case (ROMAN L): e.g. vii) (7) roman upper case (ROMAN V): e.g. (XI). In addition to identification of internal cross-references, the program must also identify, separately, external references. These are labelled differently using the EXT tag. To do this correctly, PREPOSITIONs must be further parsed into "of", "to" and "the" respectively. The following rules are applied: (d) a REF (either defined in (a) above or otherwise pre-tagged), followed by a PREPOSITION, followed by CTEXT (a word having initial capitalization as explained above). The PREPOSITION is further parsed into "of", "to" and "the". An EXT is only found if either "of" or "to" is present, and is followed by "the". For example: "Section 4(a) of the Housing Act". (e) an EXT (either defined by (d) above or when an EXT has already been found and tagged) followed by a CTEXT, a NUMBER or a KEY. (f) an EXT (again either defined by (d) above or a previously determined and tagged EXT followed by a CONJUNCTION, followed by a PREPOSITION (specifically, "of" then "the") followed by another PREPOSITION (specifically, "the") followed by CTEXT. (g) an EXT followed by a PREPOSITION (specifically, "of"), followed by a NUMBER, then CTEXT. Having applied the above rules to identify strips, the start of a reference is identified as any word tagged as type KEY. The end of a continuous sequence of tagged words following the KEY word is identified using a finite state machine whose state transition rules are defined in Table 2.
TABLE 3
CURRENT
STATE NEXT STATE
0-KEY 1-NUM;2-'('
1-NUM 0-KEY;1-NUM;2-'(';3-')';4-OF;5-CONJ;10-to;6-','
2-'(' 1-NUM;13-TEXT;13-CTEXT;13-inclusive
3-')' 1-NUM;2-'(';4-of;5-CONJ;5-to;6-',';8-this
4-of 0-KEY;1-NUM;7-det;9-EXT;9-CTEXT
5-CONJ 0-KEY;1-NUM;2-'('
6-',' 0-KEY;1-NUM;2-'(';5-CONJ
7-det 1-NUM;9-EXT;9-CTEXT;9-act
8-this 0-KEY
9-EXT 9-EXT;9-OF;14-'(';9-CTEXT;9-CONJ;9-act;15-NUM;15-'-'
10-to 0-KEY;1-NUM;2-'(';7-det;9-EXT;9-CTEXT;8-this
11 not used
12-act 12-act;16-'(';15-NUM;12-CONJ;12-of
13-'(' 1-')';13-{any other word type}
14-'(' 9-')';14-{any other word type}
15-NUM 15-NUM;0-KEY;12-of;12-act;15-'-';15-EXT
16-'(' 12-')';16-{any other word type}
Starting from such a KEY word, subsequent WORDS (defined above) are read until there is no next state in the Table 3 above. In other words, the WORDs in the sequence following a KEY, REF or EXT are each read by the program, and when a WORD which is not covered by the state transition table of Table 2 is encountered, and end of reference (EOR) tag is placed on the last word in the sequence. There may be more than one reference in a paragraph. Using the above rules, the different reference patterns in a given paragraph are extracted and tagged separately, in turn. V--Generation of Standard Form Cross-References from Reference Patterns A phrase such as "Paragraphs 1, 2 and 4 of Part I and 3(a) and (c) of Part II of Schedule 6" should now have been identified using the techniques described in IV above. A phrase in this form is, however, still ambiguous and must be converted into a form which can be understood before cross-references can be hyperlinked to the appropriate paragraph/section. To do this, the various parts (WORDS) of a given phrase are first labelled according to their generic family. For example, KEY words may be given upper case alphabetical labels, with P="Paragraph"; Q="Part"; S="Schedule". Similarly, different numbering styles may be given different lower case alphabetical character labels. For example: x=NUMBER (as previously defined) y=RN (as previously defined) z=SA (as previously defined). On this basis, the above example can be rewritten as "Px.sub.1, x.sub.2 and X.sub.3 of Qy.sub.1 and x.sub.4 z.sub.1 of QY.sub.2 of Sx.sub.5 " where x.sub.1 =1 x.sub.2 =2 x.sub.3 =4 x.sub.4 =3 x.sub.5 =6 y.sub.1 =I y.sub.2 =II z.sub.1 =a x.sub.2 =c. One common source of ambiguity is the word "to". Depending upon context, "to" may mean "of" or "and", e.g.: "Appendix 6 to Schedule 3"=Appendix 6 of Schedule 3 but "Sections 4 to 8 inclusive"=Sections 4 and 8. Thus, a further set of rules applied in sequence is needed. These are set out below in Table 4:
TABLE 4
Rule No: Rule:
(1) "x" .fwdarw.x (singleton)
(2) "xyz" .fwdarw. (xyz)
[Groups are formed, reading from left to right]
(3) "x to y" .fwdarw. &(x,y)
[The word "to" is interpreted as a Boolean AND.]
(4) "x and y" .fwdarw. &(x,y)
(5) "x or y" .fwdarw. &(x,y)
(6) "(x y.sub.1 z.sub.1) and (y.sub.2 z.sub.2)
or "(x y.sub.1 z.sub.1) or (y.sub.2 z.sub.2)" .fwdarw.
x(y.sub.1 z.sub.1, y.sub.2 z.sub.2)
or "(x y.sub.1 z.sub.1) to (y.sub.2 z.sub.2)"
[The program assumes that x refers to both parts that are separated by "and", "or" or "to"--for example "3a(i) and c(ii)" is interpreted as "3a(i) and 3c(ii)"] (7) "(x y) and/or/to (rs)".fwdarw.&((x y), (rs)) [If the first WORD in the first part is not also found in the second part after the PREPOSITION/CONJUNCTION, then it is assumed that the two parts are separate. For example: "3(a) and II(i)" are not understood as "3(a) and 3(II(i))" (8) "x.sub.1, x.sub.2, &(x.sub.3, y.sub.3)".fwdarw.& (x.sub.1, x.sub.2, x.sub.3, x.sub.4) [This rule governs lists] (9) "&(x.sub.1, x.sub.2) and x.sub.3.fwdarw.& (x.sub.1, x.sub.2, x.sub.3) (10) "Px .fwdarw.P(x)" [Formation of prefix groups] (11) "P(x y.sub.1 y.sub.2) and/or/to P(y.sub.2 z.sub.2)".fwdarw.P(x(y.sub.1 z.sub.1, y.sub.2 z.sub.2)) [The document outline defines when Q is subordinate to P] (13) "P(x) Q(y) R(z) and Q(r) T(s)".fwdarw.P(x) Q(yR(z), rT(s)) (14) "x of Q" .fwdarw.Qx "x to Q" (15) "P of Q" .fwdarw.QP "P to Q" (16) "PQ of R" .fwdarw.RPQ PQ to R" (17) "PE".fwdarw.E:P [E is an external reference] The following structures are implied by the nomenclature employed above: (x y z) is a tree structure: x .vertline.-y .vertline.-z x(a,b) is a branch structure: x .vertline.-a .vertline.-b Thus, for example, (P(x) Q(y(R(z), rT(s)) represents: ##STR1## The rules are applied cyclically until only one root remains. EXAMPLE 1 As an example, the reference pattern "paragraphs 1, 2 and 4 of Part I and 3(a) and (c) of Part II of Schedule 6".fwdarw. x.sub.1, x.sub.2 and x.sub.3 of Qy.sub.1 and x.sub.4 z.sub.1, and z.sub.2 of Qy.sub.2 of Sx.sub.5. Then, from Rule (2).fwdarw. Px.sub.1, and x.sub.2 and x.sub.3 of Qy.sub.1 and (x.sub.4 z.sub.1) and Z.sub.2 of Qy.sub.2 of Sx.sub.5 The next applicable rule is Rule (4): Px.sub.1, &(x.sub.2, x.sub.3) of Qy.sub.1 and (x.sub.4 z.sub.z) and z.sub.2 of Qy.sub.2 of Sx.sub.5 Under Rule (6): Px.sub.1, &(x.sub.2, x.sub.3) of Qy.sub.1 and x.sub.4 (z.sub.1, z.sub.2) of Qy.sub.2 of Sx.sub.5 Rule (8) P &(x.sub.1, x.sub.2, x.sub.3) of Qy.sub.1 and x.sub.4 (z.sub.1, z.sub.2) of Qy.sub.2 of Sx.sub.5 Rule (10) P(x.sub.1, x.sub.2, x.sub.3) of Q(y.sub.1) and x.sub.4 (z.sub.1, z.sub.2) of Q(y.sub.2) of S(x.sub.5) Rule (14) P(x.sub.1, x.sub.2, x.sub.3) of Q(y.sub.1) and Q(y.sub.2)x.sub.4 (z.sub.1, z.sub.2) of S(x.sub.5) Rule (15) Q(y.sub.1) P(x.sub.1, x.sub.2, x.sub.3) and Q(y.sub.2)x.sub.4 (z.sub.1, z.sub.2) of S(x.sub.5) Rule (16) Q(y.sub.1) P(x.sub.1, x.sub.2, x.sub.3) and S(x.sub.5) Q(y.sub.2) x.sub.4 (z.sub.1, z.sub.2) This expression still contains more than one root so the rules are re-applied from (1) onwards. It will be seen that the applicable rule is Rule (13): Q(y.sub.1) P(x.sub.1, x.sub.2, x.sub.3) and S(x.sub.5) Q(y.sub.2) x.sub.4 (z.sub.1, z.sub.2) .fwdarw.S(x.sub.5) Q(y.sub.1 P(x.sub.1, x.sub.2, x.sub.3) , y.sub.2 x.sub.4 (z.sub.1, z.sub.2)) END The reference tree S(x.sub.5)Q(y.sub.1 P(x.sub.1, x.sub.2, x.sub.3),y.sub.2 x.sub.4 (z.sub.1, z.sub.2)) can be traversed to give five link anchors: Sx.sub.5.Qy.sub.1.Px.sub.1 ("Schedule 6.Part I.Paragraph 1") Sx.sub.5.Qy.sub.1.Px.sub.2 ("Schedule 6.Part I.Paragraph 2") Sx.sub.5.Qy.sub.1.Px.sub.3 ("Schedule 6.Part I.Paragraph 4") Sx.sub.5.Qy.sub.2.Px.sub.4.z.sub.1 ("Schedule 6.Part II.Paragraph 3.a") Sx.sub.5.Qy.sub.2.Px.sub.4.z.sub.2 ("Schedule 6.Part II.Paragraph 3.c") It will be noted that, if y.sub.1 were of type x, then ambiguity would exist, for which see below. EXAMPLE 2 "Clauses 6 or 7 of Annexes A, B and C" .fwdarw."Cx .sub.1 or x.sub.2 of Ay.sub.1, y.sub.2 and y.sub.3 " .fwdarw."Cx.sub.1 or x.sub.2 of Ay.sub.1, &(y.sub.2, y.sub.3) (Rule 4) .fwdarw."C& (x.sub.1,x.sub.2) of Ay.sub.1, &(y.sub.2, y.sub.3) (Rule 5) .fwdarw."C& (x.sub.1,x.sub.2) of A&(y.sub.1,y.sub.2, y.sub.3) (Rule 8) .fwdarw."C(x.sub.1,x.sub.2) of A(y.sub.1,y.sub.2, y.sub.3) (Rule 10) .fwdarw.A(y.sub.1,y.sub.2, y.sub.3) C (x.sub.1, x.sub.2) (Rule 15) This may again be traversed to produce the 6 link anchors: Ay.sub.1.Cx.sub.1 (AnnexA.Clause6) Ay.sub.1.Cx.sub.2 (AnnexA.Clause7) Ay.sub.2.Cx.sub.1 (AnnexB.Clause6) Ay.sub.2.Cx.sub.2 (AnnexB.Clause7) Ay.sub.3.Cx.sub.1 (AnnexC.Clause6) Ay.sub.3.Cx.sub.2 (AnnexC.Clause7) The outline file previously generated, which sets global rules for interpretation and minimizes ambiguities, is consulted to help identify local references in attachments and references elsewhere within the document. As will be seen from the outline file, the reference phrase "paragraph 7", when encountered in a paragraph in Appendix A, will be assigned the full address ".ApxA.7" provided that the entry "appendix.paragraph" is contained within the outline file. By default, the address locality is prefixed. VI--Recognition of Defined Terms and the Creation of Destination Anchors for Them In addition to the generation of links between section headings and references to them in the text, the program can also insert links between defined terms and their definitions. Any group of words enclosed in double quotes or mixed 66 99 style quotes is treated as a defined term. Its position in the file is recorded so that matching word groups in the text can be associated with it and links inserted. If the input text contains a mistake and only has an unpaired quote then a limit of ten consecutive words is taken as the quoted term and the term is ignored. Quoted terms are not allowed over paragraph boundaries. Further, the program does not look for definitions in recognized cross-reference phrases. The position in the text of the defined terms is found and link anchors are inserted between them and the definitions (Box 56 in FIG. 3). Furthermore, automatic links may be established between a term considered to be undefined in the text, and a table of such undefined terms: Terms are usually characterised in contracts and agreements by words with initial capitalisation or all capitals. Where single capitalised words not at the beginning of a sentence or two or more capitalised words in sequence are found and not being a defined term then they are classified as an undefined term. A table of such terms is generated, and links are established between that table (which may either be a separate document or may be appended to the document being read) and the undefined term located. VII--Collation of References with Paragraph Numbers This procedure is shown in the form of a flow diagram in FIG. 6 with steps 58 and 400-490. Once the whole HTML input file has been read, the reference addresses (tags) are collated with the section/paragraph tags (step 410). Some of the reference tags may not match the section/paragraph tags (step 420), and if so, they are compared with a list of entries in a "link database". This is a file containing associations between named anchors in the processed document and URLs to anchors elsewhere or also in the processed document. It can be provided before processing commences (at step 400) of FIG. 6 so that external links can be made on the first processing of the document or with a post-processing step. The link database typically has a file format ;comment NAME URL NAME URL NEWLINKTEXT NAME ! Where NAME is the paragraph anchor name, URL is a known uniform resource location (e.g. an external Internet address) and NEWLINKTEXT is (if specified), alternative text to replace text before the </A> HTML tag. If "!" is specified, then the link is ignored. Any references which are still unmatched (step 440) are written to a missing links file (step 450) and also to a connections file for post processing by another program. All external references are written to a separate external file as well as to this connections file. Next, a list of all the paragraphs which reference a given paragraph is collated (step 58, again also referring to FIG. 3). In the following example, numbers between < and > are the paragraph anchor labels obtained as set out in II and III above: <3.1>(3.1) Heading [Text] <3.2>(3.2) Heading [Text . . . "as defined in <3.1>3.1 above" . . . ] <4.6.a.i>4.6(a)(i) Heading [Text . . . "further details in 3.1 are . . . "] Appendix A <Apx.A.II.C>PartII(c) Heading [Text . . . "the definitions in Section 4.6(a)(i) of the main body . . . "] would cause the following table to be generated:
TABLE 5
Paragraph Anchor Paragraph Anchor Label
Paragraph Label of those of those paragraphs
Anchor paragraphs referencing Col. 2
Label referencing Col. 1 (indirect references)
. . . . . . . . .
3.1 3.2 ApxA.II.c
4.6.a.i
3.2 NONE NONE
3.3 1.4 NONE
ApxA.i
. . . . . . . . .
The dependencies tree set out in Table 5 is written to a further file, with the indirect references (see Table 5, column 3 and step 470 of FIG. 6) also listed. To save file space, paragraphs that are not referenced elsewhere are preferably not listed. To address problems of circularity (that is, where clause A, for example, uses clause B as a definition and clause B then uses clause A as a definition), a list of such circular clauses is also output (step 480). VIII--Insertion of Link HTML Tags The paragraphs/sections have been identified and named as explained in I, II and III above, and the cross-references have been identified and tagged as explained in IV and V. Having generated the various information files as set out in VII, the positions of the start and end of the reference phrases and term definitions (VI above) and use are recorded. When the output file is written the appropriate HTML anchors are written at those positions (step 490). The paragraph address is written on a line before the original text. It forms a hyperlink to the contents list of the document. Also on this line is a list of links to clauses which reference this clause (backward dependent references). The cross-reference links are written after the reference phrase. This is rather than using the phrase as the text of the link itself because the phrase can refer to multiple references. For example, using the text of the phrase "sections 5 and 7(b)" as the link would be inappropriate as there are two targets. Instead the text will be augmented to read "sections 5 and 7b [5] [7.b]" with the "[5]" and "[7.b]" forming the text of two separate hyperlinks. Finally, once the text has been marked up, an index file (FIG. 2) is generated. This holds the (optional) table of contents, as well as any diagnostic information obtained during the processing of the original (input) HTML file. The index file is preferably linked to the original HTML file, as well as to other utility programs such as a spell checker/thesaurus, a search tool and to the other generated files containing the list of missing references, external references, term definitions, undefined terms and circular references. Depending upon how the output file format 60 (FIGS. 2 and 3) has been defined by the user, the output file 60 may either consist of one file or a plurality of separate files numbered sequentially, e.g. "part1.htm", "part2.htm" etc. Each separate file relates to a corresponding top level section of the contract, for example. The purpose of producing an output file 60 with a list of references which are not linked to a known paragraph anchor name either in the main document or to external publications is to allow the user to edit this file. This allows associations between unconnected links and their actual URLs to be defined. The user can also delete output files and list files found in error in the document. The format of the missing references file is typically: ;comment NAME URL ;anchor NAME href replaced with URL, anchor text unchanged NAME URL NEWLINKTEXT ;anchor NAME href replaced with URL, anchor text replaced with NEWLINKTEXT NAME URL NEWLINKTEXT* ;anchor NAME href replaced with URL, anchor text replaced with NEWLINKTEXT, entry in missing or external files removed NAME URL* ;anchor NAME href replaced with URL, anchor text unchanged, entry in missing or external files removed NAME ! ;anchor text removed, NAME entry in missing or external files removed Note that there should be no spaces in the URL and NEWLINKTEXT strings. The program can be re-run, at which point manual URL amendments are read (at 510, see FIG. 3). Thus, a new set of output files is generated, which files have the corrected links. Entries in the "missing links" an "external" files (see VII) are automatically removed if the "*" or "!" codes are specified. Although the foregoing has described a technique for linking cross-references to section headings, or defined terms to their definition, within a single document, it will be understood that the technique is equally applicable to a suite of documents located upon the same computer 10, or indeed to a suite of documents located across a Local Area Network (a LAN), or a wide area network (a WAN) including the Internet 70. The anchor labels have a further hierarchy of labelling in such a case, to identify which of the suite of documents is being referred to. For example, an anchor label may read "Doc:6.3.a.i" or "Doc3:ApxA.7.3". Note that a colon separator is used between the document identifier tag (Doc 1 and Doc 3 in this example) and the intra-document reference tags. This allows the program to associate a remote file with the label "Doc 1", for example, then if present, look in a separate anchor index file to find the actual file in which that section (6.3.a.i) will be found. Whenever a contract, for example, is processed, an anchor index file is generated containing the section number link anchor and the name of the file it was found in, so that other contracts can be linked to it. Whilst the foregoing has been described in terms of HTML, it will be understood that the method could be applied to word processor files in any format. Preferably, input word processor files in other formats would be pre-processed into HTML format to facilitate cross-referencing, but this is not essential. Furthermore, the skilled reader will understand that the techniques disclosed for linking sections within the same document can readily be extended to two or more documents at different locations on a local area network or even documents at different locations across a wide area network. The scope of invention is therefore to be limited only by the following claims.
|
Same subclass Same class Consider this |
||||||||||
