Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like6212494Abstract A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word. The method further comprises the steps of combining the linguistically analyzed, annotated, and labeled text and previously stripped markup information into a merged file, mining the merged file for domain knowledge, including the steps of identifying and creating a list of technical terminology, mining the merged file for manifestations of domain primitives and maintaining a list of manifestations of such domain primitives in an observations file, analyzing the discourse context of each sentence or phrase in the merged file, analyzing the frequency of manifestations of domain primitives in the observations file to determine those that are important, expanding the list of key terms by searching for terms sanctioned by a domain primitive deemed important in the previous step, and searching the merged file for larger relations by searching for particular lexico-syntactic patterns involving key terms and manifestations of domain primitives previously identified. The method further comprises the steps of structuring the knowledge thus mined and building a domain catalog. Claims I claim: Description BACKGROUND OF THE INVENTION
uses -> use
ground -> ground
ground -> grind
A morphological processor further looks at the formation of each word in the document and attempts to perform a mapping of a derivational word (i.e., a word modified from its base form as by the addition of a noninflectional affix) to its base form, for example:
reinitialize -> initialize
reinitializing -> initialize
The ability to perform morphological processing enables the present invention to, for example, derive: [initialize] [disk] from "reinitializing the disk". The morphological processor further returns the possible part-of-speech tags for a word, for example:
uses -> use+[NounPlural]
uses -> use+[VerbPast]
ground -> ground+[Verb]
ground -> ground+[NounSingular]
ground -> grind+[VerbPast]
Part of Speech Tagger (For Lexical Disambiguation) The primary function of such a component is to disambiguate among sets of parts-of-speech annotations, i.e., syntactic tags. For example, while every content word in the phrase, "to display files, view by size" would be lexically analyzed and marked both as a noun and a verb, local syntactic context is sufficient to disambiguate between the individual parts-of-speech: to/[Inf] display/[Verb] files/[NounPlural] view/[Verb] by/[Prep] size/[Noun] Shallow Syntactic Analyzer Syntactic analysis (parsing) is the process of resolving a sentence into component parts of speech, describing them grammatically, and identifying structural relationships between words and phrases in a sentence. For example, a noun phrase might be made up from a determiner followed by a noun; a verb phrase might be identified as a verb optionally followed by a noun phrase, and a possible sentence structure might be a noun phrase functioning as the subject, followed by a verb phrase, wherein the verb is the main verb of the sentence, and the noun phrase within the verb phrase is the object. Presently, full syntactic analysis over real instances of text is not feasible due to a number of reasons, including the complexity of the parsing process, the high degree of lexical ambiguity, failure to cope robustly with unfamiliar input items, and inadequate coverage of existing grammatical descriptions of natural language. However, present technology does make it possible, on the basis of locally defined rules for syntactically allowed contexts, to perform a shallow form of syntactic analysis in which certain linguistic annotations of text are possible. Assuming part-of-speech disambiguation analysis has already been performed by, for example, the part-of-speech tagger described above, valid sequences of syntactic tags can be identified, for example, the grammatical sequence [[Det] [Adj]] is highly common, in contrast to [[Adj] [Det]]. Furthermore, it is possible to associate the words in a sentence with the syntactic functions they play within the particular context ("@Subject", "@Object", "@Complement-Modifier", etc.), as well as indicate the structural constraints between words. For example, the [Adj] proceeding a [Noun] is dominated by that noun; in a sequence of two or more [Noun]s, the rightmost one acts as the head; etc. Shallow syntactic analysis differs from syntactic analysis in that a complete parse tree representation is not constructed--phrase boundaries are not identified, nor are relationships between phrases recovered. However, individual lexical items are assigned, where appropriate, syntactic functions. For example, as a result of processing the sentence, The application requires the use of a separate type of layout window for modifying user templates. "application" would be analyzed as the main subject; "layout" would be tagged as a [Noun] and associated as a dependent (premodifier) to "window"; both "window" and "templates" would be identified as nouns in complement positions, with "use" and "modify" marked as the dominant heads to which, respectively, those nouns act as direct objects. The significance of being able to identify these relationships will be discussed below. Thus, because natural language is highly complex and ambiguous, full syntactic analysis for an entire language is impossible given present technology. Shallow syntactic analysis, however, is possible. While not developing a complete parse tree, shallow syntactic analysis attempts to identify and generate a pointer to different structures in a sentence, including, but not limited to, for example, a subject, verb, object, complement, adjunct, etc. DETAILED DESCRIPTION Referring now to FIG. 3, a detailed description of an embodiment of the present invention follows. Linguistic Analysis Stage A data file stored on, for example, mass storage device 206 and containing technical documentation may have been created by any one of a number of commercially available desktop publishing or wordprocessing software applications. The internal representation of the data file is, for the most part, governed by the software application that created the file, e.g., Microsoft Word. Commonly, the various desktop publishing or wordprocessing software applications have their own proprietary internal data representation for keeping track of the various features of a document, e.g., the typographical, visual, and layout characteristics of the document. Natural language processing software cannot adequately deal with the arbitrary format of documents created by different software applications. Thus, the present invention assumes a uniform framework for representing, storing and accessing the document in a way which preserves the majority of typographical, visual and layout information in the data file containing the document. This is accomplished by mapping, or exporting, the document into a stream of ASCII text to which the natural language technologies of the present invention can be applied. This prerequisite is fulfilled according to application-specific means outside the scope of the present invention. Essentially, what this means is that wordprocessing or desktop publishing application software must create an ASCII-based representation of the internal data format. In doing so, typically all that occurs is the internal representation of the file is converted from binary format to ASCII format--markup tags providing information regarding document structure may still be in a proprietary format, e.g., Microsoft Word Rich Text Format (RTF). Furthermore, in addition to extruding ASCII text from, e.g., a Microsoft Word file containing a document having a proprietary internal representation for document structure (a process which will yield a text corpus), it is essential when exporting the document, for reasons discussed below, that this text corpus retain the markup information contained therein concerning the logical and physical structure of the document. Not all markup information may be important. The key in retaining markup information is to strip formatting information that is not important but maintain that which is, along with the text to which it applies. Markup information is information in the form of tags interspersed throughout the document which is used to (conceptually) drive a typesetting machine. To the extent there is important textual information in a sentence, there is equally important information in the way the text is visually organized and presented on a page. As will be seen, the fact that a phrase appears in a subject or chapter heading makes it much more important than if it were embedded in the middle of a long paragraph of text, and thus, more likely that it merits incorporation in the domain catalog. Unlike prior art text processing technologies, the present invention seeks to appreciate the context, in linguistic, document layout and structure terms, in which a phrase, e.g., a noun phrase, appears, and thus, markup information should be maintained in the ASCII text stream. At step 301 in the cascade of individual linguistic processing modules, the present invention translates the data file containing the ASCII text of a document having a proprietary internal representation for document structure (created by application software outside the scope of the present invention as discussed above) to a data file containing the ASCII text of a document having a standard internal representation for document structure, which may, for example, generally conform to SGML (standard general markup language). The purpose of this step is to provide a standard internal representation for document structure information (i.e., markup tags) in the file containing the document, one which is understood by the natural language processes of the present invention. Thus, subsequent modules in the cascade need only understand one standard file format. An example of a standard file format is set forth below, hereinafter referred to as example A. It can be seen that markup tags containing information such as the beginning and ending of chapter headings, lists and the items listed therein, subsections, paragraphs, and different text typefaces, such as bold or italics, are interspersed throughout the ASCII text. The example is taken from a portion of an online copy of the Apple Macintosh Reference guide, Chapter 1. Additional references herein related to this text document represent output generated at various stages of the cascade of linguistic processing modules using the same source of technical documentation.
</chapter>Setting Up Your Programs </echapter>
</para> This chapter describes how to set up the programs that you use when
you work with your computer. </epara>
</section> Installing your application programs </esection>
</para> Most application programs come on floppy disks, and you install
them
by copying them from the floppy disks to your hard disk. Some programs have
special installation instructions. See the documentation that came with
your
programs. </epara>
</para> To use your programs most effectively: </epara>
</list>
</item> Put only one copy of each program on your hard disk. Having more
than
one copy can cause errors. </eitem>
</item> Whenever you copy a program disk to your hard disk, be careful not
to
copy a System Folder. Always check to see what you've copied, and drag any
extra System Folders to the Trash. </eitem>
</item> If a program malfunctions consistently, try installing a fresh
copy.
If that does not help, find out from the software manufacturer whether your
version of the program is compatible with the system software you're using.
</eitem>
</item> Put frequently used programs (or aliases for those programs) in the
Apple menu so you can open the programs more conveniently. See Chapter 5,
"Adapting Your Computer to Your Own Use." </eitem>
</item> To open a program automatically each time you start up, you can put
the program (or its alias) into the Startup Items folder. See Chapter 5,
"Adapting Your Computer to Your Own Use." <(/eitem>
</elist>
EXAMPLE A As part of this translation process, visual characteristics of the text are mapped to the logical function that the characteristics perform, e.g., red text may indicate a chapter heading, bold text may mean a subsection heading, a string of 12-point Helvetica text may indicate a paragraph of text. This logical function information representing markup information is retained in the form of a tag at the beginning of each record in the data file. Maintaining such logical function information entails, for example, identification of chapter, section, subsection and other headings, as well as parsing of lists and sublists. To reiterate, the rationale behind this is not only that, for example, section and subsection headings are good places to identify technical terms, but more interestingly, the structure of a running discourse of technical text is itself quite revealing with respect to offering clues to information that describe the domain or application to which the content of the document is directed. For example, definitions of terms are typically found at the beginning of introductory paragraphs, section units typically are concerned with describing the functionality of closely related components, and phrases that are emphasized (e.g., by bold or italic font) are clearly important, etc. As is step 301, step 302 is primarily a prepatory step in anticipation of step 303 and later steps in the cascade of linguistic processes. Linguistic analysis of text at step 303 assumes the document contains only ASCII text. The markup tags, therefore, must be stripped, as demonstrated in below in example B:
Setting Up Your Programs **
This chapter describes how to set up the programs that you use when
you work with your computer.
Installing your application programs **
Most application programs come on floppy disks, and you install them
by copying them from the floppy disks to your hard disk. Some programs have
special installation instructions. See the documentation that came with
your
programs.
To use your programs most effectively:
Put only one copy of each program on your hard disk. Having more than
one copy can cause errors.
Whenever you copy a program disk to your hard disk, be careful not to
copy a System Folder. Always check to see what you've copied, and drag any
extra System Folders to the Trash.
If a program malfunctions consistently, try installing a fresh copy.
If that does not help, find out from the software manufacturer whether your
version of the program is compatible with the system software you're using.
Put frequently used programs (or aliases for those programs) in the
Apple menu so you can open the programs more conveniently. See Chapter 5,
"Adapting Your Computer to Your Own Use."
To open a program automatically each time you start up, you can put
the program (or its alias) into the Startup Items folder. See Chapter 5,
"Adapting Your Computer to Your Own Use."
EXAMPLE B However, as was previously mentioned, this information is subsequently used by the present invention, so it is not discarded, but saved in a temporary file and merged back into the text stream at a later step in the cascade, as will be discussed below. The ASCII text free of markup information produced at step 302 is next analyzed with respect to its lexical and morphological content at step 303. Each word is annotated to include its lexical and morphological features, including a part-of-speech tag for each morphological context, and a possible syntactic label obtained by way of shallow syntactic analysis. For example, with reference to FIG. 4, text phrase 401, after analysis and annotation, appears as annotated phrase 400. Below is an example, hereinafter referred to as example C, of such analysis and annotation as performed on the ASCII text provided in example B. For example, each record is comprised of a word of text and its annotations. Each word appears in its original form as used in the document and its base form, both in double quotes. Lexical annotations follow and are encapsulated by < >. The morphological annotation follows, in uppercase. Where more than one possible part-of-speech tag exists, each tag is shown annotated on a separate row. For example, the word "set" in the sentence "this chapter describes how to set up the programs that you use when you work with your computer", as set forth in the example below, has 6 possible part-of-speech tags: it may be interpreted as, among other things, a past tense verb in finite form (V PAST VFIN), a normal present tense, non third person singular finite verb (V PRES-SG3 VFIN), and a verb in its infinitive form (V INF), etc. The possible syntactic function is provided in the form of syntactic label, if present, and is the last annotation affixed to each word.
("<*setting>"
("set" <*> <SVOC/A> <SVO> <SVOO> <SV<P/on> PCP1 ))
("up" <*> PREP)
("up" <*> ADV ADVL (@ADVL)))
("<*your>"
("you" <*> PRON PERS GEN SG2/PL2 (@GN> ) ))
("<*programs>"
("program" <*> <SVO> V PRES SG3 VFIN (@+FMAINV))
("program" <*> N NOM PL))
("<$HEAD>")
("<*this>"
("this" <*> DET CENTRAL DEM SG (@DN>))
("this" <*> ADV AD-A> (@AD-A>))
("this" <*> PRON DEM SG ))
("<chapter>"
("chapter" N NOM SG))
("<describes>"
("describe" <as/SVOC/A> <SVO> V PRES SG3 VFIN (@+FMAINV) ))
("<how>"
("how" <**CLB> ADV WH ))
("<to>"
("to" PREP)
("to" INFMARK> (@INFMARK>) ))
("<set>"
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> PCP2)
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V PAST VFIN (@+FMAINV))
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V SUBJUNCTIVE VFIN
(@+FMAINV))
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V IMP VFIN (@+FMAINV))
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V INF)
("set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V PRES -SG3 VFIN (@+FMAINV))
("set" N NOM SG))
("<up>"
("up" PREP)
("up" ADV ADVL (@ADVL)))
("<the>"
("the" <Def> DET CENTRAL ART SG/PL (@DN>) ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL))
("<that>"
("that" <**CLB> CS (@CS) )
("that" DET CENTRAL DEM SG (@DN>) )
("that" ADV AD-A> (@AD-A>) )
("that" PRON DEM SG )
("that" <NonMod> <**CLB> <Rel> PRON SG/PL))
("<you>"
("you" <NonMod> PRON PERS NOM SG2/PL2)
("you" <NonMod> PRON PERS ACC SG2/PL2))
("<use>"
("use" N NOM SG)
("use" <as/SVOC/A> <SVO> <SV> V SUBJUNCTIVE VFIN (@+FMAINV) )
("use" <as/SVOC/A> <SVO> <SV> V IMP VFIN (@+FMAINV) )
("use" <as/SVOC/A> <SVO> <SV> V INF )
("use" <as/SVOC/A> <SVO> <SV> V PRES -SG3 VFIN (@+FMAINV) ))
("<when>"
("when" <**CLB> ADV WH (@ADVL) ))
("<you>"
("you" <NonMod> PRON PERS NOM SG2/PL2)
("you" <NonMod> PRON PERS ACC SG2/PL2))
("<work>"
("work" <SV> <SVO> <P/in> <P/on> V SUBJUNCTIVE VFIN (@+FMAINV) )
("work" <SV> <SVO> <P/in> <P/on> V IMP VFIN (@+FMAINV) )
("work" <SV> <SVO> <Plin> <P/on> V INF )
("work" <SV> <SVO> <P/in> <P/on> V PRES -SG3 VFIN (@+FMAINV) )
("work" N NOM SG ))
("<with>"
("with" PREP ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<computer>"
("computer" <DER:er> N NOM SG ))
("<$.>")
("<*installing>"
("instal" <*> <SVO> PCP1 ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<application>"
("application" N NOM SG ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL ))
("<$HEAD>")
("<*most>"
("much" <*> ADV SUP)
("much" <*> <Quant> PRON SUP SG)
("much" <*> <Quant> DET POST SUP SG (@QN>))
("many" <*> <Quant> PRON SUP PL)
("many" <*> <Quant> DET POST SUP PL (@QN>)))
("<application>"
("application" N NOM SG ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL ))
("<come>"
("come" <SVC/A> <SV> <P/for> PCP2)
("come" <SVC/A> <SV> <P/for> V SUBJUNCTIVE VFIN (@+FMAINV) )
("come" <SVC/A> <SV> <P/for> V IMP VFIN (@+FMAINV) )
("come" <SVC/A> <SV> <P/for> V INF)
("come" <SVC/A> <SV> <P/for> V PRES -SG3 VFIN (@+FMAINV) ))
("<on>"
("on" PREP)
("on" ADV ADVL (@ADVL ) ))
("<floppy_disks>"
("floppy_disk" N NOM PL ))
("<&.backslash.,>")
("<and>"
("and" CC (@CC ) ))
("<you>"
("you" <NonMod> PRON PERS NOM SG2/PL2 )
("you" <NonMod> PRON PERS ACC SG2/PL2 ))
("<install>"
("install" <SVO> V SUBJUNCTIVE VFIN (@+FMAINV) )
("install" <SVO> V IMP VFIN (@+FMAINV) )
("install" <SVO> V INF)
("install" <SVO> V PRES -SG3 VFIN (@+FMAINV) ))
("<them>"
("they" <NonMod> PRON PERS ACC PL3 ))
("<by>"
("by" PREP )
("by" ADV ADVL (@ADVL) ))
("<copying>"
("copy" <SVO> <SV> <P/of> PCP1 ))
("<them>"
("they" <NonMod> PRON PERS ACC PL3 ))
("<from>"
("from" PREP ))
("<the>"
("the" <Def> DET CENTRAL ART SG/PL (@DN>) ))
("<floppy_disks>"
("floppy_disk" N NOM PL ))
("<to>"
("to" PREP)
("to" INFMARK> (@INFMARK>) ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<hard disk>"
("hard_disk" N NOM SG ))
("<$.>")
("<*some>"
("some" <*> <Quant> DET CENTRAL SG/PL (@QN>) )
("some" <*> ADV )
("some" <*> <NonMod> <Quant> PRON SG/PL ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL ))
("<have>"
("have" <SVO> <SVOC/A> V SUBJUNCTIVE VFIN (@+FMAINV) )
("have" <SVO> <SVOC/A> V PRES -SG3 VFIN)
("have" <SVO> <SVOC/A> V INF )
("have" <SVO> <SVOC/A> V IMP VFIN (@+FMAINV) ))
("<special>"
("special" A ABS ))
("<installation>"
("installation" N NOM SG ))
("<instructions>"
("instruction" N NOM PL ))
("<$.>")
("<*see>"
("see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V SUBJUNCTIVE VFIN
(@+FMAINV)
)
("see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V IMP VFIN (@+FMAINV) )
("see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V INF)
("see" <*> <as/SVOC/A> <SVO> <SV> <lnfComp> V PRES -SG3 VFIN
(@+FMAINV) ))
("<the>"
("the" <Def> DET CENTRAL ART SG/PL (@DN>) ))
("<documentation>"
("documentation" <-Indef> N NOM SG ))
("<that>"
("that" <**CLB> CS (@CS) )
("that" DET CENTRAL DEM SG (@DN>) )
("that" ADV AD-A> (@AD-A>) )
("that" PRON DEM SG )
("that" <NonMod> <**CLB> <Rel> PRON SG/PL ))
("<came>"
("come" <SVC/A> <SV> <P/for> V PAST VFIN (@+FMAINV) ))
("<with>"
("with" PREP ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL ))
("<$.>")
("<*to>"
("to" <*> PREP)
("to" <*> INFMARK> (@INFMARK>) ))
("<use>"
("use" N NOM SG)
("use" <as/SVOC/A> <SVO> <SV> V SUBJUNCTIVE VFIN (@+FMAINV) )
("use" <as/SVOC/A> <SVO> <SV> V IMP VFIN (@+FMAINV) )
("use" <as/SVOC/A> <SVO> <SV> V INF )
("use" <as/SVOC/A> <SVO> <SV> V PRES -SG3 VFIN (@+FMAINV) ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<programs>"
("program" <SVO> V PRES SG3 VFIN (@+FMAINV) )
("program" N NOM PL ))
("<most>"
("much" ADV SUP )
("much" <Quant> PRON SUP SG )
("much" <Quant> DET POST SUP SG (@QN>) )
("many" <Quant> PRON SUP PL )
("many" <Quant> DET POST SUP PL (@QN>) ))
("<effectively>"
("effective" <DER:ive> <DER:ly> ADV ))
("<$.backslash.:>")
("<*put>"
("put" <*> <SVO> PCP2 )
("put" <*> <SVO> V PAST VFIN (@+FMAINV) )
("put" <*> <SVO> V SUBJUNCTIVE VFIN (@+FMAINV) )
("put" <*> <SVO> V IMP VFIN (@+FMAINV) )
("put" <*> <SVO> V INF )
("put" <*> <SVO> V PRES -SG3 VFIN (@+FMAINV) ))
("<only>"
("only" ADV)
("only" A ABS ))
("<one>"
("one" NUM CARD )
("one" PRON NOM SG ))
("<copy>"
("copy" <SVO> <SV> <P/of> V SUBJUNCTIVE VFIN (@+FMAINV) )
("copy" <SVO> <SV> <P/of> V IMP VFIN (@+FMAINV) )
("copy" <SVO> <SV> <P/of> V INF )
("copy" <SVO> <SV> <P/of> V PRES -SG3 VFIN (@+FMAINV) )
("copy" N NOM SG ))
("<of>"
("of" PREP ))
("<each>"
("each" <Quant> DET CENTRAL SG (@QN>))
("each" <NonMod> <Quant> PRON SG ))
("<program>"
("program" <SVO> V SUBJUNCTIVE VFIN (@+FMAINV) )
("program" <SVO> V IMP VFIN (@+FMAINV) )
("program" <SVO> V INF )
("program" <SVO> V PRES -SG3 VFIN (@+FMAINV) )
("program" N NOM SG ))
("<on>"
("on" PREP)
("on" ADV ADVL (@ADVL) ))
("<your>"
("you" PRON PERS GEN SG2/PL2 (@GN>) ))
("<hard disk>"
("hard_disk" N NOM SG ))
("<$.>")
("<*having>"
("have" <*> <SVO> <SVOC/A> PCP1 ))
("<more=than>"
("more=than" <CompPP> PREP )
("more=than" <**CLB> CS (@CS) )
("more=than" ADV ))
("<one>"
("one" NUM CARD)
EXAMPLE C Each word of text is further analyzed to disambiguate, if appropriate, between the part-of-speech possibilities, and determine the syntactic function of each word, as set forth in the example below, hereinafter referred to as example D (wherein part-of-speech disambiguation analysis is accomplished). Each of the linguistic processes is discussed in turn below.
"<*setting>" "set" <*> <SVOC/A><SVO><SVOO><SV><P/on>PCP1
"<*up>" "up" <*> ADV ADVL @ADVL
"<*your>" "you" <*> PRON PERS GEN SG2/PL2 @GN>
"<*program<" "program" <*> N NOM PL
"<$HEAD>"
"<*this>" "this" <*> DET CENTRAL DEM SG @DN>
"<chapter>" "chapter" N NOM SG
"<describes>" "describe" <as/SVOC/A><SVO> V PRES SG3 VFIN @+FMAINV
"<how>" "how" <**CLB> ADV WH
"<to> " "to" INFMARK> @INFMARK>
"<set> " "set" <SVOC/A> <SVO> <SVOO> <SV> <P/on> V INF
"<up> " "up" ADV ADVL @ADVL
"<the> " "the" <Def> DET CENTRAL ART SG/PL @DN>
"<programs> " "programt" N NOM PL
"<that> " "that" <**CLB> CS @CS "that" <NonMod> <**CLB> <Rel> PRON SG/PL
"<you> " "you"<NonMod> PRON PERS NOM SG2/PL2
"<use> " "use" N NOM SG "use" <as/SVOC/A> <SVO> <SV> V PRES -SG3 VFIN
@+FMAINV
"<when> " "when" <**CLB> ADV WH @ADVL
"<your> " "you" <NonMod> PRON PERS NOM SG2/PL2
"<work> " "work" <SV> <SVO> <P/in> <P/on> V PRES -SG3 VFIN @+FMAINV
"<with> " "with" PREP
"<your> " "you" PRON PERS GEN SG2/PL2 @GN>
"<computer> " "computer" <DER:er> N NOM SG
"<$.> "
"<installing> " "instal" <*> <SVO> PCP1
"<your> " "you" PRON PERS GEN SG2/PL2 @GN>
"<application> " "application" N NOM SG
"<programs> " "program" N NOM PL
"<$HEAD> "
"<*most> " "much" <*> <Quant> DET POST SUP SG @QN> "many" <*> <Quant>
DET POST SUP PL @QN>
"<appiication> " "application" N NOM SG
"<programs> " "program" N NOM PL
"<come> " "come" <SVC/A> <SV> <P/for> V PRES -SG3 VFIN @+FMAINV
"<on> " "on" PREP "on" ADV ADVL @ADVL
"<floppy_disks> " "floppy_disk" N NOM PL
"<$,> "
"<and> " "and" CC @CC
"<you> " "you" <NonMod> PRON PERS NOM.SG2/PL2
"<install> " "install" <SVO> V PRES -SG3 VFIN @+FMAINV
"<them> " "they" <NonMod> PRON PERS ACC PL3
"<by> " "by" PREP
"<copying> " "copy" <SVO> <SV> <P/of> PCP1
"<them> " "they" <NonMod> PRON PERS ACC PL3
"<from> " "from" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<floppy_disks>" "floppy_disk" N NOM PL
"<to>" "to" PREP
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<hard_disk>" "hard_disk" N NOM SG
"<$.>"
"<*some>" "some" <*> <Quant> DET CENTRAL SG/PL @QN>
"<programs>" "program" N NOM PL
"<have>" "have" <SVO> <SVOC/A> V PRES -SG3 VFIN
"<special>" "special" A ABS
"<installation>" "installation" N NOM SG
"<instructions>" "instruction" N NOM PL
"<$.>"
"<*see>" "see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V IMP VFIN @+FMAINV
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<documentation>" "documentation" <-Indef> N NOM SG
"<that>" "that" <NonMod> <**CLB> <Rel> PRON SG/PL
"<came>" "come" <SVC/A> <SV> <P/for> V PAST VFIN @+FMAINV
"<with>" "with" PREP
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<programs>" "program" N NOM PL
"<$.>"
"<to>" "to" <*> INFMARK> @INFMARK>
"<use>" "use" <as/SVOC/A> <SVO> <SV> V INF
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<programs>" "program" N NOM PL
"<most>" "much" ADV SUP "much" <Quant> PRON SUP SG "many"
"<Quant> PRON SUP PL
"<effectively>" "effective" <DER:ive> <DER:ly> ADV
"<$:>"
"<*put>" "put" <*> <SVO> PCP2
"<only> "only" ADV
"<one>" "one" NUM CARD
"<copy>" "copy" N NOM SG
"<of>" "of" PREP
"<each>" "each" <Quant> DET CENTRAL SG @QN>
"<prcgram>" "program" N NOM SG
"<on>" "on" PREP
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<hard_disk>" "hard_disk" N NOM SG
"<$>"
"<*having>" "have" <*> <SVO> <SVOC/A> PCP1
"<more=than>" "more=than" ADV
"<one>" "one" NUM CARD
"<copy>" "copy" N NOM SG
"<can>" "can" V AUXMOD VFIN @+FAUXV
"<cause>" "cause" <SVO> <SVOO> V INF
"<errors>" "error" N NOM PL
"<$.>"
"<*whenever>" "whenever" <*> <**CLB> ADV WH @ADVL
"<you>" "you" <NonMod> PRON PERS NOM SG2/PL2
"<copy>" "copy" <SVO> <SV> <P/of> V PRES -SG3 VFIN @+FMAINV
"<a>" "a" <Ind.ef> DET CENTRAL ART SG @DN>
"<program>" "program" N NOM SG
"<disk>" "disk" N NOM SG
"<to>" "to" PREP
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<hard_disk>" "hard_disk" N NOM SG
"<$,>"
"<be>" "be" <SV> <SVCIN> <SVC/A> V SUBJUNCTIVE VFIN
"<careful>" "careful" A ABS
"<not>" "not" NEG-PART @NEG
"to>" "to" INFMARK> @INFMARK>
"<copy>" "copy" <SVO> <SV> <P/of> V INF
"<a>" "a" <Indef> DET CENTRAL ART SG @DN>
"<*system>" "system" <*> N NOM SG
"<*folder>" "folder" <*> <DER:er> N NOM SG
"<$.>"
"<*always>" "always" <*> ADV ADVL @ADVL
"<check>" "check" <SVO> <SV> <P/for> <P/with> <P/on> V IMP VFIN @+FMAINV
"check"
N NOM SG
"<to>" "to" INFMARK> @INFMARK>
"<see>" "see" <as/SVOC/A> <SVO> <SV> <infComp> V INF
"<what>" "what" <NonMod> <**CLB> PRON WH SG/PL
"<you_>" "you_" <NonMod> PRON PERS NOM SG2/PL2 SUBJ @SUBJ
"<_'ve>" "have" <SVO> V PRES -SG3 VFIN
"<copied>" "copy" <SVO> <SV> <P/of> PCP2
"<$,>"
"<and>" "and" CC @CC
"<drag>" "drag" <SVO> <SV> V IMP VFIN @+FMAINV "drag" <SVO> <SV> V INF
"drag" <SVO> <SV> V PRES -SG3 VFIN @+FMAINV
"<any>" "any" <Quant> DET CENTRAL SG/PL @QN>
"<extra>" "extra" A ABS
"<*system>" "system" <*> N NOM SG
"<*folders>" "folder" <*> <DER:er> N NOM PL
"<to>" "to" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<*trash>" "trash" <*> <-Indef> N NOM SG
"<$.>"
"<*if>" "if" <*> <**CLB> CS @CS
"<a>" "a" <Indef> DET CENTRAL ART SG @DN>
"<program>" "program" N NOM SG
"<malfunctions>" "malfunction" <SV> V PRES SG3 VFIN @+FMAINV
"<consistently>" "consistent" <DER:ly> ADV
"<$,>"
"<try>" "try" <SVO> <SV> <P/for> V IMP VFIN @+FMAINV "try" N NOM SG
"<installing>" "instal" <SVO> PCP1
"<a>" "a" <Indef> DET CENTRAL ART SG @DN>
"<fresh>" "fresh" A ABS
"<copy>" "copy" N NOM SG
"<$.>"
"<if>" "if" <*> <**CLB> CS @CS
"<that>" "that" PRON DEM SG
"<does>" "do" <SVO> <SVOO> <SV> V PRES SG3 VFIN
"<not>" "not" NEG-PART @NEG
"<help>" "help" <SVO> <SV> <InfComp> <P/with> V INF
"<$,>"
"<find>" "find" <SVOO> <SVOC/N> <SVOC/A> <SVO> <SV> <P/for> V IMP VFIN
"@+FMAINV "find" <SVOO> <SVOCIN> <SVOC/A> <SVO> <SV> <P/for> V INF
"<out>" "out" ADV ADVL @ADVL
"<from>" "from" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<software>" "software" <-Indef> N NOM SG
"<manufacturer>" "manufacturer" <DER:er> N NOM SG
"<whether>" "whether" <**CLB> CS @CS
"<your>" "you" PRON PERS GEN SG2/PL2 @GN>
"<version>" "version" N NOM SG
"<of>" "of" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<program>" "program" N NOM SG
"<is>" "be" <SV> <SVC/N> <SVC/A> V PRES SG3 VFIN
"<compatible>" "compatible" <DER:ble> A ABS
"<with>" "with" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<system>" "system" N NOM SG
"<software>" "software" <-Indef> N NOM SG
"<you_>" "you_" <NonMod> PRON PERS NOM SG2/PL2 SUBJ @SUBJ
"<_'re>" "be" <SV> <SVCIN> <SVC/A> V PRES -SG1 ,3 VFIN
"<using>" "use" <as/SVOC/A> <SVO> <SV> PCP1
"<$.>"
"<*put>" "put" <*> <SVO> PCP2 "put" <*> <SVO> V IMP VFIN @+FMAINV
"<frequently>" "frequent" <DER:ly> ADV
"<used>" "use" <as/SVOC/A> <SVO> <SV> PCP2
"<programs>" "program" N NOM PL
"<$(>"
"<or>" "or" CC @CC
"<aliases>" "alias" N NOM PL
"<for>" "for" PREP
"<those>" "that" DET CENTRAL DEM PL @DN>
"<programs>" "program" N NOM PL
"<$)>"
"<in>" "in" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<*apple>" "apple" <*> N NOM SG
"<menu>" "menu" N NOM SG
"<so>" "so" <**CLB> CS @CS
"<you>" "you" <NonMod> PRON PERS NOM SG2/PL2
"<can>" "can" V AUXMOD VFIN @+FAUXV
"<open>" "open" <SVO> <SV> V INF
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<programs>" "program" N NOM PL
"<more>" "much" ADV CMP
"<conveniently>" "convenient" <DER:ly> ADV
"<$.>"
"<*see>" "see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V IMP VFIN @+FMAINV
"<*chapter>" "chapter" <*> N NOM SG
"<5>" "5" NUM CARD
"<$,>"
"<$">"
"<*adapting>" "adapt" <*> <SVO> <SV> <P/for> PCP1
"<*your>" "you" <*> PRON PERS GEN SG2/PL2 @GN>
"<*computer>" "computer" <*> <DER:er> N NOM SG
"<to>" "to" PREP
"<*your>" "you" <*> PRON PERS GEN SG2/PL2 @GN>
"<*own>" "own" <*> A ABS
"<*use>" "use" <*> N NOM SG
"<$.>"
"<$.>"
"<*to>" "to" <*> INFMARK> @INFMARK>
"<open>" "open" <SVO> <SV> V INF
"<a>" "a" <Indef> DET CENTRAL ART SG @DN>
"<program>" "program" N NOM SG
"<automatically>" "automatical" <DER:ic> <DER:al> <DER:ly> ADV
"<each>" "each" <Quant> DET CENTRAL SG @QN>
"<time>" "time" N NOM SG
"<you>" "you" <NonMod> PRON PERS NOM SG2/PL2
"<start>" "start" <SV> <SVO> <P/on> V PRES -SG3 VFIN @+FMAINV
"<up>" "up" ADV ADVL @ADVL
"<$,>"
"<you>" "you" <NonMod> PRON PERS NOM SG2/PL2
"<can>" "can" V AUXMOD VFIN @+FAUXV
"<put>" "put" <SVO> V INF
"<the>" "the" <Def> DET CENTRAL ART SG/PL@DN>
"<program>" "program" N NOM SG
"<$(>"
"<or>" "or" CC @CC
"<its>" "it" PRON GEN SG3
"<alias>" "alias" N NOM SG
"<$)>"
"<into>" "into" PREP
"<the>" "the" <Def> DET CENTRAL ART SG/PL @DN>
"<*startup>" "startup" <*> N NOM SG
"<*items>" "item" <*> N NOM PL
"<folder>" "folder" <DER:er> N NOM SG
"<$.>"
"<*see>" "see" <*> <as/SVOC/A> <SVO> <SV> <InfComp> V IMP VFIN @+FMAINV
"<*chapter>" "chapter" <*> N NOM SG
"<5>" "5" NUM CARD
"<$,>"
"<$">"
"<*adapting>" "adapt" <*> <SVO> <SV> <P/for> PCP1
"<*your>" "you" <*> PRON PERS GEN SG2/PL2 @GN>
"<*computer>" "computer" <*> <DER:er> N NOM SG
"<to>" "to" PREP
"<*your>" "you" <*> PRON PERS GEN SG2/PL2 @GN>
"<*own>" "own" <*> A ABS
"<*use>" "use" <*> N NOM SG
EXAMPLE D At step 303, the linguistic analysis begins by morphologically and lexically analyzing each word of the text to determine its possible morphological and lexical features. Morphological analysis involves, among other things, mapping a word to its base form. Morphological analysis takes each word and, either through derivation (e.g., "re-initialize" maps to "initialize") or inflection (e.g., "initializing" maps to "initialize"), reduces it to its base form. For example, with reference to FIG. 4, a lookup in the lexicon of the word "initializing" 403 is performed by lexical analysis and fails. Morphological analysis reduces the word to its base form "initialize" by stripping the "ing" ending and adding "e", as indicated by annotation 402 ("</base "initialize">"). Using this base form of the word, lexical analysis then performs a successful lookup in the lexicon to determine the lexical features of the word. Lexical analysis determines the word is a verb. More specifically, lexical analysis determines the word's part-of-speech is a verb present participle, as indicated by part-of-speech annotation 405 ("</pos PCP1>"); it also determined the word participates in a subject-verb-object construction, as indicated by lexical features annotation 404 ("</Ifeats><SVO>>"). Furthermore, the morphological features, e.g., part-of-speech, are determined. Morphological features provide an inference of the linguistic properties of a word (e.g., tense, person, mood) based on how the word is used in a particular context. In the case of the word "initializing" 403, there are no morphological features as indicated by an empty morphological features annotation 406. However, morphological features may be inferred from the fact that lexical analysis has identified the word as a present participle (due to the ing ending) as is indicated by part-of-speech annotation 405. Finally, it should be noted that since lexical analysis determined "initialize" is a verb in a subject-verb-object construction, it will search for an object following the verb. In this way, syntactic labeling is possible, at least to the extent that an association of the words in a sentence with the syntactic functions they play within the particular context ("@Subject", "@Object") may be determined. As another example, "use" 407 may be a noun, as indicated by part-of-speech annotation 408 ("</pos N>"), in which case it takes on the morphological features of NOM (nominal) and SG (singular, as opposed to plural in the case of "uses"), as indicated by morphological features annotation 409 ("</mfeats NOM SG>"). However, "use" 407 may also be a verb, as indicated by part-of-speech annotation 410 ("</pos V>"), in which case it has the morphological features of an imperative (IMP), i.e., "you use", and a finite verb (VFIN), as indicated by morphological features annotation 411 ("</mfeats IMP VFIN>"). "[U]se" may also be used as a normal present tense, non third person singular finite verb, i.e., "they use", as indicated by morphological features annotation 412 ("</mfeats PRES-SG3 VFIN>"). Thus, "use" 407 has three possible uses: it may be a noun or either of two verb readings. As will be seen, it is important to identify the correct use of each word in order to perform knowledge mining. For example, knowledge mining attempts to identify technical terms to be included in the domain catalog by searching for particular syntactic patterns representative of technical terms, e.g., a noun phrase. If it sees that "disk", "repair" and program" in phrase 401 are nouns, then it recognizes the three nouns as constituting a noun phrase, and thus, potentially a technical term. However, "repair" and "program" can also be verbs, so lexical analysis must first determine that the words are, in this context, nouns. Notice, however, with reference to FIG. 4, that "repair" has a noun analysis, as indicated by part-of-speech annotation 413 ("</pos N>"). "[P]rogram" has a verb and a noun analysis, as indicated by part-of-speech annotations 414 and 415, respectively. What has happened here already is a certain amount of part-of-speech disambiguation analysis, i.e., the lexical and morphological analyses have together determined on the basis of local constraints and knowledge about how combinations of words are formed, the proper part-of-speech for certain terms. Using phrase 401 as an example, the analysis, at a certain level of abstraction, proceeds as follows: the first occurrence of "disk" is unambiguously a noun; "program", however, can be a noun or a verb, but because it precedes a preposition ("for") and follows a verb sequence ("use" "commercial"), it is very likely a noun; if "disk" is a noun and "program" is a noun, then "repair" is most likely a noun as well. The linguistic analysis performed at step 303 does not generate a complete syntactic analysis of the sentence, but it is able to, in some instances, identify components of sentence structure, e.g., subject, verb, and object. In this way, extraction of semantically important terms and conceptually interesting data from the document is possible on the basis of their syntactic identity without requiring full syntactic analysis. The lexical, morphological, part-of-speech disambiguation and syntactic label processing performed in the linguistic analysis stage are not concurrent processes, nor do they function sequentially with respect to each other. Lexical analysis and morphological analysis are performed essentially in parallel. Part-of-speech disambiguation is coupled closely to morphological analysis. Determination of syntactic functions and syntactic labeling follows closely behind part-of-speech disambiguation. Part-of-speech disambiguation decides between two or more possible analyses of a word. For example, with reference to example C, as a result of lexical and morphological analysis, the word "up" in the sentence "this chapter describes how to set up the programs that you use when you work with your computer" is determined to be either a preposition (PREP) or an adverb (ADV ADVL). A syntactic label of adverbial (@ADVL) is affixed to the latter possibility. Given the context of the sentence in which the word appears, part-of-speech disambiguation analysis is able to determine "up" functions as an adverbial. Thus, as set forth in example D, the part-of-speech which "up" functions as is unambiguously that of an adverb. Thus, while lexical and morphological analysis generally operate at the word level, part-of-speech disambiguation analysis is concerned with a phrase or sentence, and reduces, to the extent possible, each word of the phrase or sentence to a single, and therefore, unambiguous analysis. Part-of-speech disambiguation analysis looks at a string of words, and on the basis of certain knowledge about the construction of some of those words (namely, that knowledge acquired through lexical and morphological analysis), and the order in which they occur in the sentence, infers likely construction of other words in the sentence. For example, given the sequence of three words "disk repair program" found in phrase 401, once it has been determined that "disk" is a noun and that "program" is a noun, part-of-speech disambiguation analysis recognizes "repair" must also be a noun. Once part-of-speech disambiguation is completed, syntactic analysis determines and labels each word of text with an appropriate syntactic function. In an embodiment of the present invention, implementation of the foregoing linguistic analyses and annotations, including lexical and morphological analysis, part-of-speech disambiguation analysis and syntactic labeling, may be accomplished by way of commercially available application software from, for example, LingSoft, Incorporated, of Helsinki, Finland. As can be seen with reference to example D, each record of linguistically analyzed, annotated and disambiguated text, i.e., each word of the text and its associated linguistic annotations, comprise an arbitrary number of tokens. In some cases, a certain annotation may be missing altogether, e.g., morphological features may not be discerned or present for a particular word. Furthermore, any one field of the record may further comprise an arbitrary number of tokens, e.g., it is not uncommon for lexical analysis to generate an annotation comprising anywhere from zero to five tokens. Thus, at step 304, the linguistically analyzed, annotated and disambiguated text, as well as the annotations themselves, are explicitly labeled to indicate which tokens refer to which annotations, thereby facilitating subsequent mining. Referring to the example below, hereinafter referred to as example E, the annotations to which the tokens belong is more readily discernible than in the case of example D.
Setting </base "set"> </lfeats <*> <SVOC/A> <SVO> <SVOO> <SV> <P/on>> </pos
PCP1>
</mfeats > <(syn @NPHR @-FMAINV>
Up </base "up"> </lfeats <*>> </pos ADV> </mfeats ADVL> <lsyn @ADVL>
Your </base "you"> </lfeats <*>> </pos PRON> </mfeats PERS GEN SG2/PL2>
</syn @NPHR @OBJ>
Programs </base "program"> </lfeats <*>> </pos N> </mfeats NOM PL> </syn
@NPHR @OBJ>
HEAD
This </base "this"> </lfeats <*>> </pos DET> </mfeats CENTRAL DEM SG> </syn
@DN>>
chapter </base "chapter"> </lfeats > </pos N> </mfeats NOM SG> </syn @SUBJ>
describes </base "describe"> </lfeats <as/SVOC/A> <SVO>> </pos V> </mfeats
PRES SG3
VFIN> </syn @+FMAINV>
how </base "how"> </lfeats <**CLB>> </pos ADV> </mfeats WH> </syn @ADVL>
to </base "to"> </lfeats > </pos INFMARK>> </mfeats > </syn @INFMARK>>
set </base "set"> </lfeats <SVOC/A> <SVO> <SVOO> <SV> <P/on>> </pos V>
</mfeats INF>
</syn @-FMAINV>
up </base "up"> </lfeats > </pos ADV> </mfeats ADVL> </syn @ADVL>
the </base "the"> </lfeats <Def>> </pos DET> </mfeats CENTRAL ART SG/PL>
</syn @DN>>
programs </base "program"> </lfeats > </pos N> </mfeats NOM PL> </syn @OBJ
@I-OBJ>
that </base "that"> </lfeats <**CLB>> </pos CS> </mfeats > </syn @CS>
</base "that"> </lfeats
<NonMod> <**CLB> <Rel>> </pos PRON> </mfeats SG/PL> </syn @SUBJ @OBJ @I-OBJ
@PCOMPL-O>
you </base "you"> </lfeats <NonMod> </pos PRON> </mfeats PERS NON SG2/PL2>
</syn
@SUBJ>
use </base "use"> </lfeats > </pos N> </mfeats NOM SG> </syn @OBJ> </base
"use"> </lfeats
<as/SVOC/A> <SVO> <SV>> </pos V> </mfeats PRES -SG3 VFIN> </syn @+FMAINV>
when </base "when"> </lfeats <**CLB>> </pos ADV> </mfeats WH> </syn @ADVL>
you </base "you"> </lfeats <NonMod>> </pos PRON> </mfeats PERS NOM SG2/PL2>
</syn
@SUBJ>
work </base "work"> </lfeats <SV> <SVO> <P/in> <P/on>> </pos V> </mfeats
PRES -SG3 VFIN>
</syn @+FMAINV>
with </base "with"> </lfeats > </pos PREP> </mfeats > </syn @ADVL>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
computer </base "computer"> </lfeats <DER:er>> </pos N> </mfeats NOM SG>
</syn @<P>
.
Installing </base "instal"> </lfeats <*> <SVO>> </pos PCP1> </mfeats >
</syn @NPHR @-
FMAINV>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/P#> </syn
@GN>>
application </base "application"> </lfeats > </pos N> </mfeats NOM SG>
</syn @NPHR @NN>>
programs </base "program"> </lfeats > </pos N> </mfeats NOM PL> </syn @NPHR
@OBJ>
HEAD
Most </base "much"> </lfeats <*> <Quant>> </pos DET> </mfeats POST SUP SG>
</syn
@QN>> </base "many"> </lfeats <*> <Quant>> </pos DET> </mfeats POST SUP PL>
</syn
@QN>>
application </base "application"> </lfeats > </pos N> </mfeats NOM SG>
</syn @NN>>
programs </base "program"> </lfeats> </pos N> </mfeats NOM PL> </syn @SUBJ>
come </base "come"> <llfeats <SVC/A> <SV> <P/for>> </pos V> </mfeats PRES
-SG3 VFIN>
</syn @+FMAINV>
on </base "on"> </lfeats > </pos PREP> </mfeats > </syn @ADVL> </base "on">
</lfeats > </pos
ADV> </mfeats ADVL> </syn @ADVL>
floppy_disks </base "floppy_disk"> </lfeats > </pos N> </mfeats NOM PL>
</syn @<P>
,
and </base "and"> </lfeats > </pos CC> </mfeats> </syn @CC>
you </base "you"> </lfeats <NonMod>> </pos PRON> </mfeats PERS NOM SG2/PL2>
</syn
@SUBJ>
install </base "install"> </lfeats <SVO>> </pos V> </mfeats PRES -SG3 VFIN>
</syn
@+FMAINV>
them </base "they"> </lfeats <NonMod>> </pos PRON> </mfeats PERS ACC PL3>
</syn
@OBJ>
by </base "by"> </lfeats> </pos PREP> </mfeats> </syn @ADVL>
copying </base "copy"> </lfeats <SVO> <SV> <P/of>> </pos PCP1> </mfeats >
</syn @<P-
FMAINV>
them </base "they"> </lfeats <NonMod>> </pos PRON> </mfeats PERS ACC PL3>
</syn
@OBJ>
from </base "from"> </lfeats > </pos PREP> </mfeats > </syn @ADVL>
the </base "the"> </lfeats <Def>> </pos DET> </mfeats CENTRAL ART SG/PL>
</syn @DN>>
floppy_disks </base "floppy_disk"> </lfeats > </pos N> </mfeats NOM PL>
</syn @<P>
to </base "to"> </lfeats > </pos PREP> </mfeats> </syn @<NOM @ADVL>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
hard_disk </base "hard_disk"> </lfeats > </pos N> </mfeats NOM SG> </syn
@<P>
.
Some </base "some"> </lfeats <*> <Quant>> </pos DET> </mfeats CENTRAL
SG/PL> </syn
@QN>>
programs </base "program"> </lfeats> </pos N> </mfeats NOM PL> </syn @SUBJ>
have </base "have"> </lfeats <SVO> <SVOC/A>> </pos V> </mfeats PRES -SG3
VFIN> </syn
@+FMAINV>
special </base "special"> </lfeats > </pos A> </mfeats ABS> </syn @AN>>
installation </base "installation"> </lfeats > </pos N> </mfeats NOM SG>
</syn @OBJ @NN>>
instructions </base "instruction"> </lfeats > </pos N> </mfeats NOM PL>
</syn @OBJ>
.
See </base "see"> </lfeats <*> <as/SVOC/A> <SVO> <SV> <InfComp>> </pos V>
</mfeats IMP
VFIN> </syn @+FMAINV>
the </base "the"> </lfeats <Def>> </pos DET> </mfeats CENTRAL ART SG/PL>
</syn @DN>>
documentation </base "documentation"> </lfeats <-Indef>> </pos N> </mfeats
NOM SG> </syn
@OBJ>
that </base "that"> <"/lfeats <NonMod> <**CLB> <Rel>> </pos PRON> </mfeats
SG/PL> </syn
@SUBJ>
came </base "come"> </lfeats <SVC/A> <SV> <P/for>> </pos V> </mfeats PAST
VFIN> </syn
@+FMAINV>
with </base "with"> </lfeats > </pos PREP> </mfeats> </syn @ADVL>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
programs </base "program"> </lfeats > </pos N> </mfeats NOM PL> </syn @<P>
.
To </base "to"> </lfeats <*>> </pos INFMARK>> </mfeats > </syn @INFMARK>>
use </base "use"> </lfeats <as/SVOC/A> <SVO> <SV¯ </pos V> </mfeats INF>
</syn @-
FMAINV>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
programs </base "program"> </lfeats > </pos N> </mfeats NOM PL> </syn @OBJ>
most </base "much"> </lfeats > </pos ADV> </mfeats SUP> </syn @ADVL @AD-A>>
</base
"much"> </lfeats <Quant>> </pos PRON> </mfeats SUP SG> </syn @OBJ
@PCOMPL-O>
</base "many"> </lfeats <Quant>> </pos PRON> </mfeats SUP PL> </Syn @OBJ
@PCOMPL-
O>
effectively </base "effective"> </lfeats <DER:ive> <DER:ly>> </pos ADV>
</mfeats > </syn
@ADVL>
:
Put </base "put"> </lfeats <*> <SVO>> </pos PCP2> </mfeats > </syn @NPHR
@PCOMPL-O>
only </base "only"> </lfeats > </pos ADV> </mfeats > </syn @AD-A>>
one </base "one"> </lfeats > </pos NUM> </mfeats CARD> </syn @QN>>
copy </base "copy"> </lfeats > </pos N> </mfeats NOM SG> </syn @NPHR @OBJ>
of </base "of"> </lfeats> </pos PREP> </mfeats > </syn @<NOM-OF>
each </base "each"> </lfeats <Quant>> </pos DET> </mfeats CENTRAL SG> </syn
@QN>>
program </base "program"> </lfeats > </pos N> </mfeats NOM SG> </syn @<P>
on </base "on"> </lfeats > </pos PREP> </mfeats> </syn @<NOM @ADVL>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
hard_disk </base "hard_disk"> </lfeats > </pos N> </mfeats NOM SG> </syn
@<P>
.
Having </base "have"> </lfeats <*> <SVO> <SVOC/A>> </pos PCP1> </mfeats >
</syn @-
FMAINV>
more=than </base "more=than"> </lfeats > </pos ADV> </mfeats > </syn @ADVL
@AD-A>>
one </base "one"> </lfeats > </pos NUM> </mfeats CARD> </syn @QN>>
copy </base "copy"> </lfeats > </pos N> </mfeats NOM SG> </syn @SUBJ>
can </base "can"> </lfeats > </pos V> </mfeats AUXMOD VFIN> </syn @+FAUXV>
cause </base "cause"> </lfeats <SVO> <SVOO>> </pos V> </mfeats INF> </syn
@-FMAINV>
errors </base "error"> </lfeats> </pos N> </mfeats NOM PL> </syn @OBJ>
.
Whenever </base "whenever"> </lfeats <*> <**CLB>> </pos ADV> </mfeats WH>
<lSyn @ADVL>
you </base "you"> </lfeats <NonMod>> </pos PRON> </mfeats PERS NOM SG2/PL2>
</syn
@SUBJ>
copy </base "copy"> </lfeats <SVO> <SV> <P/of>> </pos V> </mfeats PRES -SG3
VFIN> </syn
@+FMAINV>
a </base "a"> </lfeats <Indef>> </pos DET> </mfeats CENTRAL ART SG> </syn
@DN>>
program </base "program"> </lfeats > </pos N> </mfeats NOM SG> </syn @NN>>
disk </base "disk"> </lfeats> </pos N> </mfeats NOM SG> </syn @OBJ>
to </base "to"> </lfeats > </pos PREP> </mfeats > </syn @<NOM @ADVL>
your </base "you"> </lfeats > </pos PRON> </mfeats PERS GEN SG2/PL2> </syn
@GN>>
hard_disk </base "hard_disk"> </lfeats> </pos N> </mfeats NOM SG> </syn
@<P>
,
| ||||||
