|
|
|
Query formulation, input preparation, or translation |
Identifier vocabulary data access method and system6826566
Abstract
A method of organizing, managing, and providing interactive access to data in a database is disclosed, along with a program and a system for implementing the method. Associations between each data Item and at least one ItemSelector are established and stored. A predefined (but modifiable) Vocabulary of ItemSelectors sufficient to describe each Item of the database is created. Presently selected Items are described by a combination of associations defined by an appropriate Boolean combination of each ItemSelector in a presently selected set of ItemSelectors. A user controls the presently selected set by adding an ItemSelector from a presented group, or by removing a previously selected ItemSelector. The system ideally makes available to the user all relevant ItemSelectors--those which, if added to the presently selected set, would result in a set that describes at least one extant data Item. The system ideally makes all presently selected Items available to the user.
Claims
We claim:
1. A method of creating a Boolean expression for identifying data Items in a database, comprising:
a) initiating controlled-vocabulary formation of a query by presenting to a user a plurality of ItemSelectors each having a Boolean property associated therewith;
b) accepting a plurality of ItemSelectors selected by the user from among the presented plurality of ItemSelectors, the accepted plurality of ItemSelectors including:
i) at least a first ItemSelector having a first Boolean property associated therewith, and
ii) at least a second ItemSelector having a different second Boolean property associated therewith; and
c) deriving from the accepted ItemSelectors a Boolean expression encompassing the first ItemSelector and the second ItemSelector and reflecting the Boolean property of the first ItemSelector and the Boolean property of the second ItemSelector;
wherein each ItemSelector presented for selection by the user has previously been determined to describe at least one data Item in the database, when combined according to the corresponding Boolean properties with other ItemSelectors presently selected by the user.
2. The method of claim 1, wherein step (a) comprises presenting to the user a plurality of groups of ItemSelectors including:
i) a first group consisting of ItemSelectors associated with the first Boolean property, and
ii) a second group consisting of ItemSelectors associated with the second Boolean property.
3. The method of claim 2, wherein the first Boolean property is disjunctive, and wherein step (b)(i) comprises accepting a plurality of ItemSelectors belonging to the first group of ItemSelectors.
4. The method of claim 3, wherein step (c) comprises:
i) disjunctively joining the plurality of chosen ItemSelectors that belong to the first group within a parenthetical expression, and
ii) conjunctively joining the parenthetical expression to a Boolean expression containing the second ItemSelector.
5. The method of claim 4, wherein step (b)(ii) comprises accepting a plurality of ItemSelectors belonging to the second group of ItemSelectors; and further comprising an act (c)(iii) of joining the accepted plurality of ItemSelectors that belong to the second group according to the second Boolean property to form the Boolean expression containing the second ItemSelector.
6. The method of claim 1, wherein the first Boolean property is disjunctive or exclusive-disjunctive; and step (b)(i) comprises accepting a plurality of ItemSelectors that are associated with the first Boolean property.
7. The method of claim 6, wherein step (c) comprises:
i) disjunctively joining the plurality of accepted ItemSelectors that are associated with the first Boolean property within a parenthetical expression, and
ii) conjunctively joining the parenthetical expression to a Boolean expression containing the second ItemSelector.
8. The method of claim 1, wherein the ItemSelectors are each of a type selected from a group consisting of words, phrases, position-independent alphanumeric characters, position-dependent alphanumeric characters, numbers of alphanumeric characters in text of a data Item, value ranges, alphabetical ranges, graphical symbols, and pictures.
9. The method of claim 1, further comprising:
d) creating a set of previously chosen ItemSelectors consisting of all ItemSelectors previously chosen by the user that have not been withdrawn;
e) adding an ItemSelector chosen by the user to the set of previously chosen ItemSelectors to form a set of presently chosen ItemSelectors;
f) presenting, responsive to the set of presently chosen ItemSelectors, a modified collection of ItemSelectors to the user for further selection.
10. A method of creating a Boolean expression for identifying data Items in a database, comprising:
a) initiating controlled-vocabulary formation of a query by presenting to a user a plurality of ItemSelectors each having a Boolean property associated therewith;
b) accepting a plurality of ItemSelectors selected by the user from among the presented plurality of ItemSelectors, the accepted plurality of ItemSelectors including:
i) at least a first ItemSelector having a first Boolean property associated therewith, and
ii) at least a second ItemSelector having a different second Boolean property associated therewith; and
c) deriving from the accepted ItemSelectors a Boolean expression encompassing the first ItemSelector and the second ItemSelector and reflecting the Boolean property of the first ItemSelector and the Boolean property of the second ItemSelector;
d) creating a set of previously chosen ItemSelectors consisting of all ItemSelectors previously chosen by the user that have not been withdrawn;
e) adding an ItemSelector chosen by the user to the set of previously chosen ItemSelectors to form a set of presently chosen ItemSelectors;
f) presenting, responsive to the set of presently chosen ItemSelectors, a modified collection of ItemSelectors to the user for further selection; and
g) identifying as relevant ItemSelectors that have a property whereby addition of such ItemSelector to the set of presently chosen ItemSelectors creates a set of ItemSelectors that describe at least one data Item within the database.
11. The method of claim 10, wherein step (f) comprises restricting the modified collection of ItemSelectors to ItemSelectors that are identified relevant according to step (g).
12. The method of claim 11, wherein step (f) further comprises making available to the user all ItemSelectors preassociated with the database that are identified as relevant according to step (g).
13. The method of claim 1, wherein each ItemSelector presented to the user is a member of a predefined vocabulary of ItemSelectors developed for the particular database.
14. The method of claim 1, wherein step (a) comprises graphically displaying to the user a representation of each presented ItemSelector.
15. The method of claim 14, wherein some graphically displayed representations differ from the presented ItemSelector represented.
16. A controlled vocabulary method of interactively creating a Boolean expression for identifying data Items in a database, the method comprising:
a) assigning a multiplicity of ItemSelectors among a plurality of ItemSelector groups including a first group and a second group;
b) implicitly associating each ItemSelector assigned to the first group with a Boolean property associated with the first group, and implicitly associating each ItemSelector assigned to the second group with a Boolean property associated with the second group;
c) initiating a query formation by presenting to a user ItemSelectors assigned to the first group and ItemSelectors assigned to the second group;
d) accepting a plurality of ItemSelectors chosen by the user including at least one ItemSelector assigned to the first group and at least one ItemSelector assigned to the second group;
e) forming a first parenthetical Boolean expression including one or more chosen ItemSelectors assigned to the first group that are joined to each other according to the Boolean property of the first group;
f) forming a second parenthetical Boolean expression including one or more chosen ItemSelectors assigned to the second group that are joined to each other according to the Boolean property of the second group; and
g) joining the first and second parenthetical Boolean expressions as a Boolean conjunction to create the Boolean expression identifying one or more data Items in the database;
wherein each ItemSelector presented to the user for selection has been previously associated with at least one data Item in the database.
17. The method of claim 16, wherein the Boolean property associated with the first group is different from the Boolean property associated with the second group.
18. The method of claim 16, wherein the Boolean property associated with the first group is disjunctive or exclusive-disjunctive.
19. The method of claim 16, wherein the Boolean property associated with each group is a member of a set of Boolean properties consisting of conjunctive, disjunctive, exclusive-disjunctive, and negative Boolean properties.
20. The method of claim 16, wherein step (c) comprises presenting to a user at least a third ItemSelector that is not assigned to the first group or to the second group, step (d) comprises accepting the third ItemSelector after it is chosen by the user, and step (g) comprises conjoining a Boolean expression containing the third ItemSelector to create the Boolean expression identifying one or more data Items in the database.
21. The method of claim 16, further comprising (h) creating a vocabulary of ItemSelectors sufficient to describe each data Item in the database.
22. A method of interactively creating a Boolean expression for identifying data Items in a database, the method comprising:
a) assigning a multiplicity of ItemSelectors among a plurality of ItemSelector groups including a first group and a second group;
b) implicitly associating each ItemSelector assigned to the first group with a Boolean property associated with the first group, and implicitly associating each ItemSelector assigned to the second group with a Boolean property associated with the second group;
c) presenting to a user ItemSelectors assigned to the first group and ItemSelectors assigned to the second group;
d) accepting from the user only a plurality of ItemSelectors chosen by the user from among ItemSelectors presented to the user, including at least one ItemSelector assigned to the first group and at least one ItemSelector assigned to the second group;
e) forming a first parenthetical Boolean expression including one or more chosen ItemSelectors assigned to the first group that are joined to each other according to the Boolean property of the first group;
f) forming a second parenthetical Boolean expression including one or more chosen ItemSelectors assigned to the second group that are joined to each other according to the Boolean property of the second group;
g) joining the first and second parenthetical Boolean expressions as a Boolean conjunction to create the Boolean expression identifying one or more data Items in the database; and
(h) determining relevant ItemSelectors as those ItemSelectors in a predefined vocabulary of ItemSelectors which, when further combined with an existing set of ItemSelectors previously chosen by the user, will create a set of Item Selectors that describe at least one data Item in the database.
23. The method of claim 22, wherein step (c) comprises restricting ItemSelectors presented to the user to those determined relevant according to step (h).
24. The method of claim 23, wherein step (c) comprises making available to the user all ItemSelectors of the predefined vocabulary that are relevant according to step (h).
25. The method of claim 16, further comprising (h) determining presently selected data Items as those data Items identified by the Boolean expression that is based upon an existing set of all ItemSelectors previously chosen and not withdrawn by the user.
26. The method of claim 25, further comprising (i) presenting data Items determined in step (h) to the user as selectable data Items.
27. The method of claim 25, further comprising (i) determining relevant ItemSelectors in each simple group, from which no ItemSelector with a disjunctive Boolean property has been chosen, as those that are related to at least one selected Item, and (j) determining relevant ItemSelectors in each complex group, from which at least one disjunctive ItemSelector has been chosen, as those that are related to at least one Item in a set of Items that is described by the existing set of ItemSelectors reduced by removing therefrom all ItemSelectors in the complex group.
28. The method of claim 27, wherein step (c) further comprises restricting ItemSelectors presented to be chosen by the user to those determined relevant in step (i).
29. The method of claim 28, wherein step (c) further comprises presenting all ItemSelectors that are members of a predefined vocabulary of ItemSelectors and are relevant according to step (i).
30. A computer program for implementing interactive procedures to aid a user searching for predefined data Items, the data Items existing in a database and being preassociated with one or more of a predefined vocabulary of ItemSelectors, the program configured to direct a computer system to perform operations comprising:
a) initiating controlled-vocabulary formation of a query by presenting graphically for selection by a user a collection of ItemSelectors from the vocabulary that each define at least one data Item in the database when combined with a set of ItemSelectors previously chosen by the user;
b) accepting an ItemSelector selected by the user from among the presented collection of ItemSelectors;
c) incorporating the ItemSelector selected in step (b) with the previously chosen set of ItemSelectors to establish a presently selected set of ItemSelectors;
d) forming a Boolean expression involving each ItemSelector in the presently selected set to describe data Items;
e) determining data Items of the database described by the Boolean expression of step (d); and
f) presenting, responsive to step (b), data Items determined in step (e) for selection by the user.
31. The program of claim 30, further configured to direct a computer system to perform operations comprising (g) creating an ItemSelector look-up table having a name and a unique identifier for each ItemSelector in the database.
32. The program of claim 30, further configured to direct a computer system to perform operations comprising (g) creating an Item lookup table having a name, a location specification, and a unique identifier for each Item in the database.
33. The program of claim 30, further configured to direct a computer system to perform operations comprising (g) storing associations between the ItemSelectors and the Items preassociated therewith in an array of ItemSelector Vectors that are configured to contain Item identifiers as components.
34. The program of claim 33, wherein the components are stored in each ItemSelector Vector as an ordered set.
35. The program of claim 33, wherein the array index of each ItemSelector Vector within the array is the identifier of a corresponding ItemSelector.
36. The program of claim 30, further configured to direct a computer system to perform operations comprising (g) storing associations between the ItemSelectors and the Items preassociated therewith in an array of ItemSelector Vectors that are configured to contain ItemSelector identifiers as components.
37. The program of claim 36, further configured to direct a computer system to perform operations comprising (h) storing associations between the ItemSelectors and the Items preassociated therewith in an array of ItemSelector Vectors that are configured to contain Item identifiers as components.
38. The program of claim 36, wherein the components are stored in each ItemSelector Vector as an ordered set.
39. The program of claim 36, wherein the array index of each ItemSelector Vector within the array is the identifier of a corresponding Item.
40. The program of claim 30, further configured to direct a computer system to perform operations comprising (g) storing associations between the ItemSelectors and the Items preassociated therewith in a binary matrix wherein non-zero elements identify the associations by their position within the binary matrix.
41. A method of identifying Data Items ("DIs") in a database on the basis of a Boolean combination of associated ItemSelectors ("ISs"), comprising:
a) establishing a plurality of ItemSelector ("IS") Groups that each impose Group-specific properties on all IS members of such Group, the properties including:
i) an IS-DI association property that defines a necessary relationship between an IS and content of an associated DI,
ii) an intra-Group Boolean property, and
iii) a pre-defined Group Title that limits a scope of semantic meaning of IS Group members, the plurality of Groups including different Groups having corresponding different intra-Group Boolean properties;
b) presenting to a user a plurality of ISs belonging to one or more of said plurality of IS Groups;
c) accepting a plurality of ISs chosen by the user from among the presented plurality of ISs; and
d) identifying one or more DIs associated with the chosen ISs according to the corresponding Group-specific IS-DI property, in a combination that also satisfies the corresponding Group-specific intra-Group Boolean property for ISs from a common Group, and an inter-Group Boolean property.
42. The method of claim 41, wherein the pre-defined Group Title of each IS Group limits the semantic meaning of all ISs in such IS Group irrespective of a literal meaning of any IS member of such IS Group.
43. The method of claim 41, wherein the established plurality of IS Groups further includes different IS Groups having corresponding different IS-DI association properties.
44. The method of claim 43, wherein the ISs accepted in step (c) include a plurality of ISs belonging to one of the plurality of IS Groups, and at least one IS belonging to a different one of the plurality of IS Groups, and wherein the one or more DIs identified in step (d) are associated with such portion of all IS Groups from which an IS has been chosen as required to satisfy the inter-Group Boolean property, and are associated with each such IS Group by virtue of an association between the one or more DIs and so many of the ISs chosen from such IS Group as required to satisfy the intra-Group Boolean property of such IS Group.
45. A method of interactively creating a Boolean expression for identifying DataItems ("DIs") in a database, the method comprising:
a) assigning each of a multiplicity of ItemSelectors to one of a plurality of ItemSelector ("IS") Groups, each IS member of each such IS Group implicitly including:
i) an intra-Group Boolean property corresponding to the IS Group,
ii) a Boolean IS-DI association property corresponding to the IS Group, and
iii) a contextual semantic meaning that is limited, irrespective of a literal semantic meaning of any IS member of such IS Group, in accordance with a pre-defined Title of the IS Group;
b) presenting, to a user, a plurality of ISs assigned to a plurality of such IS Groups;
c) accepting a plurality of ISs chosen by the user from the presented ISs as a chosen combination of ISs;
d) effectively generating a Boolean DI selection equation that reflects
i) the IS-DI association property corresponding to each of the chosen ISs,
ii) the intra-Group Boolean property corresponding to each of the chosen ISs belonging to an IS Group from which more than one IS has been chosen, and
iii) an inter-Group Boolean property corresponding to all IS Groups from which ISs have been chosen; and
e) identifying, as selected, one or more DIs that satisfy the Boolean DI selection equation effectively generated in step (d).
46. The method of claim 45, wherein the plurality of IS Groups includes different IS Groups having corresponding different intra-Group Boolean properties.
47. The method of claim 46, wherein the plurality of IS Groups includes different IS Groups having corresponding different IS-DI association properties.
48. The method of claim 45, wherein the plurality of IS Groups includes different IS Groups having corresponding different IS-DI association properties.
Description
FIELD OF THE INVENTION
This invention relates to the field of computers, more particularly to computer information storage and retrieval, and particularly to information organizational structures such as databases.
BACKGROUND
Data access is becoming increasingly important, as the extent of information sources that are available to computers increases with the exponential growth of networks, such as the Internet. Unfortunately, current database designs are inflexible and impose severe demands on user and computing power effort during unplanned queries.
Inflexibilities and high processing demands result from the current structure of known databases. Such structures generally seek to achieve quick access to records within the database by calculating the precise location of the record within the whole database. Inconvenient structural limitations are often imposed to facilitate this common database goal. For example, each record may be required to be the same size. This limitation may be avoided by using pointers, but a pointer structure requires user foresight and decisions at the outset, if database restructuring is to be avoided.
A fixed record size requirement only assures quick access when the record number is known. To have quick access when searching on field values, indexing needs to be performed linking those values with the record ID. In a typical database many index tables are needed. Maintenance of such tables requires an update of all of them whenever anything requires a change in the record identifiers--which in practice happens too often.
Numerous legacy databases need to be integrated with newer database systems. Normally this is done by converting them all to a single, modern relational database. This is an extremely difficult and time-consuming task under present systems, requiring a great deal of work to reconcile the different legacy structures into one new structure. Such integrations often incur extremely large costs, taking a very long time, disrupt business, and yet produce only partly satisfactory outcomes.
Accordingly, there is a need for a method and system that facilitates queries for data from data sources. Because of the wide range of different organizational structures for the data sources that are available to many computers, it is desirable that improved data access be capable of operation across a range of computing platforms and organizational structures.
SUMMARY
In response to the needs identified above, a new approach is described herein that is based on a universal data structure, and is developed and applied to structured databases. Some foundations for this approach may be found in U.S. Pat. No. 5,544,360, (Lewak et al.). Using a generalized Vocabulary of Identifiers (called ItemSelectors) of each data fragment (called Items), this approach will be referred to as software Technology for Information Engineering.TM. or TIE, and is applicable to most or all information systems. TIE databases eliminate inflexibilities associated with current databases, and reduce processing demands. They allow virtually any number, and any organization, of fields for each record. Moreover, they significantly enhance the effective speed of query responses.
TIE databases typically provide an intuitive Guided Information Access (GIA) interface to the user that is based upon Vocabulary terms. As the user selects presented Vocabulary terms, the portion of the Vocabulary that is presented thereafter may be constrained, dynamically and in real time, by such previous selection, such that only ItemSelectors that will yield viable (non-null) results remain available to be selected. Such dynamic constraints are difficult or impossible to achieve in known technologies.
Associations resulting from choosing ItemSelectors are immediately apparent to the user, are easy to implement and edit, and facilitate search queries. Associations between the Identifiers and the individual data Items (which may be, for example, Records or Linked Records) are organized in a binary matrix that facilitates quick access. With such organization, substantial change in the relationship between fields (or Items), even disruptive changes, typically require a change of data within just one universal data structure, generally implemented in TIE systems as a Universal Matrix Structure (UMS).
The Items in a TIE database may be referenced through a path, URL, or any other suitable identifier. The references themselves may be hidden to avoid confusion. The actual data may be located anywhere that can be accessed by a computing system employing TIE, sometimes even across a Wide Area Network such as the Internet. Such flexible referencing techniques, particularly in combination with a universal structure described further below, facilitates an easy, non-invasive integration of disparate legacy databases. The TIE system permits conversion of legacy databases into a new database structure in an intuitive manner that need not disrupt the legacy system, which can continue to be used in parallel.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 is a block diagram showing typical information flow in a TIE system.
FIG. 2 illustrates derivation of a Boolean expression from ItemSelectors in groups.
FIG. 3 illustrates Boolean derivation for ItemSelectors differing from those of FIG. 2.
FIG. 4 represents an initial condition for an interactive GUI during a user search.
FIG. 5 represents a modified condition of the GUI during the user search of FIG. 4.
FIG. 6 represents a GUI as further modified during the user search of FIG. 4.
FIG. 7 shows a matrix providing associations between Items and ItemSelectors in a bitmap.
FIG. 8 is a graph illustrating element estimation using straight line interpolation.
DETAILED DESCRIPTION
Introduction
Extremely flexible databases can be achieved by employing a universal matrix structure ("UMS"). For background on such structures, see U.S. Pat. No. 5,544,360 (Lewak et al.) ("the '360 patent").
In TIE (Technology for Information Engineering.TM.) system databases described herein, each element of information is called an Item, and each Item has its own unique identifier (typically an ID number). Each Item may be described using a set of one or more descriptors (ItemSelectors), each of which represents an attribute of the Item. Some combination of meaningful ItemSelectors (which may be key words, phrases, or other descriptors, each uniquely identified within the system), will suffice to distinguish a particular Item within the constellation of Items available in a database. Such meaningful descriptive key words or phrases may therefore be used to select an Item. The meaningful descriptive key words or phrases will be referred to as "ItemSelectors." (Note that in the Provisional Application upon which this application is based, these key words or phrases were referred to as "Categories." The terminology is substantially arbitrary, and, though different, is internally consistent within each document.) As with any search, a set of ItemSelectors will typically describe a first set of Items consistent therewith. One or more additional ItemSelectors may be needed to uniquely describe a single Item from the first set of Items. Conversely, Items may be described as belonging to, or associated with, one or more ItemSelectors.
Because as many ItemSelectors as needed may be associated with each Item, relationships between Items may be as complicated as will be helpful. In a TIE database, associations between records, and between fields within such records, need not be restricted to a fixed hierarchy such as is imposed by known legacy databases, but may have much greater flexibility due to association via a virtually unlimited number of ItemSelectors. The relationships between Items may thus mimic those that naturally form in the mind of the user, through associations conveyed by meanings of the name given to each ItemSelector associated with such Items.
One TIE database described herein uses a single universal table, referred to as a Matrix because of its theoretical (and, in some implementations, physical) structure. This Matrix may be maintained in readily accessible memory for quick access. A two-matrix alternative TIE database approach is also described, which may permit increased response speed under some circumstances.
The Matrix holds all associations between Items and ItemSelectors. Changes in the ItemSelectors, or in the relationships between the ItemSelectors and information Items, require the update of just this one universal matrix and so are relatively easy and quick to achieve. Each TIE database is characterized by a Vocabulary of ItemSelectors that are sufficient to describe each information Item in the database. The Vocabulary is typically structured into Groups of ItemSelectors, and sometimes into Subgroups.
The use of ItemSelectors as Language-Based Identifiers (or descriptors) of each field and record, along with an appropriate software implementation, reduces processing demands while making the database organization extremely flexible. A TIE database may contain any number of Items, and may effectively permit a user to select virtually any organization of "fields" for each "record." Moreover, query responses may be almost instantaneous. A TIE database typically employs a GUI that allows users to both view and interrogate the data intuitively, by selection ("point and click") of descriptors (ItemSelectors) that are presented. The Associations resulting from the use of such descriptors are immediately apparent to the user, and yet permit the software underlying the organization to be simple and fast.
Many other advantages result from the TIE approach. In particular, it is easy to combine legacy databases across any number of platforms and any number of different data types, into one uniform, intuitive interface, without the need to disturb the current legacy databases. The only decisions that need to be made when merging databases involve the Vocabulary of ItemSelectors and their properties. Such decisions are orders of magnitude easier than the complicated structure decisions required when current databases, each with its own structure or data model, must be merged into a single new structure or data model.
A TIE user interface is preferably uniform, and typically may be customized. The user interface generally allows users to actually view portions or representations of the available data, by displaying the structured Vocabulary (of ItemSelectors, which are descriptors/identifiers) for such data, even before initiating any actual search. Thus, the interface permits users to search through the data interactively, generally by adding or removing an ItemSelector (descriptor or identifier) to a present search query. After each such modification of a search query, the TIE interface may incrementally adjust both the data Items that are available in view of the modified query, and also adjust the further ItemSelectors (descriptors) that are available to further narrow the query. Such incremental adjustment may indicate to the user the new scope of available data, without a need to actually retrieve the data specified by the search query. By thus incrementally indicating the scope of data specified to the current point, a TIE interface may guide a user through to the completion of each search. Moreover, the interface may prevent the user from selecting combinations of descriptors (ItemSelectors) that lead to a null set of data Items, by presenting to the user only that subset of the ItemSelector Vocabulary which, when added to the present query, will still identify at least one data Item. Consequently, no actual search need ever encounter zero hits, because an absence of data may be seen before the search is even performed, which eliminates the frustration and wasted time of "dead end" searches.
Integration of existing databases using TIE requires only a relatively easy choice of an ItemSelector Vocabulary. A user may define the ItemSelectors (and their properties) that are associated with data Items, thereby establishing ItemSelector relationships to data Items as the data is entered. To integrate two different TIE databases, the user may choose a starting Vocabulary (of ItemSelectors) that is simply the union of the individual Vocabularies for each database, accounting for synonyms. Such a selection of Vocabulary requires virtually no decisions at all. However, a more optimized Vocabulary is recommended, and can readily be developed, which could reduce the number of ItemSelectors in the Vocabulary. All of the associations between Items and ItemSelectors are established by the final, united Vocabulary.
FIG. 1 is a block diagram illustrating information flow in a typical TIE system. Interaction with the user takes place at a graphical user interface 102, with the user choosing ItemSelectors from those offered by the system to describe information that is sought. The selections of ItemSelectors and/or Items entered by the user are passed on to a Boolean expression generator 104, where a Boolean search expression is created from the entered information. This important step is described subsequently in much more detail. The Boolean search expression may be passed to a query engine 106 (which may, of course, be part of the same computing hardware as item 104). The query engine may access data Item information from any number of different locations, represented here by just two: Storage A 108 and Storage B 110. The query engine accesses relationship data, such as association tables in storage 112, which information may be organized in a TIE system as a Universal Matrix System.
Definitions and Usage
Database users have evolved a language specific to database tasks. In order to describe the TIE system, it is necessary to extend this language. The following is a glossary of terms relevant to TIE systems. Some definitions explain methods used within TIE, and thus provide a description of some TIE procedures.
Some of these definitions relate to current, structured databases, while others to the TIE database and to unstructured databases.
Item: Information Items are the elementary data objects stored in a database. Users may choose to define Items in different ways, according to their previous experience and needs. Thus, the user of a particular TIE database may treat a traditional "Field" as an Item, or may treat a collection of Field Components (i.e., Subfields), or Records, or any other identifiable data entity, as an Item by simply providing the appropriate ItemSelectors (descriptors) related to such data entity, thereby permitting it to be accessed by the system. In general, Items may constitute any type of data, such as Text, Graphics, Sound Recordings, Movies, and so on. Users may define, and then later redefine, what data entities constitute an Item. Thus, when converting an existing database to the TIE system it may be convenient to first define a record or a row of the existing database as an Item, and then to change the initial definition upon determining a more convenient linking of records, to form more appropriate or useful Items. Quite often Items are concurrently defined in a plurality of different ways, which is to say that what constitutes an Item may be easily changed. Such flexibility flows naturally from the TIE system.
Derived Item: These are special Items that are not contained within the database, but are derived from the information contained within the database. For example, in a Police Department's Overtime Database, records of work hours and pay are kept for each individual and each occasion. When converting from a traditional structured database, it may be convenient to derive new fields having totals of both the pay and the hours, for each person, for each department subdivision, for each kind of activity etc. In the TIE database, such totals may be added as explicit new data Items, or may alternatively be made available indirectly as Derived Items by simply defining the treatment of explicit Items that will produce each Derived Item. For example, graphical plots and associated tables of total spending for each kind of activity and each department subdivision may be Derived Items that are produced as needed from underlying data Items, rather than being maintained within the database as explicit Items (which, of course, require storage space). Such Derived Items may be defined when converting to a TIE system, or, more flexibly, may be created upon user request. Such Derived Items are accessed using the overall TIE database Vocabulary, which accordingly must be amended to include any terms needed to define the desired Derived Items.
Field: This term belongs to the terminology of previous databases, and is used somewhat loosely in the context of TIE databases. A Field is generally the smallest fragment of information having a separate meaning within a database, but different database organizations will generally have different Field definitions. "Address" records in one database, for example, may be defined to contain a field "street address" that includes a street number and street name, and such "street address" information would not be subdivided into further fields in such database. However, another database may define separate "Street Name" and "Street Number" fields within the "Address" record.
Subfileld: This term again is appropriate to previous databases, and is loosely used with respect to TIE databases. A Subfield is not a separate entity within a database, but refers to a portion of a Field. For example, if a Field "Address" contains both street name and number, then the street name and the street number may each be considered a subfield of the "Address" Field. While subfields are not formally maintained as separate information fragments within a database, it may be a simple matter to either enter such subfields separately, or to separate the information from a particular field into subfields. With a TIE database, the distinction between Subfields and Fields is rarely significant, as either may be defined by the user as an Item for direct access.
ItemSelector: A TIE ItemSelector is simply a descriptor, or identifier, of information. Words, phrases, letters and numbers may all be used to specify a particular ItemSelector. A single letter or number may be an ItemSelector, as may "Sick and Vacation Time." Like a name, an ItemSelector may be indicated by any unique (within a Group, see below) combination of symbols. Though the symbols are typically simple alphanumerics and spaces, they may be mathematical expressions, symbols associated with chemical expressions, or icons, or graphics or pictures of any sort. Also like a name, a particular ItemSelector may refer to a single entity (e.g., Frederic B. Remington, Exxon Corporation), or may encompass many entities (e.g., Fred, Corporation). Due to this broad usage within TIE systems, it is useful to further define many different kinds of ItemSelectors. A partial list follows:
AlphaSelector: individual letter or number values (a special case of SingularSelector, below). For example, "House Number Digit 1" and "House Number Digit 2" are names of ItemSelector Groups. The individual digits 0-9 are ItemSelectors that belong to such group, and are AlphaSelectors because they are single alphanumeric characters. Thus, when a user is searching and selects the AlphaSelector "3" from the Group "House Number Digit 1" and the AlphaSelector "4" from the Group "House Number Digit 2," the Vocabulary choices thereafter presented will typically be limited to the available AlphaSelectors for any as-yet unspecified position Group. If other information that has already been selected in a search process (e.g., the street name) narrows the possible range of "House Number Digit 1," then it is possible that only one or two such AlphaSelectors will then be available for selection by the user. On a short street, for example, all of the house numbers may begin with either 7 or 8, and thus only the AlphaSelectors "7" and "8" will be presented to the user as selectable Vocabulary choices (within the Group "House Number Digit 1") after such street name has been selected.
RangeSelector: (or ValueRangeSelector) a range of values sharing a common descriptor (which is the RangeSelector). For example, "180-185 lbs." is a RangeSelector that describes all weight values between 180 and 185 lbs.
ImpreciseSelector: a descriptor that is not precise, and thus conveys some potentially ambiguous scope of equivalents. Colors are good examples of this type of ItemSelector; for example, "Brown" is an ImpreciseSelector that generally encompasses light brown, dark brown, brunette, etc.
SingularSelector: Some ItemSelectors (descriptors) identify just a single value. For example, phone numbers may be split up into "area code" and "prefix" and "last four" "Area codes" can only take on certain values (specifically, between 200 and 999). Each value of an area code, such as "601" or "503" is a SingularSelector.
FieldSelector (Field ItemSelector): Terms used to describe Fields, such as Billing Address, Shipping Address, and Costs. A FieldSelector is an ItemSelector (and thus a descriptor) of a Group of ItemSelectors that have a logical association with each other. For example, an "Area Code" is a descriptor (ItemSelector) of an entity that is often considered a "Field." Because it describes a Field, "Area Code" is a FieldSelector.
SubfieldSelector This is a descriptor of a subfield. Last Name, First Name, Street Name, and Number may all be SubfieldSelectors for a Field such as "Mailing Address" that encompasses all of this information (or more).
GroupSelector: a descriptor or identifier (ItemSelector) of a Group of ItemSelectors that are, perhaps arbitrarily, included in such group; see Group, below.
WildSelector: is a class of ItemSelectors, specifically a descriptor of a data value that is position independent. It is most commonly used with AlphaSelectors, such as "LicensePlateCharacter." However, it can also be a descriptor of a DNA sequence of a number of amino acids, and within a gene it may be searched for irrespective of position.
PositionDependentSelector (PD ItemSelector) describes any class of ItemSelectors that happen to be position dependent, such as "LicensePlateCharacter1" or "Area Code" (which, of course, is the first three digits of a phone number.) Both Wild and PD AlphaSelectors are useful, for example, in a Police Department crime database that includes data on license plates. There, Wild AlphaSelectors may classify each license plate by all of its character components, independently of character position within the license plate sequence. In contrast, there may be a separate set of PD AlphaSelectors that apply to each character position within a license plate. When searching for a partially known license plate, selections of characters whose position is known may be made form PD AlphaSelectors, while Wild AlphaSelectors may be used for characters whose position is unknown.
Group: In practice, ItemSelectors are usually organized into logical Groups of ItemSelectors for easier access by a user. Upon user selection, Group membership defines the query Boolean that is used intemally. ItemSelectors need only be unique within a Group; that is, a particular Group may contain an ItemSelector that has the same name as a different ItemSelector in a different Group. For example, a Group "Licensed Drivers" may contain an ItemSelector "Hair color," but it would not be the same as an identically-named ItemSelector "Hair color" in a Group "Registered Owners." Thus, membership in a, Group effectively distinguishes an ItemSelector from a same-named ItemSelector that is in another Group, or in no Group. This hierarchical structure within the organization of the Vocabulary will be familiar to most users of computers, due to its similarity to typical directory organization. Indeed, although most TIE databases need zero, one or two levels of such hierarchy within the Vocabulary, TIE system software generally may readily be extended to utilize any number of hierarchical levels as may suit the needs for a particular database Vocabulary. No hierarchy is typically required in the organization of associations between Items and ItemSelectors.
Vocabulary: This is simply the union of all ItemSelectors, and forms the entire scope of descriptors (ItemSelectors) that may be selected by a user to locate or describe each and every Item in a TIE database. The available Vocabulary is intuitively similar to words that may be used to describe a desired Item. A TIE Vocabulary is a limited set of descriptors (ItemSelectors) that is sufficient to describe all Items within a TIE database. During a search, a user initially may select any ItemSelector from the entire TIE database Vocabulary, and thereafter the TIE interface typically presents only that subset of the Vocabulary of ItemSelectors which, if any one is selected to make a further combination with those ItemSelectors already chosen, describes at least one data Item.
Boolean "Conjunctive" ItemSelectors are those that a TIE system treats as if they are invoked with a Boolean "AND" between such descriptors (ItemSelectors that have the Conjunctive attribute). Accordingly, Items so described must contain the attributes of all of the Conjunctive ItemSelectors chosen. A plurality of Conjunctive ItemSelectors may be assigned to an Item, so that they act in an overlapping fashion to identify the Item. For example, the ItemSelectors "Software," "Development," "Tools," "C++" may be overlapped or conjoined to describe a set of Items, and thus they may each be Conjunctive ItemSelectors.
Boolean "Disjunctive" ItemSelectors are those which, when selected by a user in the process of describing an Item, are treated by a TIE system as having an "OR" disjunction between them. ItemSelectors are often divided into several separate Disjunctive Groups. When a plurality of ItemSelectors is chosen from within a Disjunctive Group, they are combined with the "OR" disjunction between themselves. The resulting disjunctive combination of ItemSelectors from such Disjunctive Group, however, is "parenthesized" and combined, using the conjunctive "AND," with selected Conjunctive ItemSelectors and with any other parenthetical expressions of ItemSelectors, in accordance with Boolean logic rules (e.g., distribution of the "AND" operation that is external to a parenthetical expression over those ItemSelectors that are "OR'd" within such parenthetical expression).
ItemSelectors that would not normally be assigned in plurality to any Item (that is, would normally be assigned only one at a time) are good candidates for a Disjunctive Group. Consider a database of events that is catalogued according to the particular date and time at which they begin. Various date-related ItemSelector Groups (such as Year, Month, Day, and Day-of-Week ItemSelector Groups) are disjunctive because an event cannot begin at two different times or dates.
Boolean "Bijunctive" ItemSelectors are those that are used in both Conjunctive and Disjunctive contexts. For example, when considering or searching on towns in the US, the ItemSelector group "State" (in which each town is located) is a disjunctive ItemSelector because each town is located in only one state. However, when considering or searching on other geographical features (that overlap states), such as lakes, national parks, rivers, etc., the ItemSelector group "State" may need to be conjunctive. Thus, the same Group is sometimes conjunctive (e.g., when searching for rivers) and disjunctive (e.g., when searching for towns). One way to manage the bijunctive nature of such a Group is to start with disjunctive search rules, and then to automatically switch to conjunctive search rules when the user chooses any ItemSelector indicating Items that are described by more than one ItemSelector within the Group. Another way is to have two parallel Groups of ItemSelectors: "States for Towns" and "States for Lakes," in this example.
Boolean "Exclusive, Disjunctive" ("ED") ItemSelectors are treated by a TIE system as connected by a Boolean exclusive "OR" or "XOR" operator. Groups of ItemSelectors that share this property are very useful in minimizing the number of disjunctive ItemSelectors in a Boolean query when ranges of values are selected by the user. For example, in a database regarding persons, it is often useful to have an age Group of ItemSelectors in which each age is represented in years. A user searching for someone between 30 and 40 could select each of the Disjunctive ItemSelectors 30, 31, 32, . . . 40. However, each ValueSelector, such as "30," may be interpreted (particularly if more than one is chosen) to indicate an age of up to 30 years, that is, an age of 0 to 30 years old. If, moreover, the ItemSelectors in "Age" are all ED ItemSelectors, then simply selecting "30" and "40 " defines the range between these two (because that is the "XOR" of the defined ranges. Thus, "ED" properties may reduce the selection actions from eleven separate "clicks" to just two.
Negative ItemSelectors: Sometimes it is convenient to invoke a Boolean negative of certain ItemSelectors. For example, in a database of people where the race of each person is stored, it may be necessary to search for non-Europeans. If "European" is an ItemSelector, using its negation would serve the purpose. This could, of course, be implemented by adding an ItemSelector "Non-European" whose synonyms are all the non-European ItemSelectors, but may be more conveniently implemented by using a modifier key (such as the Control key) while clicking on the ItemSelector to indicate that an negative or inverse of the ItemSelector is being selected.
ItemSelector Group Properties: Each Group of ItemSelectors has a Boolean property that is associated with each ItemSelector in the Group. When a user selects an ItemSelector, a query Boolean is automatically created by the TIE software which then executes an Item search based on this Boolean query and evaluates the Item hits and the remaining Available ItemSelectors. In creating this Boolean query, the TIE software takes its cue from the Boolean property of the group to which the selected ItemSelector belongs. Exemplary Boolean properties are described below, but are best understood in the context of TIE system functions, which are set forth in a functional description that follows these definitions.
Conjunctive/Dislunctive/ED Decisions: The decision as to which ItemSelectors to treat as Conjunctive and which as Disjunctive is a matter of choice and meaning, based on the specific type of data and the types of searches required through the data. Guidance for handling these decisions in a TIE system is provided below.
A TIE system may assign the ED (Exclusive-Disjunctive) property to appropriate ItemSelectors, and may present them graphically to a user with instructions to select endpoints of a range. As one alternative, a TIE system may present (or permit to be entered) values for certain Groups of ItemSelectors, such as "Age." The system may then interpret a first selected ValueSelector as ED with the immediately succeeding ValueSelector in the Group (effectively treating it as identifying a unique ValueSelector), but, upon selection by the user of a second ValueSelector, treat the two as ED with each other.
A Bijunctive Group may be expanded into two separate groups,: one treated Conjunctively and one Disjunctively, each displayed so as to make the treatment clear. ItemSelectors that are never assigned together to the same Item are always Disjunctive, because if used Conjunctively, they would find zero Items.
ValueRangeSelectors, when users may need varying ranges, may be presented as Exclusive Disjunctive, so that any range can be selected by choosing the two boundary ranges. Below is a further example of an appropriate use of Exclusive Disjunctive (ED) properties with ValueRangeSelectors:
Suppose Items are described with the following ValueRangeSelectors:
$0->$10 $10.01->$20 $20.01->$30
Such ItemSelectors could appropriately be designated as "Disjunctive," whereupon each range could be selected individually, or ranges could be combined to create broader ranges. Thus, if the range $0-$30 was desired, all three ItemSelectors could be chosen.
Now suppose instead, the same data was described by the following, alternative ItemSelectors:
$0->$10 $0->$20 $0->$30
and these were all designated as ED ItemSelectors (for example, by attaching ED attribute to the entire Group of ItemSelectors). With this change, any contiguous range may be chosen by selecting one range, or by combining just two ItemSelectors. Combining the first and the last yields the range $10->$30.
A better way to present such an Exclusive disjunctive Group would be:
$0 $10 $20 $30
with instructions that a user pick the two range boundaries.
ItemSelector Groups and Group Properties: When designing the GUI, the various types of ItemSelectors are usually logically grouped into ItemSelector Groups. It has not been found convenient to combine ItemSelectors having different Boolean properties within the same group. Therefore, ItemSelector groups are typically divided into (Boolean) Disjunctive, Conjunctive, Exclusive Disjunctive, and sometimes Negative types. It is convenient to refer to each group by name (i.e., the GroupSelector for such Group), and to describe attributes of each group by a property called Kind. "Kind" itself is a name (GroupSelector) for a Group of ItemSelectors that determines the rules by which ItemSelectors are assigned to Items.
FIGS. 2 and 3 illustrate rules that may be used for creating a Boolean expression from ItemSelectors chosen from different groups having different Boolean properties associated therewith. FIG. 2 indicates that a presently chosen set of ItemSelectors 202 includes a pair 204 of ItemSelectors C1 and C2 that are from a Conjunctive group (or are otherwise associated the conjunctive Boolean property. A pair 206 of Disjunctive ItemSelectors D1 and D2 are also in the chosen set, as are a pair 208 of ItemSelectors E1 and E2 that are associated with the Boolean Exclusive-Disjunctive property (at least as to each other). The fact that pairs of such items is merely for convenience; any number may be selected. The resulting Boolean Expression is created by first relating chosen ItemSelectors having the same Boolean property with respect to each other (such as D1 and D2, or E1 and E2) according to such Boolean property, within a parenthetical expression. In this case the result is parenthetical expressions (D1+D2) and (E1-E2), where "+" indicates "OR," "-" indicates "XOR," and "*" indicates "AND." The resulting parenthetical expressions are then conjunctively combined with each other, generally irrespective of the Boolean property associated with the ItemSelectors. Due to the nature of Boolean logic, it does not matter if C1 and C2 are originally parenthesized or not, because they in any event are eventually related to the rest of the overall Boolean search expression conjunctively.
FIG. 3 illustrates a slightly different situation that FIG. 2. The same Conjunctive pair C1 and C2 (304) is present, but also two different disjunctive pairs, 1st Disjunctive ItemSelectors 306 and 2nd Disjunctive ItemSelectors 308. These different pairs are presumably from different disjunctive groups; in any event, they are disjunctive only as to the other member of the pair (or larger group). Accordingly, the parenthetical expressions that result include (1D1+1D2), as well as (2D1+2D2). As in FIG. 2, the resulting parenthetical expressions are conjunctively associated with all other parts of the Boolean search expression, and again it is not necessary to actually put C1 and C2 within a parenthesis, because they will be related conjunctively with or without such parenthetical.
ItemSelectors may be assigned to Items in diverse, selectable ways. The desirable rules will generally be selected depending upon the nature of the data contained within the associated Items. For example, consider a Group of ItemSelectors broadly described by the term (GroupSelector) "Address." Subgroups of "Address" may be identified by the following ItemSelectors:
House Number; Street Name; Street Type; City; Zip; State
Exact Kind Groups. Consider the House Number. A list of all the house numbers in the database may be designated as the House Number group. An appropriate Kind designation for that group would be Exact, because the assignment of each ItemSelector from the list of house numbers would occur only if the ItemSelector matched exactly the data in the House Number field.
Alpha Wild Groups. Presenting a list of every possible house number for the user to choose from is usually too cumbersome. So an easy alternative, though very much less precise, is to list a single column of digits from 0 through 9, each of which is an ItemSelector, and is assigned to an Item whenever it is contained in any position of the house number. For example, if a house number was 3421, the Item containing this number would be assigned the four ItemSelectors 1, 2, 3 and 4. When the user chooses these four ItemSelectors from the available list (in any order), all numbers that contain these digits, in any order would be selected. In addition, any house number that contains other digits in addition to these, would also be selected. The Kind property of this group is referred Alpha Wild--that is, Alpha-Numeric and Wild. The Alpha Wild designation does not distinguish between purely numeric ItemSelectors and those including letters.
Although an ItemSelector from an Alpha Wild Selector group does not narrow down a search as much as those from an Exact group, ItemSelectors of this type useful in many applications, particularly when only partial information is known. In combination with other ItemSelectors, it is very effective at narrowing down possibilities when searching or browsing through data.
Digit Number Groups. ItemSelectors describing the number of digits in numbers, such as house numbers, can also form a useful ItemSelector group. In combination with an Alpha Wild group, for example an ItemSelector from such a group can considerably narrow down the possible matches. A group of number ItemSelectors that designates the number of Digits in a house number that is the target of a search would be described as being of Digit Number Kind.
Alpha Position Groups. One precise way of classifying a house number (or indeed any number, name or word) is to select ItemSelectors from an appropriate set of ItemSelectors of Alpha Position Kind. A "set" of such groups is needed, the complete set including one group for each digit position. Each group consists of all possible AlphaPosition ItemSelectors for its associated digit position, which for house numbers (for example) is generally limited to the numerals 0-9. An ItemSelector Group Set of this Kind is designated Alpha Position n, where n is the number of character positions, and therefore is also the number of Groups within the set.
Subfield Values Groups: The abstraction of a Subfield, such as for example the Last Name, is instantiated with a Subfield Value when the data is entered. So for example if the name Smith is entered into the Last Name Subfield, then Smith is the Value of that Subfield.
Subfield Value Types: The following broad three Value Types can be easily identified: Text, Numbers, and Dates. Other Value Types can be introduced as the need arises in specific applications. The words used to describe the Value Types can also be implemented as ItemSelectors. When choosing Subfield Value Selectors, each of these Value Types can be treated differently by the software. In some cases, the individual Values can be used as ItemSelectors. In other cases ranges of values can be defined as ItemSelectors.
Subfield Derived ItemSelectors: For example, when the database contains product sales information about a very large number of products, the individual sales prices could be used as ItemSelectors or alternatively price ranges, optionally defined by the user, can be used as ItemSelectors, or both these sets of ItemSelectors can be used.
The Price Range Selectors would be the derived ItemSelectors. Another, less obvious example is the day-of-week ItemSelector in a database where the entries are dates, because the day-of-week can be derived from the date. Yet another example: the first letter of the last name in a long list of names can be a (Disjunctive) ItemSelector to help narrow down the list of name ItemSelectors.
ItemSelector Synonyms: ItemSelector Synonyms are useful in many different contexts. The ItemSelector Synonym here is used with a broader meaning then the dictionary synonym. A synonym normally means a word with a similar meaning. ItemSelector Synonym includes that meaning but additionally includes any word whose meaning is narrower than, but contained within that of the ItemSelector.
So for example, if the original ItemSelector is Correspondence, then Email, Letter, Fax, and Voice Mail, could be that ItemSelector's Synonyms. When considering Subfield Value Selectors and using ItemSelector ranges, the values within a range are that range's synonyms.
ItemSelector Synonyms are not symmetrically related. So in the Correspondence ItemSelector example above, every Item that has the Email ItemSelector would also have assigned to it the ItemSelector Correspondence, but the converse could not be asserted: not every Correspondence is an Email.
Synonyms of ItemSelector Booleans: More generally the Synonym of an ItemSelector Boolean is the ItemSelector equivalent to a Boolean expression of other ItemSelectors. The example of Subfield Value Selectors, which represent ranges of values, can equivalently be considered as the Synonym of the disjunctive Boolean of all the detailed Value Selectors within the range.
Conventional & TIE DB Designs Compared: There are two levels of description of databases: the Logical Level and the Physical Level. At the Logical Level, a conventional Relational Database is described in terms of a logical Schema within a data definition language. The purpose of the Schema is to specify those properties (such as relationships, value types etc.) of a database that are permanently true, regardless of the particular data details or situation that applies at any particular time. The data dictionary is used to catalog the various data attributes and relations.
In contrast, the TIE system does not care how or where the data is stored because it is based entirely on data about data--usually called "Meta-Data"--not directly on the data itself. This allows total flexibility in the storage and the type of data stored. We will call this data about data the "Data BLOBS" because Meta-Data is already being used with a completely different meaning in the database context and its use here could cause misunderstanding. (BLOBS stands for Binary Linked and Organized Binary System.)
It is well known in other contexts (particularly in programming data structures) that it is much easier to track dynamic data when only references to the data are used. A very simple example of this is the use of pointers to data elements in databases when each data element can be stored anywhere, can be of any size and can be changed without in any way affecting the pointer.
All the data associations and descriptions are abstracted to the BLOBS. It is shown here that an appropriate logical optimal data structure of the BLOBS is a Binary Matrix. Its equivalent optimal physical data structure depends on the hardware and compiler implementations, but for current off-the-shelf hardware and compilers, an array of vectors (of varying dimensions) with integer (id) components is usually optimal.
Users of databases need to be able to see the data to help them form a question or search query. In forming the search query, they need to be guided to the available data only, to protect them from fruitless searches. To be useful to the average user, a database should not require the knowledge of complex query languages nor the knowledge and understanding of Boolean query expressions.
None of these requirements are met by current state-of-the-art databases. The TIE system however fulfils all these requirements and in addition makes the merging of disparate legacy databases relatively very easy.
We begin with a simple example, describing a possible conventional approach and the TIE approach, and then follow with a generalization, describing a common implementation. One of the properties of the TIE system is that a sufficiently general implementation will cover almost all the features needed in almost all implementations, with differences being confined to the GUI. Any small additional features that may become desirable can be easily added without affecting the main application.
The Conventional Approach: Consider a relational database containing customer and product information. In current databases, this would normally be handled with three types of records: one for the customer information another for the product information, and the third for the purchase orders, tied together through defined hierarchical relations. For example, the data of each product purchased by a customer will be linked to that customer's record, and a purchase order record.
Under current inflexible, structured databases, we have to decide what fields to assign to each record in the Customer database. As an example, we would naturally define a set of address fields for the Shipping Address, and another set of address fields for the Billing Address. Suppose now that the customer for some reason has two shipping addresses. (Perhaps one is for one type of product the other for other products.) We are now faced with the prospect of adding another set of address fields, but with no space originally allocated for them. Current databases would normally require us to add another address field to all customer records, even though only a very small fraction of the customers may need it. In addition if indexing is used, any tables we have created will all require updating when we do add such an extra field set.
Of course, with foresight at the outset, a possible need a different number of Address Fields for each customer would have been recognized, and this would have allowed creation of a table of addresses that would solve this particular problem of inflexibility. However, it is hard to determine at the outset which Fields will need a plurality of alternatives. The overhead of having each and every field be a table of fields is too great to make that approach practical.
Consider now how such a database could be organized and in particular how it could grow, using the TIE technology.
Descriptive Overview of TIE: The TIE deals with two classes of objects: Information Items (referred to simply as Items) and ltemSelectors (which are the individual descriptors within the TIE system vocabulary).
In implementations of the TIE technology the user may be presented with the entire vocabulary of ItemSelectors. The organization by which the vocabulary is presented will vary, but typically follows the following general approach, as illustrated in FIGS. 4, 5 and 6. As shown in FIG. 4, a user may be presented on a graphical screen 402 with lists of ItemSelectors and Items. ItemSelector list 404, for example, is a group of Person Description ItemSelectors (although not always true, we may assume for the moment that the ItemSelector is the same as the name by which it is represented). FIG. 4 shows an initiation of a search, before the user has chosen any ItemSelectors at all. A list 406 of (twelve) possible ItemSelectors for a "month observed" is presented, as well as a list 408 of (seven) possible ItemSelectors for Day Of Week Observed. Finally, an ItemSelector list 410 makes all possible ItemSelectors for Day (of the month). Thirty one such ItemSelectors are possible, but the situation is represented in which there is not sufficient room for all possible ItemSelectors to be directly presented the the user. Any graphical technique may be used; shown here is a "scroll bar" 412 by which the user, with aid of a mouse, can quickly scroll through the ItemSelectors that are not immediately visible.
Because no ItemSelectors have yet been selected to narrow the field of described Items, all Items belong to the Selected Item listing 414. Here, another technique for displaying less than all possibilities is illustrated: listing some of the Items, and indicating how many there are. More typically, the number of selected Items that are NOT displayed would be indicated. Here, a representative sample of eight selected Items is shown for illustration.
Turning to FIG. 5, it can be seen that the overall graphic presentation 502 has changed, as has the list 504 of possible Person Descriptions. That is because the user has chosen (and added to the present ItemSelector set) two ItemSelectors. One, "September," is clearly indicated as selected in list 506. The other, "Saturday," is indicated in the Day Of The Week list 508. Due to these selections, the number of possible ItemSelectors in the Day list 510 is reduced to just the days of the four Saturdays in September, i.e., to 7, 14, 21 and 28 (a single year is assumed for descriptive convenience). During Saturdays in September, a smaller set of Person Descriptions were recorded (and thus exist as ItemSelectors, or descriptors, in the database). In this example, four such Item selectors are applicable to persons observed on the limited days defined by the present set of ItemSelectors. Moreover, the Item listing of actual Items (persons, in this case, represented by some of their salient characteristics) is much reduced, as well, to just four that are described by the present set of Item Selectors.
FIG. 6 reflects the next user choice from Items and ItemSelectors presented in GUI 602. In the list 604, the user has chosen "Boisterous." This does not affect the Month Observed list 506 or Day Of Week Observed list 508. In some embodiments the user is permitted to choose an additional ItemSelector from these lists, which would INCREASE (typically) the number of selected Items shown in the list 614. Given the three ItemSelectors that have been chosen and thus constitute the present set of ItemSelectors, the Day list 610 reflects that a Boisterous person was observed only on the 7th of September. The Selected Items list 614 is also reduced by this further choice, now reflecting only two persons. unchosen possibilities that selects something further in the GUI 602, as is reflected in FIG. 6.
A user searches for Items of interest by selecting combinations of particular ItemSelectors from the vocabulary. This is typically done one at a time, either using a mouse click or by using the keyboard. Although multiple simultaneous selections are possible, they are either avoided or constrained in order to prevent "null hits" in which no Items are consistent with the selected combination of ItemSelectors. The simplest way to avoid such null hits is to renew that portion of the vocabulary that is presented as a selectable option to the user after each single ItemSelector choice entered by the user.
Thus, as each ItemSelector is chosen by a user, the remaining ItemSelector vocabulary that is made available to the user adjusts itself in such a way that at each stage any choice of an available ItemSelector will always result in at least one Item that matches all of the ItemSelectors selected thus, or in other words that fits the description that has been entered to such point.
Each ItemSelector chosen further describes a target that the user is seeking.
ItemSelectors in some Groups of ItemSelectors (descriptors) are mutually exclusive when describing the target Item(s); that is, if an Item is described by one of such mutually exclusive ItemSelectors, then it cannot be described by another. Such groups are called "disjunctive." An example of this is a group of ItemSelectors that will be called "gender." The ItemSelectors (descriptors) within this group include only "male," "female," "unknown," or "none." These ItemSelectors, as can be seen, are mutually exclusive as applied to any particular Item, and may be referred to as "disjunctive." Groups of ItemSelectors my be used in the manner of disjunctive Groups even if not all ItemSelectors within such Group are truly mutually exclusive; this will be a matter of choice and convenience for the purposes of a particular database.
In other Groups, the ItemSelectors are mostly NOT mutually exclusive, but instead have a large degree of overlap. Such Groups would be called conjunctive. A "Products" group of ItemSelectors is likely to contain the following ItemSelectors (descriptors), in addition to others: appliance, furniture, electrical, kitchen, outdoor, major, small, large, etc. Many of these descriptors can apply to a single product, and thus such a group would be treated as a conjunctive group.
A TIE system typically makes decisions about the Items selected by applying rules that take into consideration whether a particular ItemSelector selected by a User belongs to a conjunctive group of ItemSelectors, or a disjunctive Group. In some cases special rules apply to ItemSelectors belonging to groups that are bijunctive, meaning that ItemSelectors in such groups are often useful both conjunctively and disjunctively.
The precise Boolean Algebraic combination of the chosen ItemSelectors depends on the groups from which the ItemSelectors were chosen.
For example, the most common group type is termed Disjunctive, because selections of more than one ItemSelector from such a group implies the disjunctive "or" between them. Such ItemSelector selection increases (or in rare cases leaves unchanged) the number of selected Items and the available ItemSelectors.
The second most common group type is termed Conjunctive, because selection of more than one ItemSelector from such a group implies the conjunctive "and" between them. Such ItemSelector selection narrows down, i.e. decreases (or in rare cases leaves unchanged) the number of selected Items and the available ItemSelectors.
Other ItemSelector group types comprise the Exclusive Disjunctive (implying an exclusive "or") and the Negated Disjunctive (implying "or not") and the Negated Conjunctive (implying "and not"). Other, more complicated types are also useful and will be described.
In situations where said ItemSelector vocabulary is large it can be divided into a number of groups and group sets, logically organized to make navigation to the appropriate vocabulary terms easy for the user. In cases where the size of the vocabulary is such that even this arrangement makes navigation cumbersome, a higher level vocabulary can be created for the sole purpose of controlling the display of the various vocabulary groups and subgroups.
For example the Items either in one frame, window, or a set of frames or windows, each accessible using tabs, and listings (usually in several and sometimes in many list groups) of various descriptive ItemSelectors. Some of these ItemSelectors may be presented as buttons of various kinds, while others are presented as lists in columns, divided into tabs when necessary to accommodate larger numbers. Some ItemSelector groups may be initially hidden and only displayed under certain conditions, such as when the user makes appropriate choices of ItemSelectors and/or of control elements.
The Items are listed using some suitable identifiers or names as determined by the particular data. When no selection of ItemSelectors is made, all items are available to be listed, their number is displayed and a small subset of them, is usually listed at any time.
As the user chooses ItemSelectors that describe the Items of interest, the number of listed Items is updated, (usually reduced). These listed Items are the ones that match the description and will be here referred to as the Selected Items. The remaining available ItemSelector lists are also updated, (also usually reduced) to show only those ItemSelectors that are related to the already selected set through any Item. These ItemSelectors will be referred to here as the Available ItemSelectors. When the number of Selected Items is small enough, the user selects from the Item listing by name those Items to be viewed in detail. Each such Item chosen may be presented in its entirety in a new window.
In addition, in preferred implementations of TIE, a user option is provided enabling the display of Item counts associated with each ItemSelector. These Item counts reflect the number of Items, from amongst the current selected Item set, associated with each of the available ItemSelectors. Each time the user changes the selected ItemSelectors, these counts are updated. This gives the user an immediate "View" of the data in the database. In addition of course, the listing of ItemSelectors and their updating provides a continuously updating view of the data. No such views of the data in a traditional structured database are possible. Thus when using the TIE system, new, useful queries often suggest themselves to the user--something impossible under current systems, both structured and unstructured.
The ItemSelectors form a Vocabulary in terms of which the user can create descriptions of Items to be listed. The dynamic updating of the ItemSelector lists to show only available ItemSelectors, means that zero returns to any query (or search) never occur.
In many interesting applications, the ItemSelector Vocabulary consists of ItemSelectors that have different Boolean Properties. It is then necessary to divide these ItemSelectors into groups, each group defining the property. So for example, in a database using ItemSelectors with a large variety of properties, there would be groups with all the possible Boolean properties and all the possible value properties.
Applying TIE: Using the TIE technology we can begin implementing a database by deciding on the fields needed for each record we enter--just as in the current, old technology. Each record or Item in TIE, however, is free to have any number of fields, without burdening other records in any way. Additional fields can be added at any time--it is not necessary to know at the outset the number or kind of fields needed.
Each customer would be uniquely identified (as is currently usual also) with an ID number and each product and purchase order would likewise be so identified. Using the TIE technology, however, we could also decide precisely which groups of sub-fields we wish to list as separate Items and identify with a Record Type ItemSelector. In this example, we will assume, similarly to a standard database, that we have decided to describe the data groupings as three types of Items: the Customers, the Products, and the Purchase Orders. We could then use the linking number
ItemSelector Identification of Items: Each Customer would be assigned a number of descriptive attributes or ItemSelectors, their combined meaning identifying the type, name, and other attributes of the customer, this being the customer data, and the type, description, price range and other attributes of the product, being the product data, and the product identifiers, descriptions and other data, being the purchase order data.
Automatic ItemSelector Association: When a customer purchases a product, the data entry automatically also assigns, to that customer, and to the purchase order, the descriptive ItemSelectors or attributes of the product, which would also include the product name, and the product ID, as ItemSelectors. This is done automatically when the data entry of the purchase order is created. Such an assignment automatically associates the product with all its ItemSelectors, the customer and all customer ItemSelectors, and the purchase order with its ItemSelectors, plus those of the product and those of the customer.
Therefore when the user subsequently chooses an ItemSelector describing a product, all customers who purchased that product are also listed. To see a listing of only the products, and not the customers, you would choose the Record Type ItemSelector Products. Similarly to see only the Customers you would choose the Record Type ItemSelector Customers, similarly for purchase orders, using the Record Type ItemSelector Purchase Orders.
Other Record Associations: In this example, the details of each purchase order would normally constitute another record in a traditional database. Each purchase order would also have an identifying number. In the TIE database each such order would be just another, though differently classified, data Item, linked to the customer through all the customer ItemSelectors including the customer ID number, to the product through the product ItemSelectors, and to the Purchase order through both sets of ItemSelectors and possibly new, PO specific ItemSelectors.
Here is how the TIE database would be used to search data.
The major Record Type ItemSelectors: Customer, Product, Purchase Order, would be displayed either as buttons or on some separate list. Such broad Data Type ItemSelectors will typically be used in two ways: to restrict the display to only one Data Type, and to include more then one Data Type in the display. They may also be used to describe any new field or subfield needed for a particular record and so automatically associate it with the appropriate Items.
It is important to realize that in the TIE system, a new ItemSelector can be added at any time, as needed, without necessarily affecting ItemSelector assignments for any current Item.
It is also possible that an added ItemSelector may need to be assigned to some subset of already entered Items. When this happens, a possible interface would have the user first choose existing ItemSelectors to narrow down the listed Items to those, or mostly those needing the new ItemSelector. Then, through suitable controls, the user would indicate which of the listed Items are to have assigned which of the new ItemSelectors. One simple implementation of this interface allows the user to mouse-drag the ItemSelector to the selected Items.
Conjunctive, Disjunctive and Bijunctive ItemSelectors: Examples. Many ItemSelectors are Conjunctive, but some are Disjunctive. For example, in the customer-product-purchase order database we are discussing, price ranges of products and totals of each purchase order would be appropriate useful ItemSelectors, but they would be disjunctive, that is, they would automatically be included with an "OR" between them when more than one of their kind is selected. This is because it is not useful to search for products or purchase orders that are in two price ranges: in fact there should be none! Similarly, if days-of-week (on which the purchase order was initiated) are used as ItemSelectors, they too would be Disjunctive, because an order can only be initiated on one day. Descriptive ItemSelectors are usually Conjunctive. So for example ItemSelectors describing a product, such as "Electrical, Appliances, Kitchen" are three words that are usually used together to form a description, so they are Conjunctive ItemSelectors.
Distinctive Display of ItemSelector Types: One way to implement the distinction between the Conjunctive, Disjunctive, and Bijunctive ItemSelectors is to list them distinctively. For example, in one implementation the ItemSelector types are in separate lists. In another, the Disjunctive ones are buttons whereas the Conjunctive ones are on lists. Bijunctive ItemSelectors can be displayed either in two displays, in separate lists, or in one display and a control can be provided to switch between the types.
As a third alternative, or addition the display could use a modified word or phrase to represent each ItemSelector. For example, after the first ItemSelector in a group is chosen, the disjunctive "or" could be pre-fixed to each subsequent ItemSelector in the disjunctive display (or as a prefix to a listing) and the conjunctive "and" to those in the conjunctive display. Other ways to distinguish the two displays are possible and are a matter of interfaces, to be decided by any special needs of the particular application.
It is also possible to provide a way to enter explicitly the "AND" the "OR" and the "NOT" between the ItemSelectors. The user could explicitly enter the conjunction, disjunction, or negation with the aid of a control or using the keyboard, or the entry could be affected by using a modifier key while clicking on an ItemSelector.
Item Names: Usually the user decides, at the outset, which Subfields are to be used to identify a record in a listing display--that is how to name each Item. This decision can be left as a preference for the user of the TIE Database, with a default of the most likely choice.
For example, for the Customer database, the last and first names plus the zip code of the customer's shipping address would be possible choices. The display of Items could then be ordered alphabetically by last name or numerically by zip code, at user's option. In general, it is possible to choose any combination of Subfields as the Item name.
Similarly, the user can choose the identifiers to use in a display of the Products and Purchase Orders data.
For example, product Name and product ID number could be useful identifiers for the Products data, while the Purchase Order Number and Customer last Name and Product Name could be useful displays for the Purchase Order data.
Interface for Choosing Item Names: Users would be given the choice of which Subfield combinations to use as Item names for the display. A list of the ItemSelector names of all Subfields would be provided and the user would choose from that list the combination to use as the Item name.
Data Entry Interface: When entering data, the user would describe each data Field (alternatively in a more detailed mode, Data Subfield) by selecting those ItemSelectors from lists that describe the Field (or Subfield). Each selection would immediately list the fields that have in common the currently selected description. The user would continue adding ItemSelectors to the description until just one field was available. That would ensure that each field is uniquely identified through its ItemSelectors.
If a Subfield, described by the selected ItemSelectors, has not yet been defined, the user is allowed to create a new Subfield using those ItemSelectors to identify it, and add it to the list of Subfields. In this way new fields can be added, because they are made up of particular subgroups of individual Subfields.
Example Adding a Field: In the customer database, suppose we have defined two address fields with the following two ItemSelector sets (Commas separate ItemSelectors):
1 Customer, Shipping, Address.
2 Customer, Billing, Address.
Suppose that we now need to add another address for some customer and that there is no descriptive ItemSelector to distinguish it from the two addresses already used.
In that case we introduce a new ItemSelector, using any appropriate descriptive terms. A possible ItemSelector might be: Large Products. (A ItemSelector may use any number of words.) Having created such an ItemSelector by typing it in, it would appear in our list of ItemSelectors and we would be able to choose it to create a new, unique Field described by the following ItemSelectors: Customer, Shipping, Large Products, Address.
In this example, the Field defined by the ItemSelectors in (1) is referred to as the Parent Field of the Field defined by (3).
The Large Products ItemSelector then becomes available for use in combination with any other ItemSelectors and for assignment by the user to any Item, as may be appropriate.
Automatic ItemSelector Assignment: After adding a new ItemSelector, it may be useful to assign it to the appropriate existing Items. This can, of course, always be done manually, picking each relevant Item and through suitable controls, assigning the ItemSelector. But such manual assignment may not be practical when the number of relevant Item groups is large.
In that case a feature can be provided to automatically assign the new ItemSelector. The conditions selecting the appropriate ItemSelectors for such an assignment, will then be specified by the user and the automatic assignment process put into place.
The conditions for such an assignment can be dependent on data content and/or existing assigned ItemSelector combinations. When data content is the criterion, the automatic assignment process involves a search of content and so can use the current conventional optimized search techniques.
When a combination of ItemSelectors is included in the criteria, the Matrix can be used to quickly access the relevant Items.
When both criteria are used, the Matrix may be used first to reduce the number of relevant Items and then a conventional search performed through the reduced set of items.
For example, in the already cited example when adding the Large Products ItemSelector, it may be useful to classify all the large products by assigning that ItemSelector to them. A simple specification would be a list of product IDs or names that are considered Large. If product names are unique and are used as ItemSelectors, the user could assign the new ItemSelector manually by selecting the Disjunctive set of product ItemSelectors and indicating by some means that the new ItemSelector is to be assigned to all the listed products. One possible such indication would be a drag and drop of the new ItemSelector to the listing.
Union Set Subfields Defines New Field.: On a more detailed level, each Subfield is defined using descriptive ItemSelectors. When a new Field is added, it automatically contains the union set of all the currently selected Subfields, each with its corresponding relevant ItemSelector Description, defined by the selected ItemSelectors before the new ItemSelector was added--that is the Parent Field. However, any Subfield can be removed, and any new Subfield can be added to a newly defined Field. This frees completely every defined field from all restrictions of its Parent Field.
For example, if the Parent Field comprises Subfields that include the last Name, the First Name, Street, City, State, Zip, but has no Subfield for the Country (not needed for mail in the US) such a component may be added simply by choosing (or if not present adding and then choosing) the additional ItemSelector Country. Adding Country as a subfield implies that the address is not for US customers, so the subfield "State" is not exactly appropriate and so may be removed from the Field and from the Field ItemSelector Descriptions.
ItemSelector Uses: ItemSelectors can be used for defining, describing, accessing and associating Records, Fields and even Subfields, as well as for defining and creating new Records, Fields and Subfields.
In general ItemSelectors are to be regarded as a vocabulary to be used in descriptions of Items, Fields and Subfields and other, more specific ItemSelectors.
Relations Automatic: In a traditional, Relational Database the various relations have to be defined by the user, usually through a hierarchical structure. In a TIE Database, all relations are automatic through the ItemSelectors. In essence they are also defined by the user, but naturally, implicitly, by use of language--through the use of descriptive ItemSelectors and not restricted by the hierarchy.
For example, when a Customer Order is entered in the TIE Database, the new Record so defined is automatically (clearly with optional user override) classified with the ItemSelectors of the particular Customer and those of the particular Product, or Products ordered.
Example Scenario: Here is how the TIE Database system might be used.
Suppose the user selects ItemSelectors describing a set of products. These ItemSelectors could be one or more of the following types:
1 product description ItemSelectors (for example: Electrical, Small, Appliances, Kitchen)
2 product price range ItemSelectors
3 product name ItemSelectors
The listing will contain all products matching the ItemSelector descriptions plus all Customers who have bought any of these plus all Purchase Orders associated with them.
When choosing these ItemSelectors, the remaining available ItemSelector vocabulary is displayed and as individual ItemSelectors are chosen, the vocabulary is updated, showing only the related or available ItemSelectors. This process guides the user to the available information and simultaneously shows the user, through the ItemSelector display, the information within the database. At each step of the process the user can actually see into the database and so be better informed. All this is in great contrast to all present database possibilities.
The user can choose to narrow down the listing by choosing more ItemSelectors of any kind, and/or by choosing ItemSelectors describing the type of Records to view, that is, choosing from the Disjunctive set of ItemSelectors: Customers, Products, Purchase Orders. (Usually, all are shown when no choice is made.)
Once the Item list has been sufficiently narrowed to show only the desired Items, the user can obtain information about them, open them individually to see the details, note the counts of the various Items, or extract specified data from all Items or the selected items in the listing.
There are many different interfaces for selecting data to extract. They can be described generally as follows.
Extracting Data & Creating Reports: Assuming the user has narrowed down the listing of Items to those of interest, the user then selects the Items of interest from the listing, either individually or in groups. Then by choosing a menu or using a button control in a window, the user indicates the desire to extract data. The resulting window frames may show, in one, a listing of ItemSelectors describing each Field and Subfield within the selected Items, and in another a listing of the selected Subfields.
The user chooses the set of ItemSelectors describing the Subfields desired, narrowing or enlarging the list of selected Subfields. The user then picks, from the resulting list, those Subfields needed for the extracted data report. One GUI for doing this is to drag each Subfield to a Report window, locating each where desired and even adding descriptive text to each as appropriate.
Individual subfields selected can further offer the user the choice to insert in the report various statistics evaluated from the values of these subfields within the chosen set of Items. Another option can allow the user to create a formula involving the subfields, said formula to be evaluated for each Item selected and its specified statistics inserted in the chosen location in the report.
A final menu command or other control executes the data extraction, creating the report to be viewed on the screen for final editing and allowing he user to save it to a file. HTML or more generally XML may be a convenient file format to use, but any file format can be used.
Handling Field Values: Field values can be of four types: Text, Numbers, Dates (including time), and Mixed The first three are obvious, the last needs some explanation. Mixed type means that the Field contains a mixture of more then one of the other three types. Such a Mixed type can be parsed and split into its components and each component can then be treated as a separate type. The splitting can be defined by the user.
Often it is convenient to use Number Ranges as ItemSelectors rather then the actual numerical values, however there may be applications in which the actual values would be convenient ItemSelectors also. In those cases each of the possible values could be an ItemSelector, or position dependent Alpha-ItemSelectors could be used. The user can be allowed to choose how to convert the Field Values to ItemSelectors. A suitable interface would display the list of individual values, together with the frequency of occurrence of each, which can be grouped into ranges, allowing for the adjustment of these ranges. When groupings of the values are created, the interface should also display the cumulative frequencies associated with each group, to allow for balancing the groups by adjusting the ranges.
TIE Implementation in General: The application described here is very general and the particular details are determined by the specific application and specifics of the data.
As already mentioned, the application implementing TIE can be a single piece of software, referred to as the stand-alone implementation, or two separate pieces of software: the Server and the Client. The Client, in turn can also be of two types: a separate application, or a browser-based Client, implemented in any of the practical ways using either an automatically downloadable Java Applet, or some addition, plug-in or modification of the browser. All these possibilities are envisioned in what follows, although the two-piece, Client-Server implementation will be described. If the Stand-Alone implementation is used, it can still be built in similar fashion to the Client-Server, though more optimizations of response times to queries may then be possible and a communication protocol is unnecessary--making all data on the server side immediately accessible to the client side.
In the preferred implementation of the Client-Server version, the communication between the two can use either of the common protocols: HTTP or TCP or a custom protocol. TCP generally allows for a better communication time, but has the disadvantage of being blocked by certain fire-walls.
In certain applications it is convenient to develop a combined type application. This is a stand-alone application that also communicates with the same server as a Client. The mode of communication however, is adjustable. For example it can act as an ordinary TIE Client, keeping locally only the minimum ItemSelector information, or it can be a stand-alone application keeping all information contained in the Matrix and even possibly additionally all Item contents. In the event of the second possibility, periodic connections to the server would keep the local data up to date, as each connection would verify the time of the last change of each piece of data and send any needed new data.
Stateless Communication: The Client-Server implementation to be described assumes stateless communications, that is, each request from the Client is dealt with by the Server, independently of any previous or future requests from the same or different clients. Although a stateless implementation is not necessary, it has the advantage of not requiring the Server to keep track of concurrent Clients. Its principal disadvantage is that because each request is independent of prior requests, calculations of Booleans may sometimes not be as efficient as they could be--in some small additions to a Boolean query it may be advisable to require its complete re-evaluation. However, in most cases, Boolean evaluations can be made incrementally by having the client pass back the results of previous evaluations.
When using any application of TIE, we speak of user actions sending a "Query" to the server and the server responding, said response being processed and presented to the user by the Client.
TIE Applications Overview: The command flow of control in a Client-Server or stand-alone application implementing the TIE system will be outlined next. Following that, details of the various parts will be presented.
As usual, assuming the application is structured as a Client Server system (alternatively as a Client part and a Server part of a stand-alone application) the user interacts with the Client, which is the vehicle of the GUI. Many GUI implementations of the TIE technology are possible.
The objective of the TIE technology is to present the user choices to use to describe, in small steps, the information Item they want to find. After every such step in said description process, said user choices are updated to show only the available remaining choices.
One way to present the user with said choices is to display sets of words, phrases and/or graphics, described as the Vocabulary, using which the user composes a description of the Item of interest. For example, an implementation that uses only text as the Vocabulary may display descriptive key-words or phrases in lists, on buttons, as checkboxes, radio buttons or in other ways which allow user choices. This may be effected by a simple system that displays the Vocabulary in one or more alphabetized lists of key-word or phrase descriptions.
There are also many ways enabling the use of such lists in making up the particular Item description. One way to begin is to have the user mouse-click on any one appropriate word or phrase. Then to immediately update the Vocabulary display to indicate the remaining available Vocabulary, allowing further additional choices. In addition it is often convenient to also display the total number of matching Items and to display the first 10 or 20 of these matching Items by name. Another useful feature is to display, next to each member of the Vocabulary (that is, next to each ItemSelector), the current number of Items to which that ItemSelector is assigned. None of these displays are essential for the functioning of TIE, but they all add to its usefulness. As the user adds to the description, the list of matching Items usually shrinks, eventually becoming a sufficiently small number for the user to be able to choose from the Item listing.
The final step in the user search process is a request to get the Item or Items. This can also be done in many ways. One simple customary way is to let the user double-click on a listed Item or selected Items. Another is to click on a "Get Items" button, having selected the Items of interest in the listing. Other possibilities parallel other methods of selecting the ItemSelectors.
Once the Item or Items are requested, the detailed data can be presented in separate windows. That detailed data can be stored in any conventional database system or it can be stored in conventional computer files. The data held by the TIE system, includes either the detailed data for each item, or preferably the URL, the path or other reference data identifying the location of the Item, enabling the Item details to be displayed without a delaying search.
Examples of other possible implementations of ItemSelector and/or Item selection include the use of Speech recognition, the use of simple remote controls where each ItemSelector and or Item has displayed a number identifier, where the user selects an ItemSelector or Item by said number, and use of the eyes to control selections. The latter possibility is particularly useful for the severely handicapped. If a means is provided for the detection of which ItemSelector or Item the eyes are focused or directed at, then a pause of a minimum predetermined duration on an ItemSelector or Item could be used to indicate a selection.
It is often convenient to use whatever method of selection of ItemSelectors or Items is implemented as a "Toggle" that is, as a method of both selecting and deselecting the ItemSelector or Item. This makes it unnecessary to provide an additional control for deselecting individual ItemSelectors or Items, although it is still useful to provide a control that clears all selections.
Program Steps: Having outlined the general user driven functionalites enabled by an implementation of the TIE technology, we now proceed with a list of the steps that the software program implementing TIE might make. (This assumes a Client-server implementation, but the steps for a stand-alone implementation are similar, replacing the communications over a connection steps with communications internal to the program.)
1 The user starts the program or Client.
2 The Client sends first request to the Server.
3 The server responds with the Time Stamp (unless the Client's Time Stamp is current) with a listing of the ItemSelector names, Group numbers (if groups used), ID numbers, with the first Item Names, and with the number count of Items, number count of ItemSelectors, and if requested, the number of Items associated with each ItemSelector.
4 The Client receives response from Server and draws the display that includes the ItemSelector Vocabulary and the list of the alphabetically first 10 or 20 Items by name.
5 The user selects an ItemSelector (or deselects on already selected).
6 The Client sends a Boolean request, based on user selections of ItemSelectors, to the Server.
7 The server sends a response listing the available ItemSelectors, the number of Items Selected, and the alphabetically first Item names and ID numbers, and the Number Counts if requested. Such counts include the number of Items, from the Selected Set, which have each of the Available ItemSelectors assigned. That is, a count is associated with each Available ItemSelector.
8 The Client updates the display of the ItemSelector Vocabulary, Item counts of each ItemSelector and the list of the first Items from the Selected Items.
The above steps, from step 5, are repeated until the user selects an Item or Items and requests them, at which point the following happens:
9 User selects an Item and requests its contents.
10 Client sends request to Server for the contents of an Item. These contents can be the full Item data but more often are simply a URL or a path to the Item.
11 The server responds with the Item contents, no matter what these contents are. The type designation of the contents is also returned to the Client so the Client will know how to deal with the data. If the data contains the Item contents, the Client presents that to the user to read. If the data is a URL to the Item, the Client sends the URL to the Browser to be opened. If the Item contains some other reference to the Item data, it is dealt with by the Client who gets the data and presents it to the user.
The user can now go on to other searches, choose to start over form the beginning, or deselect an already selected ItemSelector, in both cases the steps start over from step 5. At any time, the user can select from the listed Items, or select all the chosen Items and perform a standard Content search using a conventional text or other data matching engine.
Other features can be implemented and these may need other controls. For example, controls may be provided for the following features:
Display Item counts for each ItemSelector
Display ItemSelectors in alternative orders, such as in order of Item counts or in order of frequency of use by user or in some other ordering.
Select an Item and request a listing of the ItemSelectors assigned to that Item. This requires the Client to send that request to the Server and then to act accordingly. The result of this is also a display of all Items with the same ItemSelectors.
Remember a filter--that is a combination of ItemSelectors. All remembered filters can be listed for the user to choose from in future quick searches. This does not require the intervention of the Server, although it could be remembered on the server. The Client can save these filters as combinations of ItemSelectors, in a file on the Client computer.
The organization of the ItemSelectors on the screen is used to make their relative location logical and selection easier. Screen organization is useful in displaying to the user the ItemSelector Groups that determine the translation of the ItemSelector selections to the Boolean query sent to the Server.
Building the ItemSelector Boolean: Overview: As already described, the more advanced and feature rich implementations of TIE divide the ItemSelectors into a number of Groups. Each group contains only one ItemSelector Type, that is, Groups are used to keep the Disjunctive and Conjunctive, Bijunctive and Negated ItemSelectors quite separate and to group different types of ItemSelectors together. The Boolean created from the user selections is determined by the ItemSelector Type and Group membership of each selected ItemSelector. The following example illustrates the relationship between the ItemSelector Type and the contribution the selection of that ItemSelector makes to the query Boolean.
Suppose A, B, C, D, stand for Conjunctive ItemSelectors. Suppose further that a, b, c, d, represent Disjunctive ItemSelectors in one Group and e, f, g represent Disjunctive ItemSelectors in a different Group. The following table shows the Booleans which result from the selection of the corresponding ItemSelectors:
Selected ItemSelectors Boolean Sent to Server
A,B A*B
A,B,a A*B*a
A,B,a,b A*B*(a+b)
A,B,a,b,e A*B*(a+b)*e
A,B,a,b,e,f A*B*(a+b)*(e+f)
A,!B A*!B
A,!B,!a,!b A*!B*!a*!b
A,!B,!a,!b,c,d A*!B*!a*!b*(c+d)
It is important to understand that the calculation of the available ItemSelectors (the IRV) involves more than one Boolean query when disjunctive ItemSelectors are involved. Thus the IRV resulting from the Boolean query A*B*(a+b) determines the available ItemSelectors in all groups other than the Disjunctive group (a,b) in which all ItemSelectors remain available.
Likewise the IRV resulting from the query A*B*(a+b)*e determines the available ItemSelectors in all groups except those containing the Disjunctive ItemSelectors (a,b,e). To determine the available ItemSelectors in the (a,b) group the modified query A*B*e must be sent to the server, whereas all ltemSelectors remain available in the group containing the Disjunctive ItemSelector e.
If any of the Disjunctive Groups are Exclusive, the "OR" operator is replaced with the "XOR" operator, but otherwise the procedure follows similar steps.
Finally, when negated Disjunctive ItemSelectors are selected, they become Conjunctive (DeMorgan's Law) but negated Conjunctive ItemSelectors remain Conjunctive.
This clearly illustrates that Disjunctive ItemSelectors sharing the same Group are parenthesized together when creating the Boolean to be sent to the Server. Furthermore, when determining the IRV (available ItemSelectors) resulting form a Boolean containing Disjunctive ItemSelectors, modified Booleans need to be used. Therefore it is necessary to track the ItemSelector Group to which each selected Disjunctive ItemSelector belongs, though this is not necessary for Conjunctive groups.
The interpretation of user choices and their conversion is normally done by the Client, though of course it could be done by the server. We have found it better to make the server as general as possible and as simple as possible, so that it should not be burdened with such details as which ItemSelectors are Disjunctive which Conjunctive, however when performance is an issue, the server should track the different groups, because the calculation of the available ItemSelectors (the IRV) involves multiple Boolean requests to the server and these can be optimized when the server knows the types of all groups.
Converting Selections to a Boolean: To interpret the user ItemSelector selections and convert them to a Boolean string, a function is needed in the Client, which accepts each selection and returns a Boolean string which is then passed to the server. Let us call this the boolean_selection function. This in turn can be divided into two steps (and so probably two functions). First is the conversion of the user selections to data in an array. Second, the conversion of this array to a Boolean string. The click location determines the ID number of the selected ItemSelector and the number of its Group. The Boolean_selection function holds the current selection in an array. When the user makes a selection of an ItemSelector "j" from Group "i" its ID "j" is added to any other ItemSelectors, if present in Group "i" Then the array is passed to the Boolean_selection function which returns the Boolean string. We first detail functions that store the ItemSelector selections in the Boolean array. Then we follow with the details of the conversion of this array into the Boolean query string.
Structure of the Boolean Array: An easy data structure to use to track and store the current ItemSelector selections is an array of struct, where the struct is an integer plus two strings. The integer stores the Group number of the ItemSelector, except for the Conjunctive ItemSelectors and negated Disjunctives that are all treated the same way, independently of groups. The first string holds the Boolean operator defining the group type, and the second holds the current Boolean accumulated expression for that ItemSelector Group, in the form of a string consisting of ItemSelector IDs and Boolean operators. Each Group type is either Conjunctive, Disjunctive, negated Conjunctive and negated Disjunctive. The Group's type determines how ItemSelector IDs are added to the current Boolean expression. When the Boolean Array is completed, the boolean_selection function converts it to the Boolean query string.
All Conjunctive ItemSelectors are stored in the first element of the struct. All the Conjunctive Negated and Disjunctive negated are stored in the second element, and the Group number part of the struct is not necessary for those two elements.
The three Boolean operators corresponding to each type of group are: "*" for Conjunctive groups, "+" for Disjunctive groups, and "*!" for Negated Conjunctive and negated Disjunctive Groups. Designated ItemSelectors can be negated by virtue of belonging to a group. Any other ItemSelectors can be negated by the choice of the user.
For example, an ItemSelector selection when a modifier key is pressed can mean the negative of the (normally non-negated) ItemSelector. Negated ItemSelectors, even when they belong to a Disjunctive group are added Conjunctively--because that is the most likely intuitive meaning the user intends and can easily understand.
As another example, in a TV Guide application, the days of the week are normally Disjunctive ItemSelectors--the user wants to know which programs are on Tuesday OR Wednesday (not Tuesday AND Wednesday). If the user chooses the ItemSelector Tuesday but negated and then chooses Wednesday also negated, clearly the meaning must be to find programs that are not on Tuesday AND not on Wednesday. A further choice of Thursday and the additional ItemSelector Friday must mean that the program is not on Tuesday AND not on Wednesday AND on (Thursday OR Friday).
To account for this, the Disjunctive ItemSelector that is negated is automatically placed into the array element for the negated Conjunctive Group. (Applications where this is not appropriate are free to interpret user choices in other ways and can even provide interfaces for the user to decide to override any automatic such choice.)
It is useful to standardize on a convention. For example, that the first element always holds all the ItemSelectors from all Conjunctive groups, the second one all ItemSelectors which are negated Conjunctive or Disjunctive, and the subsequent series of elements holds all the Disjunctive ItemSelectors, one element for each distinct Group.
As is obvious by the examples, and previously stated, Conjunctive ItemSelectors from different groups together with any negated ItemSelectors of any type are all combined together in one element--because it makes no difference to the resulting Boolean which group they come from. However, Disjunctive (non-negated) ItemSelectors have to retain their Group origin to the extent that the ltemSelectors from each group are grouped together and parenthesized to be Conjunctively added, as a group, to the output Boolean string, and in addition modified Booleans, omitting ItemSelectors from each Disjunctive group in turn are needed to determine the IRV appropriate for the disjunctive Groups.
Adding Selected ItemSelectors to the Boolean Array: Normally when an ItemSelector is selected or deselected by the user the following program actions are triggered:
1 The selected ItemSelector is added to the Boolean Array or the deselected ItemSelector is removed from the array.
2 The Boolean query string is created from the Boolean Array
3 The Boolean Query string is sent to the server.
4 The server responds, the Client parses the response.
5 The Client updates all displays in accordance with the response from the Server.
When a "Clear All" or a "Start Over" command is issued by the user, the Boolean Array is cleared of all its data. We now detail the first of these steps.
Adding a selected ItemSelector. The location of the selected ItemSelector determines its Group number and its ID number. A ItemSelector is identified by its ID alone. Its Group number can be looked up in the Group Table.
If the selected ItemSelector is from a Conjunctive, or a Conjunctive or Disjunctive negated Group the Boolean Array first element (or second element for the negated case) is checked for the presence of a string in its second string component. If the string is present, the Boolean operator for the group is added to that string followed by the ID number of the selected ItemSelector. If there is no string in the second component, the ID number without the Boolean operator is assigned to that string.
If the selected ItemSelector is from a Disjunctive group, its Group number is first looked-up and the above procedure is followed with the array element in question being the one corresponding to the specific ItemSelector Group as identified by the Group number.
When an ItemSelector is deselected, a similar procedure is followed, but this time a search needs to be made within the appropriate Boolean string and then a deletion performed of the found ID number. This deletion must also delete any Boolean operator that precedes it in the string.
Creating Boolean Query String from Boolean Array: The following describes the details of the second step triggered by the user selection of an ItemSelector.
After the user completes the selection of an ItemSelector, the ItemSelector Boolean expression to be sent to the server is put together from the Boolean Array. The Conjunctive Boolean operator "*" is used between non-null strings from the Array, enclosing in parentheses only those elements associated with Disjunctive groups when more than one ItemSelector in a group has been selected.
The Boolean selection function uses the following steps to create the accumulating Boolean string by combining each non-null string element in the Boolean Array with other non-null elements in that Array.
Let the current string element be current Boolean, then the accumulating Boolean is accumulating Boolean.=current Boolean at initialization and output Boolean=accumulating Boolean at completion.
If the current Boolean does not contain a Boolean operator or the current Boolean is a Conjunctive grouping (first or second element of the array), accumulating Boolean=accumulating Boolean*current Boolean
else
parenthesize the current Boolean first, giving:
accumulating Boolean=accumulating Boolean*(current Boolean)
When all non-null elements have been processed in the array the resulting Boolean string is the accumulating Boolean. It is sent to the server as the next query. The queries sent to the Server require the Server to respond in the minimum of time to a general Boolean expression linear in ItemSelectors. The information that associates each Item with its ItemSelectors is usually held in memory (RAM) for quicker access, and is referred to as the ItemSelector Matrix, or simply as the Matrix.
Server Parses Query: The TIE Server receives the query as a Boolean string. The following steps describe in overview the Server actions that follow.
The query string is parsed as is customary using a simplified arithmetic parser (because the rules for Booleans are the same as those for arithmetic expressions involving only multiply and add) that results in a parse tree structure of ID numbers and operators. The evaluation of these is a simple, well-known process, once we have detailed the evaluation of the elementary operator actions. These involve the use of the ItemSelector Matrix.
The ItemSelector Matrix: In what follows, the several implementations of the Matrix are described and the details of the evaluation of the elementary ItemSelector and Item Booleans are presented. It is convenient to regard the ItemSelector Matrix as a binary matrix of n by N bit elements (where n=number of ItemSelectors and N=number of Items), even when the implementation uses the ItemSelector Vector approach.
ItemSelector & Item Choice Features: The following describes first the details of the ltemSelector Vector implementation, and then follows with the details of the Bitmap implementation. There are two classes of features in implementations of TIE: Those based on user choices of ItemSelectors, and those based on user choices of Items.
Any means can pectors or Items. For example, choices from displayed lists can be by mouse pointer and click, or by keyboard using any suitable keys. How such choices are made is a matter of user interface design, and will depend on both the particular application, specific type of data, and the number of possible ItemSelectors to choose from. When the number of ItemSelectors listed is too large for easy practical presentation on screen, a special TIE method of access, using the keyboard, uses the herein described TIE technology in an independent, new technique, using a completely separate Matrix. This is described in Appendix II.
The TIE method comprises consecutive incremental ItemSelector choices, in which it is important for the user to see displayed, an updated list of available further choices immediately after making each ItemSelector choice in the sequence. Each user choice sends a query to the server, which, in turn, responds and the Client uses the response to update the display.
ItemSelector Filter: Some selected ItemSelectors can be Conjunctive others Disjunctive, while others can be Negated, that is, preceded by the Boolean operator NOT. The set of selected ItemSelectors comprises, what is referred to as an ItemSelector Filter, because it filters out all information other than that described by the ItemSelector Boolean in the filter. The ItemSelector Filter is built up incrementally by the user until the time the user decides to choose to access an Item. At that point we can say the user has defined the first Used Filter.
It is convenient to allow the user to save certain such filters so they can be accessed though a single mouse click of key. Sometimes it is also convenient for the client or the server to automatically save all such defined filters ever used and to keep frequency of use data. The most frequently used filters of user's could then be displayed for the user to easy access. This is the Frequently Used Filters (FUF) feature.
It is clear therefore that any Used Filter is arrived at through a number of user choices, following each one of said choices, an Intermediate Filter having been defined, and the Client having sent queries to the Server, based on each of said Intermediate Filters, and received responses from the Server.
Vector Boolean Algebra: Each said query Filter is in the form of a general Boolean expression, linear in ItemSelectors, but because of the incremental build of said expression, it can be evaluated incrementally in steps, where each step involves the evaluation of a very simple Boolean expression, consisting of two ItemSelector vectors and a Boolean operator between them. Either ItemSelector, or both ItemSelectors can be negated, and the only possible Boolean operators are the conjunctive AND and the two possible disjunctives OR and XOR. (For most applications only the first disjunctive is used.)
For example, using the star "*" to designate the Boolean AND operator, the plus "+" for the Boolen OR, and the exclamation point"!" for the Boolean prefix NOT, we can develop a simple symbolic algebra with very useful shorthand meanings within the TIE framework. (The development could also be presented using the theory of ordered sets.)
For example, the Boolean equation:
D1=C1*C2=(I1,I2,I3, . . . ) Eq. 1
defines the Derived ItemSelector vector D1, whose components are the ID numbers (I1, I2, I3 . . . ,) of Items filtered by the C1*C2 filter. Using a more descriptive l |