Supporting intuitive decision in complex multi-attributive domains using fuzzy, hierarchical expert models5983220Abstract A database evaluation system provides for intuitive end user analysis and exploration of large databases of information through real time fuzzy logic evaluation of utility preferences and nearest neighbor exploration. The system provides for domain modeling of various types of information domains using attribute mappings to database fields, and utility value weightings, allowing multiple different domain models to be coupled with a same database of information. User interaction with the evaluation system is through an interactive key generator interface providing immediate, iterative visual feedback as to which candidate items in the database match the user's partial query. A proximity searcher user interface provides for nearest neighbor navigation and allows the user to determine which items in the database are closest to a given item along each independent attribute of the items, and selectively navigate through such nearest neighbors. A fractal proximity searcher simultaneously displays multiple levels of nearest neighbors for user selected attributes. Claims I claim: Description III. MICROFICHE APPENDIX
__________________________________________________________________________
Listing 1
__________________________________________________________________________
000
begin
Comment: to normalize weights, set total weight to zero
first
001
total.sub.-- weight = 0.0
Comment: for each slider execute the following loop
002
for each slider do
Comment: add the weight (setting) of this slider to the total
weight
003 total.sub.-- weight = total.sub.-- weight + weight[slider]
Comment: do the next slider if any
004
next slider
Comment: now have the total weight
005
for each slider do
Comment: divide the weight of each slider by the total
006 weight[slider] = weight[slider] / total.sub.-- weight
Comment: this makes sure that the total of all weights is
1.0
007
next slider
Comment: now, for each item execute the following loop
008
for each item do
Comment: set the total utility of this item to zero
009 total.sub.-- utility[item] = 0.0
Comment: for each attribute, do the following
010 for each attribute do
Comment: calculate the weighted utility for the current
attribute as the product of
the utility as provided by value.sub.-- function above and the
relative weight of
this attribute
011 weighted.sub.-- utility[item] [attribute] =
utility[item] [attribute] * weight[attribute]
Comment: add this weighted attribute utility to the total
utility
012 total.sub.-- utility[item] = total.sub.-- utility[item] +
weighted.sub.-- utility[item] [attribute]
Comment: go to the next attribute if any
013 next attribute
Comment: go to the next item if any
014
next item
Comment: for each item, do the following
015
for each item do
Comment: display the item's utility (as a bar graph, printed
number, etc.)
016 display (total.sub.-- utility)
Comment: or display each component in a stacked bar chart or
pie chart
017 display (weighted.sub.-- utilities)
Comment: go to the next item, if any
018
next item
Comment: end
019
end
__________________________________________________________________________
The source code in Appendix A implements the above pseudo code, as well as related functionality, in one of many possible languages, in this case, C++. Second, the domain expert identifies 3.2 which fields of the database 2.4, in the opinion of the domain expert, are relevant to the definition of each of the attributes in the vocabulary. This process may involve the definition of intermediate terms. Performance, for example, may be defined as a combination of Acceleration, Top Speed, Handling and Braking. There is no theoretical limit to the number of intermediate levels in the resulting hierarchy. FIG. 5 illustrates a conceptual hierarchy of attributes, showing that Performance is defined by the attributes of Acceleration, Handling, Braking, Passing, and MPH. Acceleration is in turn defined by the times needed to accelerate to particular speeds, e.g. 0-30 mph, 0-60 mph, and so forth. Handling is likewise defined by attributes for Lateral accerelation, Slalom speed, and Balance. These attributes at the lowest level match individual fields in the underlying raw data 2.8 of the database. Third, the domain expert then defines for each attribute a mapping between the values attribute and a "util" value, a fuzzy, dimensionless, theoretical utility value ranging from a minimum of 0.0, which signifies "absolutely worthless", to a maximum of 1.0, equivalent to "couldn't be better". The domain expert defines 3.6 the attribute value having the minimum utility, and the attribute value 3.8 having the maximum utility. The domain expert also defines 3.4 the attribute value below which an item will have zero utility regardless of the values of any of its other attributes. For example, a car that gets only 5 mpg may be considered of zero utility overall. The dialog 6.0 shown in FIG. 6 shows one mechansim that allows the domain expert to set the minimum and maximum values 6.6, as well as defining 3.10 a transfer function 6.10 which defines the mapping relationship between the input values (raw data 2.8 in the database) to the util value. The dialog 6.0 provides information on the range 6.2 of input values for all records (for a given attribute), and for a selected set 6.4. The domain expert may also specify how out of range values of treated 6.8. FIG. 7 shows how the relationship between an attribute's value and that attribute's utility, as defined by the transfer function 6.10, is not necessarily linear. In this example, for all practical purposes a car incapable of traveling at least 60 mph scores a zero in top speed utility, as shown that an input of 60 in input column 7.2 maps to an output util value of 0.0 in column 7.14. From that speed on, though, the util value increases rapidly until it tops out at 120 mph (having a maximum utility value of 100), any improvement upon which is purely academic and does not translate into real-world utility, according to the domain expert defining this attribute. The window 7.0 in FIG. 7 also shows minimum 7.6 and maximum 7.4 util values on the y axis, and minimum 7.8 and maximum 7.10 attribute values on the x axis. Curve 7.12 shows the util values corresponding to any given point between the minimum and maximum atribute values on the x axis. Individual values may be edited manually at edit output 7.16 in order to precisely adjust the transfer function. Beside monotonic functions such as the one shown in FIG. 7, double-valued ones are common, e.g. a car's minimum desirable weight might be determined by accident behavior, its maximum by fuel economy and handling, with a maximum utility somewhere in between. The domain expert fine-tunes each of the definitions by assigning 3.12 relative weights, i.e. the degree of importance, to each of the constituent inputs, i.e. database fields or sub-terms, of a term. This is done by adjusting the sliders in the equalizer panel, such as shown in FIG. 4, and updating the weights, as described in Listing 1. Fourth, the domain expert identifies 3.12 which pairs of attributes may be considered compensatory, i.e. the shortcomings of which attribute may be compensated for by the strength of another. In addition, the degree of compensation at various combinations of input values is defined here. FIG. 8 illustrates a compensation matrix 8.0 that enables the domain expert to identify which attributes are compensatory with which other attributes by placing an "X" 8.2 at the appropriate location in the matrix 8.0. As illustrated, attributes A and B (e.g. Price and Maintenance Costs) are compensatory, as are B and D, B and F and D and G. This particular set of relationships illustrates the potential multiple interdependencies between attributes. Each relationship between compensatory attributes is modeled individually using a compensation map 9.0 as shown in FIG. 9. This 2-D representation of a three dimensional surface defines the combined utility of any combination of Attributes A and B. The surface is interpolated from values 9.2 (0.0 to 1.0) that the domain expert "drops" onto the map, resulting in contour lines e.g. line 9.4. Relationships that share a common attribute are combined into surfaces of higher dimensionality. Next, it is determined 3.14 whether or not there are more attributes. Referring to FIG. 3, this process of defining attributes is repeated 3.18 for each attribute that is selected by the domain expert for the domain model. When no more attributes remain 3.16, the domain model 2.10 is complete 3.20. The domain model 2.10 contains a list of attribute names, associated database fields, and the various attribute value definitions, transfer function, minimum and maximum values. The resulting domain model 2.10 is stored as an external file. Since explicit references to field names are stored within this model 2.10 it will only work in concert with the database file that was used to create the domain model. The standalone file format makes is convenient to assemble libraries of domain models 2.10 authored by different domain experts with different points of view. How well a database item "performs" in regards to a given domain model, i.e. its utility, is determined according to the algorithm in Listing 2, below:
__________________________________________________________________________
Listing 2
__________________________________________________________________________
000
begin
Comment: this loop executes once for each item in the
database
001
for each item do
Comment: initialize the utility value for this item to zero
002 utility[item] = 0.0
Comment: this loop executes once for each attribute in the key
query
003 for each attribute do
Comment: get the value for this attribute for this item from
the database
004 value = field.sub.-- name[item]
Comment: test if the value is "catastrophic" (so bad nothing
could
possibly compensate for this shortcoming)
005 if value < fail.sub.-- value
Comment: set the item's utility to zero since item is
unusable
006 utility[item] = 0.0
Comment: go to the end of the loop
007 go to line 15
Comment: if the value is not catastrophic, but below a certain
minimum
008 else if value < min.sub.-- value
Comment: then add no utility for this attribute to the current
overall
utility total of the item and go to the end of the attribute
loop
009 go to line 03
Comment: if the value exceeds a certain maximum value
010 else if value > max.sub.-- value
Comment: then add the highest score (1.0) to the current
utility total
011 utility[item] = utility[item] + 1.0
Comment: if the value is between the minimum and maximum
012 else
Comment: then calculate the utility for the product of the
criterion
function return result (as described below) and the
attribute's relative
weight.
013 utility[item] = utility[item] +
value.sub.-- function (value) *
weight [attribute]
Comment: end of loop for this attribute. If there are more
attributes to
process, jump back up to line 3
014 next attribute
Comment: end of loop for this item. If there are more items to
process,
jump back up to line 1
015
next item
Comment: all attributes for all items are processed
016
end
__________________________________________________________________________
Generally, the algorithm evaluates each item in the database (line 001), and for each attribute of the item (line 003), obtains the value of the attribute from the appropriate field in the database (line 004). The value is checked (line 005) to see if it is below the fail value, and if so, the item's utility is set to zero (line 006). Otherwise, the value is checked (line 008) to determine if it is below the minimum value, and if so there is no increase in the utility of the item due to this attribute. If the value exceeds the maximum utility (line 010), then 1 unit of utility is added to the item's utility (line 011). If the attribute's value falls between the mimimum and maximum, then the utility for the item is updated (line 013) as a function of the value of the attribute and the weight of the attribute. The value.sub.-- function in line 013 translates an input value of an attribute in the legal range between the minimum value and maximum value into a utility value using the transfer function 7.10. This process begins 10.0 and is illustrated in FIG. 10. Here, the value is determined 10.2 to be either discrete 10.6 or not 10.4. If it is discrete, then the utility is assigned 10.8 given the input value and the transfer function. It is then determined 10.10 whether or not there are more input values. If yes 10.12, then a utility is assigned 10.8 to the value. If not 10.14, then the criterion function 10.16 has been defined. If the value is determined 10.4 not to be discrete, a discrete predefined function may be used 10.18. If it is determined 10.20 to use a predefined function, it may be selected 10.24 from a list of predefined functions. One or more options for the function may also be selected 10.28. The outcome is a defined criterion function 10.16. If it is determined 10.22 not to use a predefined function, a custom function may be defined 10.26 to produce a defined criterion function 10.16. The utility for each attribute is calculated according to the algorithm outlined in Listing 3, which further describes one embodiment of FIG. 10:
__________________________________________________________________________
Listing 3
__________________________________________________________________________
000
begin
Comment: discrete attributes can be enumerated (i.e. degrees),
continuous ones
can't
001
if attribute = discrete
Comment: user enters number of possible input values (e.g.
binary -> two)
002 num.sub.-- values = user input (number of possible input values)
Comment: this loop executes once for each possible input
value
003 for each input.sub.-- value do
Comment: user enters a value between 0 and 1 for this input
value
004 utility[input.sub.-- value] = user input (utility value)
Comment: go to next input value, if any
005 next input.sub.-- value
Comment: otherwise, i.e. we're dealing w/ a continuous
attribute (e.g. salary)
006
else
Comment: predefined functions include linear, logarithmic,
exponential, etc.
007 if use.sub.-- predefined.sub.-- function
Comment: user enters name or reference number of predefined
function
008 attribute.function.sub.-- name = user input (function name)
Comment: user selects one or more options (such as inverse
operation)
009 attribute.function.sub.-- option = user input (option name)
Comment: otherwise, i.e. if user wants to defined own
function
010 else
Comment: one possible user interface for defining the curve is
shown in a
separate figure
011 attribute.curve = user input (curve shape)
Comment: we're all done
012
end
__________________________________________________________________________
The source code listing in Appendix B contains examplarly routines for manipulating the data structures necessary for the storage of domain models, as described above with respect to FIGS. 3-9, and deriving utility values from it, and described with respect to FIG. 10 and Listings 2-3. This source code example implements a version of the Database Evaluation System 1.0 for the selection of automobiles, with the exception of routine overhead functions (file and menu handling, OS calls, and so forth), which are of general application. A second use of the authoring tools 2.2 is the creation of a script 2.12 that controls a few operational details of the DES engine 2.14 execution, such as: 1) which field of the database is used as a thumbnail preview of the item (see FIG. 13), if any; 2) which field contains a full-size image of the item, if any; 3) which attributes are presented in which order to the user for generating a short list of database items for further evaluation; 4) what are the user interface elements used for determining those attributes, and what are minimum and maximum values for each attribute; 5) what are the relative weights of those attributes. Further details on the scripting feature are discussed in more detail in the user interface sections below. 2. Selection and Inspection of Database Items Referring to FIG. 11, there is shown the overall process 11.0 by which the user of the DES 1.0 selects and evaluates items in the database. The DES is started 11.2 by the user or other operator. The DES engine 2.14 loads 11.4 the domain model 2.10 and script 2.12 used to control evaluation of the model. A shortlist is initialized 11.6, which will contain a listing or index of items the user has selected or retrieved. The database and query structures are also initialized 11.6. An audit trail is initialized 11.8 for tracking the user's queries. The DES engine 2.14 then generates 11.10 a visual key which allows the user to quickly evaluate items in the database. FIGS. 13-16, further described below, illustrate the operation of one type of visual key, refered to herein as the Data Viewer. From the visual key and additional query inputs, the user obtains the shortlist of database items for further evaluation, and compares 11.12 these items for their various attributes. This process of evaluation, and inputs is interactive, with the DES engine 2.14 evaluating each partial query input and immediately updating the visual key with the partial items matches. The user may decide to individually inspect 11.16 the attributes of an individual item displayed in the visual key, obtaining detailed information about the item. The user may also explore 11.14 neighbors of the item in the database using a proximity searcher, herein refered to as the Navigator. A neighbor is an item having similar attribute values to another item. Exploration of neighbors is done by a nearest neighbor evaluation of attributes of other database items. FIG. 12 provides a detailed flowchart of the selection process including generation of the shortlist and visual key by iterative selection of critieria. C. User Interface Elements The process of selection, comparison, inspection and navigation of database items by user is handled by three user interface modules: A. Data Viewer B. Attribute Equalizer C. Navigator 1. Data Viewer Most database front ends display the result of a search in summary form, either textually as "32 items out of 256 found", or in the form of a graph indicating the same. This is usually the result of the user initiating a search against her query. Improperly constructed, or narrowly defined, queries often result in no matches at all, at times after considerable effort has been put into the query definition. An example from the domain discussed here, automobiles, is the construction of a query that searches for rotary-engined minivans for under $9,000 that have a top speed exceeding 150 mph. Many conventional front ends do not alert the user after the second criterion is defined that any further elaboration is moot since the selection list--the list of database items matching the query--is already empty. In contrast, the Data Viewer displays, at all times during query construction, a window with thumbnail graphical representations of each item in the database currently matching the query. Initially, each item is considered a match when no query is specified. With each step in the process of defining the selection criteria (i.e. a query), a search is run in the background and the resulting score for each item is reflected in one of several, user-selectable ways: the thumbnail corresponding to a particular item changes in size, hue or color according to the score. In the case of only a few hundred database items, performance even on low-end machines like 386s allows real-time screen updates, giving the user instant feedback as the selection criteria are changed. FIG. 13 illustrates one embodiment of this graphical representation of data by the Data Viewer 13.0. The Data Viewer 13.0 includes a results window 13.8 that contains a plurality of thumbnail images 13.2. Each thumbnail 13.2 represents one of database items, here one of the 223 cars in the database. In this particular online application, vehicles are visually differentiated only by bodystyle as to minimize the amount of graphic images that need to be downloaded from the database. FIG. 13 shows the initial state of every item matching a yet-to-be-defined query or set of selection criteria. The results window 13.8 depicts each item, in this case automobile models, in iconic form (the number of vehicles in the database was insufficient to fill the last row, thus there are two cars "missing"). For each attribute of an item, the user may establish criterion for evaluating the attribute, essentially providing a user weighting to the attribute. The controls window 13.10 displays the slider controls 13.4 for each criterion used in the query, in this case price. The slider control 13.4 in this example allows the user to define four points of weighting for any attribute, which can then be internally translated into a utility function 14.0, as shown in FIG. 14. The ability to adjust the weighting of an attribute introduces the element of fuzziness. Since the result of the search is not a simple match or non-match, but a suitability rating on a sliding scale of 1 to 10 (or 0.0 to 1.0, or any other arbitrary, continuous scale), it can be represented as a partial match such as illustrated in FIG. 15. A pane 13.6 may also be included in the control window 13.10 to allow a user to screen items according to specific features of items. In the example, shown, the features are vehicle options. In FIG. 15 the size of a thumbnail of an item is proportional to the score of the item, i.e. how well the item fits the selection range (i.e. between 18 and 30K: full size, from 30K up to 50K the size shrinks, as well as from 18K down to 5K), given the slider position in the control window 13.10 and the corresponding fuzzy weighting. An alternative method of reflecting the match index is by fading out to the thumbnail. This is shown in FIG. 16. FIG. 12 provides a detailed flowchart 12.0 of the operation of the DES in evaluating database items. The program is started 12.2 and initialized 12.4. The user may then select 12.6 an action. The actions may include, for example: selecting 12.8 an item in the database, and subseqquently inspecting 12.7 and/or adding 12.9 the item to a shortlist 12.10. The actions may also include comparing items on the shortlist to each other. Referring to FIG. 12, the user selects 12.12 various attributes, and defines 12.14 critierion for each attribute using the control window 13.10 slider control. The DES engine modifies 12.16 or expands 12.16 the current query 12.26 to incorporate the new criterion 12.14. The DES engine 2.14 then evaluates 12.18 each item in the database until all items have been evaluated 12.24, calculates 12.20 the items score based on the item's attribute values, and updates 12.22 the corresponding item in the display, and upon the critierion and utility function defined by the user, using a fuzzy logic evaluation and creating an array 12.28 of utility scores for the items. The DES engine 2.14 updates the Data Viewer 13.0 or other visual key display. The user may add 12.9 individual items to her short list 12.10 for further evaluation and inspection. Successive, iterative selection of criteria (such as, in this example bodystyle, engine and safety features) further reduces the number of database items on the shortlist 12.10. These criteria, and the sequence in which they are presented to the user, is determined by the aforementioned script 2.12 that is produced by the authoring tools 2.2. Listing 4 is the pseudo-code for one embodiment of the Data Viewer routine.
__________________________________________________________________________
Listing 4
__________________________________________________________________________
000
begin
Comment: we keep track of the current score for each item
001
create (total.sub.-- score.sub.-- array)
Comment: it's up to the user to select attributes, so we don't
know how often to
loop
002
while (user selects attribute) do
Comment: we store the score of each item for this attribute in
this array
003 create (attribute.sub.-- score.sub.-- array)
Comment: depending on the attribute type, we expect different
return values
from UI
004 case attribute type of
Comment: this is usually implemented as a simple check box
(yes or no)
005 binary:
Comment: user interface returns a boolean value
006 get criterion = boolean from user.sub.-- interface
Comment: usually implemented as a radio button
007 one.sub.-- of many:
Comment: user interface returns a single integer indexing one
of the possible
choices
008 get criterion = integer from user.sub.-- interface
Comment: usually implemented as a series of check boxes or
multi-selection list
009 many.sub.-- of.sub.-- many:
Comment: UI returns an array of integers (first element
indicating array length)
010 get criterion = integer.sub.-- array from
user.sub.-- interface
Comment: this can be implemented as a slider, a knob, a data
field, etc.
011 single.sub.-- number:
Comment: UI returns a single value indicating a maximum, min,
ideal, etc. value
012 get criterion = real.sub.-- number from
user.sub.-- interface
Comment: usually implemented as a dual thumbed slider, two
sliders, two fields . . .
013 number.sub.-- range:
Comment: UI returns two floating point values
014 get criterion = real.sub.-- range from user.sub.-- interface
Comment: usually implemented as a graph/slider combo w/ bell
curve over target
value
015 fuzzy.sub.-- number:
Comment: UI returns three value pairs describing curve
016 get criterion = fuzzy.sub.-- number from
user.sub.-- interface
Comment: usually implemented as dual-point (min, max)
graph/slider combo
017 fuzzy.sub.-- range:
Comment: UI returns four value pairs describing max range w/
fall-off
018 get criterion = fuzzy.sub.-- range from
user.sub.-- interface
Comment: end of possible data types
019 end of case
Comment: for each item in the database we do the following
loop
020 for each item do
Comment: get the value for the attribute of this item from the
database
021 get item.sub.-- attribute.sub.-- value from database
Comment: convert item's value to a score from 0.0 to 1.0
022 item.sub.-- score = value.sub.-- to.sub.-- score (item.sub.--
attribute.sub.-- value,
criterion)
Comment: multiply individual score w/ total score so far for
this item (note that if
the score for this item is 0.0, the total score is 0.0 as
well, i.e. one strike and the
item is out.
023 attribute.sub.-- score.sub.-- array[item] =
attribute score.sub.-- array[item] * item.sub.-- score
Comment: display item modified according to its score
024 display (item, attribute.sub.-- score.sub.-- array[item])
Comment: go to next item, if any
025 next item
Comment: let user choose another attribute
026
next attribute
Comment: end of short key definition
027
end
__________________________________________________________________________
The source code listing of Appendix C provides an examplary embodiment of the Data Viewer. a) Critique Mode No matter how well chosen the attributes of the domain model are, there will always be cases of non-analytical, indecisive users who are unable to define their criteria tightly enough to end up with a manageably-sized short list. This problem is addressed by a critique mode of the Data Viewer which is invoked either by the user or automatically by the DES engine 2.14 under certain conditions, such as by elapsed time, short list size, and the like. In the critique mode, one of the items in the database is selected for critique. The goal is to identify an item 17.4 that the user would not consider as a candidate for the shortlist. Either the DES engine 2.14 automatically selects an item and confirms with the user that it is indeed ineligible, or the user manually picks such an item. In order to "back into" a criterion list--establish the criterion from the user's selections--the user is then presented with the dialog that prompts the user to establish why the selected item would not be included in the shortlist. One such dialog is shown in FIG. 17. The dialog 17.0 includes various menus 17.2 of reasons why the user would not select the item. The user selects reasons from the menus which best describe why the item was not selected. If the user is able to select one or several reasons from the dialog choices, she is then asked if that criterion can be generalized and applied to all other cars to which is applies equally. If the user confirms, the program will have ascertained one more criterion and based on it will be able to eliminate other items from the selection list that share the same deficiency by having equal or less utility value for the attribute that is associated with the selected reason as does the presented item. This process repeats until a manageable short list 12.10 is arrived at, or the user creates a short list by manually selecting the items of choice. At this point, program operation proceeds to the following comparison phase. 2. Attribute Equalizer In traditional multi-attribute comparisons such as printed consumer product reviews and test, it is the editor's task to define the relevant attributes and their relative weights. A final score for each contender is determined by adding the products of the item's rating for each attribute times that attribute's weight: ##EQU1## where S=total Score n=Number of attributes U=Utility value W=relative Weight of an attribute i=attribute Index The Attribute Equalizer makes that process dynamic and puts users in control by allowing them to: 1) select from a variety of attributes the ones most relevant to them; 2) change the definition of those attributes; and 3) change their relative weights in real-time. These added degrees of freedom demonstrate an item's sensitivity to particular weight settings and how different preferences change the outcome of the comparison, while increasing a user's awareness of what considerations go into a decision. The Attribute Equalizer utilizes a two-part graphical user interface, as shown in FIG. 18. The control part 18.1 is analogous to an audio equalizer, with each slider 18.4 controlling the relative weight of a particular attribute 18.2. The upper display part 18.6 is a bar chart that displays the rating of each item in real-time, using a variety of display modes. The example of FIG. 18 lets a car buyer compare a number of cars on the basis of attributes such as Safety, Performance, Luxury, Utility, Economy (which would be derived from a database) and Fun (which is a subjective value determined by the editor, or manually entered by the user). The particular setting of sliders 18.4, i.e. weighting of attributes, in FIG. 18. results in the Volvo scoring highest. A different combination of preferences such as those in FIG. 19 can change the outcome significantly. A car's rating for a given attribute is expressed as a value between 0 (no utility, i.e. lowest possible score) and 1.0 (highest utility, i.e. perfect score). Each bar in the graph above is composed of the weighted utility of the car for each attribute. A stacked bar chart 20.2 as in the Excel example in FIG. 20 illustrates that fact, where each shaded region of a bar corresponds to the utility contribution of a single attribute. The flowchart in FIG. 21 demonstrates the process 21.0 of changing weights. The process is started 21.2 and the user moves 21.4 one of the sliders 18.4. The Attribute Equalizer normalizes 21.6 the weights of all attributes based on the new slider setting, and then calculates 21.8 weighted attribute scores for each attribute. The sum of the attribute weights for each item in the database is calculated 21.10, and the results displayed 21.12 in the display panel 18.6 of the Attribute Equalizer. The Attribute Equalizer monitors 21.14 for any more changes in the slider positions, and if so 21.16, responds accordingly with another iteration. Otherwise 21.18,the attribute weighting process is terminated 21.20. Weights and utility values are calculated as defined in the pseudo code of Listing 1, above. Additional features supported by the Attribute Equalizer include: Predefined evaluation models Domain models may be contributed from a variety of authors for user's selection, modification or combination. Customizability Hierarchical pulldowns that reflect the entire model hierarchy down to the database fields let users construct their own set of relevant attributes. Reverse Operation Allows a user to select the desired "winner". The program will attempt to reverse-engineer a set of corresponding weights to support the "decision", if such a set does indeed exist. "Explain mode" Definitions of attributes can be revealed to the user in layers of progressively more detail. Multiple display modes The default mode aggregates the individual scores of each attribute into a single total utility value and displays such as a bar. Alternative modes include: Stacked bar chart Each attribute's contribution is color-coded and keyed to the corresponding slider, as shown in FIG. 20. Pie/bar combination The bar references a pie chart the slices of which recursively reference constituent pie charts down to the database level. Variable-size pie The bar is replaced by a pie the size of which is relative to the score. Absolute slider mode In this mode, the slider setting does not indicate the relative weight of an attribute, but a desired level of utility for that attribute. Scoring is done by adding the surpluses and deficiencies separately and displaying the results in a bipolar bar chart. 3. Navigator The Navigator is a proximity searcher. At any stage of the program, this user interface module can be invoked by selecting a database item. The selected item will become the reference to which other items are compared using a proprietary proximity metric for proximity based searching. The purpose of this tool is to find the "nearest neighbor" items that are most similar to the reference item overall, but differ from it according to a particular attribute. Significant elements of this methodology include: Proximity metric The degree of similarity between two database items is defined as the n.sup.th root of the weighted n.sup.th powers of the difference between the items attribute utilities. n-dimensionality The Navigator supports as many dimensions as can be reasonably resolved by the display device. Similarity indicator The degree of similarity can optionally be indicated by the distance between the reference item and the nearest neighbor. Quick comparison At user request, a list of pluses and minuses is automatically generated from the differences between two neighbors. Fractal nature Depending on the resolution of the display device, multiple generations of neighbors may be displayed in successive Navigators, allowing one-click access to neighbors more than once removed. The process of traversing the nodes of the "nearest neighbor web" is shown in the flowchart in FIG. 22. To demonstrate the usefulness of this navigational tool, assume a case in which the user tries to visually locate items in three dimensions. In this case of just three attributes (e.g. Price, Horsepower, and Acceleration), a 2D or 3D scatter graph is adequate to situate each item according to its ratings and relative to the other items in the database for individual attributes, such as in FIG. 23. A display like this illustrates how similar a given car is to each of its neighbors: the farther the distance between two cars, the more different they are. The Navigator extends the concept of similarity into more than three dimensions and offers a visual interface that allows a user to move from car to car in n-dimensional space, as illustrated in FIG. 24. The Navigator 24.0 includes a center pane 24.2 in which the currently selected item is shown. Surrounding the center pane is one attribute pane 24.4 for each additional attribute or dimension of an item. In FIG. 24, there are eight attribute dimensions, and thus eight attribute panes 24.4 surrounding the center pane 24.2. The example of FIG. 24 represents a slice out of an 8-dimensional space and shows the user those cars that are most similar to the currently selected one at the center pane 24.2, but which also differ, as indicated above the images. In each surrounding pane, the "nearest neighbor" item in the database to the currently selected item is displayed. For example, from the currently selected car in the center pane 24.2, the Infinity Q45, the nearest neighbor with respect to the Better Acceleration attribute is the BMW 540i, and is displayed in the attribute pane 24.4 for acceleration. Other attribute panes 24.4 shows the nearest neighbor items in each of the other attributes. Note that some attributes may not have a nearest neighbor to display, for example, the Higher Top Speed attribute is empty because there is no item in the database that has a higher top speed than the selected car. Going back to the spatial paradigm in FIG. 13, the Navigator 24.0 "measures the distances" between the current item and all of the other items in the database with respect to each attribute. It then places the closest neighbor in the respective pane for the attribute. Thus, the car in the top left corner is most like the one in the center in all respects except Top Speed. The one at the top is most like the center one in all respects except Luxury, and so on. The general formula for computing the partial derivative, i.e. the distance D, or similarity, of an item (e.g. a car) from the reference item, is ##EQU2## where n=number of attributes U=utility value, or "score" W=relative weight of an attribute c=index of reference item j=index of current attribute. The proximity metric weights the distance between the reference and other items in the database by the weighted n.sup.th power of the difference between the attribute values, and takes the n.sup.th root. This approach provides an accurate assesment of the similarity of an item to another item with respect to all of the attributes for the items. The pseudo code listing in Listing 5 provides one implementation of the formula above.
__________________________________________________________________________
Listing 5
__________________________________________________________________________
000
begin
Comment: create an array of floating point numbers to store
score of each item
001
create (diff.sub.-- array)
Comment: loop for all items
002
for each item in database do
Comment: don't compare reference item to itself
003 if item == reference.sub.-- item
Comment: do the next item
004 do next item
Comment: initialize the sum total of all differences to zero
005 total.sub.-- difference = 0
Comment: loop for each attribute
006 for each attribute do
Comment: difference is between the attribute scores of
selected and reference
items
007 difference = attribute.sub.-- score[item] -
attribute.sub.-- score[reference.sub.-- item]
Comment: if this is the attribute that we're searching
008 if attribute == target.sub.-- attribute
Comment: store the value
009 attribute.sub.-- diff = difference
Comment: otherwise . . .
010 else
Comment: take the nth power of the difference and add it to
the total
011 total.sub.-- difference = total.sub.-- difference +
(difference power (num.sub.-- attributes - 1))
Comment: do the next attribute
012 next attribute
Comment: take the nth root of the total and normalize it
013 total.sub.-- difference = total.sub.-- difference root (num.sub.--
attributes -
1) / (num.sub.-- attributes - 1)
Comment: value.sub.-- of returns a value (0 to 1) that
indicates how close to the ideal
the attribute difference is to an ideal one (this is a
non-linear function).
Multiply that by the similiarity (= 1 - difference) and get a
value that indicates how
ideal a neighbor this item is
014 diff.sub.-- array[item] value.sub.-- of (attribute.sub.-- diff) * (1
-
total.sub.-- difference)
Comment: do the next item
015
next item
Comment: sort the score array
016
sort (diff.sub.-- array)
Comment: the first item in the array (if sorted in descending
order) is the one
unless it has a negative value (which indicates that there
exist no "better"
neighbors in the desired search direction)
017
display (diff.sub.-- array[first.sub.-- item])
Comment: all done
018
end
__________________________________________________________________________
The source code in Appendix D implements the above pseudo code in C++. Line 14 weights the difference between a reference item and another item with respect to a given attribute by the overall similarity between the item and the reference item for all other attributes. The scatter charts in FIG. 25 shows, as an example, the results of the above calculation for cars A to G in reference to car R, with respect to a single attribute, Performance. The distance D, or similarity between R and the other cars with respect to all other attributes, is mapped onto the y-axis, and the utility value, or score, U onto the x-axis. In a conventional proximity searcher car E would be selected as the nearest neighbor because it has just slightly more performance than car R In contrast, in the preferred embodiment of the Navigator, a heuristic function searches for the car that is "vertically closest" to the reference car R, and "a bit" to the right or left (depending on the search direction). Depending on the particular trade-off characteristics, either car B or C would be chosen to be most similar to car R, but with somewhat more Performance. Note that car A is not closer to car R with this proximity metric. In FIG. 25, if car C had a performance value at the maximum, then car B would be selected, using the heuristic of Listing 5, as the nearest neighbor. The reason that car B would be preferable is that there is an optimum desired increment in the user's quest for more performance, as defined by the utility function 14.0 constructed from the user's specification of critieria in the control window 13.10. That is, the user is not looking to maximize that attribute, but just get "somewhat" more of it. That ideal "somewhat", which may be user- and case-dependent, represents some distance along the x-axis. A car at that exact same spot on the axis as car R would be considered ideal since it is 100% similar to the reference car R and presents the desired increment in the search attribute. FIG. 22 provides a flowchart of the process of using the Navigator 24.0 The process is started 22.0 and the user selects 22.2 one of the neighbors in an attribute pane 24.4 by clicking on its image or name. The Navigator 24.0 checks 22.4 to make sure the item is a valid selection, and if so, makes that item the current choice, i.e. move 22.8 it to the center pane 24.2. If the item is not valid 22.6, another item may be selected 22.2 for a validity determination. For each attribute, the Naviator 24.0 determines 22.10 22.12 the nearest neig or to the current item. If there is no such neighbor 22.16, a message is displayed 22.28, or the attribute pane 24.4 left empty. If there is such a neighbor 12.14 the image of the nearest neighbor is displayed 22.18.The entire "web" of closest neighbors is thus re-computed now relative to the new reference item. For a particular reference item, the Navigator then considers whether there are more dimensions 22.20. If there are more dimensions 22.26, the Navigator determines 22.10 the nearest neighbor for the current reference item. If there are no more dimensions 22.24, the Navigator allows a new item to be selected 22.20. By repeatedly moving along the same or different axes (=attributes or dimensions), the user can explore the entire database while incrementally changing the desired mix of attributes. FIG. 26 illustrates the fractal nature of the web structure into which the Navigator organizes the database items. In this embodiment, for a user selected attribute, the Navigator 24.0 expands the attribute by displaying additional sub-Navigators, with each with a center pane 26.2 and surrounding attribute panes 26.4, for 1, 2, or more levels of data. For each level, the Navigator 24.0 replaces one attribute pane by taking the item in the attribute pane as the reference item and determining the nearest neighbors to item, and populating a set of attribute panes for that item. Using this fractal expansion the user can search large portions of the database while having a very intutive model of the relationships of each item to other items. Given unlimited display resolution, the entire web could be displayed and made random-accessible to the user. D. Implementation Issues All components of the Database Evaluation System 1.0 including the Authoring Tools may be written in Symantec C++ on the Macintosh platform. Porting to Microsoft Corp.'s Windows operating system has not yet been performed, however, the design of the software minimizes such effort, as much of the code as possible (about 75%) has been written without reliance on platform-specifics such as O/S calls and class libraries. 1. Authoring Tools The Authoring Tools (2.2 of FIG. 2) is a stand-alone application. The only interface requirements are to the external database 2.0. One currently implemented interface is to xBase and dBase format database files. The API comprises the following calls: 1. int DB.sub.-- Init (void) Initialize the external database engine 2.6. 1. FILE* DB.sub.-- OpenFile (char* fileName) Open specified database file. 2. int DB.sub.-- GetNumFields (FILE* file) Returns the number of fields in the database file. 3. int DB.sub.-- GetValue (FILE* file, char* fieldName, int fieldNum) Returns a pointer to the name of the specified field. 4. int DB.sub.-- GetFieldType (FILE* file, int fieldnum) Returns the data type for the specified database field. 5. void DB.sub.-- Close (FILE* file) Close the database file. The Authoring Tools 2.2 outputs two types of files: domain models 2.10 and key scripts 2.12. a) Domain Model The output of the Authoring Tool 2.2 is a domain model 2.10 that structures the data in a hierarchical fashion as illustrated in FIG. 5. A domain model 2.10 file consists of three sections: 1. Database filename Since a domain model is specific to a particular database file, each domain model has its own name. A domain model may be applied to any number of different database file having the proper field names for matching to the domain model components listed below. 2. Domain definition list This is a list of the names of the top-level attributes. These are the attributes presented to the user in the Attribute Equalizer. 3. Term list A term (such as Economy or Braking in the FIG. 5. above) is defined as two or more terms or database fields related as described below. All terms are compound variables and, as a consequence of the incommensurability of their constituents, can be measured only in arbitrary units which are refered to here as "utils". These can be considered the equivalent of a 1 to 10 scale. The actual range is from 0.0 to 1.0. The calculation that leads to a score for a term is an expansion on qualitative and numerical sum and weight models used in multiple attribute utility theory, specifically: ##EQU3## where S=total Score n=Number of attributes f=attribute-specific criterion function U=Utility value W=relative Weight of an attribute i=attribute Index The parameters stored with each term in the domain model thus are: 1. Normalized weight where 0.0<Weight<1.0. 2. Criterion function Up to 16 values pairs relating input value to output "utils" with intermediate values being linearly interpolated. 3. Out-of-range handling Treatment mode (clip or alert) for values below minimum and above maximum. b) Key Script The second file type generated by the Authoring Tool 2.2 is the key script file 2.12. This file contains the following information: Thumbnail field Name of the database field which contains a bitmap to be used as a thumbnail, if any. Image field Name of the database field which contains a full-size image of the item, if any. Key attribute list A list of criteria that are presented to the user for generating the short list 12.10 (FIG. 12). Each item in the key attribute list contains the following information: UI controls Which are the UI controls used for setting those criteria, and what are the minimum and maximum values. The supported types of UI controls are: Boolean: Checkboxes Enums: Radio buttons Floating point values: Single-thumbed slider Ranges: Dual-thumbed slider Custom functions: x-y diagram Weight What are the relative weights of those criteria (used for calculating scores for partial matches). 2. DES Engine The DES Engine 2.14 communicates with the following external modules as described in more detail below: A. Key Script B. Domain Model C. User Interface D. Database a) Key Script The DES engine 2.14 to key script 2.12 ("KS") interface is not application-specific and therefore requires no modification. The API comprises the following calls: 1. FILE* KS.sub.-- Create (char* fileName) Create a new key script with given filename (used only by Authoring Tool 2.2). 2. FILE* KS.sub.-- Open (char* fileName) Open the specified key script (in typical applications, only one default script is used). 3. int KS.sub.-- GetAttribute (FILE* file, attributeptr attr, int attrNum) Reads an attribute structure from the file for the specified attribute. 4. int KS.sub.-- GetThumbnailFieldName (FILE* file, char* fieldname) Returns the name of the database field to be used for a thumbnail. 5. int KS.sub.-- GetImageFieldName (FILE* file, char* fieldname) Returns the name of the database field that contains a picture of the item. 6. int KS.sub.-- Save (FILE* file) Saves the key script referenced by file (used only by Authoring Tools 2.2). 7. void KS.sub.-- Close (FILE* file) Closes the specified script file. b) Domain Model Like the DES/KS interface, the DES to domain model 2.10 ("DM") interface is not application- or platform-specific. The interface calls are: 1. FILE* DM.sub.-- Create (char* fileName) Create a new domain model with given filename. 2. FILE* DM.sub.-- Open (char* fileName) Open the specified domain model file. 3. int DM.sub.-- GetDatabaseFileName (FILE* file, char* fieldname) Returns the name of the database to be used with this model. 4. int DM.sub.-- GetNumTerms (void) Returns the number of names in the list of top level domain attributes. 5. int DM.sub.-- GetTermName (FILE* file, char* termName, int termNum) Returns the name of the specified term. 6. int DM.sub.-- GetTermData (FILE* file, TermPtr pTerm, int termNum) Returns information about the specified term. 7. void DM.sub.-- Close (FILE* file) Closes the specified domain model. c) User Interface Since the user interface is highly implementation-specific and provides extensive functionality, the DES/Ul API contains by far the most calls. A user interface implementation for Macintosh and utilizes API calls organized into the following groups: 1) Data Viewer Module As described above, the primary function of the Data Viewer module is to define various "hard" criteria, and to monitor their effect on the database items. The following functions make up the core of the API. i) Control functions: 1. critptr UI.sub.-- GetCriterionValue (char* criterionName) Returns a pointer to a criterion type-specific structure. Such types range is from simple booleans to fuzzy sets. 2. int UI.sub.-- SetCriterionValue (char* criterionName, critptr criterion). Sets a criterion definition to the value(s) contained in the criterion parameter. ii) Display functions: 1. int UI.sub.-- GetThumbnail (thumbptr* thumb, int itemNum) Returns a pointer to a bitmap containing a thumbnail to the requested item that is modified to reflect the item's score, i.e. the degree to which it matches the query. 2. int UI.sub.-- GetItemScore (double* score, int itemNum) Returns an item's score. This call allows the UI to handle the modification of bitmaps locally, instead of retrieving a bitmap every time. 3. int UI.sub.-- CreateSLDisplay (PortHandle hport, Rect* area, SLDisplayPtr* ppDisplay) This is a higher-level display function call. Together with the following SLDisplay-calls, it handles the management of a graph port that displays and updates the thumbnails in response to changes in selection criteria. 4. int UI.sub.-- UpdateSLDisplay (SLDisplayPtr pDisplay) Updates the specified display after a change in selection criteria has occured. 5. int UI.sub.-- CloseSLDisplay (SLDisplayPtr pDisplay) Closes the display and tidies up after itself. iii) Miscellaneous functions: 1. int UI.sub.-- InitShortList (listptr* ppList) Creates a new short list and returns a pointer to it. 2. int UI.sub.-- AddToShortList (listptr pList, int itemNum) Adds an item to the specified short list. 3. int UI.sub.-- RemoveFromShortList (listptr pList, int itemNum) Removes an item from the specified short list. 2) Attribute Equalizer The Attribute Equalizer allows a user to change the weights of the given attributes in real-time, as well as to load alternative domain models (DM.sub.-- Open as described above). API functions include: i) Control functions: 1. int UI.sub.-- GetAttrWeight (char* attrName, double* attrValue) Returns the current value for the weight of the specified attribute, ranging from 0.0 to 1.0. 2. int UI.sub.-- SetAttrWeight (char* attrName, double attrvalue) The inverse of UI.sub.-- GetAttrWeight. ii) Display functions: 1. int UI.sub.-- GetItemScore (int itemNum, double* itemScore) Returns the current score of a database item. This value is used to display an appropriate bar chart. 2. int UI.sub.-- CreateEQDisplay (PortHandle hport, Rect* area, listptr shortList, EQDisplayPtr* ppDisplay) This is a higher-level display function call that creates a bar graph in the specified rectangle of the specified graph port. The DES engine displays the scores for the items in the specified short list. 3. int UI.sub.-- UpdateEQDisplay (EQDisplayPtr pDisplay) Updates the specified display after a change in weights has occured. 4. int UI.sub.-- CloseEQDisplay (EQDisplayPtr pDisplay) Closes the graph and tidies up after itself. iii) Miscellaneous functions: 1. int UI.sub.-- SelectAEMode (int mode) Selects from various Attribute Equalizer modes. A discussion of these modes is not contained in the current version of this document. 3) Navigator The Navigator is, from a user interface-point of view, a very simple device to allow the user to explore most similar neighbors of a selected database item. Its simplicity keeps the list of API calls short: 1. int UI.sub.-- SetNVModel (char* modelName) Specifies which domain model to use for proximity calculations. 2. int UI.sub.-- SetNVRefItem (int itemNum) Specifies which database item becomes the navigator's reference item. 3. int UI.sub.-- GetNVNeighbor (int attrNum) Return the database item that is the closest neighbor in reference to the given attribute. 4. int UI.sub.-- CreateNVDisplay (PortHandle hport, Rect* area, NVDisplayPtr* ppDisplay) This is a higher-level display function call that creates a navigator display in the specified rectangle of the specified graph port. The DES engine 2.14 handles the display of bitmaps, attribute names, proximity scores, and so forth. 5. int UI.sub.-- updateNVDisplay (NVDisplayPtr pDisplay) Updates the specified display after a new reference item has been selected. 6. int UI.sub.-- CloseNVDisplay (NVDisplayPtr pDisplay) Closes the navigator display and tidies up after itself. d) Database The interface between the DES and the underlying database is the same as the API for the Authoring Tools, described above. Since the DES engine 2.14 does its own searching, the only frequent call to the database is the retrieval of values of a particular field for all records.
|
Same subclass Same class Consider this |
||||||||||
