Predictive modeling of consumer financial behavior6430539Abstract Predictive modeling of consumer financial behavior is provided by application of consumer transaction data to predictive models associated with merchant segments. Merchant segments are derived from consumer transaction data based on co-occurrences of merchants in sequences of transactions. Merchant vectors representing specific merchants are clustered to form merchant segments in a vector space as a function of the degree to which merchants co-occur more or less frequently than expected. Each merchant segment is trained using consumer transaction data in selected past time periods to predict spending in subsequent time periods for a consumer based on previous spending by the consumer. Consumer profiles describe summary statistics of consumer spending in and across merchant segments. Analysis of consumers associated with a segment identifies selected consumers according to predicted spending in the segment or other criteria, and the targeting of promotional offers specific to the segment and its merchants. Claims We claim: Description BACKGROUND
TABLE 1
Customer Summary File
Description Sample Format
Account_id Char[max 24]
Pop_id Char (`1`-`N`)
Account number Char[max 16]
Credit bureau score Short int as string
Internal credit risk Short int as string
score
Ytd purchases Int as string
Ytd_cash adv Int as string
Ytd_int purchases Int as string
Ytd int cash adv Int as string
State code Char[max 2]
Zip_code Char[max 5]
Demographic 1 Int as string
.
.
.
Demographic N Int as string
Note the additional, optional demographic fields for containing demographic information about each consumer. In addition to demographic information, various summary statistics of the consumer's account may be included. These include any of the following:
TABLE 2
Example Demographic Fields for Customer Summary File
Description Explanation
Cardholder zip code
Months on books or open date
Number of people on the account Equivalent to number of plastics
Credit risk score
Cycles delinquent
Credit line
Open to buy
Initial month statement balance Balance on the account prior to the
first month of transaction data pull
Last month statement balance Balance on the account at the end of
the transaction data pulled
Monthly payment amount For each month of transaction data
contributed or the average over last
year.
Monthly cash advance amount For each month of transaction data
contributed or the average over last
year.
Monthly cash advance count For each month of transaction data
contributed or the average over last
year.
Monthly purchase amount For each month of transaction data
contributed or the average over last
year.
Monthly purchase count For each month of transaction data
contributed or the average over last
year.
Monthly cash advance interest For each month of transaction data
contributed or the average over last
year.
Monthly purchase interest For each month of transaction data
contributed or the average over last
year.
Monthly late charge For each month of transaction data
contributed or the average over last
year.
Consumer transaction file 406. The consumer transaction file 406 contains transaction level data for the consumers in the consumer summary file. The shared key is the account_id. In a preferred embodiment, the transaction file has the following description.
TABLE 3
Consumer Transaction File
Description Sample Format
Account_id Quoted char(24) - [0-9]
Account_number Quoted char(16) - [0-9]
Pop_id Quoted char(1) - [0-128]
Transaction code Integer
Transaction_amount Float
Transaction_time HH:MM:SS
Transaction_date YYYYMMDD
Transaction type Char(5)
SIC code Char(5)- [0-9]
Merchant descriptor Char(25)
SKU Number Variable length list
Merchant zip code Char[max 5]
The SKU and merchant zip code data are optional, and may be used for more fine-grained filtering of which transactions are considered as co-occurring. The output for the DPM is the collection of master files 408 containing a merged file of the account information and transaction information for each consumer. The master file is generated as a preprocessing step before inputting data to the profiling engine 412. The master file 408 is essentially the customer summary file 404 with the consumer's transactions appended to the end of each consumer's account record. Hence the master file has variable length records. The master files 408 are preferably stored in a database format allowing for SQL querying. There is one record per account identifier. In a preferred embodiment, the master files 408 have the following information:
TABLE 4
Master File 408
Description Sample Format
Account id Char[max 24]
Pop_id Char (`1`-`N`)
Account number Char[max 16]
Credit bureau score Short int as string
Ytd purchases Int as string
Ytd_cash_advances Int as string
Ytd_interest on purchases Int as string
Ytd_interest on cash advs Int as string
State_code Char[max 2]
Demographic_1 Int as string
.
.
.
Demographic N Int as string
<transactions>
The transactions included for each consumer include the various data fields described above, and any other per-transaction optional data that the financial institution desires to track. The master file 408 preferably includes a header that indicates last update and number of updates. The master file may be incrementally updated with new customers and new transactions for existing customers. The master file database is preferably be updated on a monthly basis to capture new transactions by the financial institution's consumers. The DPM 402 creates the master file 408 from the consumer summary file 404 and consumer transaction file 406 by the following process: a) Verify minimum data requirements. The DPM 402 determines the number of data files it is handling (since there maybe many physical media sources), and the length of the files to determine the number of accounts and transactions. Preferably, a minimum of 12 months of transactions for a minimum of 2 million accounts are used to provide fully robust models of merchants and segments. However, there is no formal lower bound to the amount of data on which system 400 may operate. b) Data cleaning. The DPM 402 verifies valid data fields, and discards invalid records. Invalid records are records that are missing the any of the required fields for the customer summary file of the transaction file. The DPM 402 also indicates missing values for fields that have corrupt or missing data and are optional. Duplicate transactions are eliminated using account ID, account number, transaction code, transaction amount, date, and merchant description as a key. c) Sort and merge files. The consumer summary file 404 and the consumer transaction file 406 are both sorted by account ID; the consumer transaction file 406 is further sorted by transaction date. Additional sorting of the transaction file, for example on time, type of transaction, merchant zip code, may be applied to further influence the determination of merchant co-occurrence. The sorted files are merged into the master file 408, with one record per account, as described above. Due to the large volume of data involved in this stage, compression of the master files 408 is preferred, where on-the-fly compression and decompression is supported. This often improves system performance due to decreased I/O. In addition, as illustrated in FIG. 4a, the master file 408 may be split into multiple subfiles, such as splitting by population ID, or other variable, again to reduce the amount of data being handled at any one time. E. Predictive Model Generation System Referring to FIG. 4b, the predictive model generation system 440 takes as its inputs the master file 408 and creates the consumer profiles and consumer vectors, the merchant vectors and merchant segments, and the segment predictive models. This data is used by the profiling engine to generate predictions of future spending by a consumer in each merchant segment using inputs from the data postprocessing module 410. FIG. 5 illustrates one embodiment of the predictive model generation system 440 that includes three modules: a merchant vector generation module 510, a clustering module 520, and a predictive model generation module 530. 1. Merchant Vector Generation Merchant vector generation is application of a context vector type analysis to the account data of the consumers, and more particularly to the master files 408. The operations for merchant vector generation are managed by the merchant vector generation module 510. In order to obtain the initial merchant vectors, additional processing of the master files 408 precedes the analysis of which merchants co-occur in the master files 408. There are two, sequential, processes that are used on the merchant descriptions, stemming and equivalencing. These operations normalize variations of individual merchants names to a single common merchant name to allow for consistent identification of transaction at the merchant. This processing is managed by the vector generation module 510. Stemming is the process of removing extraneous characters from the merchant descriptions. Examples of extraneous characters include punctuation and trailing numbers. Trailing numbers are removed because they usually indicate the particular store in a large chain (e.g. Wal-Mart #12345). It is preferable to identify all the outlets of a particular chain of stores as a single merchant description. Stemming optionally converts all letters to lower case, and replaces all space characters with a dash. This causes all merchant descriptions to be an unbroken string of non-space characters. The lower case constraint has the advantage of making it easy to distinguish non-stemmed merchant descriptions from stemmed descriptions. Equivalencing is applied after stemming, and identifies various different spellings of a particular merchant's description as being associated with a single merchant description. For example, the "Roto-Rooter" company may occur in the transaction data with the following three stemmed merchant descriptions: "ROTO-ROOTER-SEWER-SERV", "ROTO-ROOTER-SERVICE", and "ROTO-ROOTER-SEWER-DR". An equivalence table is set up containing a root name and a list of all equivalent names. In this example, ROTO-ROOTER-SEWER-SERV becomes the root name, and the latter two of these descriptions are listed as equivalents. During operation, such as generation of subsequent master files 408 (e.g. the next monthly update), an identified equivalenced name is replaced with its root name from the equivalence table. In one embodiment, equivalencing proceeds in two steps, with an optional third step. The first equivalencing step uses a fuzzy trigram matching algorithm that attempts to find merchant descriptions with nearly identical spellings. This method collects statistics on all the trigrams (sets of three consecutive letters in a word) in all the merchant descriptions, and maintains a list of the trigrams in each merchant description. The method then determines a closeness score for any two merchant names that are supplied for comparison, based on the number of trigrams the merchant names have in common. If the two merchant names are scored as being sufficiently close, they are equivalenced. Appendix I, below, provides a novel trigram matching algorithm useful for equivalencing merchant names (and other strings). This algorithm uses a vector representation of each trigram, based on trigram frequency in data set, to construct trigram vectors, and judges closeness based on vector dot products. Preferably, equivalencing is applied only to merchants that are assigned the same SIC code. This constraint is useful since two merchants may have a similar name, but if they are in different SIC classifications there is a good chance that they are, in fact, different businesses. The second equivalencing step consists of fixing a group of special cases. These special cases are identified as experience is gained with the particular set of transaction data being processed. There are two broad classes that cover most of these special cases: a place name is used instead of a number to identify specific outlets in a chain of stores, and some department stores append the name of the specific department to the name of the chain. An example of the first case is U-Haul, where stemmed descriptions look like U-HAUL-SAN-DIEGO, U-HAUL-ATLANTA, and the like. An example of the second case is Robinsons-May department stores, with stemmed descriptions like ROBINSONMAY-LEE-WOMEN, ROBINSONMAY-LEVI-SHORT, ROBINSONMAY-TRIFARI-CO, and ROBINSONMAY-JANE-ASHLE. In both cases, any merchant description in the correct SIC codes that contain the root name (e.g. U-HAUL or ROBINSONMAY) are equivalenced to the root name. A third, optional step includes a manual inspection and correction of the descriptions for the highest frequency merchants. The number of merchants subjected to this inspection varies, depending upon the time constraints in the processing stream. This step catches the cases that are not amenable to the two previous steps. An example is Microsoft Network, with merchant descriptions like MICROSOFT-NET and MSN-BILLING. With enough examples from the transaction data, these merchant descriptors can also be added to the special cases in step two, above. Preferably, at least one set of master files 408 is generated before the equivalencing is determined. This is desirable in order to compile statistics on frequencies of each merchant description within each SIC code before the equivalencing is started. Once the equivalencing table is constructed, the original master files 408 are re-built using the equivalenced merchant descriptions. This steps replaces all equivalenced merchant descriptors with their associated root names, thereby ensuring that all transactions for the merchant are associated with the same merchant descriptor. Subsequent incoming transaction data can be equivalenced before it is added to the master files, using the original equivalence table. Given the equivalence table, a merchant descriptor frequency list can be determined describing the frequency of occurrence of each merchant descriptor (including its equivalents). Once the equivalence table is defined an initial merchant vector is assigned to each root name. The merchant vector training based on co-occurrence is then performed, processing the master files by account ID and then by date as described above. 2. Training of Merchant Vectors: The UDL Algorithm As noted above, the merchant vectors are based on the co-occurrence of merchants in each consumer's transaction data. The master files 408, which are ordered by account and within account by transaction date, are processed by account, and then in date order to identify groups of co-occurring merchants. The co-occurrence of merchant names (once equivalenced) is the basis of updating the values of the merchant vectors. The training of merchant vectors is based upon the unexpected deviation of co-occurrences of merchants in transactions. More particularly, an expected rate at which any pair of merchants co-occur in the transaction data is estimated based upon the frequency with which each individual merchant appears in co-occurrence with any other merchants, and a total number of co-occurrence events. The actual number of co-occurrences of a pair of merchants is determined. If a pair of merchants co-occur more frequently then expected, then the merchants are positively related, and the strength of that relationship is a function of the "unexpected" amount of co-occurrence. If the pair of merchants co-occur less frequently then expected, then the merchants are negatively related. If a pair of merchants co-occur in the data about the same as expected, then there is no generally relationship between them. Using the relationship strengths of each pair of merchants as the desired dot product between the merchant vectors, the values of the merchant vectors can be determined in the vector space. This process is the basis of the Unexpected Deviation Learning algorithm or "UDL". This approach overcomes the problems associated with conventional vector based models of representation, which tend to be based on overall frequencies of terms relative to the database as a whole. Specifically, in a conventional model, the high frequency merchants, that is merchants for which there are many, many purchases, would co-occur with many other merchants, and either falsely suggest that these other merchants are related to the high frequency merchants, or simply be so heavily down-weighted as to have very little influence at all. That is, a high frequency merchant names would be treated as high frequency English words like "the" and "and", and so forth, which are given very low weights in conventional vector systems specifically because of their high frequency. However, the present invention takes account of the high frequency presence of individual merchants, and instead analyses the expected rate at which merchants, including high frequency merchants, co-occur with other merchants. High frequency merchants are expected to co-occur more frequently. If a high frequency merchant and another merchant co-occur even more frequently than expected, then there is a positive correlation between them. The present invention thus accounts for the high frequency merchants in a manner that conventional methodologies cannot. The overall process of modeling the merchant vectors using unexpected deviation is as follows: 1. First, count the number of times that the merchants co-occur with one another in the transaction data. The intuition is that related merchants occur together often, whereas unrelated merchants do not occur together often. 2. Next, calculate the relationship strength between merchants based on how much the observed co-occurrence deviated from the expected co-occurrence. The relationship strength has the following characteristics: Two merchants that co-occur significantly more often than expected are positively related to one another. Two merchants that co-occur significantly less often than expected are negatively related to one another. Two merchants that co-occur about the number of times expected are not related. 3. Map the relationship strength onto vector space; that is, determine the desired dot product between the merchant vectors for all pairs of items given their relationship strength. The mapping results in the following characteristics: The merchant vectors for positively related merchants have a positive dot product. The merchant vectors for negatively related merchants have a negative dot product. The merchant vectors for unrelated merchants have a zero dot product. 4. Update the merchant vectors from their initial assignments, so that the dot products between them at least closely approximate the desired dot products. The next sections explain this process in further detail. a) Co-occurrence Counting Co-occurrence counting is the procedure of counting the number of times that two items, here merchant descriptions, co-occur within a fixed size co-occurrence window in some set of data, here the transactions of the consumers. Counting can be done forwards, backwards, or bi-directionally. The best way to illustrate co-occurrence counting is to give an example for each type of co-occurrence count: Example: Consider the sequence of merchant names: M1 M3 M1 M3 M3 M2 M3 where M1, M2 and M3 stands for arbitrary merchant names as they might appear in a sequence of transactions by a consumer. For the purposes of this example, intervening data, such dates of transactions, amounts, transaction identifiers, and the like, are ignored. Further assume a co-occurrence window with a size=3. Here, the co-occurrence window is based on a simple count of items or transactions, and thus the co-occurrence window represents a group of three transactions in sequence. i) Forward Co-occurrence Counting The first step in the counting process is to set up the forward co-occurrence windows. FIG. 6a illustrates the co-occurrence windows 602 for forward co-occurrence counting of this sequence of merchant names. By definition, each merchant name is a target 604, indicated by an arrow, for one and only one co-occurrence window 602. Therefore, in this example there are seven forward co-occurrence windows 602, labeled 1 through 7. The other merchant names within a given co-occurrence window 602 are called the neighbors 606. In forward co-occurrence counting, the neighbors occur after the target. For window size=3 there can be at most three neighbors 606 within a given co-occurrence window 602. Obviously, the larger the window size, the more merchants (and transactions) are deemed to co-occur at a time. The next step is to build a table containing all co-occurrence events. A co-occurrence event is simply a pairing of a target 604 with a neighbor 606. For the co-occurrence window #1 in FIG. 6a, the target is M1 and the neighbors are M3, M1, and M3. Therefore, the co-occurrence events in this window are: (M1, M3), (M1, M1), and (M1, M3). Table 5 contains the complete listing of co-occurrence events for every co-occurrence window in this example.
TABLE 5
Forward co-occurrence event table
Co-occurrence
Window Target Neighbor
1 M1 M3
1 M1 M1
1 M1 M3
2 M3 M1
2 M3 M3
2 M3 M3
3 M1 M3
3 M1 M3
3 M1 M2
4 M3 M3
4 M3 M2
4 M3 M3
5 M3 M2
5 M3 M3
6 M2 M3
The last step is to tabulate the number of times that each unique co-occurrence event occurred. A unique co-occurrence event is the combination (in any order) of two merchant names. Table 6 shows this tabulation in matrix form. The rows indicate the targets and the columns indicate the neighbors. For future reference, this matrix will be called the forward co-occurrence matrix.
TABLE 6
Forward Co-occurrence matrix
Neighbor
Target M1 M2 M3
M1 1 1 4 6
M2 0 0 1 1
M3 1 2 5 8
2 3 10 15
ii) Backward Co-occurrence Counting Backward co-occurrence counting is done in the same manner as forward co-occurrence counting except that the neighbors precede the target in the co-occurrence windows. FIG. 6b illustrates the co-occurrence windows for the same sequence of merchant names for backward co-occurrence counting. Once the co-occurrence windows are specified, the co-occurrence events can be identified and counted.
TABLE 7
Backward co-occurrence event table
Co-occurrence
Window Target Neighbor
1 M3 M2
1 M3 M3
1 M3 M3
2 M2 M3
2 M2 M3
2 M2 M1
3 M3 M3
3 M3 M1
3 M3 M3
4 M3 M1
4 M3 M3
4 M3 M1
5 M1 M3
5 M1 M1
6 M3 M1
The number of times that each unique co-occurrence event occurred is then recorded in the backward co-occurrence matrix.
Backward Co-occurrence matrix
Neighbor
Target M1 M2 M3
M1 1 0 4 2
M2 1 0 2 3
M3 4 1 5 10
6 1 8 15
Note that the forward co-occurrence matrix and the backward co-occurrence matrix are the transpose of one another. This relationship is intuitive, because backward co-occurrence counting is the same as forward co-occurrence counting with the transaction stream reversed. Thus, there is no need to do both counts; either count can be used, and then the transpose the resulting co-occurrence matrix taken to obtain get the other. iii) Bi-directional Co-occurrence Counting The bi-directional co-occurrence matrix is just the sum of the forward co-occurrence matrix and the backward co-occurrence matrix. The resulting matrix will always be symmetric. In other words, the co-occurrence between merchant names A and B is the same as the co-occurrence between merchant names B and A. This property is desirable because this same symmetry is inherent in vector space; that is for merchant vectors V.sub.A and V.sub.B for merchants A and B, V.sub.A.multidot.V.sub.B =V.sub.B.multidot.V.sub.A. For this reason, the preferred embodiment uses the bi-directional co-occurrence matrix.
Bi-directional Co-occurrence matrix
Neighbor
Target M1 M2 M3
M1 2 1 5 8
M2 1 0 3 4
M3 5 3 10 18
8 4 18 30
FIGS. 7a and 7b illustrate the above concepts in the context of consumer transaction data in the master files 408. In FIG. 7a there is shown a portion of the master file 408 containing transactions of a particular customer. This data is prior to the stemming and equivalencing steps described above, and so includes the original names of the merchants with spaces, store numbers and locations and other extraneous data. FIG. 7b illustrates the same data after stemming and equivalencing. Notice that the two transactions at STAPLES which previously identified a store number are now equivalenced. The two car rental transactions at ALAMO which transactions previously included the location are equivalenced to ALAMO, as are two hotel stays at HILTON which also previously included the hotel location. Further note that the HILTON transactions specified the location prior to the hotel name. Finally, the two transactions at NORDSTROMS which previously identified a department have been equivalenced to the store name itself. Further, a single forward co-occurrence window 700 is shown with the target 702 being the first transaction at the HILTON, and the next three transactions being neighbors 704. Accordingly, following the updating of the master files 408 with the stemmed and equivalenced names, the merchant vector generation module 510 performs the following steps for each consumer account: 1. Read the transaction data in date order. 2. Forward count the co-occurrences of merchant names in the transaction data, using a predetermined co-occurrence window. 3. Generate the forward co-occurrence, backward co-occurrence and bi-directional co-occurrence matrixes. One preferred embodiment uses a co-occurrence window size of three transactions. This captures the transactions as the co-occurring events (and not the presence of merchant names within three words of each other) based only on sequence. In an alternate embodiment the co-occurrence window is time-based using a date range in order to identify co-occurring events. For example, with a co-occurrence window of 1 week, given a target transaction, a co-occurring neighbor transaction occurs within one week of the target transaction. Yet another date approach is to define the target not as a transaction, but rather as a target time period, and then the co-occurrence window as another time period. For example, the target period can be a three month block and so all transactions within the block are the targets, and then the co-occurrence window may be all transactions in the two months following the target period. Thus, each merchant having a transaction in the target period co-occurs with each merchant (same or other) having a transaction in the co-occurrence period. Those of skill in the art can readily devise alternate co-occurrence definitions which capture the sequence and/or time related principles of co-occurrence in accordance with the present invention. b) Estimating Expected Co-occurrence Counts In order to determine whether two merchants are related, the UDL algorithm uses an estimate about the number of times transactions at such merchants would be expected to occur. Suppose the only information known about transaction data is the number of times that each merchant name appeared in co-occurrence events. Given no additional information, the correlation between any two merchant names, that is how strongly they are related, cannot be determined. In other words, we would be unable to determine whether the occurrence of a transaction at one merchant increases or decreases the likelihood of occurrence of a transaction at another merchant. Now suppose that it is desired predict the number of times two arbitrary merchants, merchant.sub.i and merchant.sub.j co-occur. In the absence of any additional information we would have to assume that merchant.sub.i and merchant.sub.j, are not correlated. In terms of probability theory, this means that the occurrence of a transaction at merchant.sub.i will not affect the probability of the occurrence of a transaction at merchant.sub.j : P.sub.j.vertline.i =P.sub.j [1] The joint probability of merchant.sub.i and merchant.sub.j is given by P.sub.ij =P.sub.i P.sub.j.vertline.i [2] Substituting P.sub.j for P.sub.j.vertline.i into equation [2] gives P.sub.ij =P.sub.i P.sub.j.vertline.i =P.sub.i P.sub.j [3] However, the true probabilities P.sub.i and P.sub.j, are unknown, and so they must be estimated from the limited information given about the data. In this scenario, the maximum likelihood estimate P for P.sub.i and P.sub.j is P.sub.i =T.sub.i /T [4] P.sub.j =T.sub.j /T [5] where T.sub.i is the number of co-occurrence events that merchant.sub.i appeared in, T.sub.j is the number of co-occurrence events that merchant.sub.j appeared in, and T is the total number of co-occurrence-events. These data values are taken from the bi-directional co-occurrence matrix. Substituting these estimates into equation [3] produces P.sub.ij =P.sub.i =P.sub.j =T.sub.i T.sub.j /T.sup.2 [6] which is the estimate for P.sub.ij. Since there are a total of T independent co-occurrence events in the transaction data, the expected number of co-occurring transactions of merchant.sub.i and merchant.sub.j is T.sub.ij =TP.sub.ij =T.sub.i T.sub.j /T [7] This expected value serves as a reference point for determining the correlation between any two merchants in the transaction data. If two merchants co-occur significantly greater than expected by T.sub.ij, the two merchants are positively related. Similarly, if two merchants co-occur significantly less than expected, the two merchants are negatively related. Otherwise, the two merchants are practically unrelated. Also, given the joint probability estimate P.sub.ij and the number of independent co-occurrence events T, the estimated probability distribution function for the number of times that merchant.sub.i and merchant.sub.j co-occur can be determined. It is well known, from probability theory, that an experiment having T independent trials (here transactions) and a probability of success P.sub.ij for each trial (success here being co-occurrence of merchant.sub.i and merchant.sub.j) can be modeled using the binomial distribution. The total number of successes k, which in this case represents the number of co-occurrences of merchants, has the following probability distribution: ##EQU1## This distribution has mean: ##EQU2## which is the same value as was previously estimated using a different approach. The distribution has variance: ##EQU3## The variance is used indirectly in UDL1, below. The standard deviation of t.sub.ij, .sigma..sub.ij, is the square root of the variance Var[t.sub.ij ]. If merchant.sub.i and merchant.sub.j are not related, the difference between the actual and expected co-occurrence counts, T.sub.ij -T.sub.ij, should not be much larger than .sigma..sub.ij. c) Desired Dot-Products Between Merchant Vectors To calculate the desired dot product (d.sub.ij) between two merchants vectors, the UDL algorithm compares the number of observed co-occurrences (found in the bidirectional co-occurrence matrix) to the number of expected co-occurrences. First, it calculates a raw relationship measure (r.sub.ij) from the co-occurrence counts, and then it calculates a desired dot product d.sub.ij from r.sub.ij. There are at least three different ways that the relationship strength and desired dot product can be calculated from the co-occurrence data: Method: UDL1 ##EQU4## Method: UDL2 ##EQU5## Method: UDL3 ##EQU6## where T.sub.ij is the actual number of co-occurrence events for merchant.sub.i and merchant.sub.j, and .sigma..sub.r is the standard deviation of all the r.sub.ij. In UDL2 and UDL3, the log-likelihood ratio, In.lambda. is given by: ##EQU7## Each technique calculates the unexpected deviation, that is, the deviation of the actual co-occurrence count from the expected co-occurrence count. In terms of the previously defined variables, the unexpected deviation is: D.sub.ij =T.sub.ij -T.sub.ij [16] Thus, D.sub.ij may be understood as a raw measure of unexpected deviation. As each method uses the same unexpected deviation measure, the only difference between each technique is that they use different formulas to calculate r.sub.ij from D.sub.ij. (Note that other calculations of dot product may be used). The first technique, UDL1, defines r.sub.ij to be the unexpected deviation D.sub.ij divided by the standard deviation of the predicted co-occurrence count. This formula for the relationship measure is closely related to chi-squared (.chi..sup.2), a significance measure commonly used by statisticians. In fact ##EQU8## For small counts situations, i.e. when T.sub.i <<1, UDL1 gives overly large values for r.sub.ij. For example, In a typical retail transaction data set, which has more than 90% small counts, values of r.sub.ij on the order of 10.sup.9 have been seen. Data sets having such a high percentage of large relationship measures can be problematic; because in these cases, .sigma..sub.r also becomes very large. Since the same .sigma..sub.r is used by all co-occurrence pairs, large values of .sigma..sub.r causes ##EQU9## to become very small for pairs that do not suffer from small counts. Therefore in these cases d.sub.ij becomes ##EQU10## This property is not desirable, because it forces the merchant vectors of two merchants too be orthogonal, even when the two merchants co-occur significantly greater than expected. The second technique, UDL2, overcomes of the small count problem by using log-likelihood ratio estimates to calculate r.sub.ij. It has been shown that log-likelihood ratios have much better small count behavior than .chi..sup.2, while at the same time retaining the same behavior as .chi..sup.2 in the non-small count regions. The third technique, UDL3, is a slightly modified version of UDL2. The only difference is that the log likelihood ratio estimate is scaled by ##EQU11## This scaling removes the ##EQU12## bias from the log likelihood ratio estimate. The preferred embodiment uses UDL2 in most cases. Accordingly, the present invention generally proceeds as follows: 1. For each pair of root merchant names, determine the expected number of co-occurrences of the pair from total number of co-occurrence transactions involving each merchant name (with any merchant) and the total number of co-occurrence transactions. 2. For each pair of root merchant names, determine a relationship strength measure based on the difference between the expected number of co-occurrences and the actual number of co-occurrences. 3. For each pair of root merchant names, determine a desired dot product between the merchant vectors from the relationship strength measure. d) Merchant Vector Training The goal of vector training is to position the merchant vectors in a high-dimensional vector space such that the dot products between them closely approximates their desired dot products. (In a preferred embodiment, the vector space has 280 dimensions, though more or less could be used). Stated more formally: Given a set of merchant vectors V={V.sub.1, V.sub.2, . . . , V.sub.N, and the set of desired dot products for each pair of vectors D={d.sub.12, d.sub.13, . . . , d.sub.1N, d.sub.21, d.sub.23, . . . , d.sub.2 N , d.sub.3,1, . . . , d.sub.N(N-1), position each merchant vector such that a cost function is minimized, e.g: ##EQU13## In a typical master file 408 of typical transaction data, the set of merchants vectors contains ten thousand or more vectors. This means that if it desired to find the optimal solution, then there must be solved a system of ten thousand or more high-dimensional linear equations. This calculation is normally prohibitive given the types of time frames in which the information is desired. Therefore, alternative techniques for minimizing the cost function are preferred. One such approach is based on gradient descent. In this technique, the desired dot product is compared to the actual dot product for each pair of merchant vectors. If the dot product between a pair of vectors is less than desired, the two vectors are moved closer together. If the dot product between a pair of vectors is greater than desired, the two vectors are moved farther apart. Written in terms of vector equations, this update rule is: ##EQU14## This technique converges as long as the learning rate (.alpha.) is sufficiently small (and determined by analysis of the particular transaction data being used; typically in the range 0.1-0.5), however the convergence may be very slow. An alternative methodology uses averages of merchant vectors. In this embodiment, the desired position of a current merchant vector is determined with respect to each other merchant vector given the current position of the other merchant vector, and the desired dot product between the current and other merchant vector. An error weighted average of these desired positions is then calculated, and taken as the final position of the current merchant vector. Written in terms of vector equations, the update rule is: ##EQU15## where V.sub.ij (n+1) is the updated position of the current merchant vector V.sub.i, and U.sub.ij is the desired position of current merchant vector V.sub.i with respect to each other merchant vector V.sub.j. U.sub.ij may be calculated using formula: ##EQU16## where d.sub.ij is the desired dot product between V.sub.i and V.sub.j, and .epsilon..sub.ij is the current dot product between V.sub.i and V.sub.j. Since U.sub.ij is a linear combination of merchant vectors V.sub.i and V.sub.j, it will always be in the plane of these vectors V.sub.i and V.sub.j. The result of any of these various approaches is a final set of merchant vectors for all merchant names. Appendix II below, provides a geometrically derived algorithm for the error weighted update process. Appendix III provides an algebraically derived algorithm of this process, which results in an efficient code implementation, and which produces the same results as the algorithm of Appendix II. Those of skill in the art will appreciate that the UDL algorithm, including its variants above, and the implementations in the appendices, may be used in contexts outside of determining merchant co-occurrences. This aspect of the present invention may be for vector representation and co-occurrence analysis in any application domain, for example, where there is need for representing high frequency data items without exclusion. Thus, the UDL algorithm may be used in information retrieval, document routing, and other fields of information analysis. 3. Clustering Module Following generation and training of the merchant vectors, the clustering module 520 is used to cluster the resulting merchant vectors and identify the merchant segments. Various different clustering algorithms may be used, including k-means clustering (MacQueen). The output of the clustering is a set of merchant segment vectors, each being the centroid of a merchant segment, and a list of merchant vectors (thus merchants) included in the merchant segment. There are two different clustering approaches that may be usefully employed to generate the merchant segments. First, clustering may be done on the merchant vectors themselves. This approach looks for merchants having merchant vectors which are substantially aligned in the vector space, and clusters these merchants into segments and computes a cluster vector for each segment. Thus, merchants for whom transactions frequently co-occur and have high dot products between their merchant vectors will tend to form merchant segments. Note that it is not necessary for all merchants in a cluster to all co-occur in many consumers' transactions. Instead, co-occurrence is associative: if merchants A and B co-occur frequently, and merchants B and C co-occur frequently, A and C are likely to be in the same merchant segment. A second clustering approach is to use the consumer vectors. For each account identifier, a consumer vector is generated as the summation of the vectors of the merchants at which the consumer has purchased in a defined time interval, such as the previous three months. A simple embodiment of this is: ##EQU17## where C is the consumer vector for an account, N is the number of unique root merchant names in the customer account's transaction data within a selected time period, and V.sub.i is the merchant vector for the i.sup.th unique root merchant name. The consumer vector is then normalized to unit length. A more interesting consumer vector takes into account various weighting factors to weight the significance of each merchant's vector: ##EQU18## where W.sub.i is a weight applied to the merchant vector V.sub.i. For example, a merchant vector may be weighted by the total (or average) purchase amount by the consumer at the merchant in the time period, by the time since the last purchase, by the total number of purchases in the time period, or by other factors. However computed, the consumer vectors can then be clustered, so that similar consumers, based on their purchasing behavior, form a merchant segment. This defines a merchant segment vector. The merchant vectors which are closest to a particular merchant segment vector are deemed to be included in the merchant segment. With the merchant segments and their segment vectors, the predictive models for each segment may be developed. Before discussing the creation of the predictive models, a description of the training data used in this process is described. F. Data Postprocessing Module Following identification of merchant segments, a predictive model of consumer spending in each segment is generated from past transactions of consumers in the merchant segment. Using the past transactions of consumer in the merchant segment provides a robust base on which to predict future spending, and since the merchant segments were identified on the basis of the actual spending patterns of the consumers, the arbitrariness of conventional demographic based predictions are minimized. Additional non-segment specific transactions of the consumer may also be used to provide a base of transaction behavior. To create the segment models, the consumer transaction data is organized into groups of observations. Each observation is associated with a selected end-date. The end-date divides the observation into a prediction window and an input window. The input window includes a set of transactions in a defined past time interval prior to the selected end-date (e.g. 6 months prior). The prediction window includes a set of transactions in a defined time interval after the selected end-date (e.g. the next 3 months). The prediction window transactions are the source of the dependent variables for the prediction, and the input window transactions are the source of the independent variables for the prediction. More particularly, the input for the observation generation module 530 are the master files 408. The output is a set of observations for each account. Each account receives three types of observations. FIG. 8 illustrates the observation types. The first type of observations are training observations which are used to train the predictive models that predicts future spending within particular merchant segments. If N is the length (in months) of the window over which observation inputs are computed then there are 2N-1 training observations for each segment. In FIG. 8, there are shown a 16 months of transaction data, from March of one year, to June of the next. Training observations are selected prior to the date of interest, November 1. The input window includes the 4 months of past data to predict the next 2 months in the prediction window. The first input window 802a thus uses a selected date of July 1, includes March-June to encompass the past transactions; transactions in July-August form the prediction window 803a. The next input window 802b, uses August 1 as the selected date, with transactions in April-July as the past transactions, August-September as prediction window 803b. The last input window for this set is 802d, which uses November 1 as its selected date, with an prediction window 803d of observations in November-December. The second type of observations are blind observations. Blind observations are observations where the prediction window does not overlap any of the time frames for the prediction windows in the training observations. Blind observations are used to evaluate segment model performance. In FIG. 8, the blind observations 804 include those from September to February, as illustrated. The third observation type is action observations, which are used in a production phase. Action observations have only inputs (past transactions given a selected date) and no target transactions after the selected date. These are preferably constructed with an input window that spans the final months of available data. These transactions are the ones on which the actual predictions are to be made. Thus, they should be the transactions in an input window that extends from a recent selected date (e.g most recent end of month), back the length of the input window used during training. In FIG. 8, the action observations 806 span November 1 to end of February, with the period of actual prediction being from March to end of May. FIG. 8 also illustrates that at some point during the prediction window, the financial institution sends out promotions to selected consumers based on their predicted spending in the various merchant segments. Referring to FIG. 4b again, the DPPM takes the master files 408, and a given selected end-date, and constructs for each consumer, and then for each segment, a set of training observations and blind observations from the consumer's transactions, including transactions in the segment, and any other transactions. Thus, if there are 300 segments, for each consumer there will be 300 sets of observations. If the DPPM is being used during production for prediction purposes, then the set of observations is a set of action observations. For training purposes, the DPPM computes transactions statistics from the consumer's transactions. The transaction statistics serve as independent variables in the input window, and as dependent variables from transactions in the prediction window. In a preferred embodiment, these variables are as follows: Prediction window: The dependent variables are generally any measure of amount or rate of spending by the consumer in the segment in the prediction window. A simple measure is the total dollar amount that was spent in the segment by the consumer in the transactions in the prediction window. Another measure may be average amount spent at merchants (e.g. total amount divided by number of transactions). Input window: The independent variables are various measures of spending in the input window leading up to the end date (though some may be outside of it). Generally, the transaction statistics for a consumer can be extracted from various grouping of merchants. These groups may be defined as: 1) merchants in all segments; 2) merchants in the merchant segment being modeled; 3) merchants whose merchant vector is closest the segment vector for the segment being modeled (these merchants may or may not be in the segment); and 4) merchants whose merchant vector is closest to the consumer vector of the consumer. One preferred set of input variables includes: (1) Recency. The amount of time in months between the current end date and the most recent transaction of the consumer in any segment. Recency may computed over all available time and is not restricted to the input window. (2) Frequency. The number of transactions by a consumer in the input window preceding the end-date for all segments. (3) Monetary value of purchases. A measure of the amount of dollars spent by a customer in the input window preceding the end-date for all segments. The total or average, or other measures may be used. (4) Recency_segment. The amount of time in months between the current end date and the most recent transaction of the consumer in the segment. Recency may be computed over all available time and is not restricted to the input window. (5) Frequency_segment. The number of transactions in the segment by a customer in the input window preceding the current end date. (6) Monetary_segment. The amount of dollars spent in the segment by a customer in the input window preceding the current end date. (7) Recency nearest profile merchants. The amount of time in months between the current end date and the most recent transaction of the consumer in a collection of merchants that are nearest the consumer vector of the consumer. Recency may be computed over all available time and is not restricted to the input window. (8) Frequency nearest profile merchants. The number of transactions in a collection of merchants that are nearest the consumer vector of the consumer by the consumer in the input window preceding the current end date. (9) Monetary nearest frequency merchants. The amount of dollars spent in a collection of merchants that are nearest the consumer vector of the consumer by the consumer in the input window preceding the current end date. (10) Recency nearest segment merchants. The amount of time in months between the current end date and the most recent transaction of the consumer in a collection of merchants that are nearest the segment vector. Recency may be computed over all available time and is not restricted to the input window. (11) Frequency nearest segment merchants. The number of transactions in a collection of merchants that are nearest the segment vector by the consumer in the input window preceding the current end date. (12) Monetary nearest segment merchants. The amount of dollars spent in a collection of merchants that are nearest the segment vector by the consumer in the input window preceding the current end date. (13) Segment probability score. The probability that a consumer will spend in the segment in the prediction window given all merchant transactions for the consumer in the input window preceding the end date. A preferred algorithm estimates combined probability using a recursive Bayesian method. (14) Seasonality variables. It is assumed that the fundamental period of the cyclic component is known. In the case of seasonality, it can be assumed that the cycle of twelve months. Two variables are added to the model related to seasonality. The first variable codes the sine of the date and the second variable codes the cosine of the date. The calculation for these variables are: Sin Input=sin(2.0*PI*(sample day of year)/365) Cos Input=cos(2.0*PI*(sample month of year)/365). (15) (Segment Vector-Consumer Vector Closeness: As an optional input, the dot product of the segment vector for the segment and the consumer vector is used as an input variable. In addition to these transaction statistics, variables may be defined for the frequency of purchase and monetary value for all cases of segment merchants, nearest profile merchants, nearest segment merchants for the same forward prediction window in the previous year(s). G. Predictive Model Generation The training observations for each segment are input into the segment predictive model generation module 530 to generate a predictive model for the segment. FIG. 9 illustrates the overall logic of the predictive model generation process. The master files 408 are organized by accounts, based on account identifiers, here illustratively, accounts 1 through N. There are M segments, indicated by segments 1 through M. The DPPM generates for each combination of account and merchant segment, a set of input and blind observations. The respective observations for each merchant segment M from the many accounts 1 . . . N are input into the respective segment predictive model M during training. Once trained, each segment predictive model is tested with the corresponding blind observations. Testing may be done by comparing for each segment a lift chart generated by the training observations with the lift chart generated from blind observations. Lift charts are further explained below. The predictive model generation module 530 is preferably a neural network, using a conventional multi-layer organization, and backpropagation training. In a preferred embodiment, the predictive model generation model 530 is provided by HNC Software's Database Mining Workstation, available from HNC Software of San Diego, Calif. While the preferred embodiment uses neural networks for the predictive models, other types of predictive models may be used. For example, linear regression models may be used. H. Profiling Engine The profiling engine 412 provides analytical data in the form of an account profile about each customer whose data is processed by the system 400. The profiling engine is also responsible for updating consumer profiles over time as new transaction data for consumers is received. The account profiles are objects that can be stored in a database 414 and are used as input to the computational components of system 400 in order to predict future spending by the customer in the merchant segments. The profile database 414 is preferably ODBC compliant, thereby allowing the accounts provider (e.g. financial institution) to import the data to perform SQL queries on the customer profiles. The account profile preferably includes a consumer vector, a membership vector describing a membership value for the consumer for each merchant segment, such as the consumer's predicted spending in each segment in a predetermined future time interval, and the recency, frequency, and monetary variables as previously described for predictive model training. The profiling engine 412 creates the account profiles as follows. 1. Membership Function: Predicted Spending in Each Segment The profile of each account holder includes a membership value with respect to each segment. The membership value is computed by a membership function. The purpose of the membership function is to identify the segments with which the consumer is mostly closely associated, that is, which best represent the group or groups of merchants at which the consumer has shopped, and is likely to shop at in the future. In a preferred embodiment, the membership function computes the membership value for each segment as the predicted dollar amount that the account holder will purchase in the segment given previous purchase history. The dollar amount is projected for a predicted time interval (e.g. 3 months forward) based on a predetermined past time interval (e.g. 6 months of historical transactions). These two time intervals correspond to the time intervals of the input window and prediction windows used during training of the merchant segment predictive models. Thus, if there are 300 merchant segments, then a membership value set is a list of 300 predicted dollar amounts, corresponding to the respective merchant segments. Sorting the list by the membership value identifies the merchant segments at which the consumer is predicted to spend the greatest amounts of money in the future time interval, given their spending historically. To obtain the predicted spending, certain data about each account is input in each of the segment predictive models. The input variables are constructed for the profile consistent with the membership function of the profile. Preferably, the input variables are the same as those used during model training, as set forth above. An additional input variable for the membership function may include the dot product between the consumer vector and the segment vector for the segment (if the models are so trained). The output of the segment models is a predicted dollar amount that the consumer will spend in each segment in the prediction time interval. 2. Segment Membership Based on Consumer Vectors A second alternate, membership aspect of the account profiles is membership based upon the consumer vector for each account profile. The consumer vector is a summary vector of the merchants that the account has shopped at, as explained above with respect to the discussion of clustering. In this aspect, the dot product of the consumer vector and segment vector for the segment defines a membership value. In this embodiment, the membership value list is a set of 300 dot products, and the consumer is member of the merchant segment(s) having the highest dot product(s). With either one of these membership functions, the population of accounts that are members of each segment (based on the accounts having the highest membership values for each segment) can be determined. From this population, various summary statistics about the accounts can be generated such as cash advances, purchases, debits, and the like. This information is further described below. 3. Updating of Consumer Profiles As additional transactions of a consumer are received periodically (e.g. each month) the merchant vectors associated with the merchants in the new transactions can be used to update the consumer vector, preferably using averaging techniques, such as exponential averaging over the desired time interval for the update. Updates to the consumer vector are preferably a function of dollars spent perhaps relative to the mean of the dollars spent at the merchant. Thus, merchant vectors are weighted in the new transaction period by both the time and the significance of transactions for the merchant by the consumer (e.g. weighted by dollar amount of transactions by consumer at merchant). One formula for weighting merchants is: W.sub.i =S.sub.i e.sup..lambda.t [28] where W.sub.i is the weight to be applied to merchant i's merchant vector; S.sub.i is the dollar amount of transactions at merchant i in the update time interval; t is the amount of time since the last transaction at merchant i; and .lambda. is a constant that controls the overall influence of the merchant. The profiling engine 412 also stores a flag for each consumer vector indicating the time of the last update. I. Reporting Engine The reporting engine 426 provides various types of segment and account specific reports. The reports are generated by querying the profiling engine 412 and the account database for the segments and associated accounts, and tabulating various statistics on the segments and accounts. 1. Basic Reporting Functionality The reporting engine 426 provides functionality to: a) Search by merchant names, including raw merchant names, root names, or equivalence names. b) Sort merchant lists by merchant name, frequency of transactions, transaction amounts and volumes, number of transactions at merchant, or SIC code. c) Filter contents of report by number of transactions at merchant. The reporting engine 426 provides the following types of reports, responsive to these input criteria: 2. General Segment Report For each merchant segment a very detailed and powerful analysis of the segment can be created in a segment report. This information includes: a) General Segment Information Merchant Cohesion: A measure of how closely clustered are the merchant vectors in this segment. This is the average of the dot products of the merchant vectors with the centroid vector of this segment. Higher numbers indicate tighter clustering. Number of Transactions: The number of purchase transactions at merchants in this segment, relative to the total number of purchase transactions in all segments, providing a measure of how significant the segment is in transaction volume. Dollars Spent: The total dollar amount spent at merchants in this segment, relative to the total dollar amount spent in all segments, providing a measure of dollar volume for the segment. Most Closely Related Segments: A list of other segments that are closest to the current segment. This list may be ranked by the dot products of the segment vectors, or by a measure of the conditional probability of purchase in the other segment given a purchase in the current segment. The conditional probability measure M is as follows: P(A.vertline.B) is probability of purchase in segment A segment in next time interval (e.g. 3 months) given purchases in segment B in the previous time interval (e.g. 6 months). P(A.vertline.B)/P(A)=M. If M is >1, then a purchase in segment B is positively influencing the probability of purchase in segment A, and if M<1 then a purchase in segment B negatively influences a purchase in segment A. This is because if there is no information about the probability of purchases in segment B, then P(A.vertline.B)=P(A), so M=1. The values for P(A.vertline.B) are determined from the co-occurrences of purchases at merchants in the two segments, and P(A) is determined and from the relative frequency of purchases in segment A compared to all segments. A farthest segments list may also be provided (e.g. with the lowest conditional probability measures). b) Segment Members Information Detailed information is provided about each merchant which is a member of a segment. This information comprises: Merchant Name and SIC code; Dollar Bandwidth: The fraction of all the money spent in this segment that is spent at this merchant (percent); Number of transactions: The number of purchase transactions at this merchant; Average Transaction Amount: The average value of a purchase transaction at this merchant; Merchant Score: The dot product of this merchant's vector with the centroid vector of the merchant segment. (A value of 1.0 indicates that the merchant vector is at the centroid); SIC Description: The SIC code and its description; This information may be sorted along any of the above dimensions. c) Lift Chart A lift chart useful for validating the performance of the predictive models by comparing predicted spending in a predicted time window with actual spending. Table 10 illustrates a sample lift chart for merchant segment:
TABLE 10
A sample segment lift chart
Cumulative Cumulative Cumulative
Bin segment lift segment lift in S Population
1 5.56 $109.05 50,000
2 4.82 $94.42 100,000
3 3.82 $74.92 150,000
4 3.23 $63.38 200,000
5 2.77 $54.22 250,000
6 2.43 $47.68 300,000
7 2.20 $43.20 350,000
8 2.04 $39.98 400,000
9 1.88 $36.79 450,000
10 1.75 $34.35 500,000
11 1.63 $31.94 550,000
12 1.52 $29.75 600,000
13 1.43 $28.02 650,000
14 1.35 $26.54 700,000
15 1.28 $25.08 750,000
16 1.21 $23.81 800,000
17 1.16 $22.65 850,000
18 1.10 $21.56 900,000
19 1.05 $20.57 950,000
20 1.00 $19.60 1,000,000
Base-line -- $19.60
Lift charts are created generally as follows: As before, there is defined input window and prediction window, for example 6 and 3 months respectively. Data from the total length of these windows relative to end of the most recent spending data available is taken. For example, if data on actual spending in the accounts is available through the end of the current month, then the prior three months of actual data will be used as the prediction window, and the data for the six months prior to that will be data for input window. The input data is then used to "predict" spending in the three month prediction window, for which in fact there is actual spending data. The predicted spending amounts are now compared with the actual amounts to validate the predictive models. For each merchant segment then, the consumer accounts are ranked by their predicted spending for the segment in the prediction window period. Once the accounts are ranked, they are divided into N (e.g. 20) equal sized bins so that bin 1 has the highest spending accounts, and bin N has the lowest ranking accounts. This identifies the accounts holders that the predictive model for the segment indicated should be are expected to spend the most in this segment. Then, for each bin, the average actual spending per account in this segment in the past time period, and the average predicted spending is computed. The average actual spending over all bins is also computed. This average actual spending for all accounts is the baseline spending value (in dollars), as illustrated in the last line of Table 10. This number describes the average that all account holders spent in the segment in the prediction window period. The lift for a bin is the average actual spending by accounts in the bin divided by the baseline spending value. If the predictive model for the segment is accurate, then those accounts in the highest ranked bins should have a lift greater than 1, and the lift should generally be increasing, with bin 1 having the highest lift. Where this the case, as for example, in Table 10, in bin 1, this shows that those accounts in bin 1 in fact spent several times the baseline, thereby confirming the prediction that these accounts would in fact spend more than others in this segment. The cumulative lift for a bin is computed by taking the average spending by accounts in that bin and all higher ranking bins, and dividing it by the baseline spending (i.e. the cumulative lift for bin 3 is the average spending per account in bins 1 through 3, divided by the baseline spending.) The cumulative lift for bin N is always 1.0. The cumulative lift is useful to identify a group of accounts which are to be targeted for promotional offers. The lift information allows the financial institution to very selectively target a specific group of accounts (e.g. the accounts in bin 1) with promotional offers related to the merchants in the segment. This level of detailed, predictive analysis of very discrete groups of specific accounts relative to merchant segments is not believed to be currently available by conventional methods. d) Population Statistics Tables The reporting engine 426 further provides two types of analyses of the financial behavior of a population of accounts that are associated with a segment based on various selection criteria. The Segment Predominant Scores Account Statistics table and the Segment Top 5% Scores Account Statistics table present averaged account statistics for two different types of populations of customers who shop, or are likely to shop, in a given segment. The two populations are determined as follows. Segment Predominant Scores Account Statistics Table: All open accounts with at least one purchase transaction are scored (predicted spending) for all of the segments. Within each segment, the accounts are ranked by score, and assigned a percentile ranking. The result is that for each account there is a percentile ranking value for each of the merchant segments. The population of interest for a given segment is defined as those accounts which have their highest percentile ranking in this segment. For example, if an account has its highest percentile ranking in segment #108, that account will be included in the population for the statistics table for segment #108, but not in any other segment. This approach assigns each account holder to one and only one segment. Segment Top 5% Scores Account Statistics. For the Segment Top 5% Scores Account Statistics table, the population is defined as the accounts with percentile ranking of 95% or greater in a current segment. These are the 5% of the population that is predicted to spend the most in the segment in the predicted future time interval following the input data time window. These accounts may appear in this population in more than one segment, so that high spenders will show up in many segments; concomitantly, those who spend very little may not assigned to any segment. The number of accounts in the population for each table is also determined and can be provided as | ||||||
