System and method for forecasting intermittent demand6205431Abstract A system and method for forecasting intermittent demand. The forecasting technique utilizes sample reuse techniques to build a distribution of predicted cumulative lead time demand values that can be analyzed using statistical methods, be used as input for an inventory control system, be used as input for a sales planning system, etc. Claims We claim: Description BACKGROUND ART
TABLE 1
MONTH: 1 2 3 4 5 6 7 8 9 10 11 12
.vertline. 13 14 15 16
DEMAND: 0 0 0 4 0 0 1 0 1 0 2 0
.vertline. ? ? ? ?
Of particular interest is forecasting a distribution of possible LTD values for a given lead time. In a traditional bootstrap methodology, random samples are taken from the historical data and assigned to months 13-16 to form a series of lead time demand values. This is then repeated or replicated N times to yield N series, each having a cumulative lead time demand value or LTD sum indicated in Table 2 as LTD(n). Table 2 provides an LTD distribution with five replications, i.e., N=5.
TABLE 2
n Series LTD(n)
1 0 0 1 2 3
2 0 0 0 0 0
3 0 1 0 0 1
4 2 4 0 4 10
5 0 0 2 0 2
In a typical real world application, the number of replications N would be a much higher number, for example N=1000. A distribution of the N demand values can then be analyzed to generate statistical information and/or be fed into an inventory control program. Other variations of bootstrapping can likewise be implemented. In addition to bootstrapping, the present invention discloses a second technique for forecasting intermittent data, referred to as the subseries method. The basis of this approach is straightforward. In order to predict what kinds of total demand can arise over a lead time of a predetermined period (e.g., 3), the overlapping subseries sums are examined. For example, the sums of all sets of three consecutive values recorded in the demand history would be utilized as forecast values. For instance, the sum of observations 1-3, the sum of observations 2-4, etc. would be computed. The results of each sum provide the basis of a distribution of possible lead time demand values. II. Detailed Description--The Smart Bootstrap.TM. One of the preferred embodiments of the invention utilizes a variant of the traditional bootstrap method in order to simulate the distribution of demand over a fixed lead time. Traditional bootstrapping samples randomly from the historical time series. This preferred embodiment, referred to as the Smart Bootstrap, provides two improvements over traditional bootstrapping. First, the method incorporates a Markov modeling technique to better predict if the next value is going to be zero or non-zero. This improves performance because it is recognized that nonzero demand often comes in bunches or runs (i.e., it is positively autocorrelated). Likewise, it is also possible, although less likely, for demand to almost never have two or more consecutive nonzero values (i.e., to be negatively autocorrelated). Second, the method incorporates a novel jittering technique. Jittering is a process in which selected historical values are randomly changed to "neighboring" values in order to enlarge the set of possible forecast values. A neighborhood, or set of neighboring values can be defined as a range of values centered around a historical value. This process likewise improves performance since it is recognized that the set of possible future values is larger than the particular set of values observed in the history. For example, if the history only records values of 0, 1, and 3, the forecasted values should include other values, such as 2's, 4's, 5's, etc. Referring now to FIG. 1, a computer system 10 is shown depicting a forecasting program 22 residing in memory system 18. The computer system comprises a CPU 12, and input/output (I/O) 14, and a bus 16. It is recognized that computer system 10 could comprise additional and related components, and that each of the computer system components could reside in a single computer system or a distributed network, such as a local area network (LAN) the world wide web. Furthermore, the invention could be implemented with multiple CPU's in a parallel processing system, or by special purpose hardware. As can be seen, memory system 18 comprises intermittent data 20 and forecasting program 22. Intermittent data 20, as described above, is generally characterized as a set of time series data values containing frequent zero values intermixed with random nonzero values. Although the preferred embodiment is primarily concerned with intermittent data 20, it is understood that other types of data sets could likewise be used by the software program described herein. Forecasting program 22 is shown to comprise a data input mechanism 24, a lead time value generator 26, a replication mechanism 33, and an analyzing and reporting mechanism 34. The data input mechanism 24 is primarily used to input intermittent data 20. Intermittent data 20 will generally be a collection of the user's historical demand data over a period of time for one or more parts or items. Lead time value generator 26 is primarily used to generate a series of forecasted lead time demand values (i.e., an "LTD series") based on the intermittent data. Replication mechanism 33 is essentially a looping mechanism used to generate "replicates" of LTD data, i.e., a plurality of LTD series. Analyzing and reporting mechanism 34 provides tabular and graphical data based on the generated LTD data, and provides lead time demand calculations based upon a statistical analysis of the LTD data. The lead time value generator 26 calculates each lead time value for a LTD series utilizing a Markov mechanism 28, an instantiation mechanism 30, and a jittering mechanism 32. Once a LTD series is generated, the values can be summed to generate an LTD sum. The process is generally then repeated N times to generate N LTD sums. The actual number of replications N can be user defined or defined by some other constraint. Once the N LTD sums are collected, the analyzing and reporting mechanism 34 is used to provide statistical results that can be examined by the user, or fed into a subsequent process, such as an inventory control system 35. The particular methods used by the forecasting program 22 will be described in more detail below. It is understood that the forecasting program 22 can be implemented using some or all of its components and/or sub-components 28, 30, 32, without departing from the scope of the invention. For example, LTD forecasting could still be achieved pursuant to this invention without one or more of the Markov mechanism 28, the instantiation mechanism 30, or the jittering mechanism 32. Moreover, it is understood that forecasting program 22 can reside in any computer readable format suitable for storing computer instructions, including magnetic media, optical media, random access memory, read only memory, transmission media, etc. Referring now to FIG. 2, an overview of the flow of the forecasting program 22 is depicted. Specifically, the forecast program 22 comprises three inputs; a lead time 54, a number N for specifying the number of replications 56, and the historical intermittent demand data 64. The forecast program outputs a distribution 58 of N LTD sums, which are either presented in a graphical or tabular display 66, or statistically analyzed 60 to provide summary statistics 62. The statistical analysis may examine a mean, median, standard deviation, minimum, fixed percentile (e.g., 99th), and/or a maximum. Alternatively, the user could preselect demand levels of interest and then show the estimated probability of exceeding those levels. It is understood that although the preferred embodiments describe inputting a lead time 54 that is fixed, the term lead time 54 may also encompass a random lead time. Specifically, the random lead time could comprise a distribution of possible lead times. In this case, each replicate (or series of lead time data) would be of varying size, depending upon a random selection from the inputted distribution of lead times. Also shown in FIG. 2 is an inventory control system 35 that receives the distribution of lead time demand values 58 along with user supplied data 59. The user supplied data 59 may include holding, ordering, and shortage costs for inventory items. The data 59 may also include service level percentages that the user would like to meet. These may include percentage of units supplied from inventory, percentage chance of no stock out during lead time, and percentage chance of replenishment shortfalls. The output of the inventory control system generally comprises reorder information 55 and performance measures 57. The reorder information 55 may include an economic order quantity (EOQ) and reorder point for a continuous review model and a review interval and order up-to level for a periodic review model. The performance measures 57 provide percentages for the above mentioned service levels. Referring now to FIG. 3, a flow chart for implementing any bootstrap methodology in accordance with this invention is depicted. As described above, the goal is to generate a distribution of LTD series and/or sums based upon historical demand data. The first step 36 is to choose "a next LTD value(s)" as a function of some sampling and/or random selection from the historical data. Thus, assuming that the historical data is made up of 12 data values, such as that shown in Table 3,
TABLE 3
MONTHS: 1 2 3 4 5 6 7 8 9 10 11 12
0 0 0 4 0 0 1 0 1 0 2 0
a sampling and/or random selection of some or all of the historical values is utilized as part of the algorithm for determining each next LTD value(s). The exact method and type of sampling/random selection used will depend on the specific bootstrap methodology implemented. In a traditional bootstrap methodology, the next lead time value will be a straight forward random selection from the data set. In the Smart Bootstrap, the selection will be made in accordance with the methodology described below with regard to FIG. 4. Other bootstrap variants, such as the moving block bootstrap, the stationary bootstrap, and the threshold bootstrap provide alternative methods for choosing next LTD values. Details of these variations can be found in the following references, which are herein incorporated by reference. Efron, B., and Tibshirani, R. J., An Introduction to the Bootstrap, Chapman and Hall, New York, 1993; Kunsch, H., "The Jackknife And The Bootstrap For General Stationary Observations," Annals of Statistics 17 (1989), 1217-1241; Politis, D. N. and Romano, J. P., "The Stationary Bootstrap," Journal of the American Statistical Association 89 (1994), 1303-1313; Park, D. and Willemain, T. R.: "The Threshold Bootstrap and Threshold Jackknife," Technical Report 37-97-461, Department of Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, Troy, N.Y., 1997. Next, the forecasting process 36 is repeated until an LTD series is complete 38. The size of the LTD series is equal to the number of months or other time units that make up the lead time, and will generally be user defined or could be random, as explained above. For example, if a user wanted to know how many widgets would be required over the next four time periods, e.g., months, the LTD series size would equal four. Once the LTD series is complete, the lead time values making up this series are summed 40 to generate a first lead time demand sum, e.g., LTD(1). Next 42, the entire process is replicated N times to create N LTD sums, LTD(1), LTD(2) . . . LTD(N). Once the N LTD sums are created, a histogram and/or other type of distribution can be stored and/or displayed 44. The distribution can then be analyzed 46 to generate summary statistics, or be used as input to an inventory control system 35. Referring now to FIG. 4, the step 36 of forecasting a next LTD value in accordance with the Smart Bootstrap is described. The forecasting method has three stages: 1. (1a) Estimation of a two-state Markov model of demand values, with states "zero" and "nonzero," and (1b) Prediction of the occurrence of zero and nonzero values over the lead time; 2. Instantiation of nonzero values with randomly selected nonzero values from the history; and 3. Jittering of the nonzero values. The methodology is explained using the following notation: X(t)=demand value at time t X*=a randomly selected nonzero value of demand T=number of historical demand values L=fixed lead time or forecast horizon N=a nonzero demand, for whatever value S=a jittered value of historical demand Poo=Prob[next demand is zero, given last demand was zero] Pon=Prob [next demand is nonzero, given last demand was zero] Pno=Prob [next demand is zero, give last demand was nonzero] Pnn=Prob [next demand is nonzero, given last demand was nonzero] U=random number uniformly distributed between 0 and 1 Z=random number with a standard normal (i.e., Gaussian) distribution (mean=0, variance=1). The first step 48 is to determine if the next LTD value is zero or nonzero. The preferred methodology for performing this step is to utilize a Markov model as explained below. 1a. Estimation of Markov model A Markov model allows the chance that the next demand value will be nonzero to depend on whether the last demand was zero or nonzero. Commonly, nonzero demands often come in spurts, so the probability of a nonzero demand is often greater just after observing a nonzero demand than after observing a zero demand, i.e., Pnn>Pon. This effect can be important over short lead times. The Markov model has four transition probabilities, but two of them are determined once the other two are known, since Poo+Pon=1 (1) and Pno+Pnn=1. (2) Poo and Pnn can be estimated from the history, and the other two values, Pon, Pno, can be estimated using the relationships above. A naive estimate of Poo is given by Poo={# of 0 to 0 transitions}/{# of 0 to 0 and # of 0 to N transitions}. (3) However, if either all or none of the transitions are 0 to 0, this would give Poo=1 or 0, respectively. These estimates would be wrong, since a transition from 0 to 0 would be neither a certain nor an impossible event. Accordingly, the preferred method uses "started counts" in the estimates, such as: Poo=[1/6+{# of 0 to 0 transitions}]/[1/3+{# of 0 to 0+# of 0 to N transitions} (4) Pnn=[1/6+{#of N to N transitions}]/[1/3+{# of N to 0+# of N to N transitions}. (5) 1b. Prediction of zero and nonzero values Once estimates of the Markov transition probabilities are made, they can be used to generate strings of zero and nonzero values over the lead time. Temporarily, the value 1 may be used to stand for any nonzero value. For each time period t from T+1 to T+L, the logic is as follows: IF X(t-1)=0 (6) THEN (generate a transition from state 0) IF U<Poo THEN X(t)=0 ELSE X(t)=1 ELSE (generate a transition from state N) IF U<Pnn THEN X(t)=1 ELSE X(t)=0 This generates strings that look as shown below in Table 4. This example is based on historical data that ended with a zero value ("Final State=0"). The forecasts are made for 12 periods into the future (i.e., a lead time of 12), and four replications (Rep. 1-4) of the bootstrap are shown. Each replication comprises a string of zero/non-zero data.
TABLE 4
Final State = 0
Zero vs Non-Zero State Values:
Lead Time: 1 2 3 4 5 6 7 8 9 10 11 12
Rep#: 1 1 1 1 1 0 0 1 0 0 1 0 0
2 0 0 1 1 1 1 0 0 0 0 0 1
3 0 0 0 1 0 1 0 0 0 1 1 0
4 0 1 0 1 1 0 0 0 0 0 0 0
The next step 50 is to replace the 1's with appropriate values from the history as described below. 2. Instantiation of nonzero values Wherever the string of future values contains a 1, it is substituted with one of the historical nonzero values, selected at random. For example, if the nonzero values in the history were {1, 2, 1, 11, 7, 1, 1, 1}, each of those eight values would have a 1/8 chance of being selected. Note that the five instances of a demand of 1 are treated as five separate values. This process is bootstrapping, and these bootstrap values are referred to as X*. The result of this step is a set of forecast values that looks realistic as shown in Table 5. For example, in the first replication, the forecast for 2 periods ahead becomes 7 and the forecast for seven periods ahead becomes 2.
TABLE 5
1 2 3 4 5 6 7 8 9 10 11 12
Raw Demand 1 7 1 1 0 0 2 0 0 1 0 0
Values: 0 0 11 1 2 1 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 1 11 0
0 7 0 1 2 0 0 0 0 7 0 0
Next, the nonzero lead time values are jittered 52 in accordance with the following procedure. 3. Jittering the nonzero values Often, the historical data will only contain a few distinct nonzero values. However, one could expect different values to appear in the future, so jittering is used to generate some other integers beyond the set in the history. This is done in such a way that there is more variation around larger numbers than smaller. Thus, if the historical value selected in the previous stage is a 2, the jittering process might substitute a "neighboring value," such as a 1, a 3 or maybe even a 4 in place of the 2. In contrast, if the historical value is a 25, the range of neighboring values is chosen to be much greater, so that the jittering process might substitute a 36 or an 18 for the 25. The jittering process smears out the distribution of nonzero demands from a few observed values to a set of normal curves centered on the observed values. The jittering process introduces a slight upward bias into the forecasts. However, experiments revealed that jittering generally improves accuracy, especially for short lead times, for which there is less variety in the distribution of bootstrap results. The jittering process may be implemented as follows: S=1+INT {X*+ZX*} (7) IF S.ltoreq.0, THEN S=X*. (8) Equation (7) creates a new integer value S (i.e., a "neighboring value") centered on the historical value X*. Equation (8) aborts the jittering process if it creates a demand that is non-positive. An alternative way to handle the bad luck of a non-positive S would be to repeat the random sampling process in equation (7) until S.gtoreq.0. However, equation (8) seems to work well and will be faster. Now the final results include some values not seen in the history. For instance, the value of 7 in the second period of the first replication in Table 5 becomes a 10 in Table 6.
TABLE 6
Jittered Demand Values (positive or zero):
1 2 3 4 5 6 7 8 9 10 11 12
3 10 1 1 0 0 4 0 0 1 0 0
0 0 15 3 2 1 0 0 0 0 0 0
0 0 0 2 0 1 0 0 0 2 8 0
0 6 0 1 1 0 0 0 0 7 0 0
While, this embodiment describes a preferred method for jittering, other similar methods could be used and therefore fall within the scope of this invention. III. Subseries Method An additional set of embodiments for providing an LTD distribution according to this invention involves the use of subseries concepts. Here the notion is to simply approximate the distribution of cumulative lead time demand by the distribution of sums of overlapping subseries in the demand history that are as long as the lead time. Three possible approaches are considered, one nonparametric and two parametric. The simple nonparametric version of this approach does not provide a probability for every possible integer value. The parametric versions provide a complete set of probabilities by utilizing discretized normal or lognormal distributions to fit the observed nonzero sums. The resulting forecasts have been found to be more accurate than those of exponential smoothing and Croston's method. In addition, because of the relative simplicity, subseries forecasts can be computed faster than the Smart Bootstrap forecast. Referring now to FIG. 5, the first step 70 of any subseries implementation is to calculate the overlapping subseries sums. Consider an example with a lead time L=3 and a demand history consisting of T=28 monthly values. Subseries 1 consists of observations 1-3, subseries 2 overlaps subseries 1 and consists of observations 2-4, and so forth. Summing the values in each of the overlapping subseries gives the T-L+1=26 subseries sums. The second step 72 is to estimate a cumulative distribution function (CDF) to provide a particular distribution model. Next 74, a parametric model can be fit to the CDF. Finally 76, the CDF of the subseries sums can be utilized to generate the LTD distribution. For example, suppose the 26 sums take 9 distinct values: 0 (fourteen times), 1, 2, 5 (twice), 6 (three times), 7, 10 (twice), 12, and 54. There are 12 nonzero values. F(X), the cumulative distribution function (CDF) of LTD, can be estimated as either a parametric or non parametric model. For the nonparametric model, the following empirical distribution function can be used: F (0)=(1/2+#{=0})/(1+# total) F (X)=F (0)+[1-F (0)].multidot.[(1/2+#{.ltoreq.X and >0})/(1+#{>0})], X>0. So, for the above example, F (0)=(1/2+14)/(1+26)=0.537 F (1)=0.537+(1-0.537).multidot.(1/2+1)/(1+12)=0.590 F (2)=0.537+(1-0.537).multidot.(1/2+2)/(1+12)=0.626 F (5)=0.537+(1-0.537).multidot.(1/2+4)/(1+12)=0.697 and so forth for F (6), F (7), F (10), and F (12) up to F (54)=0.537+(1-0.537).multidot.(1/2+12)/(1+12)=0.982. Note that the nonparametric subseries model has increments in its estimate of the CDF only at those values of LTD that appeared in the subseries sums. To provide a more complete distribution, a parametric model can be utilized. This involves the use of a "mixture model," so named because the model is a mixture of two components. One component is the probability of a zero result. The other is a parametric model for the nonzero values. (A "parametric" model has an explicit expression, or formula, for the probability distribution. That expression will have parameters in it that must be adjusted to fit the data at hand.) The mixture model is: F (X)=F (0)+[1-F (0)]*G (X), X>0 where F (0)=(1/2+#{=0})/(1+#total). G (X) is the estimated, discretized CDF of either a normal or lognormal distribution. The parametric subseries models are continuous, so they can be discretize in any number of ways, e.g., G (1) includes all the area under the normal or lognormal density function to the left of 1. In contrast to the nonparametric estimate F (X), the parametric estimates G (X) have probability increments at every possible value of LTD. The parameters of the normal and lognormal models are estimated by matching the sample mean and sample variance calculated from those subseries sums with nonzero values. A normal probability density function is calculated as follows: g(X)={1/(Sqrt[2*Pi]*Sigma)}*exp(-0.5*[(X-Mu)/Sigma] 2) for: -.infin.<X<.infin., where Mu and Sigma are the estimated parameters representing the mean and standard deviation, respectively. A lognormal probability density function is provided as follows: g(X)={1/(Sqrt[2*Pi]*Sigma*X)}*exp(-0.5*[(Ln(X)-Mu)/Sigma] 2) for 0<=X<.infin.. For the lognormal case, the mean is exp[Mu+0.5*Sigma 2] and the standard deviation is Sqrt{exp(2*Mu+Sigma 2 )*[exp(Sigma 2)-1]}. Note that these formulas give g(X) not G(X). G(X) is the cumulative distribution function, obtained by integrating the probability density function from -infinity to X. That is, G(X)=Integral[g(u), u=-.infin. . . . X]. G (X) is an approximation of G(X), given that the parameters Mu and Sigma are estimated from from the data. For the lognormal, the mean and standard deviation are both functions of Mu and Sigma. Standard statistical methods, e.g., the method of moments estimation, can be used to solve two simultaneous nonlinear equations in the two unknowns Mu and Sigma to match the observed sample mean and sample standard deviation. IV. Forecast Accuracy Assessed Using Industrial Data In order to test the embodiments described above, over 28,000 data series of actual intermittent data were studied. The data is summarized in the table shown in FIG. 6. A. Measures Of Forecast Accuracy Assessing the performance of forecasting methods for intermittent demand required the development of new measures. When assessing forecasts of smooth demand, it is conventional to use such accuracy measures as the mean squared error (MSE) or the mean absolute percentage error (MAPE). This latter is generally preferred when one wants to assess the performance of alternative forecasting methods over many items, because its relative, scale-free measure can be averaged naturally across series with different mean levels. However, when one or more of the observed demands is zero, as is often true with intermittent demand, one cannot compute the MAPE. This is one problem with assessing accuracy. The other problem is the need to assess the quality not of a point forecast but of a predicted distribution. This is because inventory control models require as input not a point forecast but a forecast of the entire distribution of LTD. Given historical data on item demand, one can hold out the final L periods, sum their demands, and thereby create a point estimate of actual LTD. The problem is how to use this single holdout value to assess the quality of an entire predicted distribution of LTD for that item. It was found that industrial data sets are generally "short and wide," e.g., there were only a few dozen monthly observations on each of hundreds or thousands of inventory items. This means that one can afford to create only one holdout value of LTD for each item, making it impossible to compare two forecast distributions on an item-specific basis. However, there are data for many items, so some form of pooling across items has potential as a way to assess the quality of forecast distributions. To respond on these two problems, two new ways to assess the forecast distributions were devised. A(1). Accuracy test based on uniform distribution of percentiles The first method is based on observed percentiles of the forecast LTD distribution. To understand this approach, first consider the easier problem of assessing the predicted distribution of a continuous random variable. Among the methods used to solve such problems are the chi-square test and tests based on the probability-integral transformation; the present solution uses both. The probability-integral transformation converts an observed value of a random variable X into its corresponding fractile using the CDF, F(X). For example, if the median value is 2.35, then the transformation converts 2.35 into 0.5. If F(X) is known exactly, the transformation converts the observed values of the random variable into values that have a uniform distribution. In the more realistic case of having to use an estimate F (X), sampling variability leads to departures from uniformity. Nevertheless, one can take the degree of conformance of the estimated fractiles to the uniform ideal as a basis for assessing the relative quality of the estimated CDF. However, the present invention does not necessarily deal with a continuous random variable, since LTD is discrete. Because the CDF is defined only at integer values in this case, only a discrete set of fractiles are observed. To convert this problem back into one in which departures from uniformity can be assessed, the possible fractiles are first divided into 20 equal bins of width 0.05. When a LTD count of X is observed, F (X)-F (X-1) is calculated and that probability is distributed proportionally over the relevant bins. Then we compare the distribution of counts in the 20 bins to a uniform distribution, summarizing the departure from uniformity using the chi-square statistic. The choice of 20 bins is an appropriate compromise between resolution and statistical stability, given the large numbers of items in our industrial data sets. Other numbers of bins would be appropriate if there were more or fewer items available. To illustrate this method of fractional allocation of counts, consider a distribution of LTD that is estimated to be Poisson with mean 1.3. In this case, F (0)=0.273 and F (1)=0.627. Now suppose that the observed LTD is 1. The present method distributes F (1)-F (0)=0.354 uniformly over the range from 0.273 to 0.627. Thus we credit 0.076=(0.30-0.273)/0.354 of a count to the bin spanning 0.25 to 0.30, 0.141=0.05/0.354 of a count to each of the six bins spanning the range from 0.30 to 0.60, and 0.076=(0.627-0.60)/0.354 of a count to the bin spanning 0.60 to 0.65. These fractional counts are accumulated across items, then the uniformity of the total counts is assessed across the 20 bins using the chi-square statistic. A(2). Accuracy test based on log likelihood ratios The second method of assessing the forecast distribution of LTD is based on log likelihood ratios. The likelihood of an event is defined to be the probability of the occurrence of that event, as computed using a specific probability model. If there are two competing models, the ratio of the two likelihoods is a measure of relative performance. The model whose likelihood forms the denominator of the ratio is referred to as the "reference" and the other method is referred to as the "alternative." Because likelihoods are often small numbers and their ratios have very skewed distributions, it is customary to work with the logarithm of the likelihood ratio. The reasoning behind using the log likelihood ratio is simple. If a model predicts LTD well, then an observed holdout value of LTD for an item will, on average, have a high likelihood of occurring. Conversely, if the model predicts poorly, the observed LTD value will have a low likelihood. If the mean log likelihood ratio, averaged over inventory items, is greater than zero, then the alternative is more accurate than the reference. Because exponential smoothing is the most used statistical forecasting method for intermittent demand, is was chosen as the reference method. Four alternatives were evaluated: Croston's method, the bootstrap method, and the two parametric subseries methods. The nature of the industrial data forced an elaboration to the definition of an event and its likelihood. The specific problem was that the data frequently included holdout values of LTD that were extremely unlikely under either the reference or the alternative or both, with one or both likelihoods being essentially zero. In such cases, the log likelihood ratio could not be computed. In turn, this made it difficult to compute summary statistics for the log likelihood ratios computed across data sets. Our response was to divide the range of possible LTD values into random-width bins defined by the empirical distribution of subseries sums. To illustrate, consider again the example cited earlier in which the observed subseries values were: 0 (fourteen times), 1, 2, 5 (twice), 6 (three times), 7, 10 (twice), 12, and 54. In this case, likelihoods were computed for the following events: 0, 1, 2, 3-5, 6, 7, 8-10, 11-12, 13-54 and 55+. This approach greatly reduced the frequency of difficulty in computing the log likelihood ratios, though it did not eliminate it. When there were difficulties, we used the following logic to assign values to the log likelihood ratio. If the likelihoods under both the reference and the alternative are 0, assign 0, indicating a tie. If only the likelihood under the reference is 0, assign +99. If only the likelihood under the alternative is 0, assign -99. Then, when computing sample means of log likelihood ratios, we Winsorize the data, replacing any +99 (-99) values with the next largest (smallest) value. B. Empirical Results B(1). Comparisons based on uniform distribution of percentiles We compared exponential smoothing and Croston's method with the bootstrap and two parametric subseries methods on the basis of the uniformity of observed LTD percentiles. For each of the nine industrial data sets, we tested the four methods using lead times of 1, 3 and 6 months. Because the distribution of chi square values was highly skewed, it simplified the analysis to work with the logarithm of the chi square statistics. Note that a smaller value of the logarithm of the chi square statistics is preferable, since it indicates a flatter, i.e., more uniform, distribution of counts across the 20 bins. The table shown in FIG. 7 and the graph of FIG. 9 show how data set, lead time and forecasting method influenced the quality of the LTD forecasts. The following conclusions were drawn: 1. The bootstrap method performed best overall. 2. The subseries methods were slightly less accurate than the bootstrap, but the differences were not statistically significant. The lognormal variant of the subseries method was slightly more accurate than the normal variant. 3. Exponential smoothing and Croston's method were the least accurate methods, with Croston's method slightly less accurate than exponential smoothing. 4. The accuracy of the new methods decreased notably with the length of the lead time, although the new methods remained more accurate than exponential smoothing and Croston's method. Lead time, however, had little effect on the accuracy of the two variants of exponential smoothing. Overall, the new forecasting methods achieved higher accuracy. B(2). Comparisons based on log likelihood ratios The second approach to assessing the performance of the forecast distribution of LTD used the mean values of the log likelihood computed using the various forecasting methods. In this case, we compared the exponential smoothing reference against four alternatives: Croston's method, the bootstrap, and the subseries method with a fitted normal and lognormal distribution. The table shown in FIG. 8 and the graph shown in FIG. 10 show the details of the comparisons. Positive values of the mean log likelihood indicate that the alternative performed better than exponential smoothing for the particular data set and lead time. Our conclusions are essentially the same as with the previous measure of accuracy: 1. The bootstrap generally performed best. 2. The subseries methods performed better than Croston's method; the lognormal variant of the subseries method was slightly more accurate than the normal variant. 3. Croston's method generally performed worse than exponential smoothing. The foregoing description of the preferred embodiments of the invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teachings. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of this invention as defined by the accompanying claims.
|
Same subclass Same class Consider this |
||||||||||
