System and method for historical database training of support vector machines6944616Abstract A system and method for historical database training of a support vector machine (SVM). The SVM is trained with training sets from a stream of process data. The system detects availability of new training data, and constructs a training set from the corresponding input data. Over time, many training sets are presented to the SVM. When multiple presentations are needed to effectively train the SVM, a buffer of training sets is filled and updated as new training data becomes available. Once the buffer is full, a new training set bumps the oldest training set from the buffer. The training sets are presented one or more times each time a new training set is constructed. A historical database of time-stamped data may be used to construct training sets for the SVM. The SVM may be trained retrospectively by searching the historical database and constructing training sets based on the time-stamped data. Claims 1. A computer implemented method for training a support vector machine, the method comprising: Description BACKGROUND OF THE INVENTION
The explanation which follows explains the problems associated with meeting and optimizing these five steps. C. The Measurement Problem As shown above, the second and fourth steps or aspects of process control involve measurement 1224 of process conditions 1906 and measurement 1304 of product properties 1904, respectively. Such measurements may be sometimes very difficult, if not impossible, to effectively perform for process control. For many products, the important product properties 1904 relate to the end use of the product and not to the process conditions 1906 of the process 1212. One illustration of this involves the manufacture of carpet fiber. An important product property 1904 of carpet fiber is how uniformly the fiber accepts the dye applied by the carpet maker. Another example involves the cake example set forth above. An important product property 1904 of a baked cake is how well the cake resists breaking apart when the frosting is applied. Typically, the measurement of such product properties 1904 is difficult and/or time consuming and/or expensive to make. An example of this problem may be shown in connection with the carpet fiber example. The ability of the fiber to uniformly accept dye may be measured by a laboratory (lab) in which dye samples of the carpet fiber are used. However, such measurements may be unreliable. For example, it may take a number of tests before a reliable result may be obtained. Furthermore, such measurements may also be slow. In this example, it may take so long to conduct the dye test that the manufacturing process may significantly change and be producing different product properties 1904 before the lab test results are available for use in controlling the process 1212. It should be noted, however, that some process condition measurements 1224 may be inexpensive, take little time, and may be quite reliable. Temperature typically may be measured easily, inexpensively, quickly, and reliably. For example, the temperature of the water in a tank may often be easily measured. But oftentimes process conditions 1906 make such easy measurements much more difficult to achieve. For example, it may be difficult to determine the level of a foaming liquid in a vessel. Moreover, a corrosive process may destroy measurement sensors, such as those used to measure pressure. Regardless of whether or not measurement of a particular process condition 1906 or product property 1904 is easy or difficult to obtain, such measurement may be vitally important to the effective and necessary control of the process 1212. It may thus be appreciated that it would be preferable if a direct measurement of a specific process condition 1906 and/or product property 1904 could be obtained in an inexpensive, reliable, timely and effective manner. D. Conventional Computer Models as Predictors of Desired Measurements As stated above, the direct measurement of the process conditions 1906 and the product properties 1904 is often difficult, if not impossible, to do effectively. One response to this deficiency in process control has been the development of computer models (not shown) as predictors of desired measurements. These computer models may be used to create values used to control the process 1212 based on inputs that may not be identical to the particular process conditions 1906 and/or product properties 1904 that are critical to the control of the process 1212. In other words, these computer models may be used to develop predictions (estimates) of the particular process conditions 1906 or product properties 1904. These predictions may be used to adjust the controllable process state 2002 or the process condition setpoint 1404. Such conventional computer models, as explained below, have limitations. To better understand these limitations and how the present invention overcomes them, a brief description of each of these conventional models is set forth. 1. Fundamental Models A computer-based fundamental model (not shown) uses known information about the process 1212 to predict desired unknown information, such as product conditions 1906 and product properties 1904. A fundamental model may be based on scientific and engineering principles. Such principles may include the conservation of material and energy, the equality of forces, and so on. These basic scientific and engineering principles may be expressed as equations which are solved mathematically or numerically, usually using a computer program. Once solved, these equations may give the desired prediction of unknown information. Conventional computer fundamental models have significant limitations, such as:
These problems result in computer fundamental models being practical only in some cases where measurement is difficult or impossible to achieve. 2. Empirical Statistical Models Another conventional approach to solving measurement problems is the use of a computer-based statistical model (not shown). Such a computer-based statistical model may use known information about process 1212 to determine desired information that may not be effectively measured. A statistical model may be based on the correlation of measurable process conditions 1906 or product properties 1904 of the process 1212. To use an example of a computer-based statistical model, assume that it is desired to be able to predict the color of a plastic product 1216. This is very difficult to measure directly, and takes considerable time to perform. In order to build a computer-based statistical model which will produce this desired product property 1904 information, the model builder would need to have a base of experience, including known information and actual measurements of desired unknown information. For example, known information may include the temperature at which the plastic is processed. Actual measurements of desired unknown information may be the actual measurements of the color of the plastic. A mathematical relationship (i.e., an equation) between the known information and the desired unknown information may be created by the developer of the empirical statistical model. The relationship may contain one or more constants (which may be assigned numerical values) which affect the value of the predicted information from any given known information. A computer program may use many different measurements of known information, with their corresponding actual measurements of desired unknown information, to adjust these constants so that the best possible prediction results may be achieved by the empirical statistical model. Such a computer program, for example, may use non-linear regression. Computer-based statistical models may sometimes predict product properties 1904 which may not be well described by computer fundamental models. However, there may be significant problems associated with computer statistical models, which include the following:
The result of these deficiencies is that computer-based empirical statistical models may be practical in only some cases where the process conditions 1906 and/or product properties may not be effectively measured. E. Deficiencies in the Related Art As set forth above, there are considerable deficiencies in conventional approaches to obtaining desired measurements for the process conditions 1906 and product properties 1904 using conventional direct measurement, computer fundamental models, and computer statistical models. Some of these deficiencies are as follows:
Although the above limitations have been described with respect to process control, it should be noted that these arguments apply to other application domains as well, such as plant management, quality control, optimized decision making, e-commerce, financial markets and systems, or any other field where predictive modeling may be used. Therefore, improved systems and methods for historical database training of a support vector machine are desired. SUMMARY OF THE INVENTION A system and method are presented for historical database training of a support vector machine. The support vector machine may train by retrieving training sets from a stream of process data. The support vector machine may detect the availability of new training data, and may construct a training set by retrieving the corresponding input data. The support vector machine may be trained using the training set. Over time, many training sets may be presented to the support vector machine. The support vector machine may detect training input data in several ways. In one approach, the support vector machine may monitor for changes in data values of training input data. A change may indicate that new data is available. In a second approach, the support vector machine may compute changes in raw training input data from one cycle to the next. The changes may be indicative of the action of human operators or other actions in the process. In a third mode, a historical database may be used and the support vector machine may monitor for changes in a timestamp of the training input data. Often laboratory data may be used as training input data in this approach. When new training input data is detected, the support vector machine may construct a training set by retrieving input data corresponding to the new training input data. Often, the current or most recent values of the input data may be used. When a historical database provides both the training input data and the input data, the input data is retrieved from the historical database for a time period selected using the timestamps of the training input data. For some support vector machines or training situations, multiple presentations of each training set may be needed to effectively train the support vector machine. In this case, a buffer of training sets (e.g., a FIFO—first in, first out—buffer) is filled and updated as new training data becomes available. The size of the buffer may be selected in accordance with the training needs of the support vector machine. Once the buffer is full, a new training set may bump the oldest training set from the buffer. The training sets in the buffer may be presented one or more times each time a new training set is constructed. It is noted that the use of a buffer to store training sets is but one example of storage means for the training sets, and that other storage means are also contemplated, including lists (such as queues and stacks), databases, and arrays, among others. If a historical database is used, the support vector machine may be trained retrospectively. Training sets may be constructed by searching the historical database over a time span of interest for training input data. When training input data are found, an input data time is selected using the training input data timestamps, and the training set is constructed by retrieving the input data corresponding to the input data time. Multiple presentations may also be used in the retrospective training approach. In one embodiment, the method may include building a first training set using training data, where the training data includes one or more timestamps indicating a chronology of the training data and one or more process parameter values corresponding to each timestamp. The first training set may include process parameter values corresponding to a first time period in the chronology. In one embodiment, building the first training set may include retrieving the training data from a historical database, selecting a training data time period based on the one or more timestamps, and retrieving the process parameter values from the training data indicated by the training data time period. Thus, the first training set includes retrieved process parameter values in chronological order over the selected training data time period. The support vector machine may then be trained using the first training set. Then, a second training set may be generated by removing at least a subset of the parameter values of the first training set, preferably the oldest parameter values of the training set, and adding new parameter values from the training data based on the timestamps to generate a second training set. Thus, the second training set corresponds to a second time period in the chronology. The support vector machine may then be trained using the second training set. The process may then be repeated, successively updating the training set to generate new training sets by removing old data and adding new data based on the timestamps and training the support vector machine with each training set. The historical database trained support vector machine may be used for process measurement, manufacturing, supervisory control, regulatory control functions, optimization, real-time optimization, decision-making systems, e-marketplaces, e-commerce, data analysis, data mining, financial analysis, stock and/or bond analysis/management, as well as any other field or domain where predictive or classification models may be useful. Using data pointers, easy access to many process data systems may be achieved. A modular approach with natural language configuration of the support vector machine may be used to implement the support vector machine. Expert system functions may be provided in the modular support vector machine to provide decision-making functions for use in control, analysis, management, or other areas of application. BRIEF DESCRIPTION OF THE DRAWINGS Other objects and advantages of the invention will become apparent upon reading the following detailed description and upon reference to the accompanying drawings in which: FIG. 1 illustrates an exemplary computer system according one embodiment of the present invention; FIG. 2 is an exemplary block diagram of the computer system illustrated in FIG. 1, according to one embodiment of the present invention; FIG. 3 is a nomenclature diagram illustrating one embodiment of the present invention at a high level; FIG. 4 is a representation of the architecture of an embodiment of the present invention; FIG. 5 is a high level block diagram of the six broad steps included in one embodiment of a support vector machine process control system and method according to the present invention; FIG. 6 is an intermediate block diagram of steps and modules included in the store input data and training input data step or module 102 of FIG. 5, according to one embodiment; FIG. 7 is an intermediate block diagram of steps and modules included in the configure and train support vector machine step or module 104 of FIG. 5, according to one embodiment; FIG. 8 is an intermediate block diagram of input steps and modules included in the predict output data using support vector machine step or module 106 of FIG. 5, according to one embodiment; FIG. 9 is an intermediate block diagram of steps and modules included in the retrain support vector machine step or module 108 of FIG. 5, according to one embodiment; FIG. 10 is an intermediate block diagram of steps and modules included in the enable/disable control step or module 110 of FIG. 5, according to one embodiment; FIG. 11 is an intermediate block diagram of steps and modules included in the control process using output data step or module 112 of FIG. 5, according to one embodiment; FIG. 12 is a detailed block diagram of the configure support vector machine step or module 302 of the relationship of FIG. 7, according to one embodiment; FIG. 13 is a detailed block diagram of the new training input data step or module 306 of FIG. 7, according to one embodiment; FIG. 14 is a detailed block diagram of the train support vector machine step or module 308 of FIG. 7, according to one embodiment; FIG. 15 is a detailed block diagram of the error acceptable step or module 310 of FIG. 7, according to one embodiment; FIG. 16 is a representation of the architecture of an embodiment of the present invention having the additional capability of using laboratory values from a historical database 1210; FIG. 17 is an embodiment of controller 1202 of FIGS. 4 and 16 having a supervisory controller 1408 and a regulatory controller 1406; FIG. 18 illustrates various embodiments of controller 1202 of FIG. 17 used in the architecture of FIG. 4; FIG. 19 is a modular version of block 1502 of FIG. 18 illustrating the various different types of modules that may be utilized with a modular support vector machine 1206, according to one embodiment; FIG. 20 illustrates an architecture for block 1502 having a plurality of modular support vector machines 1702-1702n with pointers 1710-1710n pointing to a limited set of support vector machine procedures 1704-1704n, according to one embodiment; FIG. 21 illustrates an alternate architecture for block 1502 having a plurality of modular support vector machines 1702-1702n with pointers 1710-1710n to a limited set of support vector machine procedures 1704-1704n, and with parameter pointers 1802-1802n to a limited set of system parameter storage areas 1806-1806n, according to one embodiment; FIG. 22 is a high level block diagram illustrating the key aspects of a process 1212 having process conditions 1906 used to produce a product 1216 having product properties 1904 from raw materials 1222, according to one embodiment; FIG. 23 illustrates the various steps and parameters which may be used to perform the control of process 1212 to produce products 1216 from raw materials 1222, according to one embodiment; FIG. 24 is an exploded block diagram illustrating the various parameters and aspects that may make up the support vector machine 1206, according to one embodiment; FIG. 25 is an exploded block diagram of the input data specification 2204 and the output data specification 2206 of the support vector machine 1206 of FIG. 24, according to one embodiment; FIG. 26 is an exploded block diagram of the prediction timing control 2212 and the training timing control 2214 of the support vector machine 1206 of FIG. 24, according to one embodiment; FIG. 27 is an exploded block diagram of various examples and aspects of controller 1202 of FIG. 4, according to one embodiment; FIG. 28 is a representative computer display of one embodiment of the present invention illustrating part of the configuration specification of the support vector machine block 1206, according to one embodiment; FIG. 29 is a representative computer display of one embodiment of the present invention illustrating part of the data specification of the support vector machine block 1206, according to one embodiment; FIG. 30 illustrates a computer screen with a pop-up menu for specifying the data system element of the data specification, according to one embodiment; FIG. 31 illustrates a computer screen with detailed individual items of the data specification display of FIG. 29, according to one embodiment; FIG. 32 is a detailed block diagram of an embodiment of the enable control step or module 602 of FIG. 10; FIG. 33 is a detailed block diagram of embodiments of steps and modules 802, 804 and 806 of FIG. 12; and FIG. 34 is a detailed block diagram of embodiments of steps and modules 808, 810, 812 and 814 of FIG. 12. While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof may be shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawing and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. DETAILED DESCRIPTION OF SEVERAL EMBODIMENTS Incorporation by Reference U.S. Pat. No. 5,950,146, titled "Support Vector Method For Function Estimation", whose inventor is Vladimir Vapnik, and which issued on Sep. 7, 1999, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,649,068, titled "Pattern Recognition System Using Support Vectors", whose inventors are Bernard Boser, Isabelle Guyon, and Vladimir Vapnik, and which issued on Jul. 15, 1997, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,058,043, titled "Batch Process Control Using Expert Systems", whose inventor is Richard D. Skeirik, and which issued on Oct. 15, 1991, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,006,992, titled "Process Control System With Reconfigurable Expert Rules and Control Modules", whose inventor is Richard D. Skeirik, and which issued on Apr. 9, 1991, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 4,965,742, titled "Process Control System With On-Line Reconfigurable Modules", whose inventor is Richard D. Skeirik, and which issued on Oct. 23, 1990, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 4,920,499, titled "Expert System With Natural-Language Rule Updating", whose inventor is Richard D. Skeirik, and which issued on Apr. 24, 1990, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 4,910,691, titled "Process Control System with Multiple Module Sequence Options", whose inventor is Richard D. Skeirik, and which issued on Mar. 20, 1990, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 4,907,167, titled "Process Control System with Action Logging", whose inventor is Richard D. Skeirik, and which issued on Mar. 6, 1990, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 4,884,217, titled "Expert System with Three Classes of Rules", whose inventors are Richard D. Skeirik and Frank O. DeCaria, and which issued on Nov. 28, 1989, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,212,765, titled "On-Line Training Neural Network System for Process Control", whose inventor is Richard D. Skeirik, and which issued on May 18, 1993, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,408,586, titled "Historical Database Training Method for Neural Networks", whose inventor is Richard D. Skeirik, and which issued on Apr. 18, 1995, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,640,493, titled "Historical Database Training Method for Neural Networks", whose inventor is Richard D. Skeirik, and which issued on Jun. 17, 1997, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. U.S. Pat. No. 5,826,249, titled "Historical Database Training Method for Neural Networks", whose inventor is Richard D. Skeirik, and which issued on Oct. 20, 1998, is hereby incorporated by reference in its entirety as though fully and completely set forth herein. Table of Contents Computer System Diagram Computer System Block Diagram I. Overview of Support Vector Machines A. Introduction B. How Support Vector Machines Work
C. An SVM Learning Rule D. Classification of Linearly Separable Data E. Classification of Nonlinearly Separable Data F. Nonlinear Support Vector Machines G. Kernel Functions
H. Construction of Support Vector Machines I. Support Vector Machine Training J. Advantages of Support Vector Machines II. Brief Overview III. Use in Combination with Expert Systems IV. One Method of Operation A. Store Input Data and Training Input Data Step or Module 102 B. Configure and Train Support Vector Machine Step or Module 104
C. Predict Output Data Using Support Vector Machine Step or Module 106 D. Retrain Support Vector Machine Step or Module 108 E. Enable/Disable Control Module or Step 110 F. Control Process Using Output Data Step or Module 112 V. One Structure (Architecture) VI. User Interface FIG. 1—Computer System FIG. 1 illustrates a computer system 82 operable to execute a support vector machine for performing modeling and/or control operations. One embodiment of a method for creating and/or using a support vector machine is described below. The computer system 82 may be any type of computer system, including a personal computer system, mainframe computer system, workstation, network appliance, Internet appliance, personal digital assistant (PDA), television system or other device. In general, the term "computer system" can be broadly defined to encompass any device having at least one processor that executes instructions from a memory medium. As shown in FIG. 1, the computer system 82 may include a display device operable to display operations associated with the support vector machine. The display device may also be operable to display a graphical user interface of process or control operations. The graphical user interface may comprise any type of graphical user interface, e.g., depending on the computing platform. The computer system 82 may include a memory medium(s) on which one or more computer programs or software components according to one embodiment of the present invention may be stored. For example, the memory medium may store one or more support vector machine software programs (support vector machines) which are executable to perform the methods described herein. Also, the memory medium may store a programming development environment application used to create and/or execute support vector machine software programs. The memory medium may also store operating system software, as well as other software for operation of the computer system. The term "memory medium" is intended to include an installation medium, e.g., a CD-ROM, floppy disks, or tape device; a computer system memory or random access memory such as DRAM, SRAM, EDO RAM, Rambus RAM, etc.; or a non-volatile memory such as a magnetic media, e.g., a hard drive, or optical storage. The memory medium may comprise other types of memory as well, or combinations thereof. In addition, the memory medium may be located in a first computer in which the programs are executed, or may be located in a second different computer which connects to the first computer over a network, such as the Internet. In the latter instance, the second computer may provide program instructions to the first computer for execution. As used herein, the term "support vector machine" refers to at least one software program, or other executable implementation (e.g., an FPGA), that implements a support vector machine as described herein. The support vector machine software program may be executed by a processor, such as in a computer system. Thus the various support vector machine embodiments described below are preferably implemented as a software program executing on a computer system. FIG. 2—Computer System Block Diagram FIG. 2 is an embodiment of an exemplary block diagram of the computer system illustrated in FIG. 1. It is noted that any type of computer system configuration or architecture may be used in conjunction with the system and method described herein, as desired, and FIG. 2 illustrates a representative PC embodiment. It is also noted that the computer system may be a general purpose computer system such as illustrated in FIG. 1, or other types of embodiments. The elements of a computer not necessary to understand the present invention have been omitted for simplicity. The computer system 82 may include at least one central processing unit or CPU 160 which is coupled to a processor or host bus 162. The CPU 160 may be any of various types, including an x86 processor, e.g., a Pentium class, a PowerPC processor, a CPU from the SPARC family of RISC processors, as well as others. Main memory 166 is coupled to the host bus 162 by means of memory controller 164. The main memory 166 may store one or more computer programs or libraries according to the present invention. The main memory 166 also stores operating system software as well as the software for operation of the computer system, as well known to those skilled in the art. The host bus 162 is coupled to an expansion or input/output bus 170 by means of a bus controller 168 or bus bridge logic. The expansion bus 170 is preferably the PCI (Peripheral Component Interconnect) expansion bus, although other bus types may be used. The expansion bus 170 may include slots for various devices such as a video display subsystem 180 and hard drive 182 coupled to the expansion bus 170, among others (not shown). I. Overview of Support Vector Machines FIG. 3 may provide a reference of consistent terms for describing an embodiment of the present invention. FIG. 3 is a nomenclature diagram which shows the various names for elements and actions used in describing various embodiments of the present invention. In referring to FIG. 3, the boxes may indicate elements in the architecture and the labeled arrows may indicate actions. As discussed below in greater detail, one embodiment of the present invention essentially utilizes support vector machines to provide predicted values of important and not readily obtainable process conditions 1906 and/or product properties 1904 to be used by a controller 1202 to produce controller output data 1208 used to control the process 1212. As shown in FIG. 4, a support vector machine 1206 may operate in conjunction with a historical database 1210 which provides input sensor(s) data 1220. It should be noted that the embodiment described herein relates to process control, such as of a manufacturing plant. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to process control, but on the contrary, various embodiments of the invention may be contemplated to be applicable in many other areas as well, such as process measurement, manufacturing, supervisory control, regulatory control functions, optimization, real-time optimization, decision-making systems, data analysis, data mining, e-marketplaces, e-commerce, financial analysis, stock and/or bond analysis/management, as well as any other field or domain where predictive or classification models may be useful. Thus, specific steps or modules described herein which apply only to process control embodiments may be different, or omitted as appropriate or desired. It should also be noted that in various embodiments of the present invention, components described herein as sensors or actuators may comprise software constructs or operations which provide or control information or information processes, rather than physical phenomena or processes. Referring now to FIGS. 4 and 5, input data and training input data may be stored in a historical database with associated timestamps as indicated by a step or module 102. In parallel, the support vector machine 1206 may be configured and trained in a step or module 104. The support vector machine 1206 may be used to predict output data 1218 using input data 1220, as indicated by a step or module 106. The support vector machine 1206 may then be retrained in a step or module 108, and control using the output data may be enabled or disabled in a step or module 110. In parallel, control of the process using the output data may be performed in a step or module 112. Thus, the system may collect and store the appropriate data, may configure and may train the support vector machine, may use the support vector machine to predict output data, and may enable control of the process using the predicted output data. Various embodiments of the present invention utilize a support vector machine 1206, and are described in detail below. In order to fully appreciate the various aspects and benefits produced by the various embodiments of the present invention, an understanding of support vector machine technology is useful. For this reason, the following section discusses support vector machine technology as applicable to the support vector machine 1206 of various embodiments of the system and method of the present invention. A. Introduction Historically, classifiers have been determined by choosing a structure, and then selecting a parameter estimation algorithm used to optimize some cost function. The structure chosen may fix the best achievable generalization error, while the parameter estimation algorithm may optimize the cost function with respect to the empirical risk. There are a number of problems with this approach, however. These problems may include: 1. The model structure needs to be selected in some manner. If this is not done correctly, then even with zero empirical risk, it is still possible to have a large generalization error. 2. If it is desired to avoid the problem of over-fitting, as indicated by the above problem, by choosing a smaller model size or order, then it may be difficult to fit the training data (and hence minimize the empirical risk). 3. Determining a suitable learning algorithm for minimizing the empirical risk may still be quite difficult. It may be very hard or impossible to guarantee that the correct set of parameters is chosen. The support vector method is a recently developed technique which is designed for efficient multidimensional function approximation. The basic idea of support vector machines (SVMs) is to determine a classifier or regression machine which minimizes the empirical risk (i.e., the training set error) and the confidence interval (which corresponds to the generalization or test set error), that is, to fix the empirical risk associated with an architecture and then to use a method to minimize the generalization error. One advantage of SVMs as adaptive models for binary classification and regression is that they provide a classifier with minimal VC (Vapnik-Chervonenkis) dimension which implies low expected probability of generalization errors. SVMs may be used to classify linearly separable data and nonlinearly separable data. SVMs may also be used as nonlinear classifiers and regression machines by mapping the input space to a high dimensional feature space. In this high dimensional feature space, linear classification may be performed. In the last few years, a significant amount of research has been performed in SVMs, including the areas of learning algorithms and training methods, methods for determining the data to use in support vector methods, and decision rules, as well as applications of support vector machines to speaker identification, and time series prediction applications of support vector machines. Support vector machines have been shown to have a relationship with other recent nonlinear classification and modeling techniques such as: radial basis function networks, sparse approximation, PCA (principle components analysis), and regularization. Support vector machines have also been used to choose radial basis function centers. A key to understanding SVMs is to see how they introduce optimal hyperplanes to separate classes of data in the classifiers. The main concepts of SVMs are reviewed in the next section. B. How Support Vector Machines Work The following describes support vector machines in the context of classification, but the general ideas presented may also apply to regression, or curve and surface fitting. 1. Optimal Hyperplanes Consider an m-dimensional input vector x=[x1, . . . ,xm]TεX⊂Rm and a one-dimensional output yε{-1,1}. Let there exist n training vectors (xi,yi) i=1, . . , n. Hence we may write X=[x1x2 . . . xn] or ##EQU1## A hyperplane capable of performing a linear separation of the training data is described by where w=[w1w2 . . . wm]T, wεW⊂Rm. The concept of an optimal hyperplane was proposed by Vladimir Vapnik. For the case where the training data is linearly separable, an optimal hyperplane separates the data without error and the distance between the hyperplane and the closest training points is maximal. 2. Canonical Hyperplanes A canonical hyperplane is a hyperplane (in this case we consider the optimal hyperplane) in which the parameters are normalized in a particular manner. Consider (2) which defines the general hyperplane. It is evident that there is some redundancy in this equation as far as separating sets of points. Suppose we have the following classes where yε[-1,1]. One way in which we may constrain the hyperplane is to observe that on either side of the hyperplane, we may have wTx+b>0 or wTx+b<0. Thus, if we place the hyperplane midway between the two closest points to the hyperplane, then we may scale w,b such that ##EQU2## Now, the distance d from a point xi to the hyperplane denoted by (w,b) is given by ##EQU3## where ∥w∥=wTw. By considering two points on opposite sides of the hyperplane, the canonical hyperplane is found by maximizing the margin ##EQU4## This implies that the minimum distance between two classes i and j is at least [2/(∥w∥)]. Hence an optimization function which we seek to minimize to obtain canonical hyperplanes, is ##EQU5## Normally, to find the parameters, we would minimize the training error and there are no constraints on w,b. However, in this case, we seek to satisfy the inequality in (3). Thus, we need to solve the constrained optimization problem in which we seek a set of weights which separates the classes in the usually desired manner and also minimizing J(w), so that the margin between the classes is also maximized. Thus, we obtain a classifier with optimally separating hyperplanes. C. An SVM Learning Rule For any given data set, one possible method to determine w0,b0 such that (8) is minimized would be to use a constrained form of gradient descent. In this case, a gradient descent algorithm is used to minimize the cost function J(w), while constraining the changes in the parameters according to (3). A better approach to this problem however, is to use Lagrange multipliers which is well suited to the nonlinear constraints of (3). Thus, we introduce the Lagrangian equation: ##EQU6## where αi are the Lagrange multipliers and αi>0. The solution is found by maximizing L with respect to αi and minimizing it with respect to the primal variables w and b. This problem may be transformed from the primal case into its dual and hence we need to solve ##EQU7## At the solution point, we have the following conditions ##EQU8## where solution variables w0,b0,α0 are found. Performing the differentiations, we obtain respectively, ##EQU9## and in each case α0i>0, i=1, . . . ,n. These are properties of the optimal hyperplane specified by (w0,b0). From (14) we note that given the Lagrange multipliers, the desired weight vector solution may be found directly in terms of the training vectors. To determine the specific coefficients of the optimal hyperplane specified by (w0,b0) we proceed as follows. Substitute (13) and (14) into (9) to obtain ##EQU10## It is necessary to maximize the dual form of the Lagrangian equation in (15) to obtain the required Lagrange multipliers. Before doing so however, consider (3) once again. We observe that for this inequality, there will only be some training vectors for which the equality holds true. That is, only for some (xi,yi) will the following equation hold: The training vectors for which this is the case, are called support vectors. Since we have the Karush-Kühn-Tucker (KKT) conditions that α0i>0, i=1, . . . , n and that given by (3), from the resulting Lagrangian equation in (9), we may write a further KKT condition This means, that since the Lagrange multipliers α0i are nonzero with only the support vectors as defined in (16), the expansion of w0 in (14) is with regard to the support vectors only. Hence we have ##EQU11## where S is the set of all support vectors in the training set. To obtain the Lagrange multipliers α0i, we need to maximize (15) only over the support vectors, subject to the constraints α0i>0, i=1, . . . ,n and that given in (13). This is a quadratic programming problem and may be readily solved. Having obtained the Lagrange multipliers, the weights w0 may be found from (18). D. Classification of Linearly Separable Data A support vector machine which performs the task of classifying linearly separable data is defined as where w,b are found from the training set. Hence may be written as ##EQU12## where α0i are determined from the solution of the quadratic programming problem in (15) and b0 is found as ##EQU13## where xi+ and xi- are any input training vector examples from the positive and negative classes respectively. For greater numerical accuracy, we may also use ##EQU14## E. Classification of Nonlinearly Separable Data For the case where the data is nonlinearly separable, the above approach can be extended to find a hyperplane which minimizes the number of errors on the training set. This approach is also referred to as soft margin hyperplanes. In this case, the aim is to where ξi>0, i=1, . . . ,n. In this case, we seek to minimize to optimize ##EQU15## F. Nonlinear Support Vector Machines For some problems, improved classification results may be obtained using a nonlinear classifier. Consider (20) which is a linear classifier. A nonlinear classifier may be obtained using support vector machines as follows. The classifier is obtained by the inner product xiTx where i⊂S, the set of support vectors. However, it is not necessary to use the explicit input data to form the classifier. Instead, all that is needed is to use the inner products between the support vectors and the vectors of the feature space. That is, by defining a kernel a nonlinear classifier can be obtained as ##EQU16## G. Kernel Functions A kernel function may operate as a basis function for the support vector machine. In other words, the kernel function may be used to define a space within which the desired classification or prediction may be greatly simplified. Based on Mercer's theorem, as is well known in the art, it is possible to introduce a variety of kernel functions, including: 1. Polynomial The pth order polynomial kernel function is given by 2. Radial Basis Function K(xi,x)=e (25) where γ>0. 3. Multilayer Networks A multilayer network may be employed as a kernel function as follows. We have where σ is a sigmoid function. Note that the use of a nonlinear kernel permits a linear decision function to be used in a high dimensional feature space. We find the parameters following the same procedure as before. The Lagrange multipliers may be found by maximizing the functional ##EQU17## When support vector methods are applied to regression or curve-fitting, a high-dimensional "tube" with a radius of acceptable error is constructed which minimizes the error of the data set while also maximizing the flatness of the associated curve or function. In other words, the tube is an envelope around the fit curve, defined by a collection of data points nearest the curve or surface, i.e., the support vectors. Thus, support vector machines offer an extremely powerful method of obtaining models for classification and regression. They provide a mechanism for choosing the model structure in a natural manner which gives low generalization error and empirical risk. H. Construction of Support Vector Machines Support vector machine 1206 may be built by specifying a kernel function, a number of inputs, and a number of outputs. Of course, as is well known in the art, regardless of the particular configuration of the support vector machine, some type of training process may be used to capture the behaviors and/or attributes of the system or process to be modeled. The modular aspect of one embodiment of the present invention as shown in FIG. 19 may take advantage of this way of simplifying the specification of a support vector machine. Note that more complex support vector machines may require more configuration information, and therefore more storage. Various embodiments of the present invention contemplate other types of support vector machine configurations for use with support vector machine 1206. In one embodiment, all that is required for support vector machine 1206 is that the support vector machine be able to be trained and retrained so as to provide the needed predicted values utilized in the process control. I. Support Vector Machine Training The coefficients used in support vector machine 1206 may be adjustable constants which determine the values of the predicted output data for given input data for any given support vector machine configuration. Support vector machines may be superior to conventional statistical models because support vector machines may adjust these coefficients automatically. Thus, support vector machines may be capable of building the structure of the relationship (or model) between the input data 1220 and the output data 1218 by adjusting the coefficients. While a conventional statistical model typically requires the developer to define the equation(s) in which adjustable constant(s) are used, the support vector machine 1206 may build the equivalent of the equation(s) automatically. The support vector machine 1206 may be trained by presenting it with one or more training set(s). The one or more training set(s) are the actual history of known input data values and the associated correct output data values. As described below, one embodiment of the present invention may use the historical database with its associated timestamps to automatically create one or more training set(s). To train the support vector machine, the newly configured support vector machine is usually initialized by assigning random values to all of its coefficients. During training, the support vector machine 1206 may use its input data 1220 to produce predicted output data 1218. These predicted output data values 1218 may be used in combination with training input data 1306 to produce error data. These error data values may then be used to adjust the coefficients of the support vector machine. It may thus be seen that the error between the output data 1218 and the training input data 1306 may be used to adjust the coefficients so that the error is reduced. J. Advantages of Support Vector Machines Support vector machines may be superior to computer statistical models because support vector machines do not require the developer of the support vector machine model to create the equations which relate the known input data and training values to the desired predicted values (i.e., output data). In other words, support vector machine 1206 may learn the relationships automatically in the training step or module 104. However, it should be noted that support vector machine 1206 may require the collection of training input data with its associated input data, also called a training set. The training set may need to be collected and properly formatted. The conventional approach for doing this is to create a file on a computer on which the support vector machine is executed. In one embodiment of the present invention, in contrast, creation of the training set is done automatically using a historical database 1210 (FIG. 4). This automatic step may eliminate errors and may save time, as compared to the conventional approach. Another benefit may be significant improvement in the effectiveness of the training function, since automatic creation of the training set(s) may be performed much more frequently. II. Brief Overview Referring to FIGS. 4 and 5, one embodiment of the present invention may include a computer implemented support vector machine which produces predicted output data values 1218 using a trained support vector machine supplied with input data 1220 at a specified interval. The predicted data 1218 may be supplied via a historical database 1210 to a controller 1202, which may control a process 1212 which may produce a product 1216. In this way, the process conditions 1906 and product properties 1904 (as shown in FIGS. 22 and 23) may be maintained at a desired quality level, even though important process conditions and/or product properties may not be effectively measured directly, or modeled using conventional, fundamental or conventional statistical approaches. One embodiment of the present invention may be configured by a developer using a support vector machine configuration and step or module 104. Various parameters of the support vector machine may be specified by the developer by using natural language without knowledge of specialized computer syntax and training. For example, parameters specified by the user may include the type of kernel function, the number of inputs, the number of outputs, as well as algorithm parameters such as cost of constraint violations, and convergence tolerance (epsilon). Other possible parameters specified by the user may depend on which kernel is chosen (e.g., for gaussian kernels, one may specify the standard deviation, for polynomial kernels, one may specify the order of the polynomial). In one embodiment, there may be default values (estimates) for these parameters which may be overridden by user input. In this way, the system may allow an expert in the process being measured to configure the system without the use of a support vector machine expert. The support vector machine may be automatically trained on-line using input data 1220 and associated training input data 1306 having timestamps (for example, from clock 1230). The input data and associated training input data may be stored in a historical database 1210, which may supply this data (i.e., input data 1220 and associated training input data 1306) to the support vector machine 1206 for training at specified intervals. The (predicted) output data value 1218 produced by the support vector machine may be stored in the historical database. The stored output data value 1218 may be supplied to the controller 1202 for controlling the process as long as the error data 1504 between the output data 1218 and the training input data 1306 is below an acceptable metric. The error data 1504 may also be used for automatically retraining the support vector machine. This retraining may typically occur while the support vector machine is providing the controller with the output data, via the historical database. The retraining of the support vector machine may result in the output data approaching the training input data as much as possible over the operation of the process. In this way, an embodiment of the present invention may effectively adapt to changes in the process, which may occur in a commercial application. A modular approach for the support vector machine, as shown in FIG. 19, may be utilized to simplify configuration and to produce greater robustness. In essence, the modularity may be broken out into specifying data and calling subroutines using pointers. In configuring the support vector machine, as shown in FIG. 24, data pointers 2204 and/or 2206 may be specified. A template approach, as shown in FIGS. 29 and 30, may be used to assist the developer in configuring the support vector machine without having to perform any actual programming. The present invention in various embodiments is an on-line process control system and method. The term "on-line" indicates that the data used in various embodiments of the present invention is collected directly from the data acquisition systems which generate this data. An on-line system may have several characteristics. One characteristic may be the processing of data as the data is generated. This characteristic may also be referred to as real-time operation. Real-time operation in general demands that data be detected, processed, and acted upon fast enough to effectively respond to the situation. In a process control context, real-time may mean that the data may be responded to fast enough to keep the process in the desired control state. In contrast, off-line methods may also be used. In off-line methods, the data being used may be generated at some point in the past and there typically is no attempt to respond in a way that may effect the situation. It should be understood that while one embodiment of the present invention may use an on-line approach, alternate embodiments may substitute off-line approaches in various steps or modules. As noted above, the embodiment described herein relates to process control, such as of a manufacturing plant, but is not intended to limit the application of the present invention to that domain, but rather, various embodiments of the invention are contemplated to be applicable in many other areas, as well, such as e-commerce, data analysis, stocks and bonds management and analysis, business decision-making, optimization, e-marketplaces, financial analysis, or any other field of endeavor where predictive or classification models may be useful. Thus, specific steps or modules described herein which apply only to process control embodiments may be different, or omitted as appropriate or as desired. III. Use in Combination with Expert Systems The above description of support vector machines and support vector machines as used in various embodiments of the present invention, combined with the description of the problem of making measurements in a process control environment given in the background section, illustrate that support vector machines add a unique and powerful capability to process control systems. SVMs may allow the inexpensive creation of predictions of measurements that may be difficult or impossible to obtain. This capability may open up a new realm of possibilities for improving quality control in manufacturing processes. As used in various embodiments of the present invention, support vector machines serve as a source of input data to be used by controllers of various types in controlling a process. Of course, as noted above, the applications of the present invention in the fields of manufacturing and process control may be illustrative, and are not intended to limit the use of the invention to any particular domain. For example, the "process" being controlled may be a financial analysis process, an e-commerce process, or any other process which may benefit from the use of predictive models. Expert systems may provide a completely separate and completely complimentary capability for predictive model based systems. Expert systems may be essentially decision-making programs which base their decisions on process knowledge which is typically represented in the form of if-then rules. Each rule in an expert system makes a small statement of truth, relating something that is known or could be known about the process to something that may be inferred from that knowledge. By combining the applicable rules, an expert system may reach conclusions or make decisions which mimic the decision-making of human experts. The systems and methods described in several of the United States patents and patent applications incorporated by reference above use expert systems in a control system architecture and method to add this decision-making capability to process control systems. As described in these patents and patent applications, expert systems provide a very advantageous function in the implementation of process control systems. The present system adds a different capability of substituting support vector machines for measurements which may be difficult to obtain. The advantages of the present system may be both consistent with and complimentary to the capabilities provided in the above-noted patents and patent applications using expert systems. The combination of support vector machine capability with expert system capability in a control system may provide even greater benefits than either capability provided alone. For example, a process control problem may have a difficult measurement and also require the use of decision-making techniques in structuring or implementing the control response. By combining support vector machine and expert system capabilities in a single control application, greater results may be achieved than using either technique alone. It should thus be understood that while the system described herein relates primarily to the use of support vector machines for process control, it may very advantageously be combined with the expert system inventions described in the above-noted patents and patent applications to give even greater process control problem solving capability. As described below, when implemented in the modular process control system architecture, support vector machine functions may be easily combined with expert system functions and other control functions to build such integrated process control applications. Thus, while various embodiments of the present invention may be used alone, these various embodiments of the present invention may provide even greater value when used in combination with the expert system inventions in the above-noted patents and patent applications. IV. One Method of Operation One method of operation of one embodiment of the present invention may store input data and training data, may configure and may train a support vector machine, may predict output data using the support vector machine, may retrain the support vector machine, may enable or may disable control using the output data, and may control the process using output data. As shown in FIG. 5, more than one step or module may be carried out in parallel. As indicated by the divergent order pointer 120, the first two steps or modules in one embodiment of the present invention may be carried out in parallel. First, in step or module 102, input data and training input data may be stored in the historical database with associated timestamps. In parallel, the support vector machine may be configured and trained in step or module 104. Next, two series of steps or modules may be carried out in parallel as indicated by the order pointer 122. First, in step or module 106, the support vector machine may be used to predict output data using input data stored in the historical database. Next, in step or module 108, the support vector machine may be retrained using training input data stored in the historical database. Next, in step or module 110, control using the output data may be enabled or disabled in parallel. In step or module 112, control of the process using the output data may be carried out when enabled by step or module 110. A. Store Input Data and Training Input Data Step or Module 102 As shown in FIG. 5, an order pointer 120 indicates that step or module 102 and step or module 104 may be performed in parallel. Referring now to step or module 102, it is denoted as "store input data and training input data". FIG. 6 may show step or module 102 in more detail. Referring now to FIGS. 5 and 6, step or module 102 may have the function of storing input data 1220 and storing training input data 1306. Both types of data may be stored in a historical database 1210 (see FIG. 4 and related structure diagrams), for example. Each stored input data and training input data entry in historical database 1210 may utilize an associated timestamp. The associated timestamp may allow the system and method of one embodiment of the present invention to determine the relative time that the particular measurement or predicted value or measured value was taken, produced or derived. A representative example of step or module 102 is shown in FIG. 6, which is described as follows. The order pointer 120, as shown in FIG. 6, indicates that input data 1220 and training input data 1306 may be stored in parallel in the historical database 1210. Specifically, input data from sensors 1226 (see FIGS. 4 and 16) may be produced by sampling at specific time intervals the sensor signal 1224 provided at the output of the sensor 1226. This sampling may produce an input data value or number or signal. Each of data points may be called an input data 1220 as used in this application. The input data may be stored with an associated timestamp in the historical database 1210, as indicated by step or module 202. The associated timestamp that is stored in the historical database with the input data may indicate the time at which the input data was produced, derived, calculated, etc. Step or module 204 shows that the next input data value may be stored by step or module 202 after a specified input data storage interval has lapsed or timed out. This input data storage interval realized by step or module 204 may be set at any specific value (e.g., by the user). Typically, the input data storage interval is selected based on the characteristics of the process being controlled. As shown in FIG. 6, in addition to the sampling and storing of input data at specified input data storage intervals, training input data 1306 may also be stored. Specifically, as shown by step or module 206, training input data may be stored with associated timestamps in the historical database 1210. Again, the associated timestamps utilized with the stored training input data may indicate the relative time at which the training input data was derived, produced or obtained. It should be understood that this usually is the time when the process condition or product property actually existed in the process or product. In other words, since it typically takes a relatively long period of time to produce the training input data (because lab analysis and the like usually has to be performed), it is more accurate to use a timestamp which indicates the actual time when the measured state existed in the process rather than to indicate when the actual training input data was entered into the historical database. This produces a much closer correlation between the training input data 1306 and the associated input data 1220. This close correlation is needed, as is discussed in detail below, in order to more effectively train and control the system and method of various embodiments of the present invention. The training input data may be stored in the historical database 1210 in accordance with a specified training input data storage interval, as indicated by step or module 208. While this may be a fixed time period, it typically is not. More typically, it is a time interval which is dictated by when the training data is actually produced by the laboratory or other mechanism utilized to produce the training input data 1306. As is discussed in detail herein, this often times takes a variable amount of time to accomplish depending upon the process, the mechanisms being used to produce the training data, and other variables associated both with the process and with the measurement/analysis process utilized to produce the training input data. What is important to understand here is that the specified input data storage interval is usually considerably shorter than the specified training input data storage interval of step or module 204. As may be seen, step or module 102 thus results in the historical database 1210 receiving values of input data and training input data with associated timestamps. These values may be stored for use by the system and method of one embodiment of the present invention in accordance with the steps and modules discussed in detail below. B. Configure and Train Support Vector Machine Step or Module 104 As shown in FIG. 5, the order pointer 120 shows that a configure and train support vector machine step or module 104 may be performed in parallel with the store input data and training input data step or module 102. The purpose of step or module 104 may be to configure and train the support vector machine 1206 (see FIG. 4). Specifically, the order pointer 120 may indicate that the step or module 104 plus all of its subsequent steps and/or modules may be performed in parallel with the step or module 102. FIG. 7 shows a representative example of the step or module 104. As shown in FIG. 7, this representative embodiment is made up of five steps and/or modules 302, 304, 306, 308 and 310. Referring now to FIG. 7, an order pointer 120 shows that the first step or module of this representative embodiment is a configure support vector machine step or module 302. Configure support vector machine step or module 302 may be used to set up the structure and parameters of the support vector machine 1206 that is utilized by the system and method of one embodiment of the present invention. As discussed below in detail, the actual steps and/or modules utilized to set up the structure and parameters of support vector machine 1206 may be shown in FIG. 12. After the support vector machine 1206 has been configured in step or module 302, an order pointer 312 indicates that a wait training data interval step or module 304 may occur or may be utilized. The wait training data interval step or module 304 may specify how frequently the historical database 1210 is to be looked at to determine if any new training data to be utilized for training of the support vector machine 1206 exists. It should be noted that the training data interval of step or module 304 may not be the same as the specified training input data storage interval of step or module 206 of FIG. 6. Any desired value for the training data interval may be utilized for step or module 304. An order pointer 314 indicates that the next step or module may be a new training input data step or module 306. This new training input data step or module 306 may be utilized after the lapse of the training data interval specified by step or module 304. The purpose of step or module 306 may be to examine the historical database 1210 to determine if new training data has been stored in the historical database since the last time the historical database 1210 was examined for new training data. The presence of new training data may permit the system and method of one embodiment of the present invention to train the support vector machine 1206 if other parameters/conditions are met. FIG. 13 discussed below shows a specific embodiment for the step or module 306. An order pointer 318 indicates that if step or module 306 indicates that new training data is not present in the historical database 1210, the step or module 306 returns operation to the step or module 304. In contrast, if new training data is present in the historical database 1210, the step or module 306, as indicated by an order pointer 316, continues processing with a train support vector machine step or module 308. Train support vector machine step or module 308 may be the actual training of the support vector machine 1206 using the new training data retrieved from the historical database 1210. FIG. 14, discussed below in detail, shows a representative embodiment of the train support vector machine step or module 308. After the support vector machine has been trained, in step or module 308, the step or module 104 as indicated by an order pointer 320 may move to an error acceptable step or module 310. Error acceptable step or module 310 may determine whether the error data 1504 produced by the support vector machine 1206 is within an acceptable metric, indicating error that the support vector machine 1206 is providing output data 1218 that is close enough to the training input data 1306 to permit the use of the output data 1218 from the support vector machine 1206. In other words, an acceptable error may indicate that the support vector machine 1206 has been "trained" as training is specified by the user of the system and method of one embodiment of the present invention. A representative example of the error acceptable step or module 310 is shown in FIG. 15, which is discussed in detail below. If an unacceptable error is determined by error acceptable step or module 310, an order pointer 322 indicates that the step or module 104 returns to the wait training data interval step or module 304. In other words, when an unacceptable error exists, the step or module 104 has not completed training the support vector machine 1206. Because the support vector machine 1206 has not completed being trained, training may continue before the system and method of one embodiment of the present invention may move to a step or module 106 discussed below. In contrast, if the error acceptable step or module 310 determines that an acceptable error from the support vector machine 1206 has been obtained, then the step or module 104 has trained support vector machine 1206. Since the support vector machine 1206 has now been trained, step or module 104 may allow the system and method of one embodiment of the present invention to move to the steps or modules 106 and 112 discussed below. The specific embodiments for step or module 104 are now discussed. 1. Configure Support Vector Machine Step or Module 302 Referring now to FIG. 12, a representative embodiment of the configure support vector machine step or module 302 is shown. This step or module 302 may allow the uses of one embodiment of the present invention to both configure and re-configure the support vector machine. Referring now to FIG. 12, the order pointer 120 indicates that the first step or module may be a specify training and prediction timing control step or module 802. Step or module 802 may allow the person configuring the system and method of one embodiment of the present invention to specify the training interval(s) and the prediction timing interval(s) of the support vector machine 1206. FIG. 33 shows a representative embodiment of the step or module 802. Referring now to FIG. 33, step or module 802 may be made up of four steps and/or modules 3102, 3104, 3106, and 3108. Step or module 3102 may be a specify training timing method step or module. The specify training timing method step or module 3102 may allow the user configuring one embodiment of the present invention to specify the method or procedure to be followed to determine when the support vector machine 1206 is being trained. A representative example of this may be when all of the training data has been updated. Another example may be the lapse of a fixed time interval. Other methods and procedures may be utilized. An order pointer indicates that a specify training timing parameters step or module 3104 may then be carried out by the user of one embodiment of the present invention. This step or module 3104 may allow for any needed training timing parameters to be specified. It should be realized that the method or procedure of step or module 3102 may result in zero or more training timing parameters, each of which may have a value. This value may be a time value, a module number (e.g., in the modular embodiment of the present invention of FIG. 19), or a data pointer. In other words, the user may configure one embodiment of the present invention so that considerable flexibility may be obtained in how training of the support vector machine 1206 may occur, based on the method or procedure of step or module 3102. An order pointer indicates that once the training timing parameters 3104 have been specified, a specify prediction timing method step or module 3106 may be configured by the user of one embodiment of the present invention. This step or module 3106 may specify the method or procedure that may be used by the support vector machine 1206 to determine when to predict output data values 1218 after the SVM has been trained. This is in contrast to the actual training of the support vector machine 1206. Representative examples of methods or procedures for step or module 3106 may include: execute at a fixed time interval, execute after the execution of a specific module, and execute after a specific data value is updated. Other methods and procedures may also be used. An order indicator in FIG. 33 shows that a specify prediction timing parameters step or module 3108 may then be carried out by the user of one embodiment of the present invention. Any needed prediction timing parameters for the method or procedure of step or module 3106 may be specified. For example, the time interval may be specified as a parameter for the execute at a specific time interval method or procedure. Another example may be the specification of a module identifier when the execute after the execution of a particular module method or procedure is specified. Another example may be a data pointer when the updating of a data value method or procedure is used. Other operation timing parameters may be used. Referring again to FIG. 12, after the specify training and prediction timing control step or module 802 has been specified, a specify support vector machine size step or module 804 may be carried out. This step or module 804 may allow the user to specify the size and structure of the support vector machine 1206 that is used by one embodiment of the present invention. Specifically, referring to FIG. 33 again, a representative example of how the support vector machine size may be specified by step or module 804 is shown. An order pointer indicates that a specific number of inputs step or module 3110 may allow the user to indicate the number of inputs that the support vector machine 1206 may have. Note that the source of the input data for the specific number of inputs in the step or module 3110 is not specified. Only the actual number of inputs is specified in the step or module 3110. In step or module 3112, a kernel function may be determined for the support vector machine. The specific kernel function chosen may determine the kind of support vector machine (e.g., radial basis function, polynomial, multi-layer network, etc.). Depending upon the specific kernel function chosen, additional parameters may be specified. For example, as mentioned above, for gaussian kernels, one may specify the standard deviation, for polynomial kernels, one may specify the order of the polynomial. In one embodiment, there may be default values (estimates) for these parameters which may be overridden by user input. It should be noted that in other embodiments, various other training or execution parameters of the SVM not shown in FIG. 33 may be specified by the user (e.g., algorithm parameters such as cost of constraint violations, and convergence tolerance (epsilon)). An order pointer indicates that once the kernel function has been specified in step or module 3112, a specific number of outputs step or module 3114 may allow the user to indicate the number of outputs that the support vector machine 1206 may have. Note that the storage location for the outputs of the support vector machine 1206 is not specified in step or module 3114. Instead, only the actual number of outputs is specified in the step or module 3114. As discussed herein, one embodiment of the present invention may contemplate any form of presently known or future developed configuration for the structure of the support vector machine 1206. Thus, steps or modules 3110, 3112, and 3114 may be modified so as to allow the user to specify these different configurations for the support vector machine 1206. Referring again to FIG. 12, once the support vector machine size has been specified in step or module 804, the user may specify the training and prediction modes in a step or module 806. Step or module 806 may allow both the training and prediction modes to be specified. Step or module 806 may also allow for controlling the storage of the data produced in the training and prediction modes. Step or module 806 may also allow for data coordination to be used in training mode. A representative example of the specific training and prediction modes step or module 806 is shown in FIG. 33. It is made up of step or modules 3116, 3118, and 3120. As shown, an order pointer indicates that the user may specify prediction and train modes in step or module 3116. These prediction and train modes may be yes/no or on/off settings, in one embodiment. Since the system and method of one embodiment of the present invention is in the train mode at this stage in its operation, step or module 3116 typically goes to its default setting of train mode only. However, it should be understood that various embodiments of the present invention may contemplate allowing the user to independently control the prediction or train modes. When prediction mode is enabled or "on," the support vector machine 1206 may predict output data values 1218 using retrieved input data values 1220, as described below. When training mode is enabled or "on," the support vector machine 1206 may monitor the historical database 1210 for new training data and may train using the training data, as described below. An order pointer indicates that once the prediction and train modes have been specified in step or module 3116, the user may specify prediction and train storage modes in step or module 3118. These prediction and train storage modes may be on/off, yes/no values, similar to the modes of step or module 3116. The prediction and train storage modes may allow the user to specify whether the output data produced in the prediction and/or training may be stored for possible later use. In some situations, the user may specify that the output data is not to be stored, and in such a situation the output data will be discarded after the prediction or train mode has occurred. Examples of situations where storage may not be needed include: (1) if the error acceptable metric value in the train mode indicates that the output data is poor and retraining is necessary; (2) in the prediction mode, where the output data is not stored but is only used. Other situations may arise where no storage is warranted. An order pointer indicates that a specify training data coordination mode step or module 3120 may then be specified by the user. Oftentimes, training input data 1306 may be correlated in some manner with input data 1220. Step or module 3120 may allow the user to deal with the relatively long time period required to produce training input data 1306 from when the measured state(s) existed in the process. First, the user may specify whether the most recent input data is to be used with the training data, or whether prior input data is to be used with the training data. If the user specifies that prior input data is to be used, the method of determining the time of the prior input data may be specified in step or module 3120. Referring again to FIG. 12, once the specified training and prediction modes step or module 806 has been completed by the user, steps and modules 808, 810, 812 and 814 may be carried out. Specifically, the user may follow specify input data step or module 808, specify output data step or module 810, specify training input data step or module 812, and specify error data step or module 814. Essentially, these four steps and/or modules 808-814 may allow the user to specify the source and destination of input and output data for both the (run) prediction and training modes, and the storage location of the error data determined in the training mode. FIG. 34 shows a representative embodiment used for all of the steps and/or modules 808-814 as follows. Steps and/or modules 3202, 3204, and 3206 essentially may be directed to specifying the data location for the data being specified by the user. In contrast, steps and/or modules 3208-3216 may be optional in that they allow the user to specify certain options or sanity checks that may be performed on the data as discussed below in more detail. Turning first to specifying the storage location of the data being specified, step or module 3202 is called specify data system. For example, typically, in a chemical plant, there is more than one computer system utilized with a process being controlled. Step or module 3202 may allow for the user to specify which computer system(s) contains the data or storage location that is being specified. Once the data system has been specified, the user may specify the data type using step or module 3204: specify data type. The data type may indicate which of the many types of data and/or storage modes is desired. Examples may include current (most recent) values of measurements, historical values, time averaged values, setpoint values, limits, etc. After the data type has been specified, the user may specify a data item number or identifier using step or module 3206. The data item number or identifier may indicate which of the many instances of the specify data type in the specified data system is desired. Examples may include the measurement number, the control loop number, the control tag name, etc. These three steps and/or modules 3202-3206 may thus allow the user to specify the source or destination of the data (used/produced by the support vector machine) being specified. Once this information has been specified, the user may specify the following additional parameters. The user may specify the oldest time interval boundary using step or module 3208, and may specify the newest time interval boundary using step or module 3210. For example, these boundaries may be utilized where a time weighted average of a specified data value is needed. Alternatively, the user may specify one particular time when the data value being specified is a historical data point value. Sanity checks on the data being specified may be specified by the user using steps and/or modules 3212, 3214 and 3216 as follows. The user may specify a high limit value using step or module 3212, and may specify a low limit value using step or module 3214. Since sensors sometimes fail, for example, this sanity check may allow the user to prevent the system and method of one embodiment of the present invention from using false data from a failed sensor. Other examples of faulty data may also be detected by setting these limits. The high and low limit values may be used for scaling the input data. Support vector machines may be typically trained and operated using input, output and training input data scaled within a fixed range. Using the high and low limit values may allow this scaling to be accomplished so that the scaled values use most of the range. In addition, the user may know that certain values will normally change a certain amount over a specific time interval. Thus, changes which exceed these limits may be used as an additional sanity check. This may be accomplished by the user specifying a maximum change amount in step or module 3216. Sanity checks may be used in the method of one embodiment of the present invention to prevent erroneous training, prediction, and control. Whenever any data value fails to pass the sanity checks, the data may be clamped at the limit(s), or the operation/control may be disabled. These tests may significantly increase the robustness of various embodiments of the present invention. It should be noted that these steps and/or modules in FIG. 34 apply to the input, output, training input, and error data steps and/or modules 808, 810, 812 and 814. When the support vector machine is fully configured, the coefficients may be normally set to random values in their allowed ranges. This may be done automatically, or it may be performed on demand by the user (for example, using softkey 2616 in FIG. 28). 2. Wait Training Input Data Interval Step or Module 304 Referring again to FIG. 7, the wait training data interval step or module 304 is now described in greater detail. Typically, the wait training input data interval is much shorter than the time period (interval) when training input data becomes available. This wait training input data interval may determine how often the training input data will be checked to determine whether new training input data has been received. Obviously, the more frequently the training input data is checked, the shorter the time interval will be from when new training input data becomes available to when retraining has occurred. It should be noted that the configuration for the support vector machine 1206 and specifying its wait training input data interval may be done by the user. This interval may be inherent in the software system and method which contains the support vector machine of one embodiment of the present invention. Preferably, it is specifically defined by the entire software system and method of one embodiment of the present invention. Next, the support vector machine 1206 is trained. 3. New Training Input Data Step or Module 306 An order pointer 314 indicates that once the wait training input data interval 304 has elapsed, the new training input data step or module 306 may occur. FIG. 13 shows a representative embodiment of the new training input data step or module 306. Referring now to FIG. 13, a representative example of determining whether new training input data has been received is shown. A retrieve current training input timestamp from historical database step or module 902 may first retrieve from the historical database 1210 the current training input data timestamp(s). As indicated by an order pointer, a compare current training input data timestamp to stored training input data timestamp step or module 904 may compare the current training input data timestamp(s) with saved training input data timestamp(s). Note that when the system and method of one embodiment of the present invention is first started, an initialization value may be used for the saved training input data timestamp. If the current training input data timestamp is the same as the saved training input data timestamp, this may indicate that new training input data does not exist. This situation on no new training input data may be indicated by order pointer 318. Step or module 904 may function to determine whether any new training input data is available for use in training the support vector machine. It should be understood that, in various embodiments of the present invention, the presence of new training input data may be detected or determined in various ways. One specific example is where only one storage location is available for training input data and the associated timestamp. In this case, detecting or determining the presence of new training input data may be carried out by saving internally in the support vector machine the associated timestamp of the training input data from the last time the training input data was checked, and periodically retrieving the timestamp from the storage location for the training input data and comparing it to the internally saved value of the timestamp. Other distributions and combinations of storage locations for timestamps and/or data values may be used in detecting or determining the presence of new training input data. However, if the comparison of step or module 904 indicates that the current training input data timestamp is different from the saved training input data timestamp, this may indicate that new training input data has been received or detected. This new training input data timestamp may be saved by a save current training input data timestamp step or module 906. After this current timestamp of training input data has been saved, the new training data step or module 306 is completed, and one embodiment of the present invention may move to the train support vector machine step or module 308 of FIG. 7 as indicated by the order pointer. 4. Train Support Vector Machine Step or Module 308 Referring again to FIG. 7, the train support vector machine step or module 308 may be the step or module where the support vector machine 1206 is trained. FIG. 14 shows a representative embodiment of the train support vector machine step or module 308. Referring now to step or module 308 shown in FIG. 14, an order pointer 316 indicates that a retrieve current training input data from historical database step or module 1002 may occur. In step or module 1002, one or more current training input data values may be retrieved from the historical database 1210. The number of current training input data values that is retrieved may be equal to the number of outputs of the support vector machine 1206 that is being trained. The training input data is normally scaled. This scaling may use the high and low limit values specified in the configure and train support vector machine step or module 104. An order pointer shows that a choose training input data time step or module 1004 may be carried out next. Typically, when there are two or more current training input data values that are retrieved, the data time (as indicated by their associated timestamps) for them is different. The reason for this is that typically the sampling schedule used to produce the training input data is different for the various training input data. Thus, current training input data often has varying associated timestamps. In order to resolve these differences, certain assumptions have to be made. In certain situations, the average between the timestamps may be used. Alternately, the timestamp of one of the current training input data may be used. Other approaches also may be employed. Once the training input data time has been chosen in step or module 1004, the input data at the training input data time may be retrieved from the historical database 1210 as indicated by step or module 1006. The input data is normally scaled. This scaling may use the high and low limit values specified in the configure and train support vector machine step or module 104. Thereafter, the support vector machine 1206 may predict output data from the retrieved input data, as indicated by step or module 406. The predicted output data from the support vector machine 1206 may then be stored in the historical database 1210, as indicated by step or module 408. The output data is normally produced in a scaled form, since all the input and training input data is scaled. In this case, the output data may be de-scaled. This de-scaling may use the high and low limit values specified in the configure and train support vector machine step or module 104. Thereafter, error data may be computed using the output data from the support vector machine 1206 and the training input data, as indicated by step or module 1012. It should be noted that the term error data 1504 as used in step or module 1012 may be a set of error data value for all of the predicted outputs from the support vector machine 1206. However, one embodiment of the present invention may also contemplate using a global or cumulative error data for evaluating whether the predicted output data values are acceptable. After the error data 1504 has been computed or calculated in step or module 1012, the support vector machine 1206 may be retrained using the error data 1504 and/or the training input data 1306. One embodiment of the present invention may contemplate any method of training the support vector machine 1306. After the training step or module 1014 is completed, the error data 1504 may be stored in the historical database 1210 in step or module 1016. It should be noted that the error data 1504 shown here may be the individual data for each output. These stored error data 1504 may provide a historical record of the error performance for each output of the support vector machine 1206. The sequence of steps described above may be used when the support vector machine 1206 is effectively trained using a single presentation of the training set created for each new training input data 1306. However, in using certain training methods or for certain applications, the support vector machine 1206 may require many presentations of training sets to be adequately trained (i.e., to produce an acceptable metric). In this case, two alternate approaches may be used to train the support vector machine 1206, among other approaches. In the first approach, the support vector machine 1206 may save the training sets (i.e., the training input data and the associated input data which is retrieved in step or module 308) in a database of training sets, which may then be repeatedly presented to the support vector machine 1206 to train the support vector machine. The user may be able to configure the number of training sets to be saved. As new training data becomes available, new training sets may be constructed and saved. When the specified number of training sets has been accumulated (e.g., in a list or buffer), the next training set created based on new data may "bump" the oldest training set from the list or buffer. This oldest training set may then be discarded. Conventional support vector machine training creates training sets all at once, off-line, and would continue using all the training sets created. It is noted that the use of a buffer to store training sets is but one example of storage means for the training sets, and that other storage means are also contemplated, including lists (such as queues and stacks), databases, and arrays, among others. A second approach which may be used is to maintain a time history of input data and training input data in the historical database 1210 (e.g., in a list or buffer), and to search the historical database 1210, locating training input data and constructing the corresponding training set by retrieving the associated input data. It should be understood that the combination of the support vector machine 1206 and the historical database 1210 containing both the input data and the training input data with their associated timestamps may provide a very powerful platform for building, training and using the support vector machine 1206. One embodiment of the present invention may contemplate various other modes of using the data in the historical database 1210 and the support vector machine 1206 to prepare training sets for training the support vector machine 1206. 5. Error Acceptable Step or Module 310 Referring again to FIG. 7, once the support vector machine 1206 has been trained in step or module 308, a determination of whether an acceptable error exists may occur in step or module 310. FIG. 15 shows a representative embodiment of the error acceptable step or module 310. Referring now to FIG. 15, an order pointer 320 indicates that a compute global error using saved global error step or module 1102 may occur. The term global error as used herein means the error over all the outputs and/or over two or more training sets (cycles) of the support vector machine 1206. The global error may reduce the effects of variation in the error from one training set (cycle) to the next. One cause for the variation is the inherent variation in tests used to generate the training input data. Once the global error has been computed or estimated in step or module 1102, the global error may be saved in step or module 1104. The global error may be saved internally in the support vector machine 1206, or it may be stored in the historical database 1210. Storing the global error in the historical database 1210 may provide a historical record of the overall performance of the support vector machine 1206. Thereafter, if an appropriate history of global error is available (as would be the case in retraining), step or module 1106 may be used to determine if the global error is statistically diffe | ||||||
