System and method for uniformly administering parameters for a load distribution and load control of a computer platform6889377Abstract A system and method for uniformly administering parameters for a load distribution and load control of a computer platform includes a processor and a storage mechanism. Software components run on the processor, and include a manager component, which is a load model manager, that uniformly administers parameters for a load distribution and load control of the system. The platform further includes a catalog stored in the storage mechanism via which the load model manager administers the parameters. The catalog includes a plurality of tables. Each of the tables respectively includes a load model which is a complete, consistent set of parameters that influence the load distribution of the load control of the computer system. Claims 1. A computer system platform, comprising: Description BACKGROUND OF THE INVENTION
The totality of software present on a platform for the above tasks, including basic software, is also referred to as "load type". Beyond these "pure" load types, platforms having "combined load types" must also be offered so that minimal configurations can be offered customized and cost-beneficially. It is precisely these platforms that are a challenge for the careful matching of the guaranteed minimal running time budget of the applications, since the competition, particularly of different types of user programs, for the allocation of the processor run time is greatest in this situation. The operating system and the load control are explained in greater detail below. Process Term and CHILL The applications discussed above are realized as processes and are also referenced as such below. A run time-related unit is understood here as a process. The process has a runnable software unit that has attributes; those that influence its access right to the processor are of primary interest. These attributes are statically implemented. The process with its attributes is verbally explained (Commitee Consultatif de Telephonie et Telegraphie High Level Language —CHILL) and compiler-supported. Upon system start-up, the processes are offered in a form suited to the operating system, that takes their attributes into consideration. The two attributes of the processes important for the real-time behavior determine: The task groups of those units for which a minimum access time to the processor of the platform on which they reside is guaranteed. The category of the delay time demands of a process within a task group determines the way that processes that belong to the same task group on the same platform compete for calculating time when they are simultaneously ready to run. This attribute, which distinguishes the processes within a task group, is what is referred to as the process priority (see below under the section "scheduling"). Operating System Part: Service Addressing When a great system expansion of a node occurs, it is desirable to distribute the requests onto the cluster such that optimally no individual platform is overworked. The operating system part called "service addressing" assumes this task. The operating system of each and every platform knows all functions offered and addressable in the entire cluster as "services". A service can be offered on the platform on which it is requested and/or on one or more other platforms of the cluster. In order to avoid the communication overhead between the platforms, the requesting user is informed of the response address of the service of its own platform—if present. When, however, this platform is in an overload condition (also see the section "load control" below), then a different platform in the cluster is selected that also offers this service, insofar as such a different platform is in a lower load condition. This is possible because a current load image of the overall cluster is situated on every platform. Operating System Part: Scheduling For a minimal configuration of the cluster, the computer capacity of an individual processor must be carefully distributed onto the applications. To that end, the operating system of each and every platform must support the processes that compete for calculating time on one and the same platform, particularly under high load, such that it can adhere to the real-time demands at the switching nodes. In order to enable this, the processes are combined into task groups. FIG. 2 shows an example of the scheduling of the processes. A guaranteed minimum part of the processor time is defined as a "run time budget" (i.e., net calculating time) for each of the task groups by load control parameters. These task groups are therefore also referred to as "virtual CPUs" (VCPU). The processes are assigned to a FIFO waiting list per VCPU and priority, and wait there for allocation of the CPU. The scheduling routine works by time slicing and within time slices. At the beginning of each and every time slot, it observes a time credit account per VCPU in that it compares the sum of all measured running times of the processes of a VCPU to the guaranteed time budget. The result of this comparison is a VCPU-individual time credit. In the VCPU having the greatest time budget, that waiting process that is waiting at the first location of the waiting list having the highest priority is the first to receive calculating time (see FIG. 2). When the process is suspended during this time slot, then the next process from the same VCPU (and, potentially, from the same waiting list) follows. When no other process is waiting from this VCPU, then the scheduling algorithm performs as it did at the beginning of the time slot. At the beginning of the next time slot, the scheduling algorithm is applied anew. Load Control The load control SW cyclically compares the current load situation of its own platform to platform-individual thresholds and potentially identifies the processor as being in an overload condition. The overload condition is established for a processor when the offered, total load lies higher than the load thresholds that are established for this processor. This load information is distributed system-wide, so that the service addressing is in the position to "spare" overloaded platforms. As soon as a processor is in the overload condition, the load control also determines, on the basis of the run time budget of the VCPUs, which of these VCPUs has exceeded its budget. These VCPUs are then referred to as overloaded. Suitable processes within these overloaded VCPUs are informed and requested to reduce their load offering to the processor. According to the above-presented MP operating system structure of the ATM switching node, both the load control as well as the process scheduling in the operating system work with a number of load control parameters in order to create the pre-conditions for an optimum real-time behavior. Since operating system and load control are generic (i.e., uniform) for all MP platforms, these data for all possible function configurations of a platform must be available on each MP. Moreover, they must be "activatable" interactively on the platform (e.g., during the configuration) in order to take the function configuration into account. Furthermore, the data must be adapted to the respective operating condition (e.g., recovery) of the platform in order to optimally respond to the various demands of these operating conditions. Load Model Management The illustrated demands and mechanisms such as scalability of the network node, distribution of the load offering onto the platforms, division of the computer capacity of a platform onto the various users, a process concept with static process attributes, as well as the load control are still not all of the mechanisms that are related to the load control parameters. Over and above this, there are also clock information, run time limitations for the internal protocol, considerations related to the basic load and further sub-budget divisions and thresholds that are not discussed here. When one considers the quantity and complexity of the load control parameters as well as the request profile and function fabric in which they reside, it becomes clear that it is necessary to implement them in surveyable fashion and to have them controlled by their own software. All parameters that influence the load control and the load distribution of the system are therefore stored in a catalog replicated on each platform. This creates a surveyability that is particularly important when the change at one of the parameters interferes with the values of other parameters in the design phase. At the run time, only a single management function (LM Manager) still has access to this catalog, which also maintains the interfaces to the parameter users. This management function is also receptive for triggers that can request changes of the currently valid parameters via defined interfaces at the run time (for example, via interactive commands by the user). It bears the responsibility for informing all users affected by the changes. The structure of the parameter catalog is selected such that online changes of the valid parameters are meaningfully limited. An engineering of the complex influencing quantities is possible and can be implemented error-free only on the basis of this form of the management and the structuring. The load management model structures the complex and extensive multitude of load control parameters in a way that, on the one hand, assures the surveyability for the purpose of engineerability, including error-free implementability. On the other hand, however, a flexibility is also enabled that envelopes pre-conditions for a situation-suited adaptation (in the configuration and at the run time) of the currently valid parameter values for the purpose of optimum performance in view of load distribution and load control. Interactively, only a single table can be selected and "activated" from the parameter catalog at one time, and thus no value from another table of the catalog can be employed (as long as this table is "active"). Changing operating conditions of the platform can in turn select only among pre-fabricated data sets and change between these. FIG. 3 shows the relationships of the load model manager for various components within the system. There are three different types of users of the LM manager. A first user type (user 1) is only in communication with the LM manager. The recovery component is cited as an exemplary user of this type. A second user type (user i) is in communication both with the LM manager as well as with the load control; the scheduling component is to be cited as an exemplary user of this type. Finally, third user (user n) only receives values from the load control; the switching technology component is cited as an exemplary user of this type. Operator Interface Each individual platform of the duster must be configured, for example upon initialization of the switching node. This means: The type of organization of the central load control parameters in the load model catalog is discussed in greater detail in the section "load model catalog". A load model table from the load model catalog is clearly allocated to a load type, by which a plurality of tables can be present for one load type (for example, for different load expectations or operating conditions). After the selection of the table, only the values recited in it, which represent a consistent set of parameters, are then valid until the interruption-free replacement of this table. This denotes a meaningful limitation that facilitates operation of the node for the user and provides him with the security of always placing parameters matched to one another into operation. The selection possibility from various load model tables provides him with the possibility of tuning the platforms. The operator can establish suitable VCPU budgets at the various VCPUs dependent on the load anticipation. Since he can replace the currently valid load model table interruption-free during operation, the budgets can be unproblemmatically adapted to modified conditions (for example, due to a HW or SW upgrade in the node). Values within a table can be changed by a SW patch that can be checked for consistency and supplied by the manufacturer. An operator command can then select the patched table free of operating interruptions (see FIG. 3), and the control software (load model management) takes care of informing the affected applications. For coordination of the selection, see the section "control software" below and FIG. 4. Load Model Catalog The load model catalog represents the organization structure of those load model parameters that are centrally implemented. In contrast to this, process attributes, of course, are decentrally declared at every individual process; see the above section "process term." The load model catalog is composed of individual load model tables that can be interactively allocated to each platform (see the section "operator interface"). Each load model table from the load model catalog contains, first, a plurality of load models that contain the budgets to be guaranteed for the VCPUs. Various load models are required in order to be able to take the different demands of the platform operating conditions (such as startup or normal operation) into account. Second, the load model table also contains the clock information, overload limits and run time limitations for sequencing the internal protocol, the size of the load reserve, the basic load as well as a table that supports the allocation of the processes onto the VCPUs. Control Software (Load Model Manager) The control SW (load model management or management component) is an operating system-proximate process that sees to the readout of the correct parameter data upon initialization or platform status changes. The load model (LM) manager enables the interruption-free replacement of the currently valid load model parameters. To that end, it makes interfaces for the events to be triggered available. Furthermore, it informs the SW affected by this event. The operating system as well as processes such as the load control belong to this SW. When, due to the replacement of the load model table, the budget is completely withdrawn from a VCPU, then, of course, the appertaining service must also be withdrawn. Depending on the internal operating condition of the appertaining platform, one or the other load model is applied. When, for example, a recovery occurs during operation (also see FIG. 3 or 4), then recovery can turn to load management so that a switch is made from normal operating load model to a specifically recovery load model, so that the modified runtime demands of recovery are taken into account. Load model management independently informs the SW affected by this event such as, for example, scheduling and the load control.
|
Same subclass Same class Consider this |
||||||||||
