Exception handling method and apparatus for a microkernel data processing system5606696Abstract Floating point hardware register set is not given to any user level thread unless it is required to perform floating point operations. Thus, for any non-floating thread, its context does not include the floating point hardware state. This effectively reduces the amount of information to be handled when threads are swapped in the processor. During the course of a thread's execution, at the first instance of an attempt by the thread to execute a floating point instruction, the "float-unavailable" exception occurs. This, in turn, invokes the microkernel's floating point exception handler. The function of this exception handler is to make floating point available to the thread that requires it. The exception handler dynamically allocates space for saving the thread's floating point registers, initializes the registers, and turns on the "float-available" bit in its machine state register. Once a thread obtains floating point context, it continues to have it for the remainder of its life. Claims What is claimed is: Description FIELD OF THE INVENTION
______________________________________
typedef struct ppc.sub.-- kernel.sub.-- state {
int ks.sub.-- ss; /* preallocated ppc.sub.-- saved.sub.-- state */
int ks.sub.-- sp; /* kernel stack pointer */
int ks.sub.-- lr; /* Link register */
int ks.sub.-- cr; /* condition code register */
int ks.sub.-- reg13[19]; /* non volatile registers r13 - r31 */
int ks.sub.-- pad; /* double word boundary */
};
______________________________________
cpu.sub.-- vars This structure holds all of the per CPU global variables.
______________________________________
typedef struct cpu.sub.-- vars {
/* these fields are read/write */
struct fh.sub.-- save.sub.-- area cv.sub.-- fast.sub.-- save; /* fast
save area */
ppc.sub.-- state.sub.-- t cv.sub.-- next.sub.-- ss; /* next exception
save area */
ppc.sub.-- state.sub.-- t cv.sub.-- user.sub.-- ss; /* user mode
exception save area */
vm.sub.-- offset.sub.-- t cv.sub.-- kernel.sub.-- stack; /* per cpu stack
*/
/*these fields are read-only after initialization */
unit cv.sub.-- toc; /* TOC value */
vm.sub.-- offset.sub.-- t cv.sub.-- call.sub.-- slih; /* address of
common call.sub.-- slih( )
routine */
vm.sub.-- offset.sub.-- t cv.sub.-- dsisr.sub.-- jt; /*physical address
of DSISR jump
table for alignment exc. handler */
int cv.sub.-- cache.sub.-- bs; /*cache block size in bytes */
int cv.sub.-- cpu.sub.-- number.sub.-- ix;
/*cpu number index */
int cv.sub.-- cpu.sub.-- number; /*cpu number */
struct cpu.sub.-- vars *cv.sub.-- panic.sub.-- slih; /*cpu.sub.-- vars on
which to run
panic.sub.-- slih( )*/
int cv.sub.-- pad[6]; /* cache line alignment */
} cpu.sub.-- vars.sub.-- t;
______________________________________
This structure is initialized in ppc.sub.-- init.sub.-- stacks() routine. The fields cv.sub.-- cpu.sub.-- number, cv.sub.-- cpu.sub.-- number.sub.-- ix,cv.sub.-- toc,cv.sub.-- call.sub.-- slih and cv.sub.-- panic.sub.-- slih all hold constant values and can be thought of as read.sub.-- only after ppc.sub.-- init.sub.-- stacks() is complete. The other fields are dynamic. At the time of ppc.sub.-- init.sub.-- stacks(), there is no notion of user or kernel stacks. To handle exceptions that may come in during this time, kernel makes use of the panic stack as its run-time and exception stack. This data structure is accessible to each cpu, with each looking at their personal copies. The SPRG3 register at the time of initialization, is made to point to cpu.sub.-- vars in ppc.sub.-- init.sub.-- stack() routine. The ppc.sub.-- saved.sub.-- state is pre-llocated and cv.sub.-- next.sub.-- ss always points to this area. In user mode, it always points to the current thread's process control block (pcb). The cv.sub.-- user.sub.-- ss always points to the current.sub.-- thread's pcb. A pointer to the bottom of the kernel stack of the thread is maintained in cv.sub.-- kernel.sub.-- stack. ppc.sub.-- saved.sub.-- state This structure describes the machine state as saved upon kernel entry. One structure lives in the pcb of the thread and holds the user state saved at the initial transition from user to kernel mode. Additional structures representing nested exceptions or interrupts and live on the kernel stack. The first structure of which lives just above the ppc.sub.-- kernel.sub.-- state. The state save structures are pre-allocated. The variable cv.sub.-- next.sub.-- rss in the per CPU structure always points to the save area that will be used at the next fault or interrupt. While running in user mode, it points to the pcb. The state save structures are linked in a chain to enable stack tracking.
______________________________________
typedef struct ppc.sub.-- saved.sub.-- state {
int regs[32]; /*users GPRS */
int iar; /*user's instruction address register */
int msr; /*user's machine state register */
int cr; /* user's condition register */
int lr; /*users link register */
int ctr; /* user's count register */
int xer; /*user's storage exception register */
int mq; /* user's mq register */
int ss.sub.-- chain; /* pointer to previous exception in chain */
int ss.sub.-- reason; /* argument to pr.sub.-- slih( ) */
int ss.sub.-- vaddr;
int ss.sub.-- extra; /* padding bytes to double word boundary */
} *ppc.sub.-- state.sub.-- t;
floatsave - floating point state structure
typedef floatsave {
double fp.sub.-- regs[32];
/* 32 64-bit floating point user registers */
long fp.sub.-- dummy;
/*32 bits of padding so fp.sub.-- scr can be stfd/lfd */
long fp.sub.-- scr; /* floating point status and control register */
};
______________________________________
pcb.sub.-- t This structure holds the user-mode machine state associated with a particular thread. The ppc.sub.-- saved.sub.-- state structure is filled in on transition from user to kernel mode. The floatsave structure is filled in lazily when some other thread needs floating point unit.
______________________________________
typedef struct pcb {
struct ppc.sub.-- saved.sub.-- state pcb.sub.-- ss;
struct floatsave pcb.sub.-- fp;
struct ppc.sub.-- machine.sub.-- state ims;
} *pcb.sub.-- t;
______________________________________
fh.sub.-- save.sub.-- area This structure provides the scratch area for storing registers GPR25-31, state save and restore regisers SRR0 and SRR1 and other registers LR, CR and XER. This is allocated in the CPU.sub.-- VARS structure to be used by all the fast handlers that do not use the four-step exception processing scheme. Alignment exception handler is the only fast handler that uses this area in its own FLIH.
______________________________________
struct fh.sub.-- save.sub.-- area {
long fh.sub.-- scratch0;
long fh.sub.-- scratch1;
long fh.sub.-- scratch2;
long fh.sub.-- scratch3;
long fh.sub.-- gpr25;
long fh.sub.-- gpr26;
long fh.sub.-- gpr27;
long fh.sub.-- gpr28;
long fh.sub.-- gpr29;
long fh.sub.-- gpr30;
long fh.sub.-- gpr31;
long fh.sub.-- srr0;
long fh.sub.-- srr1;
long fh.sub.-- lr;
long fh.sub.-- cr;
long fh.sub.-- xer;
}
______________________________________
Global variables The following are system global variables that are used primarily in exception processing. 1. active.sub.-- threads[ ] 2. active.sub.-- stacks[ ] They have elements for each CPU on the system. Each element points to the current thread and stack on that CPU. e. g active.sub.-- threads[0]--current thread on cpu 0/* refer to current.sub.-- thread() macro definition */ [Note: Floating point exception handlers make use of another kernel variable called "float.sub.-- thread" which points to the thread that has access to the floating point hardware] Floating Point Exceptions Float Unavailable Introduction This section briefly explains all the Floating Point related exception scenarios in the PowerPC architecture. It also provides information as to how the microkernel perceives such exceptions in the context of an executing thread. It also furnishes PowerPC architecture specific details such as the bit settings etc. for each of the exception types. PowerPC Information A floating point unavailable exception occurs when no higher priority exceptions exist, an attempt is made to execute a floating-point instruction (including floating-point load, store and move instructions) and the floating point available bit in the MSR is disabled. (MSR[FP]=0). The register settings for floating point unavailable exceptions are given below.
______________________________________
SRR0 - Set to the effective address of the instruction that caused
the exception
SRR1 - 0-15 cleared
16-31 Loaded from bits 16-31 of the MSR
MSR EE 0
PR 0 FE 0
FP 0 EP not altered
ME not altered IT 0
FE0 0 DT 0
______________________________________
This exception type is vectored at 0.times.0800 in the exception vector table. When a floating point unavailable exception is taken, instruction execution resumes at offset 0.times.00800 from the physical base address indicated by MSR[EP]. Microkernel Information lazy context restore policy Floating point hardware register set is not given to any user level thread unless it is required to perform floating point operations. Thus, for any non-floating thread, the context does not include the floating point hardware state. This effectively reduces the amount of information to be handled during each context switch time. There are 32 64-bit floating point registers and a 32-bit Floating point status and control register in 32-bit PowerPC processor implementations. These add upto 260 bytes of information that would be saved and restored during a context switch even if the threads do not use them. A thread, when it is created is given a context save area addressed as its PCB. The PCB consists of integer context and float-context save areas. Any thread created and scheduled for execution does not have a float save area addressed by its pcb. The thread's MSR (machine state register) has a bit to indicate the availability of floating point hardware to the thread. It is initially set to zero. During the course of a thread execution, at the first instance of an attempt by the thread to execute a floating point instruction, the float unavailable exception to occurs. This in turn causes the microkernel's floating point exception handler to be invoked. The function of this exception handler is to make floating point available to the thread that required it. The exception handler dynamically allocates space for saving the thread's floating point registers, initializes the registers and turns on the float-available bit to 1 in its machine state register (MSR). Once a thread obtains floating point context, it continues to have it during the remainder of its life. The flow chart of FIG. 6 illustrates the floating point exception handler 192, which is part of the PowerPC exception handler 190. The flow diagram of FIG. 6 starts by creating a thread in the memory 102 without the floating point context indication in the thread's process control block (pcb). In accordance with the invention, this will prevent the copying of the floating point registers of the processor 110 on which the thread has been running, when its execution is terminated after a fault or interrupt. While executing during a first occurring session, only fixed point (integer) operations will be carried out by the thread in the processor using the plurality of fixed point registers of the processor 110. When a fault or an interrupt occurs terminating the first session (context switch time), the thread is removed from execution in the processor 110 and the contents of the fixed point registers are stored in the thread's process control block. The contents of the processor's machine state register (MSR), including the state of the current floating point context in the processor, is stored in the thread's process control block. In response to the stored indication of no floating point context, the contents of the plurality of floating point registers in the processor are not stored in the thread's process control block. This significantly improves the overall performance of the system. Later, when the thread's execution is restored in during a second occurring session, either in the same processor 110, or in an alternate processor 112, the contents of the process control block are examined to determine the state of the floating point context indication. Since the indication is that the thread does not have the floating point context, only fixed point operations are to be carried out with the thread in the processor using the plurality of fixed point registers. Thus, there is no attempt to copy back from the thread's process control block, values to load into the processor's floating point registers. This provides is a significant improvement in the overall performance of the system. The indication that the thread does not have the floating point context is copied back from the thread's process control block, to the processor's machine state register. If the sequence of program instructions being run by the thread attempts to execute a floating point instruction during the second session, the floating point exception handler 192 is called by the processor (the instruction is trapped by the microkernel 120). The exception handler 192 stores an alternate indication in the processor's machine state register that the floating point context is available for the thread. This enables the thread to perform floating point operations. The thread then resumes execution of the floating point instruction. If another fault or interrupt occurs, forcing a termination of the execution of the thread in the processor (context switch time), the thread is removed from the processor, terminating the second session. This time, the contents of both the plurality of fixed point registers and the plurality of floating point registers in the processor are stored in the thread's process control block in response to the alternate indication that it is enabled for floating point operations. Thus, only those threads that are performing floating point operations have the floating point registers copied at the termination of the thread's execution session in the processor. The machine state register is copied to the thread's process control block, including the floating point context status. Later, when the thread's execution is restored in during a third occurring session, either in the same processor 110, or in an alternate processor 112, the contents of the process control block are examined to determine the state of the floating point context indication. Since the indication is that the thread does have the floating point context, both floating point and fixed point operations are to be carried out with the thread in the processor using the plurality of floating point and fixed point registers. Thus, the microkernel 120 copies back from the thread's process control block, values to load into the processor's floating point registers, in addition to the values to load into the processor's fixed point registers. The indication that the thread does have the floating point context is copied back from the thread's process control block, to the processor's machine state register. Thus, only those threads that are performing floating point operations have values copied out of their process control blocks to load into the processor's floating point registers at the restoration of execution of the thread's in the processor. In an alternate embodiment of the invention, if a processor A 110 in FIG. 1 has only one thread executing within it that has the floating point context, then the contents of that processor's floating point registers do not need to be saved when that thread is removed from the processor. If all other threads executing within that processor 110 are not using the floating point registers, the values loaded into those registers by the sole floating point thread remain untouched. In accordance with the invention, each processor maintains a data structure 196A for processor A 110 and data structure 196B for processor A 112 in the memory 102 of FIG. 1. The data structure 196A stores the name of the sole floating point thread that is executing in the respective processor A 110. Similarly, the data structure 196B stores the name of the sole floating point thread that is executing in the respective processor B 112. Then, when a second thread having a floating point context is to begin execution in the processor A 110, the processor A 110 calls the floating point exception handler 192. The floating point exception handler 192 then copies the contents of the processor's 110 A floating point registers, gets the name of first thread from the data structure 196A, and saves the copied values in the process control block for the named first thread. Then the second thread can begin execution in the processor A 110, and can load its own values into the processor's A 110 floating point registers. In this manner, the contents of the floating point registers of the processor A 110 need not be saved at all, if there is only one floating point thread executing in that processor A 110. For multiprocessor configurations, when the first thread is to resume execution in a different processor B 112, the floating point exception handler 192 is called to copy the contents of the floating point registers of the first processor A 110, to those of the second processor B 112, if the first thread was the sole floating point thread that was executing in the first processor A 110. Multiprocessing and performance The lazy context restore policy is multiprocessor enabled. In the sense, the floating context is associated with the thread executing as opposed to being to tied to a processor. In other words, earlier systems solved this problem by adopting a lazy float context switch policy whereby only a thread owns the floating point hardware at any time. In such a scheme, when a thread traps into the kernel for getting the float context, the trap handler allocates and provides the thread floating point save area. It also designates the thread as being the float thread of this processor. In the event, another thread requires to use floating point hardware, it traps into the kernel. This time the trap handler designates the new thread as the float thread for this processor after saving the old float thread's floating point registers and restoring the new thread's floating point registers. In a uniprocessor systems also, lazy context switch can be expensive, particularly in float intensive applications since the overall context switch time is increased by the exception handling path also. But with lazy context restore policy, because threads have the float context since obtaining it, all subsequent context switches would include both integer and floating point state. For a multi-processor system, the concept of tying a float thread to processor makes it difficult to obtain and move the state information across processors. With the lazy float restore policy, it is guaranteed that if a thread is a float thread, it has its latest floating point state information when it is ready to run on any processor. External Interface Details This section explains the interface details of the SLIH for the float.sub.-- unavailable exception The name of the SLIH is float.sub.-- unavailable(). It is invoked as follows. void float.sub.-- unavailable(struct ppc.sub.-- saved.sub.-- state *state) It expects the ppc.sub.-- saved.sub.-- state to be passed to it by the pre-second level interrupt handler call.sub.-- slih() routine. Data structures The following global data structures are affected by this routine. The thread data structure of the current thread in which this fault has occurred. This routine essentially changes the thread's machine state by changing the MSR bit settings in the thread's pcb, It also restores the thread's floating point context by loading the floating point registers from the thread's float save area in the pcb. Functional Description Float Unavailable Function name: float.sub.-- unavailable() Purpose: To handle float.sub.-- unavailable exception that occurred in a thread. Prototype: void float.sub.-- unavailable(struct ppc.sub.-- saved.sub.-- state *state); Input: The machine state as saved upon kernel entry. output: none return values: none error codes: routines invoked: panic(),float.sub.-- load(),float.sub.-- store() Logic: If it has happened in the previleged/Supervisor mode then panic and quit Fetch the current thread Allocate float save area (260 bytes) and make thread's float save area pointer point to it Initialize all the registers. turn on its MSR[FP] in its pcb; load the floating point registers with the current thread's float save area; Errors and Messages 1) Floating point unavailable in kernel mode Since kernel does not make use of floating point, this fault is not expected to occur in kernel mode. Floating Point Program Exceptions Introduction This section describes all the floating point program exceptions that can occur in the PowerPC architecture and how those exceptions are processed in the microkernel. It provides functional descriptions of all the routines that are related to the floating point enabled program exception handling. PowerPC information The control with regard to enabling and disabling the floating point program exceptions is provided in the PowerPC hardware both in the machine state register as well as in the Floating Point Status and Control register. Both the registers have floating point exception enable bits that need to be set to recognize and process these exceptions. FIG. 7 illustrates the bit significance of FPSCR register. A floating point program exception occurs when no higher priority exception exists and the following condition which correspond to bit settings in SRR1, occur during execution of an instruction. System floating point enabled exception is generated when the following condition is met: (MSR[FE0].vertline.MSR[FE1]) & FPSCR[FEX] is 1 FPSCR[FEX] is set by the execution of a floating point instruction that causes an enabled exception or by the execution of a "move to FPSCR" type instruction that sets an exception bit when its corresponding enable bit is set. In the MPC-601, all floating point enabled exceptions taken clear SRR1[15] to indicate that the address in SRR0 points to the instruction that caused the exception because all floating point enabled exceptions are handled in a precise manner on the MPC601. Floating point exceptions are signalled by condition bits set in the floating point status and control register. They can cause the system floating point enabled exception error handler to be invoked. The following conditions that can cause program exceptions are detected by the processor. These conditions may occur during execution of floating point arithmetic instructions. The corresponding bits set are indicated in parentheses. I) Invalid floating point operation exception (VX) i) sNaN (VXSNAN) ii) Inf--Inf (VXISi) iii) Inf/Inf (VXIDI) iv) zero/zero (VXZDZ) v) Inf*zero (VXIMZ) vi) Illegal compare (VXVC) II) Software request condition (VXSOFT) III) Illegal integer convert IV) zero divide V) Overflow VI) Underflow VII) inexact The exception bit indicates occurrence of the corresponding condition. If a floating point exception occurs, the corresponding enable bit governs the results produced by the instruction and, in conjunction with bits FE0 and FE 1, whether and how the system floating point enabled exception handler is invoked. When an exception occurs, the instruction execution may be suppressed or a result may be delivered, depending on the exception type as well as if the exception is enabled or not. Instruction execution is suppressed for i) enabled illegal floating point operation ii) enabled zero divide Default result is generated and written to the destination specified by the instruction causing the exception i) disabled and enabled overflow ii) disabled and enabled underflow iii) disabled and enabled inexact iv) disabled zero divide v) disabled illegal floating point instruction In the PowerPC architecture, setting enable bits causes the generation of the result value specified in the IEEE default behavior standard for the "trap enabled" case and if the enable bit is 0, it causes the generation of the default value specified for the "trap disabled" case. The "trap disabled" case is when both FE0 and FE1 are cleared in the MSR and all the enable bits are cleared in the FPSCR. If the program exception handler should notify the software that a given exception condition has occurred, the corresponding FPSCR enable bit must be set .and a mode other than Ignore exception mode should be selected. In the MPC601, both FE0 and FE1 are `OR`ed. Unless both are cleared, MPC601 operates in precise mode. The MSR register bits FE0 and FE1 (bit positions 20 and 23. Both of them) need to be on to enable the processor to execute in "Synchronous precise mode". This ensures that all the Floating Point Program exceptions are recognized and the Floating Point Exception handler is invoked if they are individually enabled through the control bits of the FPSCR. The standard default results may be satisfactory under most circumstances. This coupled with the performance optimization objectives, renders the Synchronous precise mode optional and to be used only for debugging and specialized applications. The program exceptions are vectored at `0.times.0700` in the vector table. The SRR0 has the Effective Address of the instruction that caused the exception
______________________________________
SRR1 0-10 cleared
11 - set to indicate a floating point enabled program exception
12-15 cleared
16-31 loaded from bits 16-31 of the MSR at the time the
exception has occurred
______________________________________
Microkernel information Once a thread attains the Floating Point capability, while executing floating point instruction, can potentially cause synchronous floating point program exceptions if enabled for such faults. The system pr.sub.-- slih handler is invoked by the FLIH for many exception conditions including Program Exceptions. The Floating point Enabled exceptions are such exceptions and are handled by the pr.sub.-- slih routine. Since these are program instruction caused exceptions, it is adequate at the kernel level, the system pr.sub.-- slih handler obtain the current floating point status of the faulting thread, format a floating point enabled program exception message and report it to the exception server. Additionally a kernel interface is provided to the applications in order to set and get the hardware state for a specific thread within a task. These calls are provided to facilitate individual threads to have control and be able to manipulate the register settings and fetch the status information. These calls are machine specific since they directly read and write into the thread's machine state save area. Actual details of the interface are explained in the following sections. External Interface pr.sub.-- slih function interface The system pr.sub.-- slih handler is invoked in case of a floating point program exception as follows
______________________________________
pr.sub.-- slih (struct ppc.sub.-- saved.sub.-- state *state,
long srr1,
long dsisr,
long dar)
______________________________________
where state is machine state as saved upon kernel entry srr1 is the segment register SRR1 dsisr is the DSISR register settings when the exception occurred. dar is the data address register The pr.sub.-- slih routine formats an exception message and raises an exception to the exception server by calling the exception routine exception(exc,codes, code.sub.-- size) where exc is the generic exception type codes is an array of values including register settings and so on code.sub.-- size is the no. of elements in the code array kernel--thread interface The kernel interface comprises of two state related routines namely thread.sub.-- set.sub.-- state (thread.sub.-- t thread,int flavor,thread.sub.-- state.sub.-- t new.sub.-- state, uint new.sub.-- state.sub.-- count) where thread--thread for which the state is to be altered flavor--machine specific flavor PPC.sub.-- THREAD.sub.-- STATE--refers to the thread's machine context except FP PPC.sub.-- FLOAT.sub.-- STATE--refers to the thread's FP context new.sub.-- state--new state count--no. of natural storage units for the state set thread.sub.-- get.sub.-- state(thread.sub.-- t thread, int flavor, thread.sub.-- state.sub.-- t new.sub.-- state,int *new.sub.-- state.sub.-- count) where thread--thread the state of which is to be obtained flavor--machine specific flavor PPC.sub.-- STATE.sub.-- FLAVOR.sub.-- LIST--list of flavors supported by the ppc implementation PPC.sub.-- THREAD.sub.-- STATE--refers to the thread's machine context except FP PPC.sub.-- FLOAT.sub.-- STATE--refers to the thread's FP context new.sub.-- state--new state count--no. of natural storage units for the state set Data structures floating point program exception handler--pr.sub.-- slih routine Floating point program exception handling portion of the pr.sub.-- slih handler deals with the following data structures codes--the code array passed to the exception call. code.sub.-- size--no. of elements that are present in the code array The code array is filled as follows codes[0]=EXC.sub.-- FLOAT.sub.-- ARITHMETIC; /*defined in machine specific exception. h include file */ codes[1]=EA; /* effective address of the instruction that caused the exception */ code.sub.-- size=2; floating point kernel interface-[thread.sub.-- set.sub.-- state() & thread.sub.-- get.sub.-- state()] 1. thread.sub.-- set.sub.-- state with ppc.sub.-- thread.sub.-- state flavor thread's machine state in the thread's PCB. (thread->pcb->pcb.sub.-- rss) This is the ppc.sub.-- saved.sub.-- state structure of the thread's pcb and it is modified with the state information that the user has provided. The ppc.sub.-- thread.sub.-- state structure that is used as a handle to pass the state information is defined in machine specific include files. 2. thread.sub.-- set.sub.-- state with ppc.sub.-- float.sub.-- state flavor thread's machine float state in the thread's PCB. (thread->pcb->pcb.sub.-- fp) This is the floatsave area of the thread's pcb that is set to the user provided state information. The ppc.sub.-- float.sub.-- state structure that is used as a handle to pass the state information is defined in machine specific include files. 3. thread.sub.-- get.sub.-- state with ppc.sub.-- thread.sub.-- state flavor This does not alter the thread's data structure. It simply copies the thread's machine state information from its pcb to the structure passed by the user. 4. thread.sub.-- get.sub.-- state with ppc.sub.-- float.sub.-- state flavor This routine in turn calls the float.sub.-- get.sub.-- state() routine which does the synchronization of floating point state information if the requesting thread is the floating thread meaning it stores the floating point hardware registers into the thread's pcb floatsave area before it passes that information to the user consistent with lazy floatsave policy. It turns the "FP available" bit in the MSR to off. [Note: All the above routines call thread.sub.-- hold() to suspend thread temporarily while modifying the thread's data structures and call thread.sub.-- release() after they are finished with modifying the state information] Functional Description pr.sub.-- slih handler Function name: pr.sub.-- slih() purpose: The pr.sub.-- slih handler is invoked for multiple exception conditions. So based on the reason passed to it by the FLIH, its control flow is altered. This section describes floating point program exception specific logic of the pr.sub.-- slih handler.
______________________________________
Prototype: void pr.sub.-- slih(struct ppc.sub.-- saved.sub.-- state
*state,
long srr1,
long dsisr,
long dar)
______________________________________
Input: state: The machine state as saved upon kernel entry srr1: is the segment register SRR1 dsisr: dsisr register settings for the exception dar: is the dafa address register output: none return values: none error codes: routines invoked: panic(), float.sub.-- read.sub.-- fpscr(),exception() Logic
______________________________________
begin
if (problem state is supervisor mode)
then
panic();
end
else
begin
switch (reason)
begin
case:
. . .
case:
. . .
case FP.sub.-- PROGRAM.sub.-- EXCEPTION:
begin
set exception to EXC.sub.-- ARITHMETIC;
set codes[0] to
EXC.sub.-- PPC.sub.-- FLOAT.sub.-- ARITHMETIC;
set codes[1] = state->iar;
code.sub.-- size = 2;
break;
end
default:
end
end
call exception(exception,cpdes,code.sub.-- size);
/* to raise an exception to the exception server in
the exception port
*/
end
______________________________________
Kernel Interface The kernel interface essentially comprises of the following major routines in the thread library. They are namely 1. thread.sub.-- get.sub.-- state: to get the current state information for the thread for a machine specific flavor 2. thread.sub.-- set.sub.-- state: to set the current state information for the thread for a machine specific flavor These two calls provide generic interface to the outer world by taking specific machine flavors and the corresponding state information as parameters. They in turn call machine specific routines that alter the pcb structure for the thread. They are 1. thread.sub.-- setstatus() 2. thread.sub.-- getstatus(). They are explained in the following sections. thread.sub.-- set.sub.-- state purpose To provide generic thread interface to deal with machine dependent hardware specific flavors and set the required state of the thread according to the flavor prototype kern.sub.-- return.sub.-- t thread.sub.-- set.sub.-- state(thread.sub.-- t thread,int flavor,thread.sub.-- state.sub.-- t new.sub.-- state,uint new.sub.-- state.sub.-- count)
______________________________________
Input
thread: current thread's data structure
flavor: machine flavor
PPC.sub.-- FLOAT.sub.-- STATE
PPC.sub.-- THREAD.sub.-- STATE
(These are the only two flavors
that are currently supported)
State: The machine state corresponding to
the machine flavor
count: byte count of state information (fixed for each flavor)
______________________________________
output: none return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- INVALID.sub.-- VALUE if the flavor passed is not legal flavor value error codes: none routines invoked:thread.sub.-- setstatus Logic
______________________________________
Begin
if (thread eq NULL OR thread is the current
thread executing)
return (KERN.sub.-- INVALID.sub.-- ARGUMENT);
call thread.sub.-- hold; / * the thread is suspended */
call thread.sub.-- do.sub.-- wait; /* wait until thread
enters `STOPPED` state */
call thread.sub.-- setstatus;/* call machine specific
setstatus routine */
call release.sub.-- thread;
end
______________________________________
thread.sub.-- get.sub.-- state purpose: To provide generic thread interface to deal with machine dependent hardware specific flavors and get the required state of the thread according to the flavor prototype: kern.sub.-- return.sub.-- t thread.sub.-- get.sub.-- state(thread.sub.-- t thread,int flavor,thread.sub.-- state.sub.-- t new.sub.-- state ,uint *old.sub.-- state.sub.-- count) Input: thread: current thread's data structure
______________________________________
flavor: machine flavor
PPC.sub.-- FLOAT.sub.-- STATE
PPC.sub.13 THREAD.sub.-- STATE
(These are the only two flavors
that are currently supported)
______________________________________
State: The machine state corresponding to the machine flavor count: byte count of state information (fixed for each flavor) output: none return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- INVALID.sub.-- VALUE if the flavor passed is not legal flavor value error codes: none routines invoked:thread.sub.-- getstatus Logic
______________________________________
begin
if (thread eq NULL OR thread is the current
thread executing)
return (KERN.sub.-- INVALID.sub.-- ARGUMENT);
call thread.sub.-- hold; /* the thread is suspended */
call thread.sub.-- do.sub.-- wait; /* wait until thread
enters `STOPPED` state */
call thread.sub.-- getstatus;/* call machine specific
setstatus routine */
call release.sub.-- thread;
end
______________________________________
thread.sub.-- setstatus purpose: The thread.sub.-- setstatus routine based on the flavor requested, would appropriately set the registers in the machine state associated with the thread. Since this section particularly dwells on the floating point state, it provides only the floating point pertinent information Prototype: kern.sub.-- return.sub.-- t thread.sub.-- setstatus(thread.sub.-- t thread, int flavor,thread.sub.-- state.sub.-- t tstate,uint count)
______________________________________
Input:
thread: current thread's data structure
flavor: machine flavor
PPC.sub.-- FLOAT.sub.-- STATE
PPC.sub.-- THREAD.sub.-- STATE
(These are the only two flavors
that are currently supported)
State: The machine state corresponding
to the machine flavor
count: byte count of state information (fixed for each flavor)
______________________________________
output: none return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- INVALID.sub.-- VALUE if the flavor passed is not legal flavor value error codes: none routines invoked: float.sub.-- set.sub.-- state Logic
______________________________________
begin
switch (flavor)
begin
case PPC.sub.-- THREAD.sub.-- STATE:
. . .
case
PPC.sub.-- FLOAT.sub.-- STATE:
begin
if (count is not equal to
PPC.sub.-- FLOATE.sub.-- STATE.sub.-- COUNT)
return (KERN.sub.-- INVALID.sub.-- VALUE);
return (float.sub.-- set.sub.-- state
(thread,(struct PPC.sub.-- float.sub.-- state *)tstate);
end
default:
. . .
end
end
______________________________________
thread.sub.-- getstatus purpose: The thread.sub.-- getstatus routine based on the flavor requested, would appropriately get the registers in the machine state associated with the thread. Since this section particularly dwells on the floating point state, it provides only the floating point pertinent information Prototype: kern.sub.-- return.sub.-- t thread.sub.-- getstatus(thread.sub.-- t thread, int flavor,thread.sub.-- state.sub.-- t tstate,uint* count)
______________________________________
Input:
thread: current thread's data structure
flavor: machine flavor
PPC.sub.-- STATE.sub.-- FLAVOR.sub.-- LIST
PPC.sub.-- FLOAT.sub.-- STATE
PPC.sub.-- THREAD.sub.-- STATE
(These are the only flavors that are currently
supported)
State: The machine state corresponding to the machine
flavor
count: byte count of state information (fixed for each flavor)
______________________________________
output: The state information requested the byte count of the state information return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- INVALID.sub.-- VALUE if the flavor passed is not legal flavor value error codes: none routines invoked: float.sub.-- get.sub.-- state Logic
______________________________________
begin
switch (flavor)
begin
case THREAD.sub.-- STATE.sub.-- FLAVOR.sub.-- LIST:
if (count <1)
return (KERN.sub.-- INVALID.sub.-- ARGUMENT);
tstate[0] = PPC.sub.-- THREAD.sub.-- STATE;
tstate[1] = PPC.sub.-- FLOAT.sub.-- STATE;
*count = 2;
break;
case PPC.sub.-- THREAD.sub.-- STATE;
. . .
case PPC.sub.-- FLOAT.sub.-- STATE:
begin
if (count is < PPC.sub.-- FLOAT.sub.-- STATE.sub.-- COUNT)
return (KERN.sub.-- INVALID.sub.-- VALUE);
*count = PPC.sub.-- FLOAT.sub.-- STATE.sub.-- COUNT;
return (float.sub.-- get.sub.-- state(thread,(struct PPC.sub.-- float.sub.
-- -
state *)tstate);
end
default;
. . .
end
end
______________________________________
float.sub.-- set.sub.-- state purpose: The float.sub.-- set.sub.-- state routine would appropriately set the floating point registers in the machine state associated with the thread Prototype: kern.sub.-- return.sub.-- t float.sub.-- set.sub.-- state(thread.sub.-- t thread, thread.sub.-- state.sub.-- t tstate) Input: thread: current thread's data structure State: The machine state corresponding to the machine flavor Output: modified thread structure Return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- FAILURE otherwise Error codes: none Routines invoked: none Logic
______________________________________
begin
copy new floating point state information tstate
to the floatsave area
of the thread's pcb:
return (SUCCESS);
end
______________________________________
float.sub.-- get.sub.-- state purpose: The float.sub.-- get.sub.-- state routine would get the floating point machine state associated with the thread. This routine calls the float.sub.-- sync.sub.-- thread() routine to force a lazy save of the floating point state if the thread is the float thread. Prototype: kern.sub.-- return.sub.-- t float.sub.-- set.sub.-- state(thread.sub.-- t thread, thread.sub.-- state.sub.-- t tstate) Input: thread: current thread's data structure State: The machine state corresponding to the machine flavor output: requested tstate return values: KERN.sub.-- SUCCESS if successful KERN.sub.-- FAILURE otherwise error codes: none routines invoked: float.sub.-- sync.sub.-- thread() logic
______________________________________
begin
if the thread is the floating thread
begin
call float.sub.-- sync.sub.-- thread();
end
copy new floating point state information from the floatsave
area to tstate;
return (SUCCESS);
end
______________________________________
Errors and Messages 1) Program Floating point enabled fault in kernel mode Since kernel does not make use of floating point, this fault is not expected to occur in kernel mode. Alignment Exceptions Overview This section illustrates various scenarios associated with an alignment exception in the PowerPC architecture. It deals with the alignment exception situations occurring in both little and big Endian modes. It also attempts to highlight the differences between MPC601 processor implementation and a PowerPC architecture and the instructions of PowerPC architecture that are not supported by 601 processor. It provides functional descriptions of the alignment exception handler. MPC-601 Information On the 601 processor, alignment exceptions occur under the following conditions: i) Any floating-point transfer with a non-memory forced I/O segment ii) Any transfer that crosses a segment or BAT boundary iii) A dcbz to a write-through or cache-inhibited area iv) A Iscbx transfer that crosses a page boundary v) Any misaligned transfer that crosses a page boundary A misaligned transfer is one in which the data is transferred to an address that is not an integer multiple of the size of the data. A string or multiple transfer is considered aligned if the transfer starts on a word boundary. When operating in big-endian mode, the 601 processor handles all misaligned transfers transparently, except as listed above, by internally breaking the transfer up into several smaller sized transfers. Note that single byte transfers never cause an alignment exception. Additionally, when the 601 processor is operating in little-endian mode the following conditions will cause an alignment exception to occur: i) Any misaligned transfer ii) Any load or store multiple or string operation PowerPC Information In addition to the conditions that may cause an alignment exception on the 601 processor, the PowerPC architecture specifies that the following conditions may cause an alignment exception to occur: i) Any floating-point transfer that's not word-aligned ii) Any fixed-point doubleword transfer that's not word-aligned iii) Any Imw, stmw, Iwarx, or stwcx. transfer that's not word-aligned iv) Any Idarx, or stdcx. transfer that's not doubleword-aligned v) Any string transfer that crosses a page boundary Support for operations not supported by the 601 processor is provided by the exception handler to provide full PowerPC compatibility. This involves adding branch out routines into the dsisr jump table for the new instructions. See Appendix B for a list of PowerPC instructions that may cause alignment exceptions that are not supported by the 601 microprocessor. Code in support of quadword floating-point loads and stores exists but will be conditionally compiled out in the 601 processor implementation. In addition to inserting the appropriate branch out routines into the dsisr jump table, new modules will have to be written to deal with fixed-point doubleword operands and for handling the stfiwx, Iwa, Iwaux, and Iwax instructions. Some instructions are also interpreted differently from the 601 implementation than when implemented by a strict PowerPC processor. These differences will have to be determined and analyzed in full detail when moving to a strict PowerPC architecture. As an example, load multiple and load string operations when the source register is within the range of the destination are permitted on the 601 processor but are considered invalid operations under a strict PowerPC implementation. Also, non-word-aligned load or store multiples are invalid under the PowerPC architecture but are permitted by the 601 processor. Finally, the Iscbx instructions implemented by the 601 processor are not part of the PowerPC architecture and future implementations will have to decide whether to treat these instructions as illegal instructions or to emulate them to remain backwards compatible. If it is decided that the Iscbx instructions will be emulated then the alignment exception handler code may be used for this purpose. Microkernel info The goal of the alignment exception handler is to emulate the transfer for the user in a completely transparent and in as expedient a manner as possible. The alignment exception handler will break up the transfer into smaller sized transfers that will not cause alignment exceptions. In the process of emulation, memory protection mechanisms will be enforced as if the user-level program was performing the transfer rather than the supervisor-level exception handler. To enforce this restriction, the exception handler will check for and prevent access to the kernel segments. The exception handler will raise a data access exception for any such potential access. Also, it will be assumed, and verified through a code review of the virtual memory support code, that the Kp and Ks bits for the user segments will always be set to the same value. Note that any and all multiple and string operations will invoke an alignment exception when operating in little-endian mode. As suoh, these instructions should never be produced by any little-endian PowerPC compiler. These instructions will not be emulated in little-endian mode and will raise an illegal instruction exception instead. Areas of code that are big-endian specific will be inclosed in the following conditional inclusion preprocessor statements:
______________________________________
#if (BYTE.sub.-- ORDER == BIG.sub.-- ENDIAN)
. . .
#endif /* (BYTE.sub.-- ORDER == BIG.sub.-- ENDIAN) */
______________________________________
Areas of code that are little-endian specific will be enclosed in the following conditional inclusion preprocessor statements:
______________________________________
#if (BYTE.sub.-- ORDER == LITTLE.sub.-- ENDIAN)
. . .
#endif /* (BYTE.sub.-- ORDER == LITTLE.sub.-- ENDIAN) /*
______________________________________
The BYTE.sub.-- ORDER token will be defined as a compiler/preprocessor command line argument. The value used for BYTE.sub.-- ORDER will be determined through Makefile target selection. The tokens BIG.sub.-- ENDIAN and LITTLE.sub.-- ENDIAN are defined in the header file mach/endian. h. [As indicated in "PowerPC Operating Environment Architecture, Book III", software should not attempt to obtain a reservation for unaligned Iwarx (or Idarx) operands, nor to simulate an unaligned stwcx. (or stdcx.). For this reason these events will not be emulated and will raise an alignment exception instead] Alignment exception handling--user choice Sometimes specific application and system scenarios require that the system not handle the alignment exceptions every time they occur but simply notify the application of the same. This is done primarily for performance reasons. The application this way has the ability to choose the best way to handle the alignment problems as opposed to trapping into the kernel. To facilitate this, functionality is provided such that a thread can register itself to be notified by the system in the event of an occurrence of a alignment exception. Since then, the application may choose to switch to byte memory access which will not cause alignment exceptions. External interface Since the goal of the alignment exception handler is to provide transparent resolution of the exception there is no external interface required. Putting this aside, it may be desirable to provide a mechanism for informing the developer of code that produces misaligned transfers. There are two mechanisms which would be useful for relaying this information to the developer. The first is to insert trace hooks into the exception handler when PowerPC assembly language trace hook macros become available. The second method is to implement a special flavor of thread.sub.-- state that indicates that misaligned transfers are to raise an exception. Only misaligned transfers, not boundary crossings, would cause an exception to be raised. This mechanism will not be implemented as part of this design, and is only mention here as a possible future enhancement. Functional Description specifications The low memory vector address for the alignment handler is at offset 0.times.600 from the base address indicated by the setting of the MSR[IP] bit. Upon entry to the alignment handler, the machine is in the following state: i) External Interrupts are disabled. ii) Processor is privileged to execute any instruction. iii) Processor can not execute any floating point instructions, including floating-point loads, stores, and moves. iv) Floating point exceptions are disabled. v) Instruction address translation is off. vi) Data address translation is off. vii) SRR0 contains the address of the instruction causing the exception. viii) SRR1 contains bits 16-31 of the MSR. ix) DAR contains the starting transfer address for the operation that caused the exception. x) DSISR contains selected bits of the instruction for decoding the type of instruction that caused the exception. Alignment exceptions will be treated as non-context switching events which are only invoked from user-level (problem mode) programs. To expedite processing and to prevent nesting the following policies will be implemented: i) the alignment exception handler will avoid a full state save and will only save those registers used or affected by the exception handler code. ii) external interrupts will remain disabled. iii) instruction translations will remain disabled. iv) data translations will remain disabled except as necessary to perform the unaligned load or store. v) AST checks will not be performed on return from the exception handler. vi) The only exception that should occur during alignment handler execution is a data access exception while performing the unaligned load or store. vii) Handler code segment and private cpu save area must be accessed in real mode (translations off). viii) An exception will be raised immediately for the following cases: Effective address within kernel segment (EXC.sub.-- BAD.sub.-- ACCESS/KERN.sub.-- INVALID.sub.-- ADDRESS), unaligned Iwarx, Idarx, stwcx., stdcx. operands (EXC.sub.-- HW.sub.-- EMULATION/EXC.sub.-- PPC.sub.-- ALIGNMENT), attempted execution of Iswi, Iswx, stswi, stswx, Iscbx, iscbx., Imw, or stmw while in little-endian mode (EXC.sub.-- BAD.sub.-- INSTRUCTION/EXC.sub.-- PPC.sub.-- BEOPONLY) Handler Design It is possible for the alignment handler to cause a data access exception due to a page fault or protection violation. This is handled with a special dependence on the data access exception handler. The data access exception handler must determine if the exception was caused by the alignment exception handler by checking the MSR[IT] bit in the SRR1 register. If this bit is clear, then the data access exception handler resolves the fault condition, backtracks to the original machine state prior to the alignment exception by restoring state saved by the alignment exception handler, and restarts the original instruction. This will result in another alignment exception, but this time no data access should be generated since the page fault condition has been resolved. FIG. 8 is a flow diagram of the alignment exception handler 194, which is part of the PowerPC exception handler 190. The steps are as follows: 1) Entry at physical address 0.times.600. 2) Temporarily save a work register into SPR.sub.-- GO. 3) Get address of cpu.sub.-- vars. fh.sub.-- save.sub.-- area from the SPR.sub.-- CPU register. 4) Convert virtual address of fh.sub.-- save.sub.-- area into a physical address. 5) Save registers used or affected by exception handler (GPR25 through GPR31, LR, CR, XER, SRR0, and SRR1). 6) Move copies of DSISR, DAR, and MSR into work registers. 7) Assert that processor was in problem mode at time of exception. 8) Check address bounds of operation against kernel virtual address space. 9) Move DSISR into CR for bit tests. 10) Branch into instruction decode (dsisr) table based on DSISR[15-21] 11) Execute appropriate submodule (submodule descriptions are given in the following submodules section) 12) Restore saved state and return to user mode. Alignment Handler--Sub modules Fixed Point Load Module: This module handles all of the fixed point icad instructions. The appropriate number of bytes (2 or 4) are loaded individually and reassembled into a scratch register, manipulated as necessary if a byte-reverse or algebraic operation. Then, the load table is used to move the data to the appropriate target register. Finally, a check for update form is performed and the address register updated with the effective address of the instruction as appropriate. Fixed Point Store Module: This module handles all of the fixed point store instructions. The store table is used to move the data from the source register to a scratch register. Then, the data (2 or 4 bytes) is stored to the target address one byte at a time, manipulating the data as necessary for byte.sub.-- reversed operations. Finally, a check for update form is performed and the address register is updated with the effective address of the instruction as appropriate. Floating Point Load Module: This module handles all of the floating point load instructions. The appropriate number of bytes (4, 8, or 16) are loaded from the source address individually and reassembled into scratch register(s) and written to the local save area. The floating point table is then used to move the data from the save area to the appropriate target floating point register(s). Finally, a check for update form is performed and the address register updated with the effective address of the instruction as appropriate. Floating Point Store Module: This module handles all of the floating point store instructions. The floating point table is used to move the appropriate number of bytes (4,8, or 16) from the floating point source register to the local save area. Then, the data is written to the target address 1 byte at a time. Finally, a check for update form is performed and the address register updated with the effective address of the instruction as appropriate. Load Multiple and Load String Module: This module handles the move assist load string instructions as well as the load multiple instruction. The length of data to be transferred is acquired, and then the data is loaded a byte at a time and reassembled into a scratch register. When the scratch register is full, the load table is used to move the data to the appropriate target register. If the target register ever overlaps the address register, the 4 bytes targeted for that register are ignored. NOTE: In the case of the load string immediate, the actual instruction will have to be fetched in order to determine the length of the operation. Store Multiple and Store String Module: This module handles the move assist store string instructions as well as the store multiple instruction. The length of data to be transferred is acquired, and then the data is moved 4-bytes at a time via the store table to a scratch register, which is then written 1 byte at a time to the target address. NOTE: In the case of the store string immediate, the actual instruction will have to be fetched in order to determine the length of the operation. Load String and Compare Module: This module handles only the load string and compare byte instruction. Bytes are loaded 1 at a time and compared against the match byte of the XER. When a match is found, or the maximum length as specified in the XER is reached, the resulting length field of the XER is updated and if this instruction was a record form, the appropriate Condition Register field is updated. NOTE: The actual instruction will have to be fetched in order to determine the setting of the record mode bit. Data Cache Block Zero Module: This module handles only the data cache block zero instruction. The cache block boundaries are determined from the target address, and the resulting block of memory is cleared. Data Structures Data structures required to support the alignment handler are all accessed though the system special purpose register cpu data pointer. The design requires modification of the cpu.sub.-- vars structure to include the fast exception save area and the physical addresses of the various alignment handler jump tables (dsisr, update, load, store, floating-point ops). Each CPU must have its own private fast handler save area. The size of the fast handler save area is 64 bytes and must be quadword aligned. The fast handler save area will be at the beginning of the private cpu data structure cpu.sub.-- vars referenced as element fh.sub.-- save.sub.-- area. The layout of the fast handler save area is as follows:
______________________________________
struct fh.sub.-- save.sub.-- area {
unsigned long fh.sub.-- scratch1;
unsigned long fh.sub.-- scratch2;
unsigned long fh.sub.-- scratch3;
unsigned long fh.sub.-- scratch4;
unsigned long fh.sub.-- gpr25;
unsigned long fh.sub.-- gpr26;
unsigned long fh.sub.-- gpr27;
unsigned long fh.sub.-- gpr28;
unsigned long fh.sub.-- gpr29;
unsigned long fh.sub.-- gpr30;
unsigned long fh.sub.-- gpr31;
unsigned long fh.sub.-- srr0;
unsigned long fh.sub.-- srr1;
unsigned long fh.sub.-- lr;
unsigned long fh.sub.-- cr;
unsigned long fh.sub.-- xer;
};
______________________________________
The cpu.sub.-- vars private cpu data structure will also be modified to contain the physical addresses of the five alignment handler jump tables. The five alignment handler jump tables are comprised of: the initial dsisr jump table which determines the instruction to be emulated; the fixed-point load table indexed by target register; the fixed-point store table indexed by source register; the update table used to update the rA register of the instruction; and the floating-point operation table which is indexed by instruction and the target or source floating-point register. Errors/Messages Any error condition encountered during processing of the alignment exception will be considered a catastrophic system failure which will result in a panic. The only anticipated source of error is possibly kernel code making unaligned accesses which is to be considered a bug. An assert check for kernel-level invocation will be used to identify this condition. Unused jump entries in the dsisr table will point to panic code, but these entries will only be accessed in the event of a processor micro-code failure. The resulting exception handling method and apparatus invention provides improved efficiency in the operation of a PowerPC processor running a microkernel operating system. Although a specific embodiment of the invention has been disclosed, it will be understood by those having skill in the art that changes can be made to that specific embodiment without departing from the spirit and scope of the invention.
|
Same subclass Same class | ||||||||||
