Multiprocessor system shaving processor based idle state detection and method of executing tasks in such a multiprocessor system5867704Abstract The present invention provides a method for executing tasks in a multiprocessor system including a plurality of processors, each processor taking either an "idle" state and a "run" state, wherein the method includes the steps of: detecting, when a first processor among the plurality of processors that is executing a first task newly generates a second task, whether or not a second processor taking the "idle" state exists among the plurality of processors; assigning the second task, if a second processor taking the "idle" state is detected, to the second processor, so as to begin execution of the second task by the second processor, change the state of the second processor from the "idle" state to the "run" state, and store a flag having a first value indicating that the execution of the first task has not been suspended; and suspending, if no second processor taking the "idle" state is detected, the execution of the first task by the first processor, beginning execution of the second task by the first processor, and storing a flag having a second value indicating that the execution of the first task has been suspended. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
TABLE 1
__________________________________________________________________________
S REQ RESET
next5
ID0 NMP0 ID1 NMP1 ID2 NMP2
__________________________________________________________________________
001 001 000 011 01 0 -- -- -- --
010 010 000 011 -- -- 00 0 -- --
100 100 000 101 -- -- -- -- 00 0
011 001 000 111 10 0 -- -- -- --
011 010 000 111 -- -- 10 0 -- --
011 011 000 111 10 0 -- 1 -- --
101 001 000 111 01 0 -- -- -- --
101 100 000 111 -- -- -- -- 01 0
101 101 000 111 01 0 -- -- -- 1
110 010 000 111 -- -- 00 0 -- --
110 100 000 111 -- -- -- -- 00 0
110 110 000 111 -- -- 00 0 -- 1
111 001 000 111 -- 1 -- -- -- --
111 010 000 111 -- -- -- 1 -- --
111 100 000 111 -- -- -- -- -- 1
111 011 000 111 -- 1 -- 1 -- --
111 101 000 111 -- 1 -- -- -- 1
111 110 000 111 -- -- -- 1 -- 1
111 111 000 111 -- 1 -- 1 -- 1
011 000 001 010 -- -- -- -- -- --
011 000 010 001 -- -- -- -- -- --
011 001 010 101 10 0 -- -- -- --
011 010 001 110 -- -- 10 0 -- --
101 000 001 100 -- -- -- -- -- --
101 000 100 001 -- -- -- -- -- --
101 001 100 011 01 0 -- -- -- --
101 100 001 110 -- -- -- -- 01 0
110 000 010 100 -- -- -- -- -- --
110 000 100 010 -- -- -- -- -- --
110 010 100 011 -- -- 00 0 -- --
110 100 010 101 -- -- -- -- 00 0
111 000 001 110 -- -- -- -- -- --
111 000 010 101 -- -- -- -- -- --
111 000 100 011 -- -- -- -- -- --
111 000 011 100 -- -- -- -- -- --
111 000 101 010 -- -- -- -- -- --
111 000 110 001 -- -- -- -- -- --
111 001 010 101 -- 1 -- -- -- --
111 001 100 011 -- 1 -- -- -- --
111 001 110 001 -- 1 -- -- -- --
111 010 001 110 -- -- -- 1 -- --
111 010 100 011 -- -- -- 1 -- --
111 010 101 010 -- -- -- 1 -- --
111 100 001 110 -- -- -- -- -- 1
111 100 010 101 -- -- -- -- -- 1
111 100 011 100 -- -- -- -- -- 1
111 011 100 011 -- 1 -- 1 -- --
111 101 010 101 -- 1 -- -- -- 1
111 110 001 110 -- -- -- 1 -- 1
__________________________________________________________________________
In FIG. 3, S represents the current state, and NextS represents the next state. These states represent the respective states of the processors 30 to 32. For example, S=001 indicates that the processor 30 is in a "run" state, and the states qf the processors 31 and 32 are in an "idle" state. The same also applies to NextS. In FIG. 3, REQ0 to REQ2 represent requests which are input from the processors 30 to 32 to the processor state management device 22. These requests are input in order to request the processor state management device 22 for the identifier of an "idle" processor. In Table 1, REQ0 to REQ2 are collectively represented as REQ. For example, REQ=101 indicates that REQ0 is "1" (i.e., asserted); REQ1 is "0" (i.e., negated); and REQ2 is "1" (i.e., asserted). In FIG. 3, RESET0 to RESET2 represent resets which are input from the processors 30 to 32 to the processor state management device 22. These resets are input in order to request the processors 30 to 32 to change the states of the processors 30 to 32 retained in the processor state management device 22 from "run" to "idle". In Table 1, RESET0 to RESET2 are collectively represented as RESET. For example, RESET=010 indicates that RESET0 is "0" (i.e., negated); RESET1 is "1" (i.e., asserted); and RESET2 is "0" (i.e., negated). In FIG. 3, ID0 to ID2 each represent a signal which informs the identifier of an "idle" processor in response to a request made by the processor 30, 31, or 32. These signals are output from the processor state management device 22 in response to requests made by the processors 30 to 32. The meanings of values of ID0 to ID2 are as follows: 00: The processor 30 is "idle". 01: The processor 31 is "idle". 10: The processor 32 is "idle". In FIG. 3, NMP0 to NMP2 represent signals which inform that "no idle processor exists" in response to requests made by the processors 30 to 32. These signals are output from the processor state management device 22 in response to requests made by the processors 30 to 32. The meanings of values of NMP0 to NMP2 are as follows: 0: An "idle" processor exists. The identifier of the "idle" processor is indicated by the values of ID0 to ID02. 1: No "idle" processor exists. In this case, the values of ID0 to ID02 are not important (i.e., "don't care"). Hereinafter, the function and operation of the processor state management device 22 will be described with reference to FIGS. 4A, 4B, 5A and 5B. The processor state management device 22 manages the states of all the processors included in the multiprocessor system. Specifically, the processor state management device 22 retains pairs each consisting of the identifier of a processor and the state of that processor. The identifiers of the processors are used for identifying the processors from one another. The identifiers of processors are typically expressed in the form of integers. Each processor takes either a "run" state or an "idle" state. The processor state management device 22 determines whether or not an "idle" processor exists in response to a request made by a processor. If an "idle" processor exists, the processor state management device 22 returns the identifier of the "idle" processor to the processor which made the request. If no "idle" processor exists, the processor state management device 22 returns a message that "no idle processor exists" to the processor which made the request. In the case where a plurality of "idle" processors exist, the processor state management device 22 returns the identifier of the "idle" processor that has the highest priority to the processor which made the request. In the case where requests by a plurality of processors simultaneously reach the processor state management device 22, the above-described process is performed for the processors which made such requests in the descending order of their priorities. FIGS. 4A and 4B show an exemplary operation of the processor state management device 22. The processor state management device 22 manages the states of four processors 0 to 3. In the example shown in FIG. 4A, the processors 0 and 1 are in a "run" state, and the processors 2 and 3 are in an "idle" state. A request made by the processor 0 and a request made by the processor 1 are input to the processor state management device 22. The processor state management device 22 returns the identifier of the "idle" processor 2 in response to the request made by the processor 0, and returns the identifier of the "idle" processor 3 in response to the request made by the processor 1 (see FIG. 4B). The identifiers of the "idle" processors are returned in accordance with the priorities of the processors. Moreover, the processor state management device 22 changes the state of the processor 2 stored in the processor state management device 22 from "idle" to "run", and the state of the processor 3 stored in the processor state management device 22 from "idle" to "run". FIGS. 5A and 5B show another exemplary operation of the processor state management device 22. The processor state management device 22 manages the states of four processors 0 to 3. In the example shown in FIG. 5A, the processors 0, 1, and 2 are in a "run" state, and the processor 3 is in an "idle" state. A request made by the processor 0 and a request made by the processor 1 are input to the processor state management device 22. The processor state management device 22 returns the identifier of the "idle" processor 3 in response to the request made by the processor 0, and returns a message that "no idle processor exists" in response to the request made by the processor 1 (see FIG. 5B). The message that "no idle processor exists" is expressed, for example, by the value of a return code output from the processor state management device 22. The identifiers of the "idle" processors are returned in accordance with the priorities of the processors. Moreover, the processor state management device 22 changes the state of the processor 3 stored in the processor state management device 22 from "idle" to "run". In the examples shown in FIGS. 4A, 4B, 5A, and 5B, four processors are managed by the processor state management device 22. However, this is for the sake of conciseness; the present invention is not limited to a multiprocessor system which includes four processors. The present invention can be applied to a multiprocessor system including any number of processors. FIG. 6 shows the structure of a packet 50. The packet 50 includes a lock bit region 51 for storing a lock bit, a return bit region 52 for storing a return bit, and a return address region 53 for storing a return address, a parameter region 54 for storing a parameter, and a return value region 55 for storing a return value. Such a packet 50 is secured in the shared memory 20 for each task, the packet 50 being owned by the task. Hereinafter, a "packet owned by a task" will simply be referred to as a "packet of a task". The packet 50 is used for exchanging data between tasks and retaining the information of tasks. A lock bit is stored in the lock bit region 51 of the packet 50. The lock bit indicates whether or not other tasks are inhibited from accessing a task which owns the packet 50 while the task owning the packet 50 is being executed. If the lock bit is "1", it indicates that such access is inhibited. If the lock bit is "0", it indicates that such access is not inhibited. A return bit is stored in the return bit region 52 of the packet 50. Return bit indicates whether or not any other task was suspended before executing a task which owns the packet 50. If the return bit is "0", it indicates that "no other task was suspended before executing the task which owns the packet 50". This corresponds to a case where the task which owns the packet 50 is a spigned to an "idle" proocssor. If the return bit is "1", it indicates that "another task was suspended before executing the task which owns the packet 50". This corresponds to a case where, since no "idle" processor exists, another task which was being executed in a processor was suspended in order for that processor to execute the task which owns the packet 50. A return address is stored in the return address region 53 of the packet 50. The return address is referred to only when the return bit is "1". The return address indicates a return address to a suspended task. A parameter for the task which owns the packet 50 is stored in the parameter region 54 of the packet 50. A return value, which is the result of the task which owns the packet 50, is stored in the return value region 55 of the packet 50. FIG. 7 shows a procedure in which the processors 30 to 32 interpret and execute a fork instruction. The processors 30 to 32 read instructions in an instruction set stored in the main storage device 2. If an instruction which has been read is a fork instruction, the processors 30 to 32 perform a process shown in FIG. 7. Hereinafter, the procedure in which the processor 30 interprets and executes a fork instruction will be described with respect to the respective steps thereof, with reference to FIG. 7. It will be appreciated that the same applies to the case where other processors 31 and 32 interpret and execute a fork instruction. A fork instruction utilizes the top address (hereinafter simply referred to as an "instruction address") of an instruction sequence indicating the content of the process performed by a new task and the address of the packet 50 of the new task (hereinafter simply referred to as a "packet address") as operands thereof. Step (a): the processor 30 Inquires to the processor state management device 22 whether or not an "idle" processor exists. Such an inquiry is made by, for example, the processor 30 sending a request (REQ0=1) to the processor state management devicei 22. The processor state management device 22 determines whether or not an "idle" processor exists in response to that request. If an "idler" processor exists, the processor state management device 22 returns the identifier of that "idle" processor to the processor 30. The identifier of an "idle" processor is obtained by, for example, the processor 30 referring to the value of ID0 output from the processor state management device 22. In the case where a plurality of "idle" processors exist, the identifier of the processor having the highest priority is obtained. In the case where a plurality of processors simultaneously interpret and exocuto a fork instruction, the processors interpret and execute the fork instruction in the descending order of their priorities. Thus, the processor 30 obtains the identifier of an "idle" processor. If no "idle" processor exists, the processor state management device 22 returns a message that "no idle processor exists" to the processor 30. The message that "no idle processor exists" is obtained by, for example, the processor 30 referring to the value of NMP0 output from the processor state management device 22. Step (b): If an "idle" processor exists, the processor 30 performs the process from Steps (c) to (e). If no "idle" processor exists, the processor 30 performs the process from Steps (f) to (g). Step (c): Assuming that the "idle" processor is the processor 31, the processor 30 transfers the instruction address of a task and the packet address of the task which are provided as operands of the fork instruction to the processor 31 via the network 21. Step (d): The processor 30 writes "1" in the lock bit region 51 of the packet 50 which is designated by the packet address of the task which is provided as an operand of the fork instruction, and writes "0" in the return bit region 52. Thereafter, the processor 30 completes the process of the fork instruction, and performs the process of a next instruction. Step (e): The processor 31 receives the instruction address of a task and the packet address of the task from the processor 30 via the network 21. The processor 31, while referring to the packet 50 which is designated by the received packet address, begins processing an instruction which is designated by the received instruction address. Through the above-mentioned Steps (a) to (e), the processor 30 independently performs a process which is different from the process performed by the processor 31. In other words, the processors 30 and 31 begin parallel processing. Thus, the process of the fork instruction ends here. Step (f): The processor 30 writes "1" in the lock bit region 51 of the packet 50 which is designated by the packet address of the task which is provided as an operand of the fork instruction, and writes "1" in the return bit region 52. Moreover, the processor 30 writes the address of a next instruction of the fork instruction in the return address region 53. The processor 30 suspends the currently executed task. Step (g): The processor 30, while referring to the packet 50 which is designated by the packet address of the task which is provided as an operand of the fork instruction, begins processing an instruction which is designated by the instruction address of the task provided as an operand of the fork instruction. Thus, the process of the fork instruction ends here. Hereinafter, the procedure in which the processor 30 interprets and executes an unlock instruction will be described with respect to the respective steps thereof, with reference to FIG. 8. It will be appreciated that the same applies to the case where other processors 31 and 32 interpret and execute an unlock instruction. Step (h): The processor 30 determines whether or not the value of the return bit region 52 of the packet 50 owned by the currently executed task is "0". If the value of the return bit region 52 is "0", it indicates that the processor 30 did not have any tasks suspended. Accordingly, when the value of the return bit region 52 is "0", the processor 30 performs the process of step (i). If the value of the return bit region 52 is "1", it indicates that the task of the processor 30 was suspended. Accordingly, when the value of the return bit region 52 is "1", the processor 30 performs the process of step (j). Step (i): The processor 30 writes "0" in the lock bit region 51 of the packet 50 owned by the currently executed task, and changes the state of the processor 30 to "idle". The processor 30, which is now "idle", performs no further process. Thus, the process of the unlock instruction ends here. Step (j): The processor 30 writes "0" in the lock bit region 51 of the packet 50 owned by the currently executed task. Furthermore, the processor 30 processes instructions from the address stored in the return address region 53, thereby restarting the suspended task. Thus, the process of the unlock instruction ends here. Table 2 shows the manner in which the state of the multiprocessor system changes in response to the interpretation and execution of a fork instruction and an unlock instruction. In the example shown in Table 2, it is assumed that the multiprocessor system includes processors P1 and P2.
TABLE 2
__________________________________________________________________________
##STR1##
__________________________________________________________________________
As shown in FIG. 9, the states of the multiprocessor system are categorized into states of processors and states of tasks. Each processor can take two states: an "idle" state (IDLE) and a "run" state (RUN). These states correspond to the states which are managed by the processor state management device 22. When a processor is in a "RUN" state, that processor is currently executing a task. Each task can take three states: a "stopped" state (STOP), a 1st "executed" state (EX1), and a 2nd "executed" state (EX2). The "stopped" state (STOP) is a state where a processor is waiting to execute the task or a processor has finished execution of the task. The 1st "executed" state (EX1) is a state where no other task was suspended before the execution of a task which is currently executed. The 2nd "executed" state (EX2) is a state where another task was suspended before the execution of a task which is currently executed. When a processor is in a "run" state (RUN), the task which is being executed by that processor is in either the 1st "executed" state (EX1) or the 2nd "executed" state (EX2). Referring back to Table 2, it will be described how the state of the multiprocessor system changes. The states of the multiprocessor system changes to a next state in response to the occurrence of an event and in accordance with the event and the current state. In Table 2, "Px. fork" indicates that an event "Processor Px has executed a fork instruction" occurred, and "Px. unlock" indicates that an event "Processor Px has executed an unlock instruction" occurred. The first row of Table 2 illustrates that, in the case where the processor P1 is in a "run" state (i.e., executing a task T1) and the processor P2 is in a "idle" state, the task T1 being in a 1st "executed" state and the task T2 being in a "stopped" state, the state of the processor P2 changes from "idle" to "run" (i.e., executing the task T2) and the state of the task 2 changes from a "stopped" state to a 1st "executed" state, in response to an event "the processor P1 has executed a fork instruction". The reason for such changes of states is that as soon as the new task T2 is generated the task T2 is assigned to the "idle" processor P2. The second row of Table 2 illustrates that, in the case where the processors P1 and P2 and the tasks T1 and T2 are currently in the "next states" as indicated in the first row, the state of the processor P2 changes from "run" (i.e., executing the task T2) to "idle" and the state of the task T2 changes from a 1st "executed" state to a "stopped" state, in response to an event "the processor P2 has executed an unlock instruction". The third row of Table 2 illustrates that, in the case where the processor P1 is in a "run" state (i.e., executing the task T1) and the processor P2 is in a "run" state (i.e., executing another task), the task T1 being in a 1st "executed" state and the task T2 being in a "stopped" state, the state of the processor P1 changes from "run" (i.e., executing the task T1) to "run" (i.e., executing the task T2) and the state of the task T2 changes from a "stopped" state to a 2nd "executed" state, in response to an event "the processor P1 has executed a fork instruction". The reason for such changes of states is that as soon as the new task T2 is generated the execution of the task T1 is suspended and the execution of the task T2 is started. The fourth row of Table 2 illustrates that, in the case where the processors P1 and P2 and the tasks T1 and T2 are currently in the "next states" as indicated in the third row, the state of the processor P1 changes from "run" (i.e., executing the task T2) to "run" (i.e., executing the task T1) and the state of the task T2 changes from a 2nd "executed" state to a "stopped" state, in response to an event "the processor P1 has executed an unlock instruction". Hereinafter, the operation of the multiprocessor system 1 in the case of parallel-processing a program including a fork instruction and an unlock instruction will be described. FIG. 10 shows the procedure of a program for calculating a sum of 1 to 4 (i.e., 1+2+3+4) based on a binary tree. This program is divided into two portions main and sum; main is a main program, and sum is a subroutine which is capable of being recursively called and parallel-processed. The subroutine sum takes two parameters n and m so as to derive a sum of n+1 to m. The main program main calls sum with n-0 and m=4 as parameters. First, it is assumed that the processor 30 is executing main. The processor 30 is in a "run" state. It is assumed that the processors 31 and 32 are in an "idle" state. Hereinafter, the operation of the multiprocessor system 1 will be described in detail with respect to Steps (A) to (H) of the program. Step (A): The processor 30 executes the subroutine sum with n=0 and m=4 as parameters thereof. Specifically, the processor 30 secures the packet 50 (Pk1) in the shared cache memory 20, and stores a value "0" and a value "4" in the parameter region 54 of that packet 50 (Pk1). Next, the processor 30 executes an exec instruction with the top address of the sum instructions and the top address of the packet 50 (Pk1) as operands thereof. Herein, an exec instruction is an instruction corresponding to only Steps (f) and (g) in the procedure for the fork instruction as shown in FIG. 7. The exec instruction utilizes the instruction address of a task and the packet address of the task as operands thereof, as in the case of a fork instruction. The processor 30 writes "1" in the lock bit region 51 of the packet 50 (Pk1) and "1" in the return bit region 52 of the packet 50 (Pk1), and stores the address of a next instruction of the exec instruction in the return address region 53 (see Step (f) in FIG. 7). The processor 30 begins execution of the sum instructions while referring to the packet 50 (Pk1) (see Step (g) in FIG. 7). Step (B): The processor 30 reads the parameters n and m from the packet 50 and compares (n+1) and m. If (n+1) and m are equal, the process proceeds to Step (G); otherwise the process proceeds to Step (C). When the subroutine sum was first called by main, (n+1) and m are not equal, so the process proceeds to Step (C). Step (C): The processor 30 calculates k=(n+m) div 2. Since (n+m)=4, k equals 2. Step (D): The processor 30 executes the subroutine sum by using n and k as parameters thereof. Specifically, the processor 30 secures the packet 50 (Pk2) in the shared cache memory 20, and stores the value n (=0) and the value k (=2) in the parameter region 54 of that packet 50 (Pk2). Next, the processor 30 executes a fork instruction with the top address of the sum instructions and the top address of the packet 50 (Pk2) as operands thereof. The processors 31 and 32 are both "idle". The processor 30 obtains the identifier of the "idle" processor 31 in accordance with the respective priorities of the processors (see Stop (a) in FIG. 7). The processor 30 transfers the instruction address of a task and the packet address of the task to the processor 31 (see Step (b) in FIG. 7). The processor 30 writes "1" in the lock bit region 51 of the packet 50 (Pk2) and "0" in the return bit region 52 of the packet 50 (Pk2) (see Step (d) in FIG. 7). Furthermore, the processor 31 begins execution of the sum instructions while referring to the packet 50 (Pk2) (see Step (e) in FIG. 7). Thus, the processors 30 and 31 execute the subroutine sum in parallel. Step (E): The processor 30 executes the subroutine sum by using k and m as parameters thereof. Specifically, the processor 30 secures the packet 50 (Pk3) in the shared cache memory 20, and stores the value k (=2) and the value m (=4) in the parameter region 54 of that packet 50 (Pk3). Next, the processor 30 executes an exec instruction with the top address of the sum instructions and the top address of the packet 50 (Pk3) as operands thereof. Before the processor 30 begins the execution of the exec instruction, the packet 50 (Pk1) is saved in a stack region. The processor 30 writes "1" in the lock bit region 51 of the packet 50 (Pk3) and "1" in the return bit region 52 of the packet 50 (Pk3) (see Step (f) in FIG. 7). Furthermore, the processor 30 begins execution of the sum instructions while referring to the packet 50 (Pk3) (see Step (g) in FIG. 7). Step (F): The processor 30 loads the packet 50 (Pk1) saved in the stack region after completing the execution of the subroutine sum, which was called in Step (E). Thereafter, the processor 30 adds s1 with s2, where s1 represents the result obtained by the subroutine sum executed in Step (D) (therefore s1is stored in the return value region 55 of the packet 50 (Pk2)), and s2 represents the result obtained by the subroutine sum executed in Step (E) (therefore s2 is stored in the return value region 55 of the packet 50 (Pk3)). There is a possibility that the task which owns the packet 50 (Pk2) is still in a "run" state when the processor 30 has just finished executing the subroutine sum called in Step (E). The processor 30, after completing the execution of the task which owns the packet 50 (Pk2), reads the value stored in the return value region 55 of the packet 50 (Pk2), and substitutes that value in s1. Herein, s1is 3. It is determined by referring to the value in the lock bit region 51 of the packet 50 (Pk2) whether or not the execution of the task which owns the packet 50 (Pk2) has been completed. If the value of the lock bit region 51 of the packet 50 (Pk2) is "0", it indicates that the execution of the task which owns the packet 50 (Pk2) has been completed. Similarly, the processor 30, after completing the execution of the task which owns the packet 50 (Pk3), reads the value stored in the return value region 55 of the packet 50 (Pk3), and substitutes that value in s2. Herein, s2 is 7. The processor 30 calculates s1+s2, whereby s=10 is obtained. Step (H): The processor 30 stores the value of s in the return value region 55 of the packet 50 (Pk1). Thereafter, the processor 30 executes the unlock instruction. The processor 30 determines whether or not the value stored in the return bit region 52 of the packet 50 (Pk1) is "1" (see Step (h) in FIG. 8). Herein, the value stored in the return bit region 52 of the packet 50 (Pk1) is "1". Therefore, the processor 30 stores "0" in the lock bit region 51 of the packet 50 (Pk1) and executes instructions from the address stored in the return address regions 53 (see Step (j) in FIG. 8). In this case, the process is restarted from the next instruction of Step (A) of main. Step (G): If n+1=m was determined to be true in Step (B), the process proceeds to Step (G). The processor 30 substitutes the value of the parameter m in s. Thereafter, the process proceeds to Step (H). It should be noted that the above-described Steps (B) to (H) are executed also in the subroutine sum called in Step (D) and the subroutine sum called in Step (E) because the subroutine sum is a subroutine which is capable of being recursively called. By thus recursively calling the subroutine sum, it becomes possible to parallel-calculating the sum of 1 to 4 (i.e., 1+2+3+4). In this example, two tasks are generated by the fork instruction in Step (D) and the exec instruction in Step (E). The fork instruction is an instruction which is used for assigning a task to any "idle" processor, as long as "idle" processors exist. The exec instruction is an instruction which is strictly used by a processor for assigning a task to itself. FIG. 11 schematically shows the content of the above-mentioned process. As shown in FIG. 11, two tasks sum (0, 2) and sum (2, 4) are generated from a task sum (0, 4) by the fork instruction and the exec instruction. The task sum (0, 2) is assigned to the processor 31, and the task sum (2, 4) is assigned to the processor 30. Similarly, two further tasks are generated from each of the above two tasks. The tasks are assigned to other processors which are "idle", as long as such "idle" processors exist. Tasks sum (2, 3) and sum (3, 4) are generated from the task sum (2, 4). However, both tasks sum (2, 3) and sum (3, 4) are assigned to the processor 30 because no "idle" processors exist any longer when assigning the task sum (2, 3). As described above, each of the processors 30 to 32 of the multiprocessor system 1 according to the present invention interprets and executes a fork instruction, so as to assign a task to any "idle" processor that exists. When no "idle" processors exist, each processor suspends the execution of a currently executed task and assigns a task to itself. Thus, as soon as a task to be processed is generated, the generated task is assigned to either an "idle" processor or the processor which generated the task, thereby enabling the generated task to be immediately executed. As a result, any mechanism which conventional multiprocessor systems required for storing tasks to be processed becomes unnecessary, and any mechanism which conventional multiprocessor systems required for scheduling the execution order of tasks becomes unnecessary. In the case where an "idle" processor exists, the task to be processed is always assigned to that "idle" processor, thereby resulting in a high utilization efficiency of processors. Furthermore, the fork instruction and the unlock instruction can be implemented by using simple hardware, which makes for high-speed processing. Accordingly, the task execution method of the present invention is very effective in the case of parallel-processing a program in the multiprocessor system 1 implemented on an integrated circuit that requires a relatively small amount of time for processing the task itself as compared to the amount of time required for scheduling or managing tasks, e.g., the exemplified program for obtaining a sum of 0 to 4. When an interruption is made from outside the integrated circuit, deterioration in the performance due to such an interruption can be prevented by detecting "idle" processors by means of the processor state management device 22 and by ensuring that the "idle" processor having the lowest priority handles the interruption process The processor state management device 22 is capable of detecting all the processors of the integrated circuit being in an "idle" state. In this case, a possible deadlock can be prevented by making one of the processors perform an exception handling. As described above, according to the present invention, a task which is newly generated by a processor can be immediately begun by that processor or another processor. This makes unnecessary any mechanism for storing tasks to be processed or scheduling the execution order of tasks. This also makes unnecessary any process of selecting a task which is waiting to be executed and assigning the selected task to an "idle" processor. As a result, a relatively small time is required for the assigning of tasks to processors as compared to the actual execution time of tasks. Thus, the process speed of parallel processing of a small coarseness degree in a multiprocessor system can be enhanced. Various other modifications will be apparent to and can be readily made by those skilled in the art without departing from the scope and spirit of this invention. Accordingly, it is not intended that the scope of the claims appended hereto be limited to the description as set forth herein, but rather that the claims be broadly construed.
|
Same subclass Same class Consider this |
||||||||||
