Method and apparatus for management of mapped and unmapped regions of memory in a microkernel data processing system5729710
Abstract
A memory management method for a microkernel architecture and the microkernel itself feature template regions which are defined by the microkernel in the memory, as special objects. In the memory management method, after the microkernel is loaded into the memory of a data processing system, it begins creating task containers in the memory. It does this by forming template regions as special objects in the memory, the template regions having a set of attributes. Then, when the microkernel forms a task in the memory, it does so by mapping the template region into the task. The microkernel defines a virtual address space for the task based upon the template region. Later, when the microkernel conducts virtual memory operations on the template regions, the effect of the virtual memory operations is manifested in the task by means of the mapping relationship. In this manner, a single template region can be mapped into multiple tasks, simultaneously. By directing virtual memory operations to the template region on which they will take effect, the sharing of the virtual memory operations is much easier to accomplish since the changes are made to a template region, not to the mapping of the template region within each task.
Claims
What is claimed is:
1. A memory management method for a microkernel architecture data processing system, comprising the steps of:
loading a microkernel into a memory of a data processing system, for creating task containers in said memory;
forming with said microkernel a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
forming with said microkernel a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object; performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
2. The memory management method for a microkernel architecture data processing system of claim 1, wherein said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object, the method further comprising:
said template pointer pointing to said first address translation prior to said virtual memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
3. The memory management method for a microkernel architecture data processing system of claim 1, wherein said memory object is a second template region, the method further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
4. The memory management method for a microkernel architecture data processing system of claim 1, wherein said step of forming a task container in said memory, further comprises the steps of:
forming with said microkernel a port name space in said memory for said task container, for use as a communication channel;
defining with said microkernel, access rights for said port name space of said task container, using said set of attributes of said template region.
5. The memory management method for a microkernel architecture data processing system of claim 1, wherein said step of forming a task container in said memory, further comprises the steps of:
forming with said microkernel a thread object in said memory for said task container, for fetching instructions from said address space of said task container.
6. The memory management method for a microkernel architecture data processing system of claim 1, wherein said template region includes a base region and a user region, said step of forming a template region in said memory, further comprising the steps of:
forming with said microkernel a template base region as a special object in said memory, said template base region having a set of base attributes; forming with said microkernel a template user region as a special object in said memory, said template user region having a set of user attributes;
said step of forming a task container in said memory, further comprising the steps of:
forming with said microkernel a task base container in said memory having a base virtual address space and said set of base attributes, and having a task base pointer to said template base region, by means of mapping said template base region into said task base container;
forming with said microkernel a task user container in said memory having a user virtual address space and said set of user attributes, and having a task user pointer to said template user region, by means of mapping said template user region into said task user container; and
performing virtual memory operations on said template base region;
said virtual memory operations operative in said task base container by means of said task base pointer.
7. The memory management method for a microkernel architecture data processing system of claim 5, which further comprises:
performing virtual memory operations on said template user region,
said virtual memory operations operative in said task user container by means of said task user pointer.
8. The memory management method for a microkernel architecture data processing system of claim 6, wherein said step of forming a task base container in said memory, further comprises the steps of:
forming with said microkernel a port name space in said memory for said task base container, for use as a communication channel;
defining with said microkernel, access rights for said port name space of said task base container, using said set of base attributes of said template base region.
9. The memory management method for a microkernel architecture data processing system of claim 6, wherein said step of forming a task base container in said memory, further comprises the steps of:
forming with said microkernel a thread object in said memory for said task base container, for fetching instructions from said address space of said task base container.
10. A memory management method for a microkernel architecture data processing system, comprising the steps of:
loading a microkernel into in a memory of the data processing system, for creating task containers in said memory;
forming with said microkernel a template base region as a special object in said memory, said template base region having a first size and a first set of attributes; forming with said microkernel a task base container in said memory having said first size and said first set of attributes, by mapping said template base region into said task base container;
forming with said microkernel a template user region as a special object in said memory, said template user region having a second size and a second set of attributes; forming with said microkernel a task user container within said task base container in said memory, having said second size and said second set of attributes, by mapping said template user region into said task user container at a location within said task base container;
performing virtual memory operations on said template base region;
said virtual memory operations operative in said task base container by means of said mapping.
11. A microkernel architecture data processing system, comprising:
a microkernel in a memory of the data processing system, for creating task containers in said memory;
a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
a task container in said memory having said set of attributes and having a task pointer to said template region, formed by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
means for performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
12. The microkernel architecture data processing system of claim 11, that further comprises:
said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object;
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
13. The microkernel architecture data processing system of claim 11, wherein said memory object is a second template region, further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
14. The microkernel architecture data processing system of claim 11, wherein said task container in said memory, further comprises:
a port name space in said memory for said task container, for use as a communication channel, having access rights for said port name space of said task container, defined using said set of attributes of said template region.
15. The microkernel architecture data processing system of claim 11, wherein said task container in said memory, further comprises:
a thread object in said memory for said task container, for fetching instructions from said address space of said task container.
16. A method of running a selected, operating system personality program in a microkernel architecture data processing system, comprising the steps of:
loading a microkernel into a memory of a data processing system, for creating task containers in said memory;
loading a selected, operating system personality program into said memory of said data processing system;
sending a call to said microkernel from said selected, operating system personality program, to form a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
forming with said microkernel a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container, said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
17. The method of running a selected, operating system personality program for a microkernel architecture data processing system of claim 16, wherein said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object, the method further comprising:
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
18. The method of running a selected, operating system personality program for a microkernel architecture data processing system of claim 16, wherein said memory object is a second template region, the method further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
19. The method of running a selected, operating system personality program for a microkernel architecture data processing system of claim 16, wherein said step of forming a task container further comprises the step of:
sending a second call to said microkernel from said selected, operating system personality program, to form a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container.
20. A microkernel architecture data processing system, comprising:
a microkernel in a memory of the data processing system, for creating task containers in said memory;
a selected operating system personality program in said memory of said data processing system, for sending a call to said microkernel to form a template region as a special object in said memory;
a template region in said memory formed in response to said call, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
a task container in said memory having said set of attributes and having a task pointer to said template region, formed by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
means for performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
21. A method of running a personality-neutral services program in a microkernel architecture data processing system, comprising the steps of:
loading a microkernel into a memory of a data processing system, for creating task containers in said memory;
loading a personality-neutral services program into said memory of said data processing system;
sending a call to said microkernel from said personality-neutral services program, to form a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
forming with said microkernel a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
22. The method of running a personality-neutral services program for a microkernel architecture data processing system of claim 21, wherein said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object, the method further comprising:
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
23. The method of running a personality-neutral services program for a microkernel architecture data processing system of claim 21, wherein said memory object is a second template region, the method further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
24. The method of running a personality-neutral services program for a microkernel architecture data processing system of claim 21, wherein said step of forming a task container further comprises the step of:
sending a second call to said microkernel from said personality-neutral services program, to form a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container.
25. A microkernel architecture data processing system, comprising: a microkernel in a memory of the data processing system, for creating task containers in said memory;
a personality-neutral services program in said memory of said data processing system, for sending a call to said microkernel to form a template region as a special object in said memory;
a template region in said memory formed in response to said call, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
a task container in said memory having said set of attributes and having a task pointer to said template region, formed by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
means for performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
26. A method of running an application program in a microkernel architecture data processing system, comprising the steps of:
loading a microkernel into a memory of a data processing system, 4 for creating task containers in said memory;
loading an application program into said memory of said data processing system;
sending a call to said microkernel from said application program, to form a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
forming with said microkernel a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
27. The method of running an application program for a microkernel architecture data processing system of claim 26, wherein said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object, the method further comprising:
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
28. The method of running an application program for a microkernel architecture data processing system of claim 26, wherein said memory object is a second template region, the method further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
29. The method of running an application program for a microkernel architecture data processing system of claim 26, wherein said step of forming a task container further comprises the step of:
sending a second call to said microkernel from said application program, to form a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container.
30. A microkernel architecture data processing system, comprising: a microkernel in a memory of the data processing system, for creating task containers in said memory;
an application program in said memory of said data processing system, for sending a call to said microkernel to form a template region as a special object in said memory;
a template region in said memory formed in response to said call, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
a task container in said memory having said set of attributes and having a task pointer to said template region, formed by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object;
means for performing virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
31. A microkernel architecture data processing system, comprising:
an auxiliary storage means for storing programmed instructions; a memory means coupled to said auxiliary storage means, for storing a cache object, said auxiliary storage means paging into said cache object said programmed instructions;
a template region in said memory means, said template region having a set of attributes defining a virtual address space and having a template pointer toward said cache object;
a microkernel means in said memory means, for creating tasks in said memory means;
a task in said memory means having said set of attributes and having a task pointer to said template region, formed by means of said microkernel means mapping said template region into said task;
said template pointer and said task pointer establishing a first relationship between said task and said cache object;
a processor means coupled to said memory means, for executing said programmed instructions;
a thread object in said memory means associated with said task, for fetching said programmed instructions from said cache object using said first relationship, for execution in said processor means;
means for performing virtual memory operations on said template region to modify said template pointer toward said cache object, thereby changing the relationship between said task and said cache object.
32. The microkernel architecture data processing system of claim 31, that further comprises:
a map of virtual addresses to physical addresses in said memory means, said map being pointed to by said template pointer, said map including a first address translation to said cache object and a second address translation to a second memory object;
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task and said cache object;
said template pointer pointing to said second address translation after said memory operations, and establishing a second relationship between said task and said second memory object.
33. The microkernel architecture data processing system of claim 31, that further comprises:
an operating system personality program means in said memory means, for sending a call to said microkernel means to form said template region as a special object in said memory means.
34. The microkernel architecture data processing system of claim 33, that further comprises:
said operating system personality program means in said memory means, sending a call to said microkernel means to form said task in said memory means.
35. The microkernel architecture data processing system of claim 31, that further comprises:
a personality-neutral services program means in said memory means, sending a call to said microkernel means to form said template region as a special object in said memory means.
36. The microkernel architecture data processing system of claim 35, that further comprises:
said personality-neutral services program means in said memory means, for sending a call to said microkernel means to form said task in said memory means.
37. The microkernel architecture data processing system of claim 31, that further comprises:
an application program means in said memory means, for sending a call to said microkernel means to form said template region as a special object in said memory means.
38. The microkernel architecture data processing system of claim 37, that further comprises:
said application program means in said memory means, for sending a call to said microkernel means to form said task in said memory means.
39. A microkernel architecture data processing system, comprising:
a memory means, for storing a cache object containing programmed instructions;
a template region in said memory means, having a set of attributes and having a template pointer toward said cache object;
a microkernel means in said memory means, for creating tasks in said memory means by mapping said template region into a task;
a task in said memory means formed by said microkernel means, having said set of attributes and having a task pointer to said template region;
said template pointer and said task pointer establishing a first relationship between said task and said cache object;
a processor means coupled to said memory means, for executing said programmed instructions;
a thread in said memory means associated with said task, for fetching said programmed instructions from said cache object using said first relationship, for execution in said processor means.
40. The microkernel architecture data processing system of claim 39, that further comprises:
an operating system personality program means in said memory means, for sending a call to said microkernel means to form said template region as a special object in said memory means.
41. The microkernel architecture data processing system of claim 39, that further comprises:
an operating system personality program means in said memory means, sending a call to said microkernel means to form said task in said memory means.
42. The microkernel architecture data processing system of claim 39, that further comprises:
a personality-neutral services program means in said memory means, sending a call to said microkernel means to form said template region as a special object in said memory means.
43. The microkernel architecture data processing system of claim 39, that further comprises:
a personality-neutral services program means in said memory means, for sending a call to said microkernel means to form said task in said memory means.
44. The microkernel architecture data processing system of claim 39, that further comprises:
an application program means in said memory means, for sending a call to said microkernel means to form said template region as a special object in said memory means.
45. The microkernel architecture data processing system of claim 39, that further comprises:
an application program means in said memory means, for sending a call to said microkernel means to form said task in said memory means.
46. A microkernel architecture data processing system, comprising:
an auxiliary storage means for storing programmed instructions;
a memory means coupled to said auxiliary storage means, for storing a cache object, said auxiliary storage means paging into said cache object said programmed instructions;
a template region in said memory means, having a set of attributes and having a template pointer toward said cache object;
a microkernel means in said memory means, for creating tasks in said memory means by mapping said template region into a task;
a task in said memory means formed by said microkernel means, having said set of attributes and having a task pointer to said template region;
said template pointer and said task pointer establishing a first relationship between said task and said cache object;
a processor means coupled to said memory means, for executing said programmed instructions;
a thread in said memory means associated with said task, for fetching said programmed instructions from said cache object using said first relationship, for execution in said processor means.
47. A microkernel architecture data processing system, comprising:
an auxiliary storage means for storing programmed instructions;
a memory means coupled to said auxiliary storage means, for storing a cache object, said auxiliary storage means paging into said cache object said programmed instructions;
a template region in said memory means, having a set of attributes and having a first template pointer and a second template pointer toward said cache object;
a microkernel means in said memory means, for creating tasks in said memory means by mapping said template region into a task;
a first task in said memory means formed by said microkernel means, having said set of attributes and having a first task pointer to said template region;
said first template pointer and said first task pointer establishing a first relationship between said first task and said cache object;
a first processor means coupled to said memory means, for executing said programmed instructions;
a first thread in said memory means associated with said first task, for fetching said programmed instructions from said cache object using said first relationship, for execution in said first processor means;
a second task in said memory means formed by said microkernel means, having said set of attributes and having a second task pointer to said template region;
said second template pointer and said second task pointer establishing a second relationship between said second task and said cache object;
a second processor means coupled to said memory means, for executing said programmed instructions;
a second thread in said memory means associated with said second task, for fetching said programmed instructions from said cache object using said second relationship, for execution in said second processor means.
48. A microkernel architecture data processing system, comprising:
an auxiliary storage means for storing programmed instructions;
a memory means coupled to said auxiliary storage means, for storing a cache object, said auxiliary storage means paging into said cache object said programmed instructions;
a template region in said memory means, having a set of attributes and having a template pointer toward said cache object;
a microkernel means in said memory means, for creating tasks in said memory means by mapping said template region into a task;
a first task in said memory means formed by said microkernel means, having said set of attributes and having a first task pointer to said template region;
a first template pointer and said task pointer establishing a relationship between said first task and said cache object;
a first processor means coupled to said memory means, for executing said programmed instructions;
a first thread in said memory means associated with said first task, for fetching said programmed instructions from said cache object using said relationship, for execution in said first processor means;
a second task in said memory means formed by said microkernel means, having said set of attributes and having a second task pointer to said template region;
said template pointer and said second task pointer establishing said relationship between said second task and said cache object;
a second processor means coupled to said memory means, for executing said programmed instructions;
a second thread in said memory means associated with said second task, for fetching said programmed instructions from said cache object using said relationship, for execution in said second processor means.
49. An article of manufacture for use in a computer system, comprising:
a computer useable medium having computer readable program code means embodied therein for providing a memory management method for a microkernel architecture data processing system, the computer readable program code means in said article of manufacture comprising:
computer readable program code means for causing a computer to load a microkernel into a memory of a data processing system, for creating task containers in said memory;
computer readable program code means for causing a computer to form with said microkernel a template region as a special object in said memory, said template region having a set of attributes defining a virtual address space and having a template pointer to a memory object;
computer readable program code means for causing a computer to form with said microkernel a task container in said memory having said set of attributes and having a task pointer to said template region, by means of mapping said template region into said task container;
said template pointer and said task pointer establishing a first relationship between said task container and said memory object; and
computer readable program code means for causing a computer to perform virtual memory operations on said template region to modify said template pointer to said memory object, thereby establishing a second relationship between said task container and said memory object.
50. The article of manufacture for use in a computer system of claim 49, wherein said memory object is a map of virtual addresses to physical addresses in said memory, and includes a first address translation to a first memory object and a second address translation to a second memory object, the computer readable program code means in said article of manufacture further comprising:
said template pointer pointing to said first address translation prior to said memory operations, and establishing said first relationship between said task container and said first memory object;
said template pointer pointing to said second address translation after said memory operations, and establishing said second relationship between said task container and said second memory object.
51. The article of manufacture for use in a computer system of claim 49, wherein said memory object is a second template region, the computer readable program code means in said article of manufacture further comprising:
said template pointer pointing to said second template prior to said memory operations, and establishing said first relationship between said task container and said second template;
said template pointer pointing to a third template region after said memory operations, and establishing said second relationship.
52. The article of manufacture for use in a computer system of claim 49, wherein said computer readable program code means for causing a computer to form a task container in said memory, further comprises:
computer readable program code means for causing a computer to form with said microkernel a port name space in said memory for said task container, for use as a communication channel;
computer readable program code means for causing a computer to define with said microkernel, access rights for said port name space of said task container, using said set of attributes of said template region.
53. The article of manufacture for use in a computer system of claim 49, wherein said computer readable program code means for causing a computer to form a task container in said memory, further comprises:
computer readable program code means for causing a computer to form with said microkernel a thread object in said memory for said task container, for fetching instructions from said address space of said task container.
54. The article of manufacture for use in a computer system of claim 49, wherein said template region includes a base region and a user region, said computer readable program code means for causing a computer to form a template region in said memory, further comprising:
computer readable program code means for causing a computer to form with said microkernel a template base region as a special object in said memory, said template base region having a set of base attributes;
computer readable program code means for causing a computer to form with said microkernel a template user region as a special object in said memory, said template user region having a set of user attributes;
said computer readable program code means for causing a computer to form a task container in said memory, further comprising:
computer readable program code means for causing a computer to form with said microkernel a task base container in said memory having a base virtual address space and said set of base attributes, and having a task base pointer to said template base region, by means of mapping said template base region into said task base container;
computer readable program code means for causing a computer to form with said microkernel a task user container in said memory having a user virtual address space and said set of user attributes, and having a task user pointer to said template user region, by means of mapping said template user region into said task user container; and
computer readable program code means for causing a computer to perform virtual memory operations on said template base region, said virtual memory operations operative in said task base container by means of said task base pointer.
55. The article of manufacture for use in a computer system of claim 53, which further comprises:
computer readable program code means for causing a computer to perform virtual memory operations on said template user region, said virtual memory operations operative in said task user container by means of said task user pointer.
56. The article of manufacture for use in a computer system of claim 54, wherein said computer readable program code means for causing a computer to form a task base container in said memory, further comprises:
computer readable program code means for causing a computer to form with said microkernel a port name space in said memory for said task base container, for use as a communication channel;
computer readable program code means for causing a computer to define with said microkernel, access rights for said port name space of said task base container, using said set of base attributes of said template base region.
57. The article of manufacture for use in a computer system of claim 54, wherein said computer readable program code means for causing a computer to form a task base container in said memory, further comprises:
computer readable program code means for causing a computer to form with said microkernel a thread object in said memory for said task base container, for fetching instructions from said address space of said task base container.
Description
FIELD OF THE INVENTION
The invention disclosed broadly relates to data processing systems and more particularly relates to improvements in operating systems for data processing systems.
BACKGROUND OF THE INVENTION
The operating system is the most important software running on a computer. Every general purpose computer must have an operating system to run other programs. Operating systems typically perform basic tasks, such as recognizing input from the keyboard, sending output to the display screen, keeping track of files and directories on the disc, and controlling peripheral devices such as disc drives and printers. For more complex systems, the operating system has even greater responsibilities and powers. It makes sure that different programs and users running at the same time do not interfere with each other. The operating system is also typically responsible for security, ensuring that unauthorized users do not access the system.
Operating systems can be classified as multi-user operating systems, multi-processor operating systems, multi-tasking operating systems, and real-time operating systems. A multi-user operating system allows two or more users to run programs at the same time. Some operating systems permit hundreds or even thousands of concurrent users. A multi-processing program allows a single user to run two or more programs at the same time. Each program being executed is called a process. Most multi-processing systems support more than one user. A multi-tasking system allows a single process to run more than one task. In common terminology, the terms multi-tasking and multi-processing are often used interchangeably even though they have slightly different meanings. Multi-tasking is the ability to execute more than one task at the same time, a task being a program. In multi-tasking, only one central processing unit is involved, but it switches from one program to another so quickly that it gives the appearance of executing all of the programs at the same time. There are two basic types of multi-tasking, preemptive and cooperative. In preemptive multi-tasking, the operating system parcels out CPU time slices to each program. In cooperative multi-tasking, each program can control the CPU for as long as it needs it. If a program is not using the CPU however, it can allow another program to use it temporarily. For example, the OS/2 (TM) and UNIX (TM) operating systems use preemptive multi-tasking, whereas the Multi-Finder (TM) operating system for Macintosh (TM) computers uses cooperative multi-tasking. Multi-processing refers to a computer system's ability to support more than one process or program at the same time. Multi-processing operating systems enable several programs to run concurrently. Multi-processing systems are much more complicated than single-process systems because the operating system must allocate resources to competing processes in a reasonable manner. A real-time operating system responds to input instantaneously. General purpose operating systems such as DOS and UNIX are not real-time.
Operating systems provide a software platform on top of which application programs can run. The application programs must be specifically written to run on top of a particular operating system. The choice of the operating system therefore determines to a great extent the applications which can be run. For IBM compatible personal computers, example operating systems are DOS, OS/2 (TM), AIX (TM), and XENIX (TM).
A user normally interacts with the operating system through a set of commands. For example, the DOS operating system contains commands such as COPY and RENAME for copying files and changing the names of files, respectively. The commands are accepted and executed by a part of the operating system called the command processor or command line interpreter.
There are many different operating systems for personal computers such as CP/M (TM), DOS, OS/2 (TM), UNIX (TM), XENIX (TM), and AIX (TM). CP/M was one of the first operating systems for small computers. CP/M was initially used on a wide variety of personal computers, but it was eventually overshadowed by DOS. DOS runs on all IBM compatible personal computers and is a single user, single tasking operating system. OS/2, a successor to DOS, is a relatively powerful operating system that runs on IBM compatible personal computers that use the Intel 80286 or later microprocessor. OS/2 is generally compatible with DOS but contains many additional features, for example it is multi-tasking and supports virtual memory. UNIX and UNIX-based AIX run on a wide variety of personal computers and work stations. UNIX and AIX have become standard operating systems for work stations and are powerful multi-user, multi-processing operating systems.
In 1981 when the IBM personal computer was introduced in the United States, the DOS operating system occupied approximately 10 kilobytes of storage. Since that time, personal computers have become much more complex and require much larger operating systems. Today, for example, the OS/2 operating system for the IBM personal computers can occupy as much as 22 megabytes of storage. Personal computers become ever more complex and powerful as time goes by and it is apparent that the operating systems cannot continually increase in size and complexity without imposing a significant storage penalty on the storage devices associated with those systems.
It was because of this untenable growth rate in operating system size, that the MACH project was conducted at the Carnegie Mellon University in the 1980's. The goal of that research was to develop a new operating system that would allow computer programmers to exploit modern hardware architectures emerging and yet reduce the size and the number of features in the kernel operating system. The kernel is the part of an operating system that performs basic functions such as allocating hardware resources. In the case of the MACH kernel, five programming abstractions were established as the basic building blocks for the system. They were chosen as the minimum necessary to produce a useful system on top of which the typical complex operations could be built externally to the kernel. The Carnegie Mellon MACH kernel was reduced in size in its release 3.0, and is a fully functional operating system called the MACH microkernel. The MACH microkernel has the following primitives: the task, the thread, the port, the message, and the memory object.
The task is the traditional UNIX process which is divided into two separate components in the MACH microkernel. The first component is the task, which contains all of the resources for a group of cooperating entities. Examples of resources in a task are virtual memory and communications ports. A task is a passive collection of resources; it does not run on a processor.
The thread is the second component of the UNIX process, and is the active execution environment. Each task may support one or more concurrently executing computations called threads. For example, a multi-threaded program may use one thread to compute scientific calculations while another thread monitors the user interface. A MACH task may have many threads of execution, all running simultaneously. Much of the power of the MACH programming model comes from the fact that all threads in a task share the task's resources. For instance, they all have the same virtual memory (VM) address space. However, each thread in a task has its own private execution state. This state consists of a set of registers, such as general purpose registers, a stack pointer, a program counter, and a frame pointer.
A port is the communications channel through which threads communicate with each other. A port is a resource and is owned by a task. A thread gains access to a port by virtue of belonging to a task. Cooperating programs may allow threads from one task to gain access to ports in another task. An important feature is that they are location transparent. This capability facilitates the distribution of services over a network without program modification.
The message is used to enable threads in different tasks to communicate with each other. A message contains collections of data which are given classes or types. This data can range from program specific data such as numbers or strings to MACH related data such as transferring capabilities of a port from one task to another.
A memory object is an abstraction which supports the capability to perform traditional operating system functions in user level programs, a key feature of the MACH microkernel. For example, the MACH microkernel supports virtual memory paging policy in a user level program. Memory objects are an abstraction to support this capability.
All of these concepts are fundamental to the MACH microkernel programming model and are used in the kernel itself. These concepts and other features of the Carnegie Mellon University MACH microkernel are described in the book by Joseph Boykin, et al, "Programming Under MACH", Addison Wessely Publishing Company, Incorporated, 1993.
Additional discussions of the use of a microkernel to support a UNIX personality can be found in the article by Mike Accetta, et al, "MACH: A New Kernel Foundation for UNIX Development", Proceedings of the Summer 1986 USENIX Conference, Atlanta, Ga. Another technical article on the topic is by David Golub, et al, "UNIX as an Application Program", Proceedings of the Summer 1990 USENIX Conference, Anaheim, Calif.
One problem with current microkernel embodiments is that all of the virtual memory operations are based on the task. This is an inappropriate requirement, since many of the virtual memory calls are not necessarily limited to a single task. This also introduces a significant security exposure. A task's control port must be given to other tasks in order for another task to be able to manipulate portions of the virtual memory sub-system on a task's behalf. Once given a task's control port, the other task can breach the security of the first task. In addition, by having all of the virtual memory functions operating on the task, the task interfaces are overly complicated with all of the virtual memory calls that are not necessarily part of the task manipulation. Another problem with the current virtual memory interfaces for microkernels is that many of them operate on a range of memory. Since a task can be composed of many memory ranges, there is nothing in the interface definitions that naturally limits the operations on multiple memory ranges. For example, where there is a range of memory that has been reserved, there is no obvious limitation to prevent an operation such as memory allocation to span the reserved range and adjacent unreserved ranges. This creates the unwanted consequence of corrupting the contents of objects in the memory.
OBJECT OF THE INVENTION
It is therefore an object of the invention to provide an improved microkernel architecture for a data processing system.
It is another object of the invention to provide to an improved microkernel architecture for a data processing system that is more simplified in its virtual memory operations than has been capable in the prior art.
It is further object of the invention to provide an improved microkernel architecture for a data processing system, that has enhanced security for tasks.
It is still a further object of the invention to provide an improved microkernel architecture for a data processing system, that has greater resistance to corruption of objects defined in the memory.
SUMMARY OF THE INVENTION
These and other objects, features and advantages are accomplished by the method and apparatus for management of mapped and unmapped regions in accordance with the invention described herein.
The invention is a memory management method for a microkernel architecture and the microkernel structure, itself. It features template regions that are defined by the microkernel in the memory as special objects. In the memory management method, after the microkernel is loaded into the memory of a data processing system, it can begin creating task containers in the memory. It does this by first forming template regions as special objects in the memory, each template region having a set of attributes to define corresponding task containers. The attributes can specify the resources available to a task for use by its threads, such as virtual memory, data, and communications ports. Then, the microkernel can form the task in the memory, by mapping the attributes specified by the template region into the task. The microkernel defines a virtual address space for the task based upon the template region. Later, when the microkernel conducts virtual memory operations on the template regions, the effect of the virtual memory operations is manifested in the task by means of the mapping relationship.
The microkernel defines a data structure representing a task at a virtual address, using a size attribute from the template region and a starting virtual address for the task. The microkernel defines a virtual address space for the task over which the task's threads can conduct their operations, using an attribute from the template region. The microkernel also defines a task name for the task and forms a port name space in the memory for the task to use as a communications channel. The microkernel defines access rights for the port name space for the task using the set of attributes from the template region. The microkernel can then form thread objects in the memory for the task, for fetching instructions from the virtual address space of the task.
In accordance with the invention, the microkernel defines a first pointer for the task that points to the template region. Within the template region, there are second pointers that point directly or indirectly to a mapping table called a PMAP. The PMAP converts the virtual address value of the second pointers, into a physical address of a cache object in the memory that contains a page of data to be used by the task. From time to time, changes are desired to be made in the data resources of a task. This is accomplished by changing the virtual address value represented by the second pointers in the template region. The changed second pointers can point to different translation values in the PMAP, resulting in the addressing of different pages or cache objects, as is desired. But, no changes are necessary to the contents of the task, itself. Task resources, such as data in cache objects, are addressed by the task through the second pointers in the template region. If the microkernel has defined a plurality of tasks from the template region, then changing the second pointers in the template region will result in a global change in the resources available to the plurality of tasks pointing to that template region. The data pages addressed by the tasks can be changed with a single change to the second pointers in the template region, instead of changing the contents of each one of the tasks. In this manner, a single template region can be mapped into multiple tasks, simultaneously. Each task sees all of the changes that are made to the template region. This allows for the sharing of properties by the tasks that were generated from the same template region.
The template region is the object to which all virtual memory operations are directed. In the past, with the MACH microkernel, a task was the object to which the virtual memory operations were directed. By directing virtual memory operations to the template region on which they will take effect, in accordance with the invention, the sharing of the virtual memory operations is much easier to accomplish since the changes are made to a region, not to the mapping of the region within each task.
DESCRIPTION OF THE FIGURES
These and other objects features and advantages will be more fully appreciated with reference to the accompanying figures.
FIG. 1 is a functional block diagram of the Microkernel System 115 in the memory 102 of the host multiprocessor 100, showing how the microkernel and personality-neutral services 140 run multiple operating system personalities on a variety of hardware platforms.
FIG. 2 shows the client visible structure associated with a thread.
FIG. 3 shows the client visible task structures.
FIG. 4 shows a typical port, illustrating a series of send rights and the single receive right.
FIG. 5 shows a series of port rights, contained in a port name space or in transit in a message.
FIG. 6 shows the client visible virtual memory structures.
FIG. 7A shows three template regions formed in the microkernel address space, a nestable base region R0, and two leaf regions R1 and R2, whose address spaces are within that of R0. Template regions R0, R1, and R2 have second pointers that point to the PMAP translations into the cache object data pages D0 and D1.
FIG. 7B shows a first task T(A) formed by the microkernel using the attributes of template region R0. Task T(A) has first pointers that point to template region R0. The second pointers of the template regions R0, R1, and R2 provide task T(A) with access to the cache object data pages D0 and D1.
FIG. 7C shows a second task T(B) formed by the microkernel using the attributes of template region R0. Task T(B) has first pointers that point to template region R0. The second pointers of the template regions R0, R1, and R2 provide task T(B) with access to the cache object data pages D0 and D1, in the same manner as that for task T(A).
FIG. 8A shows mapping a template region into two tasks.
FIG. 8B shows mapping a template region into a task twice.
FIG. 9 shows overlapping template regions.
FIG. 10 shows memory allocations and overlapping template regions.
FIG. 11 shows condensing template regions.
FIG. 12 shows mapping memory objects.
FIG. 13 shows incorrect mapping of memory objects.
FIG. 14 shows virtual memory components.
FIG. 15 shows a task's template regions.
FIG. 16 shows tasks with different kernel interface libraries.
FIG. 17 shows a kernel task's template regions.
DISCUSSION OF THE PREFERRED EMBODIMENT
Part A. The Microkernel System
Section 1. Microkernel Principles
FIG. 1 is a functional block diagram of the Microkernel System 115, showing how the microkernel 120 and personality-neutral services 140 run multiple operating system personalities 150 on a variety of hardware platforms.
The host multi-processor 100 shown in FIG. 1 includes memory 102 connected by means of a bus 104 to an auxiliary storage 106 which can be for example a disc drive, a read only or a read/write optical storage, or any other bulk storage device. Also connected to the bus 104 is the I/O adaptor 108 which in turn may be connected to a keyboard, a monitor display, a telecommunications adaptor, a local area network adaptor, a modem, multi-media interface devices, or other I/O devices. Also connected to the bus 104 is a first processor A, 110 and a second processor B, 112. The example shown in FIG. 1 is of a symmetrical multi-processor configuration wherein the two uni-processors 110 and 112 share a common memory address space 102. Other configurations of single or multiple processors can be shown as equally suitable examples. The processors can be, for example, an Intel 386 (TM) CPU, Intel 486 (TM) CPU, a Pentium (TM) processor, a Power PC (TM) processor, or other uni-processor devices.
The memory 102 includes the microkernel system 115 stored therein, which comprises the microkernel 120, the personality neutral services (PNS) 140, and the personality servers 150. The microkernel system 115 serves as the operating system for the application programs 180 stored in the memory 102.
An objective of the invention is to provide an operating system that behaves like a traditional operating system such as UNIX or OS/2. In other words, the operating system will have the personality of OS/2 or UNIX, or some other traditional operating system.
The microkernel 120 contains a small, message-passing nucleus of system software running in the most privileged state of the host multi-processor 100, that controls the basic operation of the machine. The microkernel system 115 includes the microkernel 120 and a set of servers and device drivers that provide personality neutral services 140. As the name implies, the personality neutral servers and device drivers are not dependent on any personality such as UNIX or OS/2. They depend on the microkernel 120 and upon each other. The personality servers 150 use the message passing services of the microkernel 120 to communicate with the personality neutral services 140. For example, UNIX, OS/2 or any other personality server can send a message to a personality neutral disc driver and ask it to read a block of data from the disc. The disc driver reads the block and returns it in a message. The message system is optimized so that large amounts of data are transferred rapidly by manipulating pointers; the data itself is not copied.
By virtue of its size and ability to support standard programming services and features as application programs, the microkernel 120 is simpler than a standard operating system. The microkernel system 115 is broken down into modular pieces that are configured in a variety of ways, permitting larger systems to be built by adding pieces to the smaller ones. For example, each personality neutral server 140 is logically separate and can be configured in a variety of ways. Each server runs as an application program and can be debugged using application debuggers. Each server runs in a separate task and errors in the server are confined to that task.
FIG. 1 shows the microkernel 120 including the interprocess communications module (IPC) 122, the virtual memory module 124, tasks and threads module 126, the host and processor sets 128, I/O support and interrupts 130, and machine dependent code 125.
The personality neutral services 140 shown in FIG. 1 includes the multiple personality support 142 which includes the master server, initialization, and naming. It also includes the default pager 144. It also includes the device support 146 which includes multiple personality support and device drivers. It also includes other personality neutral products 148, including a file server, network services, database engines and security.
The personality servers 150 are for example the dominant personality 152 which can be, for example, a UNIX personality. It includes a dominant personality server 154 which would be a UNIX server, and other dominant personality services 155 which would support the UNIX dominant personality. An alternate dominant personality 156 can be for example OS/2. Included in the alternate personality 156 are the alternate personality server 158 which would characterize the OS/2 personality, and other alternate personality services for OS/2, 159.
Dominant personality applications 182 shown in FIG. 1, associated with the UNIX dominant personality example, are UNIX-type applications which would run on top of the UNIX operating system personality 152. The alternate personality applications 186 shown in FIG. 1, are OS/2 applications which run on top of the OS/2 alternate personality operating system 156.
FIG. 1 shows that the Microkernel System 115 carefully splits its implementation into code that is completely portable from processor type to processor type and code that is dependent on the type of processor in the particular machine on which it is executing. It also segregates the code that depends on devices into device drivers; however, the device driver code, while device dependent, is not necessarily dependent on the processor architecture. Using multiple threads per task, it provides an application environment that permits the use of multi-processors without requiring that any particular machine be a multi-processor. On uni-processors, different threads run at different times. All of the support needed for multiple processors is concentrated into the small and simple microkernel 120.
This section provides an overview of the structure of the Microkernel System 115. Later sections describe each component of the structure in detail and describe the technology necessary to build a new program using the services of the Microkernel System 115.
The Microkernel System 115 is a new foundation for operating systems. It provides a comprehensive environment for operating system development with the following features:
Support for multiple personalities
Extensible memory management
Interprocess communication
Multi-threading
Multi-processing
The Microkernel System 115 provides a concise set of kernel services implemented as a pure kernel and an extensive set of services for building operating system personalities implemented as a set of user-level servers.
Objectives of the Microkernel System 115 include the following:
Permit multiple operating system personalities to work together in harmony;
Provide common programming for low-level system elements, such as device drivers and file systems;
Exploit parallelism in both operating system and user applications;
Support large, potentially sparse address spaces with flexible memory sharing;
Allow transparent network resource access;
Be compatible with existing software environments, such as OS/2 and UNIX; and
Portable (to 32-bit and 64-bit platforms).
The Microkernel System 115 is based on the following concepts:
User mode tasks performing many traditional operating system functions (for example, file system and network access);
A basic set of user-level run time services for creating operating systems;
A simple, extensible communication kernel;
An object basis with communication channels as object references; and
A client/server programming model, using synchronous and asynchronous inter-process communication.
The basis for the Microkernel System 115 is to provide a simple, extensible communication kernel. It is an objective of the Microkernel System 115 to permit the flexible configuration of services in either user or kernel space with the minimum amount of function in the kernel proper. The kernel must provide other support besides task-to-task communication, including:
Management of points of control (threads);
Resource assignment (tasks);
Support of address spaces for tasks; and
Management of physical resources, such as physical memory, processors, interrupts, DMA channels, and clocks.
User mode tasks implement the policies regarding resource usage. The kernel simply provides mechanisms to enforce those policies. Logically above the kernel is the Personality-Neutral services 140 (PNS) layer. The PNS provide a C runtime environment, including such basic constructs as string functions, and a set of servers which include:
Name Server--Allows a client to find a server
Master Server--Allows programs to be loaded and started
Kernel Abstractions
One goal of the Microkernel System 115 is to minimize abstractions provided by the kernel itself, but not to be minimal in the semantics associated with those abstractions. Each of the abstractions provided has a set of semantics associated with it, and a complex set of interactions with the other abstractions. This can make it difficult to identify key ideas. The main kernel abstractions are:
Task--Unit of resource allocation, large access space and port right
Thread--Unit of CPU utilization, lightweight (low overhead)
Port--A communication channel, accessible only through the send/receive capabilities or rights
Message--A collection of data objects
Memory object--The internal unit of memory management
(Refer to Section 2, Architectural Model, for a detailed description of the task, thread, port, message and memory object concepts).
Tasks and Threads
The Microkernel System 115 does not provide the traditional concept of process because: All operating system environments have considerable semantics associated with a process (such as user ID, signal state, and so on). It is not the purpose of the microkernel to understand or provide these extended semantics.
Many systems equate a process with an execution point of control. Some systems do not.
The microkernel 120 supports multiple points of control separately from the operating system environment's process. The microkernel provides the following two concepts:
Task
Thread
(Refer to Section 2, Architectural Model, for a detailed description of the task and thread concepts).
Memory Management
The kernel provides some memory management. Memory is associated with tasks. Memory objects are the means by which tasks take control over memory management. The Microkernel System 115 provides the mechanisms to support large, potentially sparse virtual address spaces. Each task has an associated address map that is maintained by the kernel and controls the translation of virtual address in the task's address space into physical addresses. As in virtual memory systems, the contents of the entire address space of any given task might not be completely resident in physical memory at the same time, and mechanisms must exist to use physical memory as a cache for the virtual address spaces of tasks. Unlike traditional virtual memory designs, the Microkernel System 115 does not implement all of the caching itself. It gives user mode tasks the ability to participate in these mechanisms. The PNS include a user task, the default pager 144, that provides paging services for memory.
Unlike other resources in the Microkernel System 115, virtual memory is not referenced using ports. Memory can be referenced only by using virtual addresses as indices into a particular task's address space. The memory and the associated address map that defines a task's address space can be partially shared with other tasks. A task can allocate new ranges of memory within its address space, de-allocate them, and change protections on them. It can also specify inheritance properties for the ranges. A new task is created by specifying an existing task as a base from which to construct the address space for the new task. The inheritance attribute of each range of the memory of the existing task determines whether the new task has that range defined and whether that range is virtually copied or shared with the existing task. Most virtual copy operations for memory are achieved through copy-on-write optimizations. A copy-on-write optimization is accomplished by protected sharing. The two tasks share the memory to be copied, but with read-only access. When either task attempts to modify a portion of the range, that portion is copied at that time. This lazy evaluation of memory copies is an important performance optimization performed by the Microkernel System 115 and important to the communication/memory philosophy of the system.
Any given region of memory is backed by a memory object. A memory manager task provides the policy governing the relationship between the image of a set of pages while cached in memory (the physical memory contents of a memory region) and the image of that set of pages when not cached (the abstract memory object). The PNS has a default memory manager or pager that provides basic non-persistent memory objects that are zero-filled initially and paged against system paging space.
Task to Task Communication
The Microkernel System 115 uses a client/server system structure in which tasks (clients) access services by making requests of other tasks (servers) through messages sent over a communication channel. Since the microkernel 120 provides very few services of its own (for example, it provides no file service), a microkernel 120 task must communicate with many other tasks that provide the required services. The communication channels of the interprocess communication (IPC) mechanism are called ports. (Refer to Section 2, Architectural Model, for a detailed description of a Port). A message is a collection of data, memory regions, and port rights. A port right is a name by which a task, that holds the right, names the port. A task can manipulate a port only if it holds the appropriate port rights. Only one task can hold the receive right for a port. This task is allowed to receive (read) messages from the port queue. Multiple tasks can hold send rights to the port that allow them to send (write) messages into the queue. A task communicates with another task by building a data structure that contains a set of data elements, and then performing a message-send operation on a port for which it holds a send right. At some later time, the task holding the receive right to that port performs a message-receive operation.
Note: This message transfer is an asynchronous operation. The message is logically copied into the receiving task (possibly with copy-on-write optimizations). Multiple threads within the receiving task can be attempting to receive messages from a given port, but only one thread will receive any given message.
Section 2. Architectural Model
The Microkernel System 115 has, as its primary responsibility, the provision of points of control that execute instructions within a framework. These points of control are called threads. Threads execute in a virtual environment. The virtual environment provided by the kernel contains a virtual processor that executes all of the user space accessible hardware instructions, augmented by user-space PNS and emulated instructions (system traps) provided by the kernel. The virtual processor accesses a set of virtualized registers and some virtual memory that otherwise responds as does the machine's physical memory. All other hardware resources are accessible only through special combinations of memory accesses and emulated instructions. Note that all resources provided by the kernel are virtualized. This section describes the top level elements of the virtual environment as seen by threads.
Elements of the Personality Neutral Services (PNS)
The PNS 140 portion of the Microkernel System 115 consists of services built on the underlying microkernel 120. This provides some functions that the kernel itself depends on, as well as a basic set of user-level services for the construction of programs. These programs can serve requests from multiple operating system personality clients and are used to construct the operating system personalities themselves. In addition, there is an ANSI C run time environment for the construction of PNS programs in standard C and some supplemental functions that have definitions taken from the POSIX standard. Besides the libraries that define the PNS themselves, there are many libraries that exist within the PNS that are a part of the microkernel proper. These libraries represent the interfaces that the microkernel exports and the support logic for the Message Interface Generator (MIG) which is used with the Microkernel System's 115 interprocess communications facilities.
The structure of the PNS environment library hides the details of the implementation of each service from its callers. Some libraries, such as one of the C run time libraries, implement all of their functions as local routines that are loaded into the address space of the caller while other libraries consist of stubs that invoke the microkernel's IPC system to send messages to servers. This architecture permits the flexible implementation of function: servers can be replaced by other servers and services can be combined into single tasks without affecting the sources of the programs that use them. A key element of the PNS environment is that, it does not constitute a complete operating system. Instead, the PNS depend on the existence of a personality. The dominant personality 152, that is loaded first during system start-up, is the operating system personality which provides the user interface on the system and provides services to its clients and to elements of the PNS. Thus, the dominant personality is a server of "last resort". The dominant personality implements whatever services are defined by the PNS libraries but are not implemented by another server.
The microkernel 120 is also dependent on some elements of the PNS. There are cases when it sends messages to personality-neutral servers to complete internal kernel operations. For example, in resolving a page fault, the microkernel 120 may send a message to the default pager 144. The default pager 144 then reads in the page that the kernel needs from a hard disk. Although the page fault is usually being resolved on behalf of a user task, the kernel is the sender of the message.
Run Time
The PNS run time provides a set of ANSI C and POSIX libraries that are used to support a standard C programing environment for programs executing in this environment. The facilities include typical C language constructs. Like all systems, the microkernel system 115 has, as its primary responsibility, the provision of points of control that execute instructions within a framework. In the microkernel 120, points of control are called threads. Threads execute in a virtual environment. The virtual environment provided by the microkernel 120 consists of a virtual processor that executes all of the user space accessible hardware instructions, augmented by emulated instructions (system traps) provided by the kernel; the virtual processor accesses a set of virtualized registers and some virtual memory that otherwise responds as does the machine's physical memory. All other hardware resources are accessible only through special combinations of memory accesses and emulated instructions. Note that all resources provided by the microkernel are virtualized. This section describes the top level elements of the virtual environment seen by the microkernel threads.
Elements of the Kernel
The microkernel 120 provides an environment consisting of the elements described in the following list of Kernel Elements:
Thread:
An execution point of control. A thread is a lightweight entity. Most of the state pertinent to a thread is associated with its containing task.
Task:
A container to hold references to resources in the form of a port name space, a virtual address space, and a set of threads.
Security Token:
A security feature passed from the task to server, which performs access validations.
Port:
A unidirectional communication channel between tasks.
Port Set:
A set of ports which can be treated as a single unit when receiving a message.
Port Right:
Allows specific rights to access a port.
Port Name Space:
An indexed collection of port names that names a particular port right.
Message
A collection of data, memory regions and port rights passed between two tasks.
Message Queue:
A queue of messages associated with a single port.
Virtual Address Space:
A sparsely populated, indexed set of memory pages that can be referenced by the threads within a task. Ranges of pages might have arbitrary attributes and semantics associated with them through mechanisms implemented by the kernel and external memory managers.
Abstract Memory Object:
An abstract object that represents the non-resident state of the memory ranges backed by this object. The task that implements this object is called a memory manager. The abstract memory object port is the port through which the kernel requests action of the memory manager.
Memory Object Representative:
The abstract representation of a memory object provided by the memory manager to clients of the memory object. The representative names the associated abstract memory object and limits the potential access modes permitted to the client.
Memory Cache Object:
A kernel object that contains the resident state of the memory ranges backed by an abstract memory object. It is through this object that the memory manager manipulates the clients' visible memory image.
Processor:
A physical processor capable of executing threads.
Processor Set:
A set of processors, each of which can be used to execute the threads assigned to the processor set.
Host:
The multiprocessor as a whole.
Clock:
A representation of the passage of time. A time value incremented at a constant frequency.
Many of these elements are kernel implemented resources that can be directly manipulated by threads. Each of these elements are discussed in detail in the paragraphs that follow. However, since some of their definitions depend on the definitions of others, some of the key concepts are discussed in simplified form so that a full discussion can be understood.
Threads
A thread is a lightweight entity. It is inexpensive to create and requires low overhead to operate. A thread has little state (mostly its register state). Its owning task bears the burden of resource management. On a multiprocessor it is possible for multiple threads in a task to execute in parallel. Even when parallelism is not the goal, multiple threads have an advantage because each thread can use a synchronous programming style, instead of asynchronous programming with a single thread attempting to provide multiple services.
A thread contains the following features:
1. a point of control flow in a task or a stream of instruction execution;
2. access to all of the elements of the containing task;
3. executes in parallel with other threads, even threads within the same task; and
4. minimal state for low overhead.
A thread is the basic computational entity. A thread belongs to only one task that defines its virtual address space. To affect the structure of the address space, or to reference any resource other than the address space, the thread must execute a special trap instruction. This causes the kernel to perform operations on behalf of the thread, or to send a message to an agent on behalf of the thread. These traps manipulate resources associated with the task containing the thread. Requests can be made of the kernel to manipulate these entities: to create and delete them and affect their state. The kernel is a manager that provides resources (such as those listed above) and services. Tasks may also provide services, and implement abstract resources. The kernel provides communication methods that allow a client task to request that a server task (actually, a thread executing within it) provide a service. In this way, a task has a dual identity. One identity is that of a resource managed by the kernel, whose resource manager executes within the kernel. The second identity is that of a supplier of resources for which the resource manager is the task itself.
A thread has the following state:
1. Its machine state (registers, etc.), which change as the thread executes and which can also be changed by a holder of the kernel thread port;
2. A small set of thread specific port rights, identifying the thread's kernel port and ports used to send exception messages on behalf of the thread;
3. A suspend count, non-zero if the thread is not to execute instructions; and
4. Resource scheduling parameters.
A thread operates by executing instructions in the usual way. Various special instructions trap to the kernel, to perform operations on behalf of the thread. The most important of these kernel traps is the mach.sub.-- msg.sub.-- trap. This trap allows the thread to send messages to the kernel and other servers to operate upon resources. This trap is almost never directly called; it is invoked through the mach.sub.-- msg library routine. Exceptional conditions, such as "floating point overflow" and "page not resident", that arise during the thread's execution, are handled by sending messages to a port. The port used depends on the nature of the condition. The outcome of the exceptional condition is determined by setting the thread's state and/or responding to the exception message. The following operations can be performed on a thread:
Creation and destruction;
Suspension and resumption (manipulating the suspend count);
Machine state manipulation Special port (such as exception; port) manipulation; and
Resource (scheduling) control.
Tasks
A task is a collection of system resources. These resources, with the exception of the address space, are referenced by ports. These resources can be shared with other tasks if rights to the ports are so distributed.
Tasks provide a large, potentially sparse address space, referenced by machine address. Portions of this space can be shared through inheritance or external memory management. Note: A task has no life of its own. It contains threads which execute instructions. When it is said "a task Y does X" what is meant is "a thread contained within task Y does X". A task is an expensive entity. All of the threads in a task share everything. Two tasks share nothing without explicit action, although the action is often simple. Some resources such as port receive rights cannot be shared between two tasks. A task can be viewed as a container that holds a set of threads. It contains default values to be applied to its containing threads. Most importantly, it contains those elements that its containing threads need to execute, namely, a port name space and a virtual address space.
The state associated with a task is as follows:
The set of contained threads;
The associated virtual address space;
The associated port name space, naming a set of port rights, and a related set of port notification requests;
A security token to be sent with messages from the task;
A small set of task specific ports, identifying the task's kernel port, default ports to use for exception handling for contained threads, and bootstrap ports to name other services;
A suspend count, non-zero if no contained threads are to execute instructions;
Default scheduling parameters for threads; and
Various statistics, including statistical PC samples.
Tasks are created by specifying a prototype task which specifies the host on which the new task is created, and which can supply by inheritance various portions of its address space.
The following operations can be performed on a task:
Creation and destruction
Setting the security token
Suspension and resumption
Special port manipulation
Manipulation of contained threads
Manipulation of the scheduling parameters
Security Port
All tasks are tagged with a security token, an identifier that is opaque from the kernel's point of view. It encodes the identity and other security attributes of the task. This security token is included as an implicit value in all messages sent by the task. Trusted servers can use this sent token as an indication of the sender's identity for use in making access mediation decisions. A task inherits the security token of its parent. Because this token is to be used as an un- forgeable indication of identity, privilege is required to change this token. This privilege is indicated by presenting the host security port.
A reserved value indicates the kernel's identity. All messages from the kernel carry the kernel identity, except exception messages, which carry the excepting task's identity.
Port
A port is a unidirectional communication channel between a client that requests a service and a server that provides the service. A port has a single receiver and potentially multiple senders. The state associated with a port is as follows:
Its associated message queue
A count of references (rights) to the port
Settable limits on the amount of virtual copy memory and port rights that can be sent in a message through the port.
Kernel services exist to allocate ports. All system entities other than virtual memory ranges are named by ports; ports are also created implicitly when these entities are created. The kernel provides notification messages upon the death of a port upon request. With the exception of the task's virtual address space, all other system resources are accessed through a level of indirection known as a port. A port is a unidirectional communication channel between a client who requests service and a server who provides the service. If a reply is to be provided to such a service request, a second port must be used. The service to be provided is determined by the manager that receives the message sent over the port. It follows that the receiver for ports associated with kernel provided entities is the kernel. The receiver for ports associated with task provided entities is the task providing that entity. For ports that name task provided entities, it is possible to change the receiver of messages for that port to a different task. A single task might have multiple ports that refer to resources it supports. Any given entity can have multiple ports that represent it, each implying different sets of permissible operations. For example, many entities have a name port and a control port that is sometimes called the privileged port. Access to the control port allows the entity to be manipulated. Access to the name port simply names the entity, for example, to return information.
There is no system-wide name space for ports. A thread can access only the ports known to its containing task. A task holds a set of port rights, each of which names a (not necessarily distinct) port and which specifies the rights permitted for that port. Port rights can be transmitted in messages. This is how a task gets port rights. A port right is named with a port name, which is an integer chosen by the kernel that is meaningful only within the context (port name space) of the task holding that right. Most operations in the system consist of sending a message to a port that names a manager for the object being manipulated. In this document, this is shown in the form:
object.fwdarw.function
which means that the function is invoked (by sending an appropriate message) to a port that names the object. Since a message must be sent to a port (right), this operation has an object basis. Some operations require two objects, such as binding a thread to a processor set. These operations show the objects separated by commas. Not all entities are named by ports, and this is not a pure object model. The two main non-port-right named entities are port names/rights themselves, and ranges of memory. Event objects are also named by task local IDs. To manipulate a memory range, a message is sent to the containing virtual address space named by the owning task. To manipulate a port name/right, and often, the associated port, a message is sent to the containing port name space named by the owning task. A subscript notation,
object ›id!.fwdarw.function
is used here to show that an id is required as a parameter in the message to indicate which range or element of object is to be manipulated. The parenthetic notation,
object (port).fwdarw.function
is used here to show that a privileged port, such as the host control port, is required as a parameter in the message to indicate sufficient privilege to manipulate the object in the particular way.
Port Sets
A port set is a set of ports that can be treated as a single unit when receiving a message. A mach.sub.-- msg receive operation is allowed against a port name that either names a receive right, or a port set. A port set contains a collection of receive rights. When a receive operation is performed against a port set, a message is received from one of the ports in the set. The received message indicates from which member port it was received. It is not allowed to directly receive a message from a port that is a member of a port set. There is no concept of priority for the ports in a port set; there is no control provided over the kernel's choice of the port within the port set from which any given message is received.
Operations supported for port sets include:
Creation and deletion
Membership changes and membership queries
Port Rights
A port can only be accessed by using a port right. A port right allows access to a specific port in a specific way. There are three types of port rights as follow:
receive right--Allows the holder to receive messages from the associated port.
send right--Allows the holder to send messages to the associated port.
send-once right--Allows the holder to send a single message to the associated port. The port right self-destructs after the message is sent.
Port rights can be copied and moved between tasks using various options in the mach.sub.-- msg call, and also by explicit command. Other than message operations, port rights can be manipulated only as members of a port name space. Port rights are created implicitly when any other system entity is created, and explicitly using explicit port creation.
The kernel will, upon request, provide notification to a port of one's choosing when there are no more send rights to a port. Also, the destruction of a send-once right (other than by using it to send a message) generates a send-once notification sent to the corresponding port. Upon request, the kernel provides notification of the destruction of a receive right.
Port Name Space
Ports and port rights do not have system-wide names that allow arbitrary ports or rights to be manipulated directly. Ports can be manipulated only through port rights, and port rights can be manipulated only when they are contained within a port name space. A port right is specified by a port name which is an index into a port name space. Each task has associated with it a single port name space.
An entry in a port name space can have the following four possible values:
MACH.sub.-- PORT.sub.-- NULL--No associated port right.
MACH.sub.-- PORT.sub.-- DEAD--A right was associated with this name, but the port to which the right referred has been destroyed.
A port right--A send-once, send or receive right for a port.
A port set name--A name which acts like a receive right, but that allows receiving from multiple ports.
Acquiring a new right in a task generates a new port name. As port rights are manipulated by referring to their port names, the port names are sometimes themselves manipulated. All send and receive rights to a given port in a given port name space have the same port name. Each send-once right to a given port have a different port name from any other and from the port name used for any send or receive rights held. Operations supported for port names include the following:
Creation (implicit in creation of a right) and deletion
Query of the associated type
Rename
Upon request, the kernel provides notification of a name becoming unusable.
Since port name spaces are bound to tasks, they are created and destroyed with their owning task.
Message
A message is a collection of data, memory regions and port rights passed between two entities. A message is not a system object in its own right. However, since messages are queued, they are significant because they can hold state between the time a message is sent and when it is received. This state consists of the following:
Pure data
Copies of memory ranges
Port rights
Sender's security token
Message Queues
A port consists of a queue of messages. This queue is manipulated only through message operations (mach.sub.-- msg) that transmit messages. The state associated with a queue is the ordered set of messages queued, and settable limit on the number of messages.
Virtual Address Space
A virtual address space defines the set of valid virtual addresses that a thread executing within the task owning the virtual address space is allowed to reference. A virtual address space is named by its owning task.
A virtual address space consists of a sparsely populated indexed set of pages. The attributes of individual pages can be set as desired. For efficiency, the kernel groups virtually contiguous sets of pages that have the same attributes into internal memory regions. The kernel is free to split or merge memory regions as desired. System mechanisms are sensitive to the identities of memory regions, but most user accesses are not so affected, and can span memory regions freely.
A given memory range can have distinct semantics associated with it through the actions of a memory manager. When a new memory range is established in a virtual address space, an abstract memory object is specified, possibly by default, that represents the semantics of the memory range, by being associated with a task (a memory manager) that provides those semantics.
A virtual address space is created when a task is created, and destroyed when the task is destroyed. The initial contents of the address space is determined from various options to the task.sub.-- create call, as well as the inheritance properties of the memory ranges of the prototype task used in that of call.
Most operations upon a virtual address space name a memory range within the address space. These operations include the following:
Creating or allocating, and de-allocating a range
Copying a range
Setting special attributes, including "wiring" the page into physical memory to prevent eviction
Setting memory protection attributes
Setting inheritance properties
Directly reading and writing ranges
Forcing a range flush to backing storage
Reserving a range (preventing random allocation within the range)
Abstract Memory Object
The microkernel allows user mode tasks to provide the semantics associated with referencing portions of a virtual address space. It does this by allowing the specification of an abstract memory object that represents the non-resident state of the memory ranges backed by this memory object. The task that implements this memory object and responds to messages sent to the port that names the memory object is called a memory manager.
The kernel should be viewed as using main memory as a directly accessible cache for the contents of the various memory objects. The kernel is involved in an asynchronous dialog with the various memory managers to maintain this cache, filling and flushing this cache as the kernel desires, by sending messages to the abstract memory object ports. The operations upon abstract memory objects include the following:
Initialization
Page reads
Page writes
Synchronization with force and flush operations
Requests for permission to access pages
Page copies
Termination
Memory Object Representative
The abstract memory object port is used by the kernel to request access to the backing storage for a memory object. Because of the protected nature of this dialog, memory managers do not typically give access to the abstract memory object port to clients. Instead, clients are given access to memory object representatives. A memory object representative is the client's representation of a memory object. There is only one operation permitted against such a port and that is to map the associated memory object into a task's address space. Making such a request initiates a protocol between the mapping kernel and the memory manager to initialize the underlying abstract memory object. It is through this special protocol that the kernel is informed of the abstract memory object represented by the representative, as well as the set of access modes permitted by the representative.
Memory Cache Object
The portion of the kernel's main memory cache that contains the resident pages associated with a given abstract memory object is referred to as the memory cache object. The memory manager for a memory object holds send rights to the kernel's memory cache object. The memory manager is involved in an asynchronous dialog with the kernel to provide the abstraction of its abstract memory object by sending messages to the associated memory cache object. The operations upon memory cache objects include the following:
Set operational attributes
Return attributes
Supply pages to the kernel
Indicate that pages requested by the kernel are not available
Indicate that pages requested by the kernel should be filled by the kernel's default rules Force delayed copies of the object to be completed
Indicate that pages sent to the memory manager have been disposed Restrict access to memory pages
Provide performance hints
Terminate
Processor
Each physical processor that is capable of executing threads is named by a processor control port. Although significant in that they perform the real work, processors are not very significant in the microkernel, other than as members of a processor set. It is a processor set that forms the basis for the pool of processors used to schedule a set of threads, and that has scheduling attributes associated with it. The operations supported for processors include the following:
Assignment to a processor set
Machine control, such as start and stop
Processor Set
Processors are grouped into processor sets. A processor set forms a pool of processors used to schedule the threads assigned to that processor set. A processor set exists as a basis to uniformly control the schedulability of a set of threads. The concept also provides a way to perform coarse allocation of processors to given activities in the system. The operations supported upon processor sets include the following:
Creation and deletion
Assignment of processors
Assignment of threads and tasks
Scheduling control
Host
Each machine (uniprocessor or multiprocessor) in a networked microkernel system runs its own instantiation of the microkernel. The host multiprocessor 100 is not generally manipulated by client tasks. But, since each host does carry its own microkernel 120, each with its own port space, physical memory and other resources, the executing host is visible and sometimes manipulated directly. Also, each host generates its own statistics. Hosts are named by a name port which is freely distributed and which can be used to obtain information about the host and a control port which is closely held and which can be used to manipulate the host. Operations supported by hosts include the following:
Clock manipulation
Statistics gathering
Re-boot
Setting the default memory manager
Obtaining lists of processors and processor sets
Clock
A clock provides a representation of the passage of time by incrementing a time value counter at a constant frequency. Each host or node in a multicomputer implements its own set of clocks based upon the various clocks and timers supported by the hardware as well as abstract clocks built upon these timers. The set of clocks implemented by a given system is set at configuration time. Each clock is named by both a name and a control or privileged port. The control port allows the time and resolution of the clock to be set. Given the name port, a task can perform the following:
Determine the time and resolution of the clock.
Generate a memory object that maps the time value.
Sleep (delay) until a given time.
Request a notification or alarm at a given time.
Section 3. Tasks and Threads
This section discusses the user visible view of threads and tasks. Threads are the active entities in the Microkernel System 115. They act as points of control within a task, which provides them with a virtual address space and a port name space with which other resources are accessed.
Threads
A thread is the basic computational entity. A thread belongs to only one task that defines its virtual address space. A thread is a lightweight entity with a minimum of state. A thread executes in the way dictated by the hardware, fetching instructions from its task's address space based on the thread's register values. The only actions a thread can take directly are to execute instructions that manipulate its registers and read and write into its memory space. An attempt to execute privileged machine instructions, though, causes an exception. The exception is discussed later. To affect the structure of the address space, or to reference any resource other than the address space, the thread must execute a special trap instruction which causes the kernel to perform operations on behalf of the thread, or to send a message to some agent on behalf of the thread. Also, faults or other illegal instruction behavior cause the kernel to invoke its exception processing.
FIG. 2. shows the client visible structure associated with a thread. The thread object is the receiver for messages sent to the kernel thread port. Aside from any random task that holds a send right for this thread port, the thread port is also accessible as the thread's thread self port, through the containing processor set or the containing task.
Actions by Threads
This section describes the details of the actions that a thread can take directly. A thread can do anything if it can gain rights to the correct ports and send messages to them. The various things it can do are discussed under the sections describing the object manipulated.
Scheduling Support Traps
The microkernel preemptively schedules threads. The way in which this is done is related to various factors. For now, it is sufficient to say that threads have scheduling priority associated with them which is used to select which threads should execute within a given processor set.
thread.sub.-- switch causes a context switch with various options. It is provided for cases, such as software lock routines, that want to give up the processor so that other threads can make progress. The options have to do with selecting the appropriate new thread to run, when this information is available. One of the options of thread.sub.-- switch causes the scheduling priority of the thread to be depressed to the lowest possible value so that other threads will run, and complete the work that blocks this depressed thread.
This priority depression is canceled when the given time expires, the thread is run despite the depression, thread.fwdarw.thread.sub.-- abort is called, or thread.fwdarw.thread.sub.-- depress.sub.-- abort is called. Finally, the clock.sub.-- sleep trap causes the thread to be delayed until a specified time. This delay can be aborted by thread.sub.-- abort causing the clock.sub.-- sleep to generate an error return.
Identity Traps
Other than the few traps mentioned in this section, all other requests for services require a port right. Even requests upon the kernel that manipulate the current thread or task need a port right (naming the current thread or task). To bootstrap this process, a thread needs a way, without any port right, to get the port right for itself and its task. These rights are obtained through the mach.sub.-- thread.sub.-- self and mach.sub.-- task.sub.-- self traps, respectively.
The port rights returned are actually the THREAD.sub.-- KERNEL.sub.-- PORT and TASK.sub.-- KERNEL.sub.-- PORT special ports last set through the thread.fwdarw.thread.sub.-- set.sub.-- special.sub.-- port and task.fwdarw.task.sub.-- set.sub.-- special.sub.-- port message calls. The default values for these special ports are the actual kernel thread and task ports, respectively. The creator of a task or thread can set these special port values before starting the thread or task so that the thread or task does not have access to its own kernel ports, but instead invokes some intermediate port when requesting services to be done to itself. The kernel also provides a trap, mach.sub.-- host.sub.-- self, which returns a send right to the host's name port for the task.
Bootstrap Reply Port Trap
The mach.sub.-- reply.sub.-- port trap is also used for bootstrap purposes. As mentioned earlier, if a service request is to return a reply, a second port is needed. This trap is used to create an initial reply port (a receive right) that can then be used for all other port related calls. If the task's self port is null, which deactivates use of microkernel services against the task, this call returns null as well.
Message Send and Receive Trap
The final, and most important trap, is mach.sub.-- msg.sub.-- trap. This trap is invoked by the mach.sub.-- msg library routine. It provides access to all other system services. It sends and/or receives a message to/from a port named by a given right. The semantics of this call are involved, and described in detail in the Kernel Programming Reference document, and also in various sections in this document.
Exception Processing
When an exception occurs in a thread, the thread executes in kernel context and sends a message whose contents describe the exception to an exception port. The exceptions are listed under catch.sub.-- exception.sub.-- raise in the Kernel Programming Reference document. A successful reply to this message causes the thread to continue in a state possibly altered by thread.sub.-- set.sub.-- state. For any given exception, there are two exception ports that apply:
A thread specific port for the specific type of exception.
A task port for the specific type of exception.
The thread specific ports are set with thread.fwdarw.thread.sub.-- set.sub.-- exception.sub.-- ports and read with thread.fwdarw.thread.sub.-- get.sub.-- exception.sub.-- ports. The task ports are set with task.fwdarw.task.sub.-- set.sub.-- exception.sub.-- ports and read with task.fwdarw.task.sub.-- get.sub.-- exception.sub.-- ports. The thread.fwdarw.thread.sub.-- swap.sub.-- exception.sub.-- ports and task.fwdarw.task.sub.-- swap.sub.-- exception.sub.-- ports calls set and return the previous exception ports, performing an atomic swap.
The kernel selects the first of these ports in the order listed, as the destination of the exception message, if it is defined. Whereas a successful reply causes the thread to continue, an unsuccessful reply causes the kernel to send an exception message to the second port. If neither exception message receives a successful reply, the thread is terminated.
The kernel can send various exception message formats, as selected when the exception port was set:
exception.sub.-- port.fwdarw.catch.sub.-- exception.sub.-- raise--A message indicating the identity of the faulting thread, the task and thread self ports, and the type of exception and status codes.
exception.sub.-- port.fwdarw.catch.sub.-- exception.sub.-- raise.sub.-- state--A message indicating the type of exception, status codes, and a flavor of thread state with register values selected when the exception port was set. The thread state is both input and output, allowing the reply to change the thread state.
exception.sub.-- port.fwdarw.catch.sub.-- exception.sub.-- raise.sub.-- state.sub.-- identity--The catch.sub.-- exception.sub.-- raise.sub.-- state message with the inclusion of the task and thread ports.
Not every exceptional condition that a thread encounters is handled in this way. A page not resident fault does not send a message to the exception port. Instead, a message is sent to the external memory manager associated with the memory page in which the faulting address lies. This is discussed as part of virtual memory. The general exception rule does not always apply to the system call instruction(s). First of all, several of the possible system call numbers are subsumed for microkernel calls. The remaining system call numbers are initially undefined. An attempt to execute them results in an exception of EXC.sub.-- SW.sub.-- EMULATION as described above.
Actions on Threads
The following section lists the various functions that can be done to a thread, given a send right to the kernel's thread port.
Life and Death
A thread is created via task.fwdarw.thread.sub.-- create and destroyed via thread.fwdarw.thread.sub.-- terminate. Since a thread belongs to a given task, thread creation is actually an operation performed upon a task. The result is a send right to the kernel's thread port for the new thread. A list of the kernel thread ports for all of the threads in a given task can be obtained with task.fwdarw.task.sub.-- threads. A newly created thread is in the suspended state. This is the same as if thread.fwdarw.thread.sub.-- suspend had been called prior to its executing its first instruction. A suspended thread does not execute. A thread is created in the suspended state so that its machine state can be properly set before it is started. To remove a thread from the suspended state (to decrement its suspend count), thread.fwdarw.thread.sub.-- resume is used.
As an optimization, the sequence of steps necessary to create a running thread- thread.sub.-- create, thread.sub.-- set.sub.-- state and thread.sub.-- resume--are combined in the task.fwdarw.thread.sub.-- create.sub.-- running call.
Thread State
A thread has two main sets of state, its machine state and a set of special ports. The machine state for a thread is obtained using thread.fwdarw.thread.sub.-- get.sub.-- state and set using thread.fwdarw.thread.sub.-- set.sub.-- state. The result of setting a thread's state at a random point is undefined. Various steps are needed to obtain a deterministic result.
thread.fwdarw.thread.sub.-- suspend is used to stop the thread. This, and the following step, are unnecessary if the thread has just been created and has yet to run. They are needed, though, for exception processing or for asynchronous interruption such as signal delivery.
thread.fwdarw.thread.sub.-- abort is called. This causes any system call (really, mach.sub.-- msg or any related message call, such as exception or page missing messages) to be aborted. Aborting a message call sets the thread's state to be at the point after the system call, with a return value indicating interruption of the call. Aborting a page fault or exception leaves the thread at the point of the page fault or exception; resuming the thread causes it to retake the page fault or exception. thread.sub.-- abort aborts non-recoverable system calls such as multi-page memory management operations. thread.fwdarw.thread.sub.-- set.sub.-- state can then be safely used.
thread.fwdarw.thread.sub.-- resume restarts the thread.
A thread currently has only one "special" port associated with it. This is the value for the thread to use to request operations upon itself. This is normally the same as the kernel thread port, but can be different if so set, most likely by the creator of the thread. This port is returned by thread.fwdarw.thread.sub.-- get.sub.-- special.sub.-- port and set by thread.fwdarw.thread.sub.-- set.sub.-- special.sub.-- port.
Various pieces of kernel thread state, such as the suspend count and scheduling information, can be obtained using thread.fwdarw.thread.sub.-- info.
Scheduling Control
The following functions affect the scheduling of a thread. They are described under physical resource management.
thread, processor.sub.-- set.sub.-- control.fwdarw.thread.sub.-- assign
thread.fwdarw.thread.sub.-- assign.sub.-- default
thread.fwdarw.thread.sub.-- get.sub.-- assignment
thread (processor.sub.-- set.sub.-- control).fwdarw.thread.sub.-- set.sub.-- policy
thread.fwdarw.thread.sub.-- policy
thread (host.sub.-- control).fwdarw.thread.sub.-- wire
thread.fwdarw.thread.sub.-- priority
thread (processor.sub.-- set.sub.-- control).fwdarw.thread.sub.-- max.sub.-- priority
The thread.sub.-- wire call marks the thread as "wired", which means privileged with respect to kernel resource management. A "wired" thread is always eligible to be scheduled and can consume memory even when free memory is scarce. This property is assigned to threads within the default page-out path. Threads not in the default page-out path should not have this property to prevent the kernel's free list of pages from being exhausted.
Tasks
A task can be viewed as a container that holds a set of threads. It contains default values to be applied to its containing threads. Most importantly, it contains those elements that its containing threads need to execute, namely, a port name space and a virtual address space.
FIG. 3. shows the client visible task structures. The task object is the receiver for messages sent to the kernel task port. Aside from any random task that may hold a send right to the task port, the task port can be derived from the task's task self port, the contained threads or the containing processor set.
Life and Death
A new task is created with task.fwdarw.task.sub.-- create. Note that task creation is an operation requested of an existing prototype task. The new task can either be created with an empty virtual address space, or one inherited from the parent task. The new task's port name space is empty. The new task inherits the parent task's PC sampling state, special and exception ports. A task is destroyed with task.fwdarw.task.sub.-- terminate. This operation is requested of the task to be destroyed, not the parent specified in its creation. The task's virtual address space and port name space are destroyed.
Various statistics about the task can be obtained with task.fwdarw.task.sub.-- info
Special Ports
Aside from its associated port name space, a task also has a small set of special ports.
A task has the following special ports associated with it:
A port the task uses to request operations upon itself. This is normally the same as the kernel task port, but can be different if so set.
A bootstrap port, which can be used for anything, but which is intended as the initial port a task holds to something other than itself, for use in locating other services.
A port used to request information of the containing host, normally the same as the host name port.
These ports are returned by task.fwdarw.task.sub.-- get.sub.-- special.sub.-- port and set by task.fwdarw.task.sub.-- set.sub.-- special.sub.-- port. The value of these ports in a new task are inherited from the task that was the target of the task.sub.-- create call, with the exception of the task self port. A task also has exception ports, as described under exception processing, inherited from the parent task.
Thread Management
A thread belongs to one and only one task. Threads are created with task.fwdarw.thread.sub.-- create. The set of threads present in a task can be found with task.fwdarw.task.sub.-- threads.
Although a task does not itself execute, some execution properties can be set for a task which will then apply to its contained threads. All of the threads in a task can be suspended or resumed together by task.fwdarw.task.sub.-- suspend and task.fwdarw.task.sub.-- resume. These operations do not affect the threads' suspend counts; they affect the task's suspend count. A thread can execute only if both its and its task's suspend counts are zero. The default scheduling properties for threads can be set with the following:
task, processor.sub.-- set.sub.-- control.fwdarw.task.sub.-- assign
task.fwdarw.task.sub.-- assign.sub.-- default
task.fwdarw.task.sub.-- get.sub.-- assignment
task.fwdarw.task.sub.-- priority
Identity
Each task is labeled with a security token which is not interpreted by the kernel. It is sent in messages sent by the task for use by trusted servers in access mediation decisions concerning the requestor of service. This security token is inherited from the parent task by task.sub.-- create. It can be changed only with the privileged task (security).fwdarw.task.sub.-- set.sub.-- security.sub.-- token call. The security port is a privileged port provided to the bootstrap task whose sole purpose is to support the setting or changing of task identity.
Section 4. IPC
With the exception of its shared memory, a microkernel task interacts with its environment purely by sending messages and receiving replies. These messages are sent using ports. A port is a communication channel that has a single receiver and can have multiple senders. A task holds rights to these ports that specify its ability to send or receive messages.
Ports
A port is a unidirectional communication channel between a client who requests a service and a server who provides the service. A port has a single receiver and can have multiple senders. A port that represents a kernel supported resource has the kernel as the receiver. A port that names a service provided by a task has that task as the port's receiver. This receivership can change if desired, as discussed under port rights.
The state associated with a port is:
The associated message queue
A count of references or rights to the port
Port right and out-of-line memory receive limits
Message sequence number
Number of send rights created from receive right
Containing port set
Name of no-more-sender port if specified
FIG. 4 shows a typical port, illustrating a series of send rights and the single receive right. The associated message queue has a series of ordered messages. One of the messages is shown in detail, showing its destination port, reply port reference, a send-and-receive right being passed in the message, as well as some out-of-line or virtual copy memory.
Few operations affect the port itself. Most operations affect port rights or a port name space containing those rights, or affect the message queue. Ports are created implicitly when any other system entity is created. Also, mach.sub.-- reply.sub.-- port creates a port. Ports are created explicitly by port.sub.-- name.sub.-- space ›port.sub.-- name!.fwdarw.mach.sub.-- port.sub.-- allocate and port.sub.-- name.sub.-- space ›port.sub.-- name!.fwdarw.mach.sub.-- port.sub.-- allocate.sub.-- name. A port cannot be explicitly destroyed. It is destroyed only when the receive right is destroyed.
The attributes of a port are assigned at the time of creation. Some of these attributes, such as the limit on the number of port rights or amount of out-of-line memory that can be received in a message can be changed with port.sub.-- name.sub.-- space ›port.sub.-- name!.fwdarw.mach.sub.-- port.sub.-- set.sub.-- attributes. These attributes can be obtained with port.sub.-- name.sub.-- space ›port.sub.-- name!.fwdarw.mach.sub.-- port.sub.-- get.sub.-- attributes.
The existence of ports is of obvious importance to all involved. As such, many tasks using a port may wish to be notified, through a message, when the port dies. Such notifications are requested with an option to mach.sub.-- msg (MACH.sub.-- RCV.sub.-- NOTIFY), as well as with port.sub.-- name.sub.-- space ›port.sub.-- name!.fwdarw.mach.sub.-- port.sub.-- request.sub.-- notification. The resultant dead name notification indicates that a task's port name has gone dead because of the destruction of the named port. The message indicates the task's name for the now dead port. (This is discussed under port name spaces.).
Messages
A message is a collection of data, out-of-line memory regions, and port rights passed between two entities. A message is not a manipulable system object in its own right. However, because messages are queued, they are significant because they can hold state between the time a message is sent and the time it is received. Besides pure data, a message can also contain port rights. This is significant. In this way a task obtains new rights, by receiving them in a message.
A message consists of an Interprocess Communication (IPC) subsystem parsed control section and a data section. In addition, the message may point to regions of data to be transferred which lie outside the message proper. These regions may contain port rights (out of line port arrays). A message also carries |