Engineering system for modeling computer programs5325533Abstract A human oriented object programming system provides an interactive and dynamic modeling system to assist in the incremental building of computer programs which facilitates the development of complex computer programs such as operating systems and large applications with graphic user interfaces (GUIs). A program is modeled as a collection of units called components. A component represents a single compilable language element such as a class or a function. The three major functionality are the database, the compiler and the build mechanism. The database stores the components and properties. The compiler, along with compiling the source code of a property, is responsible for calculating the dependencies associated with a component. The build mechanism uses properties of components along with the compiler generated dependencies to correctly and efficiently sequence the compilation of components during a build process. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
______________________________________
The pseudocode for the function
CreateCompileLists is as follows:
CreateCompileLists(){
for each A in ChangeList{
if ( A.PropertyName == Interface ){
InterfaceCompileList.Add( A );
AddClients( A );
else if ( A.PropertyName == Implementation ){
if ( IsInLine == True ){
InterfaceCompileList.Add( A );
AddClients( A );
}
else if ( IsInLine == False ){
ImplementationCompileList.Add( A );
}
}
}
}
______________________________________
The function AddClients, for each reference in the parameter references clients properly, examines the reference and, if its BuildState is Compiled, sets the reference's BuildState to Uncertain, adds a copy of the reference to the appropriate CompileList, and calls AddClients on the reference. This process is called creating the Client Closure of the ChangeList. The Client Closure represents the subset of components that may need to be recompiled as the result of a build. In practice, dependencies and changes generated by the compiler as the build progresses are used to avoid having to compile as many components as possible in the Client Closure. The following is the pseudo-code for the AddClients function:
______________________________________
AddClients( A ){
for each B in A.ClientList{
if( B.BuildState == Compiled){
B.SetBuildState( Uncertain );
if( B.PropertyName == Interface ){
InterfaceCompileList.Add( B );
AddClients( B );
else if( B.PropertyName == Implementation ){
ImplementationCompileList.Add( B );
AddClients( B );
}
}
}
}
______________________________________
Method of Processing Interfaces This is the second stage of the Build process. The possible BuildStates for items on the InterfaceCompileList are Compiled, BeingCompiled, NeedToCompile, Uncertain, CompileError or UncertainError. The Interface CompileList is processed until it is empty as shown in the flowchart of FIG. 7. The process is entered at block 701 where a reference is chosen from the front of the InterfaceCompileList. If there are no more references on the list, processing is complete at block 702. If the interface BuildState of the component associated with the reference is Compiled, CompileError or UncertainError, as indicated in block 703, the reference is removed from the front of the list and processing continues in block 701. If the Interface BuildState of the component associated with the reference is BeingCompiled or NeedToCompile, as indicated in block 704, the BuildState of the component is set to BeingCompiled in function block 705. Then the Compile function (which invokes the compiler 42) is called on the Interface of the component. This function will return one of the values Abort, Done and Error. If the value returned is Abort at block 706, then processing continues at block 701. If the value returned is Done at block 707, then the Interface BuildState of the component is set to Compiled and the reference is removed from the front of the list at block 708 before processing continues with block 701. If the value returned is Error at block 709, then the Interface BuildState of the component is set to CompileError, the reference is removed from the front of the list, and the function PropagateError is called on the component in function block 710 before processing continues at block 701. If the Interface BuildState of the component associated with the reference is Uncertain, as determined at block 711, the BuildState of the component is set to BeingCompiled at function block 712. Then the ConditionallyCompile function (which may or may not call the compiler 42) is called on the Interface of the component. This function will also return one of the values Abort, Done and Error. If the value returned is Abort, then processing continues at step 1. If the value returned is Done at block 713, then the reference is removed from the front of the list at function block 708, and processing continues at block 701. If the value returned is Error at block 714, then the reference is removed from the front of the list and the function PropagateError is called on the component in function block 715 before processing continues at block 701. The pseudocode for the ProcessInterfaces function is as follows:
______________________________________
ProcessInterfaces(){
until( ( A = InterfaceComileLIst.First ) == NIL ){
state = A.BuildState;
if( A = Compiled .sub.-- CompileError .sub.-- Uncertainerror ){
InterfaceCompileList.RemoveFirst();
else if( A = BeingCompiled .sub.-- NeedToCompile ){
A.SetBuildState( BeingCompiled );
value = Compile( A );
if( value == Abort ){
continue;
}
else if( value == Done ){
A.SetBuildState( Compiled );
InterfaceCompileList.RemoveFirst();
}
else if( value == Error ){
A.SetBuildState( CompileError );
InterfaceCompileList.RemoveFirst();
PropagateError( A );
}
}
else if( A = Uncertain ){
A.SetBuildState( BeingCompiled );
value = ConditionallyCompile( A );
if( value == Abort ){
continue;
{
else if( value == Done ){
A.SetBuildState( Compiled );
InterfaceCompileList.RemoveFirst();
}
else if( value == Error ){
A.SetBuildState( UncertainError );
InterfaceCompileList.RemoveFirst();
PropagateError( A );
}
}
}
}
______________________________________
The function PropagateError adds a reference corresponding to the component to the Project's InternalErrorList and carries out the following for every reference on the component's Client list: If the reference's BuildState is CompileError or UncertainError, the process continues with the next reference. If the reference's BuildState is NeedToCompile, the process sets its BuildState to CompileError, adds the reference to the InternalErrorList, and calls PropagateError on the reference before continuing with the next reference. If the reference's BuildState is Uncertain, the process sets its BuildState to UncertainError, adds the reference to the InternalErrorList, and calls PropagateError on the reference before continuing with the next reference. The pseudocode of the function PropagateError is as follows:
______________________________________
PropagateError( A ){
for each B in A.ClientList {
state = B.BuildState;
if( state == CompileError .sub.-- UncertainError )[
continue;
else if( state == NeedtoCompile ){
B.SetBuildState( CompileError ){
InternalErrorList.Add( B );
PropagateError( B );
}
else if( state == Unceratin ){
B.SetBuildState( UncertainError );
InteranlErrorList.Add( B );
PropagateError( B );
}
}
}
______________________________________
Method of Processing Implementations This is the third stage of the Build process. Each reference in the ImplementationCompileList is processed as shown in the flowchart of FIG. 8. The process is entered at block 801 where a reference is chosen from the front of the ImplementationCompileList. If there are no more references on the list, processing is complete at block 802. If the BuildState of the reference is Uncertain, as determined in block 803, the BuildState is set to Compiled in function block 804 before processing continues in block 801. If the BuildState of the reference is NeedToCompile, as determined in block 805, the component is compiled in function block 806. The possible values returned from the compiler 42 are Done and Error. If the value returned is Done at block 807, the BuildState of the reference is set to Compiled in function block 804 before processing continues in block 801. If the value returned is Error in block 808, the BuildState of the reference is set to CompileError and the function PropagateError is called on the component in function block 809 before processing continues in block 801. If the BuildState of the reference is CompileError or UncertainError, nothing is done. Note that the processing of Implementations is order independent at this stage because dependencies can only be on Interfaces or Implementations whose IsInline attribute is True, and these have already been processed. The pseudocode for ProcessImplementations is as follows:
______________________________________
ProcessImplementations(){
for each A in ImplementationCompileList{
state = A.BuildState;
if( A = Uncertain ){
A.SetBuildState( Compiled );
else if( A = NeedToCompile ){
value = Compile( A );
if( value == Done ){
A.SetBuildState( Compiled );
}
else if( value ==Error ){
A.SetBuildState( CompileError );
PropagateError( A );
}
}
else if(A = CompileError .sub.-- UncertainError ){
}
}
}
______________________________________
Compiler Which Supports Build Process The compiler 42 is called via the Compile function, and these two may be used as synonyms. The compiler 42 processes the source text and identifies the names of possible external components. The compiler 42 next obtains a list of references to all components The compiler may eliminate references from the list using language specific knowledge such as component kinds. The compiler then calls the function called GetDeclaration for each external component identified in the text. The Compile function clears any existing errors on a component before invoking the compiler 42. This will clear any error messages from the Errors property and remove any references from the Project's ErrorList property. The compiler first calls the GetDeclaration function, which is illustrated by the flowchart of FIG. 9. The GetDeclaration function returns one of the values Abort, Done, Circulardependency or Error and may additionally return the data of the Declaration. The process is entered at block 901 where each reference is examined for its BuildState. If there are no more references to process, as indicated by block 902, processing is complete and a return is made. If the BuildState of the component is Compiled, as indicated at block 903, the function returns Done at function block 904, and the stored Declaration data is also returned, before processing continues at block 901. If the BuildState of the component is NeedToCompile or Uncertain, as indicated at block 905, a reference corresponding to the component is added to the front of the InterfaceCompileList in function block 906 and the function returns Abort in function block 907 before processing continues at block 901. Declaration data is not returned in this case. If the BuildState of the component is BeingCompiled, as indicated by block 908, then the function returns Circulardependency at function block 909 before processing continues at block 901. Declaration data is not returned for this case either. If the BuildState of the component is CompileError or UncertainError, as indicated in block 910, then the function returns Error in function block 911 before processing continues at block 901. Again, declaration data is not returned.
______________________________________
The pseudocode for the GetDeclaration function is as
follows:
value GetDeclaration( A, Declaration ){
Declaration = NIL;
state = A.BuildState;
if( state == Compiled ){
Declaration = CurrentDeclaration();
return( Done );
else if( state == NeedToCompile .sub.-- Uncertain ){
InterfaceCompileList.AddToFront( A );
return( Abort );
}
else if( state == BeingCompiled ){
return( Circulardependency );
}
else if( state ==CompileError .sub.-- UncertainError ){
return( Error );
}
}
______________________________________
After calling GetDeclaration, the compiler continues as follows. If the value returned was Abort, the compiler must terminate processing and return the value Abort. An alternative implementation would be for the compiler to suspend compilation, to be restarted or abandoned after compiling the returned component. This would require the compiler to be reentrant but otherwise requires no essential change to the procedure as described. If the value returned was Compiled, the compiler can continue processing. If the Declaration is used, this will constitute a SourceReference dependency, and the compiler should keep track of both the dependency and its nature. If the value returned was Circulardependency or Error, then the compiler must terminate processing, call the SetError function on the component, and return the value Error. The compiler may optionally continue processing to possibly find more errors before terminating. If the calls to GetDeclaration return Compiled, the compiler will continue processing the source text in a conventional manner. If any error is encountered in the processing, the compiler will call the SetError function on the component and return the value Error. If no errors are encountered, the compiler then returns the value Done. If the compiler has been processing an interface, then it will store the new value of the Declaration property. Method for Processing Errors Before the compiler is called to compile an Interface or Implementation, any existing Errors are cleared. This will ensure that all error messages are up to date. Because of the built-in dependency between Interfaces and Implementations and the fact that the errors are propagated, it is never possible to get compiler errors on both the Interface and the Implementation on the same build. When the compiler encounters an error, it calls the function SetError which communicates information about the error, including the location of the error and a message describing the error, back to the erroneous component. This information is stored in the Errors property and the appropriate source property (Interface or Implementation) of the component. Also a reference is stored in a global error list maintained by the Project which allows convenient access to all errors. The error will be propagated to any dependent component so that these components need not be compiled later, since it is known that these compiles will fail. Furthermore, the build will continue after errors are encountered and win correctly build as many components as possible that are not themselves explicitly in error or which depend on components with errors. The SetError function takes the error message passed to it by the compiler 42 and creates an entry in the component's Errors property corresponding to the appropriate property (Interface or Implementation). It also creates an entry in the Project's ErrorList property corresponding to the error. The two entries created in this way share the same key so that they remain "linked". The function also typically records the position of the error in the program source using a "sticky marker" which remains attached to the same range of characters during later user editing. If the compiler successfully completes processing of the source text, it will produce object code and pass that to the Linker function to incrementally link. Alternatively, the object code could be stored until the end of the build process and linked in a traditional fashion. The compiler will now update the SourceReferences property of the component and the Clients properties of each SourceReference. For each reference to, say, component B in the SourceReferences property of, say, component A, there will need to be a corresponding reference (which has the same dependency information) to component A in the Clients property of component B. The compiler will create a change describing the ways in which the Declaration has changed from its previous value. The compiler will call the function PropagateChange on the component passing it the calculated change. The compiler will then set the new value of the Declaration. The function PropagateChange matches the change against the dependency of each reference in the component's Client List. If the match indicates that the referenced component has been affected by the change and its BuildState is not CompileError or UncertainError, its BuildState is set to NeedToCompile. It is possible for the compiler to use the SetError function to issue warning messages or suggestions of various forms. In this case, if only warning messages are returned, the Compile function should return Done. The warning messages will be added to the Errors property and references will be added to the Project's ErrorList property. However, otherwise the compile is treated as successful. The appropriate BuildState will be set to Compiled and no errors will be propagated. If only warnings or suggestions are issued, then the program will be completely and correctly built. Process for Conditionally Compiling a Component The flowchart for the function ConditionallyCompile is shown in FIGS. 10A and 10B, to which reference is now made. Each component B in a component A's SourceReferences is processed in block 1001. If all components B have been processed, as indicated by block 1002, then processing is complete as to the components B, and the process goes to FIG. 10B to compile component A. If the BuildState of component B is BeingCompiled or NeedToCompile, as indicated at block 1003, the BuildState of the component is set to BeingCompiled and the component is compiled in function block 1004. The Compile function may return one of the values Done, Abort or Error. If the value Done is returned in block 1005, processing continues in block 1001. If the value returned is Abort in block 1006, the function is terminated and the Abort is returned in function block 1007. If the value returned is Error in block 1008, the original component's BuildState is set to UncertainError, the function is terminated, and Error is returned in function block 1009. If the BuildState of component B is Uncertain, as indicated at block 1010, then the BuildState is set to BeingCompiled and the component is conditionally compiled in function block 1011. Again, the ConditionallyCompile function may return one of the values Done, Abort or Error. If the value Done is returned in block 1005, processing continues in block 1001. If Error is returned in block 1012, the component's BuildState is set to UncertainError, the component A is removed from the InterfaceCompileList, and the PropagateError function is called in function block 1014 before the function is terminated. If Abort is returned in block 1015, Abort is returned in function block 1007 before the function is terminated. Turning now to FIG. 10B, if all the reference's have been processed, then they all have the BuildStates Compiled. However, one of the SourceReferences may have propagated a change to the component during the processing to this point, and so its BuildState may now be either BeingCompiled or NeedToCompile. Therefore, the BuildState of component A is determined in block 1016. If the BuildState is NeedToCompile, as indicated at block 1017, then the BuildState is set to BeingCompiled and component A is compiled in function block 1018. The compiler can return either Error or Done. Note that Abort should never occur because all the SourceReferences are Compiled at this stage. If Error is returned in block 1019, then the BuildState is set to CompileError and Error is returned in function block 1020. If Done is returned in block 1021, then the BuildState is set to Compiled and Done is returned in function block 1023. If the BuildState of component A is BeingCompiled, as indicated at block 1024, then the BuildState is set to Compiled and Done is returned in function block 1023. The pseudocode for the function ConditionallyCompile is as follows:
______________________________________
value ConditionallyCompile( A ){
for each B in A.SourceReference{
state = B.BuildState;
if( state == NeedToCompile .sub.-- BeingCompiled ){
value = Compile( B );
if( value == Done ){
continue;
else if( value == Abort ){
return( Abort );
}
else if(value == Error ){
A.SetBuildState( UncertainError );
return( Error );
}
}
else if( state == Uncertain );
A.SetBuildState( BeingCompiled );
value = ConditionallyCompile( A );
if( value == Done ){
continue;
}
else if( value == Abort ){
return( Abort );
}
else if( value == Error ){
A.SetBuildState( UncertainError );
InterfaceCompileList.Remove( A );
PropagateError( A );
}
}
}
state = A.BuildState;
if( state == NeedToCompile ){
A.SetBuildState( Being Compiled ){
value = Compile( A );
if( value == Done ){
A.SetBuildState( Compiled );
return( Done );
}
else if( value == Error ){
A.SetBuildState( CompileError );
return( Error );
}
}
A.SetBuildState( Compiled );
return( Done );
}
}
______________________________________
Method for Post Processing Errors The method for post processing errors is the fourth stage of the Build process. If any errors occurred during the build, then the function PostProcessErrors is called at the end of the build. For each reference in the InternalErrorList, if the reference's BuildState is CompileError, the BuildState is changed to NeedToCompile. If the reference's BuildState is UncertainError, the BuildState is changed to Compiled. When all the references on the InternalErrorList have been processed, the list is cleared of all entries. As a convenience to the programmer, if the Projects ErrorList contains any entries, a window or the Browser is opened on the Project's ErrorList. The pseudocode for the PostProcessErrors function is as follows:
______________________________________
PostProcessErrors(){
for each A in InternalErrorList{
state = A.BuildState;
if( state == CompileError ){
A.SetBuildState( NeedToCompile );
else if( state == UncertainError ){
A.SetBuildState( Compiled );
}
}
InternalErrorList.ClearA11();
if( ErrorList.Count !=0 ){
OpenErrorWindow();
{
{
______________________________________
Using HOOPS The Human Oriented Object Programming System (HOOPS) according to the invention can be started on the computer by entering either a project name or an existing project name, depending on whether a new program is to be built or an existing program is to be edited. When HOOPS is started, a window is opened and an initial screen similar to the one shown in FIG. 11 is displayed. The initial window that HOOPS opens displays the Members property of the Project component and its immediate members. Although it initially only displays the immediate members, the same window is used to display every component starting at the project component. In the example shown in FIG. 11, a Project called "Payroll" has been imported. Every window in HOOPS is a browser. Browsers are temporary viewing and editing tools for looking at information in the Project. They can be deleted at any time by clicking on the close icon in the window. Any changes made to the Project while in the browser are automatically saved. A browser has an input component that is specified when it is opened. A property of the input component is displayed in a pane, and each pane displays one property viewer or is blank, as shown in FIG. 12. New panes are added to a browser by choosing one of the split icons in the upper right corner of a pane. When a new pane is created, default wiring is created from the pane being split to the new pane. Wiring is the logical relationship between a pane. A pane can have zero or one wire input and zero or more wires as output, but wiring cannot form a loop. When a component is selected in a pane, the selection is converted into a reference to a component in the project and becomes a new input to the destination of any wires emanating from that pane. The wiring can be turned on by choosing Turn on Wiring from the Browser menu selected from the menu bar, resulting in the display shown in FIG. 13. Using this display, it is possible to change the wiring between two panes by clicking down with the mouse on the new input location and dragging to the target pane. In many viewers, such as Members, Clients and References, components can be distinguished by their names and their icons, which differ by component kind. In other viewers, a component's name simply appears in the text, such as in Source or Documentation. The component hierarchy can be browsed by expanding and collapsing container components in the Members property viewer, producing a Tree view, an example of which is shown in FIG. 14. One level of a component's subtree can be expanded or collapsed by clicking the component's circular toggle switch. When a component is selected in a viewer, either by clicking on its icon if it has one or by selecting its name in a text display, the Property menu in the global menu bar is adjusted to list the properties for that type of component. Any property of any component can be viewed by selecting the component in a viewer and then choosing a property from the Property menu. This opens a new browser containing a single viewer which displays the chosen property of the selected component. Components are created from within either a Members or Interface viewer by specifying where the new component is to be created, and the kind of component it will be. The location of the new component is specified by either selecting an existing component or by placing an insertion point between components. The kind of component created is determined by which menu item is selected from the New viewer menu. All editing is automatically stored. Only changed components, and their clients affected by the change, are compiled. The recompiled components can be viewed by choosing the Show Components Built menu item from the Build menu. To see the components changed since the last build, the Show Components Changed from the Build menu is chosen. A program is compiled, and linked, by choosing Build from the Build menu. The Build & Run menu also runs the program. FIGS. 15 to 18 illustrated some of the screens displayed in the process of editing a component. FIG. 15 shows the display of the source code of an Implementation of a function called "main". In FIG. 16, the function "main" has been edited by changing numberdisks from "7" to "9". If the programmer now chooses Show Components Changed from the Build menu shown in FIG. 17, a browser like that shown in FIG. 18 appears. In the "Implementation Changes" viewer (on the right), the function "main" is displayed indicating that it has been changed. Object Oriented Linking This description lists the important features of the HOOPS linker, then it provides background on the runtime environment of a preferred embodiment, and the HOOPS database to provide the context in which linking occurs. Finally, a discussion of component linkage, and the interaction of components with the HOOPS compiler, the HOOPS database, and the system loader is provided with reference to a preferred embodiment. Linker Features Linking occurs during the compilation process. There is no extra linking pass. During a build, only newly compiled functions and data are re-linked. During incremental development, some shared library space is traded for speed. The compiler interacts with components and properties to produce all object code and other linking information. When a program is ready for release, a "publish" step will remove extra space and information used during incremental development, and separate the application from HOOPS. A "QuickPublish" step will be available for quickly separating the application from HOOPS for sharing with others, or moving to another machine. The linker is extensible because the compiler may specify new fixups that the linker doesn't normally handle. A suspended program may be modified and then resume execution without being reloaded. (Some changes will require a reload. Background The linker operates inside HOOPS, and creates files that are used by the loader. To understand the linker strategy, it is important to understand the unique aspects of both the runtime system and HOOPS. An executable file interacts with the runtime much differently than in other runtime systems. Normally, a loader program must understand the executable file format. The executable file has known fields that describe various aspects of the program such as the amount of memory needed, the address of main, any relocation information if that is needed at load time, and any debugger information that is packaged in the executable. In a runtime of a preferred embodiment, the loader interacts with the executable file through an abstract TLoadModule class interface. The TLoadModule provides protocols for all the loading operations. For example, operations such as specifying memory requirements, building meta data information, and linking with other shared libraries are all provided by methods of TLoadModule. With this approach, there can be many different ways in which a load module can respond to the loading requests. The runtime definition provides shared libraries, and allows for cross-library calls to be resolved at load time. Since libraries may be loaded at any memory location, all code must be either position independent, or must be patched at load time. In addition to position independent code, calls to other shared libraries must be resolved at load time. This is because the static linker does not know what the location, or the relative offset, of the external library will be in memory. While each TLoadModule class may implement cross-library calls in many different ways, the standard method is to jump through a linkage area that is patched at load time. The linkage area serves as an indirect jump table between libraries. An external call will JSR to the linkage area, and the linkage area will then JMP to the called function. Internal calls can JSR directly to the called function. An example of an internal and cross-library call is shown in FIG. 19 and described below. The call to f1() 1900 is an internal call, so the JSR goes directly to f1() 1910. The call to f2() 1920 is a cross-library call; therefore, the call goes to the external linkage area 1930 that is patched at load time. The HOOPS environment also provides a unique context for the linker. A program is represented as a collection of components. Each component has an associated set of properties. During the compilation of each component, the compiler will generate and store properties applicable to that component. The HOOPS build process orders the building of components so that all interfaces (declarations) are compiled before implementations (definitions). A HOOPS project may consist of several library components. All source components are members of one of these library components. Each library component represents a shared library build. Overview To support incremental linking, and allow a final application to be as small and fast as possible, two different types of load modules are created. During development, HOOPS generates and modifies a TIncrementalLoadModule. There is a second load file, TStandardLoadModule, that is created when publishing applications. A preferred embodiment discloses an approach for building and updating code during development. Converting a TIncrementalLoadModule into a TStandardLoadModule involves an extra "publish" step. This step will be much like a normal link step, in that each function or data item will be relocated and patched. However, external references are not resolved until load time. Compiler Interaction As the compiler generates code for a component, it passes the code to the object code property with a set of fixups that are used to patch the object code. Each compiled component has its object code property filled. The compiler uses an "object group" model. That is, a component can be made up of multiple types of object code. For example, a function could also have a private static data area associated with it, along with a destructor sequence for that static data area. A static data item could have a constructor and destructor sequence associated with it to initialize it at runtime. For example, suppose the following component was compiled:
______________________________________
TFoo::Print()
static int timesCalled = 0;
cout << "Hello world:" << timesCalled << " n";
timesCalled++;
}
______________________________________
The compiler will generate two pieces of object code and associate them with the component TFoo::Print. There will be the object code for the function, and 4 bytes of private data for the static variable timesCalled. This might look something like the following:
______________________________________
Object code Property of TFoo::Print - code:
0x0000: LINK A6,#0
0x0004: MOVE.L A5,--(A7)
0x0006: PEA L1
0x000A: MOVE.L <timesCalled>,--(A7)
0x000E: PEA L2
0x0012: MOVE.L cout,--(A7)
0x0016: BSR <operator<<(char*)>
0x001C: ADDQ.L #8,A7
0x001E: MOVE.L D0,--(A7)
0x0020: BSR <operator<<(int)>
0x0026: ADDQ.L #8,A7
0x0028: MOVE.L D0,--(A7)
0x002A: BSR <operator(char*)>
0x0030: ADDQ.L #8,A7
0x0032: ADDQ.L #1,<timesCalled>
0x0034: UNLK A6
0x0036: RTS
L1: DB " n"
L2: DB "Hello world:"
______________________________________
Object code property of TFoo::Print - data:
00000000:
0000 0000
______________________________________
Along with the object code, the compiler will specify different fixups that must be applied as the code is relocated. These might look something like: reference to timesCalled @ offset 0.times.0c reference to count @ offset 0.times.14 reference to ostream::operator<<(const char *) @ offset 0.times.18 reference to ostream::operator<<(int) @ offset 0.times.22 reference to ostream::operator<<(const char *) @ offset 0.times.2c reference to timesCalled @ offset 0.times.34 Notice that the fixups may specify references to the other pieces of objects associated with this same component (the private static variable timescalled), or to other components (such as count). When the compiler has completely specified the full set of objects and fixups associated with a component, the object code property relocates all of its pieces, and links itself at the same time. There is no second link pass performed after all the components are compiled. As each component is compiled, it is also fully linked. Fixup Lists Linking is essentially a matter of iterating through the list of fixups and patching the code in an appropriate manner. Different types of fixups are specified through a class hierarchy, with each fixup knowing how to calculate the patch value. For example, a pc-relative fixup knows that it must calculate the difference between the address of its location, and the component which it references. An absolute fixup knows that it must delay calculations until load time. While the linker specifies a set of fixup classes, new compilers may specify new types of fixups. FIG. 20 illustrates a set of fixup classes in accordance with a preferred embodiment. Address Calculation The main problem with linking each component as it is compiled is that some components it references may have not yet been compiled. Each source component is a member of exactly one library component. Associated with each library component is a load module property. The load module property works as the clearing house for all components that belong to the shared library. As a fixup prepares to calculate a patch value, it queries the load module property for the address of a component. The load module property checks to see if the component has been compiled. If it has, then it returns the address of the component. However, if the component has not yet been compiled, the load module property performs two actions depending on the type of the component. If the type of the component is a data component, then it just returns a constant address. If the type of the component is a function component, then it creates a linkage area for that function, and returns the address of the linkage area. Object Placement As mentioned before, as each component is compiled, it is allocated a position in the shared library. As this is done, some extra work must be done so that all references are consistent. If the component is a data component, all its clients are notified of the position. Some clients may have initially been linked with bogus addresses, so this process cleans up all the clients and provides them with the right address. If the component is a function component, then the linkage area for that function is updated with the new address. Notice that this two style approach provides indirect access to functions, and direct access to data. In addition, extra space is allocated so that future updates of the object code has a higher probability of being able to use the same area. 12% extra is provided for functions and 25% extra is provided for large data objects. Linkage Area As mentioned above, when the load module property is asked for the address of a function, it will give the address of the linkage area. This means that every function reference is indirect. FIG. 21 illustrates a linkage area in accordance with a preferred embodiment. Notice that not only the internal library calls pass indirectly through the internal linkage area, but cross-library calls to functions go indirectly through a library's internal linkage area (i.e.: the call to f2 in Library B, 2100, 2110, 2115, 2120). This must be done so that f2 may change position without updating both its internal and external clients, and also for consistency so that items such as function pointers work correctly. In addition, all virtual table function pointers will also point to the internal linkage area. Any functions that are referenced, but not defined, will point to a common Unimplemented() function. Having all uncompiled functions point to Unimplemented(), facilitates the load and run partial applications without forcing the programmer to create stub functions. Another benefit of having the internal linkage area is that it provides a bottleneck to all functions. During development, the internal linkage area can be useful for activities that require function tracing such as debugging or performance monitoring. Incremental Linking The previous discussion has laid the foundation for a detailed discussion of incremental linking. When a component is recompiled, the new component size is compared to the old component size to determine if the new component fits in the current location. If it will, then it is stored there, and it is iteratated through its fixup list. Linking is then complete. If the object code for the new component must be relocated, then the old space is marked as garbage, and the new object code is relocated to a new area. Then the fixup list is iterated through. If the component is a function, the linkage entry is updated. Linking is then complete. However, if the component is a data item, then the component must iterate over the list of clients and update their references to this component. Linking is then complete for the data. Notice that the initial link and incremental link follow the exact same steps. The only extra step done in incremental updates is handling the case when a data item must change location. Object Code Storage The object code and load module property are normal component properties, and as such, are stored like all other properties in the HOOPS database. However, the object code property describes the object code, but does not contain the actual bits. The actual bits are stored in segments owned by the load module property. The load module property maintains four different segments. These segments include: code, uninitialized data, initialized data, and linkage. FIG. 22 illustrates the storage of object code in accordance with a preferred embodiment. Each of the graphic objects 2200 has an associated load module property 2250 containing the individual object code associated with the graphic objects 2210, 2220, 2230 and 2240. Since all code is linked as it is compiled, and support is provided for changing and incremental building, the load module property maintains a map of all the objects allocated in each segment. It also tries to keep extra space available for growth. This extra space wastes some virtual memory space, but does not occupy backing store or real memory. If during the process of repeatedly changing and building an application, the extra space is exhausted, additional space will be allocated, affected segments must be relocated, and all references into and out of that segment must be updated. FIG. 23 illustrates a loaded library in accordance with a preferred embodiment. The white sections 2300, 2310, 2320 and 2330 represent free space. Four sections are provided for uninitialized data 2340, initialized data 2350, code 2360 and a linkage area 2370. In HOOPS, the segments have no spatial relationship. Linking uses what will be the loaded relationship, not the relationship that they might have within HOOPS itself. Loading To run a program, the loader must be given a streamed TLoadModule class. During program building, a streamed TLoadModule class is created. When loaded, it loads the segments created in HOOPS. The segments are shared between the loaded application and HOOPS. This provides two benefits: first, it greatly reduces the amount of copying that must be done, and second it allows for incremental updates while the program is loaded. Streams must be written from start to finish, since the loader requires a streamed TLoadModule class, the TIncrementalLoadModule attempts to reduce the amount of information streamed. This means that for most changes in a program, the TIncrementalLoadModule will not have to be restreamed. The TIncrementalLoadModule gets all the mapping information from HOOPS through the use of a shared heap. Otherwise, any change in data location, or function size would require a new TIncrementalLoadModule to be built and streamed. FIG. 24 is a memory map of a load module in accordance with a preferred embodiment. Incremental Updates Incremental linking facilitates modification of a loaded library without removing it from execution. This requires changes made in HOOPS to be reflected in the address space of the running application. This will be handled by loading the library as a shared segment. Any modifications made on the HOOPS side will be reflected on the running application side. Remember that on the HOOPS side, the segment is interpreted as a portion of the HOOPS database, on the application side, it is just a segment that contains object code. The model for active program modification is as follows. The debugger first stops execution, modified functions are compiled, and located at different locations even if they fit in their current location, the internal linkage area is updated, and the program is continued. If a modified function was active on the stack, the old version will execute until the next invocation of that function. An alternative is to kill the program if active functions are modified. Publishing a Program When an application is published, the linker will copy all object code to a file outside of the database. As the segments are copied to an external file, the linker will relocate and patch an the functions. In addition, all internal calls will become direct calls, and the internal linkage area will be removed. Besides just relocating and linking the object code, the linker must include the meta data necessary for virtual table creation. Notice that this step is essentially a relink, the compiler is not involved. A second style of publishing is also required, the style is referred to as a quick publish. A quick publish copies the required segments from the database to an external file. The purpose of this second publish is to support quick turn-around for cross development, or shared work.
______________________________________
Implementation details
Class Deinintions
______________________________________
enum EObjectKind {kCode,kData,
kStaticCtor, kStaticDtor };
class TObjectProperty : public TProperty {
public:
TObjectProperty();
virtual .about.TObjectProperty();
// Compiler Interface
virtual void WriteBits(EObjectKind
whichOne, LinkSize length,
void* theBits, unsigned short
alignment);
virtual void AdoptFixup(EObjectKind
whichOne, TFixup* the Fixup);
// Getting/Setting
void*
CopyBits(EObjectKind whichOne) const;
LinkOffset GetOffset(EObjectKind
whichOne) const;
LinkSize
GetLength(EObjectKind whichOne) const;
ELinkSegment
GetLinkSegment(EObjectKind whichOne) const;
Boolean
Contains(EObjectKind whichOne) const;
virtual EObjectKind
GetPublicKind() const = 0;
// Linking
virtual void GetLocation(EObjectKind
whichOne, TLocation& fillInLocation) const;
TIterator* CreateFixupIterator()
const;
};
The object code property delegates the fixup work to
individiual fixup objects.
class TFixup {
public
void DoFixup(void* moduleBase) = 0;
private:
TComponent* fReference;
long fOffset;
};
______________________________________
Derived from TFixup are the classes TPCRelativeFixup, TAbsoluteFixup, and TDataRelativeFixup. Each fixup class understands how to perform the appropriate patching for its type. This is completely different than the normal compiler/linker interaction where the linker must interpret different bits to decide what action to take. Another advantage of this approach is that a new compiler for a new architecture doesn't have to worry about a fixup type not being supported in the linker. Reference Types The linker must handle 4 types of references. They are code-to-code, code-to-data, data-to-code, and data-to-data. The way each type of reference is handled (for 68K) is described below:
______________________________________
Code-to-Code
______________________________________
Example;
Foo();
______________________________________
The compiler handles this case in two different ways depending on the context. It can either go pc-relative to Foo(), or it can load the address of Foo(), and go indirect through a register. Any internal call can use either style. The linker will always report the address of the linkage area. Cross-library cans must use the load address of style. These will use absolute addresses that will be patched at load time.
______________________________________
Code-to-Data
______________________________________
Example: gValue = 1;
______________________________________
The compiler will generate a pc-relative access to gValue. However, if gValue is in a different shared library, the compiler will automatically generate an indirection. The linker will catch the indirect reference and provide a local address which will be patched with the external address at load time.
______________________________________
Data-to-Code & Data-to-Data
______________________________________
Example (Data-to-Code):
void (*pfn)() = Foo;
Example (Data-to-Data):
int& pi = i;
______________________________________
Since both of these references require absolute addresses, they will be handled during loading. The patching of data references at load time will be handled just like the patching of external references. FIG. 25 shows what happens in each type of reference. All of these cases show the internal usage case. If an external library references these same components, this library will receive several GetExportAddress() calls at load time. In response to the GetExportAddress(), a library will return the internal linkage area address for functions, and the real address for data. This allows the functions to move around while the library is loaded. Linkage Areas The internal linkage area is completely homogeneous (each entry is: JMP address). The external area has different types of entries. A normal function call will have a jump instruction in the linkage area, while a virtual function call will have a thunk that indexes into the virtual table. Pointers to member functions have a different style of thunk. While the invention has been described in terms of a preferred embodiment in a specific programming environment, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.
|
Same subclass Same class Consider this |
||||||||||
