Development system and methods with direct compiler support for detecting invalid use and management of resources and memory at runtime5909580Abstract A development system having a compiler, a linker, and an interface for detecting invalid use of resource is generated. When the system is (optionally) operating in a "code guarded" mode, the linker links the object modules with a CodeGuard.RTM. Library to generate "code guarded" program(s). The API (Application Programming Interface) calls to the runtime libraries are "wrappered" by CodeGuard wrapper functions. Additionally in this mode, operation of the compiler is altered to insert calls from the user code (e.g., compiled code in object modules) to the CodeGuard Runtime Library, for checking runtime use of resources and memory. As a result, the system can identify a programming error at the location where it occurs that is, at the original line of code which gave rise to the problem in the first place. Errors are found immediately as they happen, so that the original offending site can be determined. In this manner, the task of creating computer programs is simplified. Claims What is claimed is: Description COPYRIGHT NOTICE
__________________________________________________________________________
void foo( int iParam )
int* pMyInt; // ptr to int
if ( iParm > 0 ) {
int MyInt; // declar int within "if" block
pMyInt = &MyInt;
// assign addr of MyInt to pMyInt
}
*pMyInt = 5; // wrong -- dereference thru MyInt no longer valid|
}
__________________________________________________________________________
Here, when execution of the function passes out of the "if" block, the local variable (i.e., MyInt) technically no longer exists--it does not represent legitimate memory any more. The fact that there exists a pointer pointing to that memory block, however, is potentially problematic. If, for instance, the pointer is dereferenced for accessing the now-gone integer (e.g., for an assignment operation, as shown), an error would occur. The stack descriptors are designed to encompass information about multiple nested scopes. In this manner, the descriptors can indicate the feasibility or lifetime of nested local variables. The above description of strcpy illustrates that the general sequence of validation steps is applicable to memory objects generally. In other words, the foregoing represented a generic sequence of validation. Specific validation sequences--specific to a particular RTL function--will now be described. 2. Specific Approach to Memory Ojbects--strcpy Construction of wrapper functions for string RTL functions is exemplified by strcpy, which may be constructed as follows (using the familiar C/C.sup.++ programming language).
__________________________________________________________________________
1:
/*
2:
* strcpy wrapper
3:
*/
4:
5:
char.sub.-- FAR * .sub.-- export .sub.-- cg.sub.-- strcpy(
6: char .sub.-- FAR * (*.sub.-- org.sub.-- strcpy) (char .sub.--
FAR *.sub.-- dest,
7: const char .sub.-- FAR * .sub.-- src), unsigned int prevEBP,
8: void * retEIP,
9: char .sub.-- FAR * .sub.-- dest, const char .sub.-- FAR *
.sub.-- src)
10:
{
11: static FUNCTION func.sub.-- strcpy
12: = {"strcpy",.sub.-- cg.sub.-- strcpy,F.sub.-- FUNC.sub.-- ALL,
"ps=p", 0};
13: char .sub.-- FAR * r;
14: size.sub.-- t srcLen;
15:
16: if(off.vertline..vertline.inRTL().vertline..vertline.isDisabled(&fu
nc.sub.-- strcpy))
17: {
18: .sub.-- SWITCHTORTLDS;
19: return .sub.-- org.sub.-- strcpy(.sub.-- dest, .sub.-- .sub.--
src);
20: }
21:
22: srcLen = validateString(.sub.-- src, 0, &func.sub.-- strcpy,
rt1EBP);
23: validate(.sub.-- dest, srcLen, &func.sub.-- strcpy, rt1EBP);
24:
25: enterRTL();
26: .sub.-- SETRTLDS;
27: r = .sub.-- org.sub.-- strcpy(.sub.-- dest, .sub.-- src);
28: .sub.-- RESTOREDS;
29: leaveRTL();
30:
31: apiRet(&func.sub.-- strcpy ,rt1EBP, (unsigned long) r);
32: return r;
33:
}
(line numbers added to faciliate description)
__________________________________________________________________________
The strcpy wrapper is substituted for each call to strcpy in the "code guarded" program. A stub named "strcpy" is employed which passes the "real" strcpy address along to the above wrapper for strcpy. For strcpy, the system provides: strcpy,.sub.-- cg.sub.-- strcpy, and .sub.-- org.sub.-- strcpy functions Calls to the function strcpy() in the user's program code get resolved to the CodeGuard.RTM. version of strcpy which pushes the address of .sub.-- org.sub.-- strcpy (the RTL strcpy) and then calls .sub.-- cg.sub.-- strcpy. The prevEBP and retEIP parameters are already on the stack from the call to the strcpy wrapper. The specific steps performed by the the wrapper function or method are as follows. At line 11, descriptor information is stored for the function, including: (1) name of the function, (2) relevant CodeGuard function, (3) flags (e.g., indicating what to check), (4) string descriptor describing the parameters (i.e., what the parameter types are, with respect to checking--used for parsing the function arguments when displaying warnings or errors), and (5) extra data member. After declaring two other local variables at lines 13-14, the function proceeds to line 16 to test two conditions. First, the function tests whether CodeGuard checking has been disabled (in which case the wrapper function is to do no checking). Second, the function tests whether execution is currently inside an RTL function which has then called the strcpy function. If the strcpy function has been called from another RTL function, no further checking should occur (i.e., by this wrapper), as checking has already occurred. If either of these cases holds true, the method simply calls into the original version of the strcpy function and then returns, as shown at line 19. If, on the other hand, this is a call from some user code, then the wrapper function performs certain checks. First, at line 22, the function validates that the source that is being passed in is a valid string. This is done by invoking a validateString function, passing the "pointer to" source string together with the descriptor for this function (i.e., the strcpy function). Also passed in is the stack frame pointer--where to start looking (in the instance where the source string lives on the stack). As shown, this returns the length of the source string to a local variable, srcLen. This length may, in turn, be employed to validate the destination. This is shown at line 23, with a call to a validate routine. Actual validation of the destination is done by a generic validate function, at line 23. The function takes as its first parameter a pointer to the memory block to validate (i.e., the destination string) together with a size for the block (here, the same number as the length of the source string). The last two arguments to the validate function are the same as those previously described for the validateString function: a strcpy descriptor and a pointer to the stack frame. The validate routine makes sure that the destination is sufficiently large to receive a string of size srcLen. Appended herewith as Appendix B are source listings demonstrating exemplary embodiments of validate and validateString. At line 25, the method invokes a routine, enterRTL for setting a flag indicating the execution is currently within the RTL (i.e., to avoid re-entry). This is followed by a call to the original strcpy function, at line 27. In other words, the "real" strcpy function call into the runtime library is .sub.-- org.sub.-- strcpy, which occurs at line 19 and line 27. For 16-bit implementations, macros are added for switching the data segment (DS) register to the DS of the runtime library, as shown at lines 26 and 28. When the call returns from the original strcpy, the data segment can be restored, as shown at line 28. Thereafter, at line 29, the function calls leaveRTL for indicating that execution has now returned from the RTL library (i.e., resetting the re-entry flag). At line 31, the return value (r) is logged, for keeping a record of the result of the function call. This provides the system with the ability to log the results of all function returns. Finally, the wrapper function returns to the client the result, r, at line 32. Corresponding wrappers for other string functions, such as strncpy, are essentially identical to that of the above except that the function descriptor is set to that for the particular RTL function. Collectively, these are wrappers for items which access or touch resources--reading from and writing to resources. 3. General Approach to Allocators and Destroyers The allocators and destroyers (de-allocators) follow a slightly different model. The general approach for these wrappers is as follows. First, they check any parameters (e.g., FILE pointer to fopen). Next, these wrappers call into the original (corresponding) RTL functions. In the instance of an allocation (e.g., malloc), the call is made with the number of bytes required. The result obtained--a pointer or a handle to a resource--is logged as a new resource (of a particular resource type). Finally, these wrappers check to see whether a failure occurred during the actual API call into the RTL library. This is reported to the user (if desired), in the manner previously described. Finally, the resource (e.g., pointer or handle) is returned to the user. 4. Specific Approach to Resource (File) Allocators and Destroyers These principles may be illustrated by examining implementations of wrapper functions for fopen and fclose. In contrast to items which use resources (e.g., string functions), things which allocate and free resources require more checking and, thus, larger wrapper functions. In an exemplary embodiment, a fopen wrapper function may be constructed as follows.
__________________________________________________________________________
1:
/*
2:
* fopen wrapper
3:
*/
4:
5:
FILE * .sub.-- export .sub.-- cg.sub.-- fopen(
6:
FILE * (*.sub.-- org.sub.-- fopen) (const char * .sub.-- path, const
char * .sub.-- mode),
7:
int (*.sub.-- org.sub.-- fclose) (FILE * .sub.-- stream),
8:
int (*.sub.-- org.sub.-- errno.sub.-- help) (int setErrno),
9:
int (*.sub.-- org.sub.-- close) (int .sub.-- handle),
10:
unsigned int prevEBP,
11:
void * retEIP,
12:
const char * .sub.-- path, const char * .sub.-- mode)
13:
{
14:
static, FUNCTION func.sub.-- fopen={"fopen",.sub.-- cg.sub.-- fopen,F.s
ub.-- FUNC.sub.-- ALL, "ss=p",0};
15:
FILE * r;
16:
17:
if (off.vertline..vertline.inRTL())
18:
{
19: .sub.-- SWITCHTORTLDS;
20: return .sub.-- org.sub.-- fopen(.sub.-- path, .sub.-- mode);
21:
}
22:
23:
validateString(.sub.-- path, 0, &func.sub.-- fopen, rt1EBP);
24:
validateString(.sub.-- mode, 0, &func.sub.-- fopen, rt1EBP);
25:
26:
enterRTL();
27:
do
28:
{
29: .sub.-- SETRTLDS;
30: r = .sub.-- org.sub.-- fopen(.sub.-- path, .sub.-- mode);
31: .sub.-- RESTOREDS;
32:
} while (r==NULL && geterrno==EMFILE && freeDelayed(&rsrc.sub.--
fstream));
33:
leaveRTL();
34:
35:
if (r|=NULL)
36:
{
37: ITEM .sub.-- FAR * i;
38: ITEM .sub.-- FAR * ih;
39:
40: i = newResource( (unsigned long) r,
41: &rsrc.sub.-- fstream,.sub.-- org.sub.-- fclose,&func.sub.--
fopen,
42: rt1EBP );
43:
44: if (i==NULL)
45: {
46: .sub.-- SETRTLDS;
47: .sub.-- org.sub.-- fclose(r);
48: .sub.-- RESTOREDS;
49: seterrno(ENOMEM);
50: apiFail(&func.sub.-- fopen, rt1EBP, NULL);
51: return NULL;
52: }
53:
54: ih = newResource( (unsigned long).sub.-- fileno(r),
55: &rsrc.sub.-- handle,.sub.-- org.sub.-- close,&func.sub.--
fopen,
56: rt1EBP );
57: if (ih==NULL)
58: {
59: .sub.-- SETRTLDS;
60: org.sub.-- fclose(r);
61: .sub.-- RESTOREDS;
62: if (i->par1) free((char*) i->par1);
63: freeResource(i);
64: seterrno(ENOMEM);
65: apiFail(&func.sub.-- fopen, rt1EBP, NULL);
66: return NULL;
67: }
68:
69: i->par1 = (unsigned long) strdup(.sub.-- path);
70: i->par2 = (unsigned long) ih;
71: ih->par1 = (unsigned long) strdup(.sub.-- path);
72: apiRet(&func.sub.-- fopen, rt1EBP, (unsigned long) r);
73:
}
74:
else
75:
{
76: apiFail(&func.sub.-- fopen, rt1EBP, NULL);
77:
}
78:
79:
return r;
80:
}
__________________________________________________________________________
At line 14, the wrapper function declares a (static) descriptor for the function, in a manner similar to that previously described for the strcpy function. The descriptor at line 14 indicates that execution is currently in the "fopen" function. At line 15, a local FILE pointer variable is declared. The function then proceeds to check, at line 17, whether it should continue checking or simply just call into the original fopen function. In a manner similar to that previously done for the strcpy wrapper function, if CodeGuard checking is turned off or execution already is within the RTL, as tested at line 17, the method simply calls the original fopen function at line 20 and returns. Otherwise, the wrapper function continues with checking as follows. At line 23, it validates the string for the "path" (i.e., the path to the file). In a similar manner, at line 24, the wrapper function validates the "mode" string (i.e., the "mode" specified for the standard C fopen function). At line 26, the wrapper function indicates that it is entering the RTL, by calling the enterRTL subroutine which sets a flag as previously described. Thereafter, the wrapper function enters a "do/while" loop for calling the original fopen function. The call to the original fopen function can fail for a number of reasons. It can fail for a "legitimate" reason (i.e., non-programmer error), such as when the intended file no longer exists (on the storage device). On the other hand, it may fail because of an error in program logic in how the resource is used. A particular problem can arise where erroneous code continues to execute because a handle which is freed is then immediately returned for use upon the next request for a handle. To bring such a problem to light, the system of the present invention employs a "delayed free" approach. To keep the user from getting the same file handle back, when a file is "closed" in the user program the file is actually closed. Then, however, the CodeGuard Library immediately reopens the file and places it on the "delayed free" list, thus keeping it from the RTL. In other words, the file is reopened in a way which the RTL does not know about. As a result, a subsequent call to fopen will not return the just-freed file handle. With this approach, the call to fopen will fail in instances where user code inappropriately relies on the return of the same handle. Since the system forces a different file handle to be used, the user code no longer works properly anymore, thus bringing the problem (of inappropriate file handle use) to the forefront. Since closed files placed on the "delayed free" list are not available, a call to fopen may fail because all the file handle are used. However, the system does not want to fail simply because it has kept items on its delayed free list. Therefore, the "while" statement at line 32 will continue to free up resources on the delayed free list, so long as such resources exist and the original fopen function fails. This is done only in the case, however, where the failure occurs for lack of file handles. Eventually, either a valid file handle or NULL is returned by the call to the original fopen function. At this point, the function pops out of the loop and proceeds to line 33. There, the routine leaveRTL is invoked for resetting the RTL entry flag. If a legitimate file handle is obtained (i.e., it is not equal to NULL), tested at line 35, the function then records the resource in its database of legal objects, as shown at line 40. Specifically, the wrapper function calls a routine, newResource for recording the resource. The routine is passed, the result "r", a descriptor indicating what type of resource it is (here, a "stream"), and the "owner" of the resource. The owner of the resource is the function which is responsible for closing the resource; for file handles, the owner is fclose. The fclose function is now the function expected to destroy the resource. In other words, the "owner" is the "closer." Finally the newResource routine is also passed a descriptor for the RTL function (i.e., fopen) and the pointer to the stack frame (as previously described). The stack pointer is used here so that the system can save away a call tree, for recording the fact that a particular file handler is allocated at a particular code position with a particular calling sequence. At a later point, upon occurrence of an error, it is this information which is used to walk up the tree for determining exactly where this resource was allocated. If the system cannot log the file handler as a new resource, tested at line 44, the wrapper function returns the file handle (by calling fclose at line 47), sets the operation to fail at line 50, and then returns NULL at line 51. This scenario generally will not happen; however, the test is included for defensive purposes. The fopen function, because of the way it is implemented in the C programming language, is actually treated as creating two resources. A file stream contains two things: a file stream (proper) and a file handle. In the RTL, the file handle can be extracted and passed to other functions. For this reason, fopen in treated as creating two resources. The new resource which was recorded at line 40 was actually for the file stream (FILE*). The file handle, on the other hand, is recorded as a second resource, at line 54, by the second call to newResource. The parameters for this call to newResource are similar to that previously described, except that the first two parameters pass information specific to the file handle. The sequence of steps to validate this resource is similar to that just described at line 34-52 for the file stream resource. After this second resource is checked, the wrapper function stores in a local buffer useful information about the function call. For instance, the "path" stream is stored, as indicated at line 69 and 71 (for the respective resources). Although this information is ordinarily lost after a call to the standard library fopen function, the CodeGuard library remembers the information, so that it can provide the user with a better description of the resource in the event of an error. Similarly, the return value is logged at line 72. Once the information is recorded, the wrapper can return the result, as indicated by line 79. Lines 75-77 simply indicates when there is an API failure--that is, when the RTL call fails for "legitimate" reasons. At line 76, the wrapper function simply logs the failure. Thereafter, the method proceeds to line 79 to return the result (here, NULL). Complementing the fopen wrapper function is the fclose wrapper function. In an exemplary embodiment, the wrapper function may be constructed as follows:
__________________________________________________________________________
1:
/*
2:
* fclose wrapper
3:
* /
5:
int .sub.-- export .sub.-- cg.sub.-- fclose(
6:
int (*.sub.-- org.sub.-- fclose) (FILE * .sub.-- stream),
7:
int (*.sub.-- org.sub.-- errno.sub.-- help) (int setErrno),
8:
int (*.sub.-- org.sub.-- close) (int .sub.-- handle),
9:
FILE * (*.sub.-- org.sub.-- freopen) ( const char * .sub.-- path, const
char * .sub.-- mode,
10: FILE * .sub.-- stream),
11:
unsigned int prevEBP,
12:
void * retEIP,
13:
FILE * .sub.-- .sub.-- stream)
14:
{
15:
static FUNCTION func.sub.-- fclose={"fclose",.sub.-- cg.sub.--
fclose,F.sub.-- FUNC.sub.-- ALL,"p=i",0};
16:
ITEM .sub.-- FAR * i, *ih;
17:
FILE * h;
18:
19:
if (off.vertline..vertline.inRTL())
20:
{
21: .sub.-- SWITCHTORTLDS;
22: return .sub.-- org.sub.-- fclose(.sub.-- stream);
23:
}
24:
25:
// checking parameter errors
26:
27:
if (NULL==( i=isGoodRscParam((unsigned long).sub.-- stream,
&rsrc.sub.-- fstream,
28: .sub.-- org.sub.-- fclose, &func.sub.-- fclose, rt1EBP)))
29:
{
30: if ( findDelayFreed((unsigned long).sub.-- .sub.-- stream,
&rsrc.sub.-- fstream,
31: .sub.-- org.sub.-- fclose))
32: {
33: seterrno(EBADF);
34: apiFail(&func.sub.-- fclose, rt1EBP, NULL);
35: return EOF;
36: }
37: else // try to fclose (we can't fall thru to delay)
38: {
39: int r;
40: enterRTL();
41: .sub.-- SETRTLDS;
42: r = .sub.-- org.sub.-- fclose(.sub.-- stream);
43: .sub.-- RESTOREDS;
44: leaveRTL();
45: if (r|=EOF)
46: apiRet(&func.sub.-- fclose, rt1EBP, (unsigned long) r);
47: else
48: apiFail(&func.sub.-- fclose, rt1EBP, EOF);
49: return r;
50: }
51:
}
52:
53:
// fclose and reopen it so others will have access to it|
54:
55:
enterRTL();
56:
.sub.-- SETRTLDS;
57:
h = .sub.-- org.sub.-- freopen("null", "rb", .sub.-- stream);
58:
.sub.-- RESTOREDS;
59:
IeaveRTL();
60:
61:
ih = (ITEM *) i->par2;
62:
63:
// If we have a valid handle resource (fdopen may not) free it.
64:
if (ih)
65:
{
66: if (ih->par1)
67: free((char*) ih->par1);
68: freeResource(ih);
69:
}
70:
71:
if (h==NULL && geterrno==EBADF)
72:
{
73: if (i->par1) free((char*) i=>par1);
7A: freeResource(i);
75: apiFail(&func.sub.-- fclose, rt1EBP, EOF);
76: return EOF;
77:
}
78:
79:
// try to take hold of the same handle for delay stuff
80:
81:
if (h|=.sub.-- stream)
82:
{
83: if (i->par1) free((char*) i->par1);
84: freeResource(i);
85: if (h)
86: {
87: .sub.-- SETRTLDS;
88: .sub.-- org.sub.-- fclose(h);
89: .sub.-- RESTOREDS;
90: }
91: apiRet(&func.sub.-- fclose, rtlEBP, 0);
92: return 0;
93:
}
94:
95:
// resource free (delay free)
96:
97:
delayFreeResource( i, &func.sub.-- fclose, rt1EBP,
98: (unsigned long) .sub.-- stream, .sub.-- org.sub.-- fclose
99: .sub.-- PASS.sub.-- USERDS );
100:
101:
apiRet(&func.sub.-- fclose, rtlEBP, 0);
102:
return 0;
103:
}
__________________________________________________________________________
At the outset, at line 15, the wrapper declares a (static) descriptor data member, which stores information describing the fclose wrapper function. At lines 16-17, other local variables are declared. At line 19, the wrapper function performs the previously-described determination of whether CodeGuard is turned off or execution is already within the runtime library (i.e., a re-entrant scenario). In either case, the wrapper function simply calls into the original fclose function and returns, at line 22. Otherwise, CodeGuard checking is indicated and the method proceeds to line 27. At line 27, the function performs a generic resource parameter check: is this a good resource parameter? This check is analogous to the validateString function (as previously described for string RTL functions) and the validate function (as previously described for memory RTL functions). For the present case of fclose, the resource is a generic resource; it is not memory but, instead, simply a "resource." Here, therefore, the determination is whether it is a good resource, specifically a good fstream resource for fclose. Thus, at line 27, the wrapper function looks up the resource to determine if it is a valid resource. If it is not determined to be a valid resource (i.e., the subroutine call at line 27 returns NULL), the function proceeds to line 30 to make an additional determination. Specifically, at line 30, the function determines if the resource has been previously freed, by calling a findDelayFreed subroutine, passing the resource and function descriptor (i.e., fclose descriptor) as arguments. If the resource is in fact on the delay freed list, a user error has occurred (i.e., the user is attempting to use a resource which has already been freed). In such a case, the wrapper function logs the error and returns a failure value (here, EOF--end of file --for fclose). If the resource is not on the delay freed list, the function proceeds to the "else" statement at line 37. Within the "else" statement, the function actually calls the original fclose function, passing the resource (stream). Here, particularly, is an example of where the wrapper function attempts to perform what the user wanted even though the resource cannot be validated. This is a rarely used case, but it still allows the CodeGuard library to defer the matter to the actual runtime library (i.e., the original fclose function). The result returned by the original fclose function is, in turn, employed at this point to determine whether the API call succeeded or failed. This result is returned, at line 49. If the resource parameter was, on the other hand, valid at line 23, the method proceeds to line 55. Here, the resource has been determined to be valid. The function will, therefore, close it and reopen it, by calling the RTL function freopen, at line 57. As shown, the file is actually reopened into the NULL stream. The file handle is not returned to the user, but, instead, is maintained on the delayed-free list. As previously described, the actual RTL call (i.e., to freopen) is sandwiched between the enterRTL function and leaveRTL function (lines 55 and 59, respectively). At line 61, the file resource handle is saved to a local variable, ih. Then, if the resource handle is valid (tested at line 64), the file handle itself is freed, specifically at lines 66-68. At line 71, if the file could not be reopened (freopen returns NULL), the wrapper function simply frees up the resource altogether and returns fail (lines 73-76). Otherwise, the function proceeds to line 81 where it will attempt to hold onto the handle for the delayed-free list. In particular, at line 81, the wrapper function tests to determine whether the handle returned by freopen (i.e., at line 57) is the original handle for the file stream. If this condition does not hold true at line 81, then delayed free processing is not attempted. The wrapper function only wants to delay freeing the resource; it does not want to tie up two resources. Therefore, if the wrapper function is not able to place the resource on the delayed free list, it closes the file handle (call to original fclose at line 88) and returns (line 92). If the same stream is returned by freopen, the wrapper function can place the resource on the delayed free list. This is done, in particular, by calling a delayFreeResource, at line 97. This places the resource on a list of objects which are free and illegal (for present use). Once this is done, the wrapper can return the result, at line 102. 5. Specific approach to memory allocators and destroyers In an exemplary embodiment, a malloc (memory allocation) wrapper function may be constructed as follows.
__________________________________________________________________________
1:
/*
2:
* malloc wrapper
3:
*/
4:
5:
void .sub.-- FAR * .sub.-- export .sub.-- cg.sub.-- malloc(
6:
void .sub.-- FAR * (*.sub.-- org.sub.-- malloc) (size.sub.-- t .sub.--
size),
7:
void (*.sub.-- org.sub.-- free) (void .sub.-- FAR *.sub.-- .sub.--
block),
8:
unsigned int prevEBP,
9:
void * retEIP,
10:
size.sub.-- t .sub.-- size)
11:
{
12:
static FUNCTION func.sub.-- malloc = { "malloc", .sub.-- cg.sub.--
malloc,
13: F.sub.-- FUNC.sub.-- ALL, "i=p", 0 };
14:
void .sub.-- FAR * r;
15:
16:
if (off.vertline..vertline.inRTLM())
17:
{
18: .sub.-- SWITCHTORTLDS;
19: return .sub.-- org.sub.-- malloc(.sub.-- size);
20:
}
21:
22:
if (deref(.sub.-- org.sub.-- malloc)==dll.sub.-- mailoc)
23:
{
24: // RTLDLL
25: .sub.-- org.sub.-- malloc = new.sub.-- dll.sub.-- malloc;
26: .sub.-- org.sub.-- free = new.sub.-- dll.sub.-- free;
27:
}
28:
29:
enterRTLM();
30:
do
31:
{
32: .sub.-- SETRTLDS;
33: r = .sub.-- org.sub.-- malloc(.sub.-- size);
34: .sub.-- RESTOREDS;
35:
} while (r==NULL && freeDelayed(&rsrc.sub.-- memory));
36:
leaveRTLM();
37:
38:
if (r|=NULL)
39:
{
40: ITEM .sub.-- FAR * i;
41:
42:
43: // if logging regions for access validation, do it
44:
45: if (|insertRegion(r , .sub.-- size, 2, 2))
46: {
47: .sub.-- SETRTLDS;
48: .sub.-- org.sub.-- free(r);
49: .sub.-- RESTOREDS;
50: apiFail(&func.sub.-- malloc,rtlEBP, 0);
51: return NULL;
52: }
53:
54: // if log resources, do it
55:
56: i = newResource( (unsigned long) r,&rsrc.sub.-- memory,
57: .sub.-- org.sub.-- free,&func.sub.-- malloc,rtlEBP );
58:
59: if (i==NULL)
60: {
61: .sub.-- SETRTLDS;
62: .sub.-- org.sub.-- free(r);
63: .sub.-- RESTOREDS;
64: deleteRegion(r);
65: apiFail(&func.sub.-- malloc,rtlEBP, 0);
66: return NULL;
67: }
68: i->par2 = .sub.-- size;
69: i->par1 = (unsigned long) r;
70: uninitFillBlock(i);
71:
72: apiRet(&func.sub.-- malloc,rtlEBP, (unsigned long) r);
73:
74:
}
75:
else
76:
{
77: // if log API errors, do it here
78:
79: if (.sub.-- size==0)
80: apiRet(&func.sub.-- malloc,rtlEBP, (unsigned long) r);
81: else
82: apiFail(&func.sub.-- malloc,rtlEBP, 0);
83:
}
84:
return r;
85:
}
__________________________________________________________________________
As shown at line 12, the function declares a "static" descriptor for identifying itself as the "malloc" function. At line 14, a local variable is declared; and it will be employed to store the result (later below). At line 16, the wrapper function performs the previously-described test to determine whether CodeGuard checking is turned off or current execution is already in the runtime library. If either of these conditions holds true, the method, at line 19, calls in to the original malloc runtime library function and then returns. Otherwise, execution continues to line 22. There, DLL (dynamic link library) housekeeping is undertaken, for determining whose malloc function is to be invoked. At line 29, the wrapper function indicates that it is entering the runtime library memory manager (RTLM). At lines 30-35, the now-familiar "do/while" loop is established. Here, the function attempts an allocation, by calling into the original malloc (line 33). The call will be repeated, however, if the result of the call is unsuccessful and delayed free memory exists. Ultimately, the loop will either obtain a memory block (valid pointer), or fail (pointer set equal to NULL). After completing the loop, the wrapper function resets the flag indicating that it was in the runtime library memory manager, at line 36. At line 38, the function tests whether a pointer to a valid memory block was obtained (i.e., pointer not equal to NULL). The resource is logged differently than was previously described for a file (handle) resource. Memory has a variable size. As a result, the size of the region must also be tracked. According to the present invention, therefore, the method at this point logs that there is a particular region of memory of a particular size which is legitimate. The recorded length is used, for instance, in the previously-described strcpy function call. The region is then inserted or logged as a valid region, line 45. If the logging operation fails, the memory block is freed (line 48) and the method returns NULL (line 51). In addition to the just-described logging of a "region" of memory, the function, at line 56, logs the resource as a valid new resource (line 56). If, for some reason, the logging of the resource fails (i.e.,the logical variable, i, is set equal to NULL), the method deletes the region from its list of valid regions (line 46), asserts an API failure (line 65), and returns NULL (line 66). On the other hand, if the resource is obtained and logged, at line 68 the size of the resource is recorded and at line 69 the pointer (result) to the resource is recorded. This is followed by filling the memory block with garbage information, at line 70. As previously described, this subroutine call includes method steps for pacing unique identifiers during the fail, so that the region can be uniquely identified. After recording the API return value (line 72), the method returns the result, at line 84. In the instance that the resource was not obtained (tested at line 38), the method enters the "else" statement at line 75. Here, the method records a failure and, thereafter, returns. Complementing malloc is free. In an exemplary embodiment, a free wrapper function may be constructed as follows:
__________________________________________________________________________
1:
/*
2:
* free wrapper
3:
*/
4:
5:
void .sub.-- export .sub.-- cg.sub.-- free(
6:
void (*.sub.-- org.sub.-- free) (void.sub.-- FAR *.sub.-- block),
7:
/* size.sub.-- t (*.sub.-- org.sub.-- msize) (void .sub.-- FAR *
.sub.-- block), */
8:
unsigned int prevEBP,
9:
void * retEIP,
10:
void .sub.-- FAR *.sub.-- block)
11:
{
12:
static FUNCTION func.sub.-- free= {"free",.sub.-- cg.sub.-- free,F.sub.-
- FUNC.sub.-- ALL, "p", 0);
13:
ITEM .sub.-- FAR * i;
14:
15:
if (off.vertline..vertline.inRTLM())
16:
{
17: .sub.-- SWITCHTORTLDS;
18: org.sub.-- free (.sub.-- block);
19: return;
20:
}
21:
22:
if (deref(.sub.-- org.sub.-- free)==dll.sub.-- free)
.sub.-- org.sub.-- free = new.sub.-- dll.sub.-- free;
23:
24:
if (.sub.-- block==NULL) return;
25:
26:
// checking parameter errors
27:
28:
if (NULL==( i=isGoodRscParam((unsigned long).sub.-- block,
29: &rsrc.sub.-- memory, .sub.-- org.sub.-- free, &func.sub.--
free, rtlEBP) ))
30:
{
31: // at this point we just want to emit a message
32: // (via findDelayFreed()) and then quit
33: findDelayFreed((unsigned long) .sub.-- block,&rsrc.sub.-- memory,
.sub.-- org.sub.-- free);
34: apiRet(&func.sub.-- free,rtlEBP, 0);
35: return;
36:
}
37:
38:
// resource free (delay free)
39:
40:
delayFreeRegion (.sub.-- block);
41:
delayFreeResource( i, &func.sub.-- free,rt1EBP, (unsigned long)
.sub.-- .sub.-- block,
42: .sub.-- org.sub.-- free .sub.-- PASS.sub.-- USERDS );
43:
apiRet(&func.sub.-- free, rtlEBP, 0);
44:
}
__________________________________________________________________________
In a manner similar to that previously described, the function declares a (static) descriptor at line 12, and declares a local variable, at line 13, for storing a result. At line 15, the function performs the previously-described test for determining whether CodeGuard checking is turned off or execution is already within the runtime library memory manager. If either of these conditions hold true, the method calls the original free API function (line 18) and then returns (line 19). At line 22, the function performs DLL housekeeping. At line 24, if the block (pointer) is equal to NULL, the function simply returns. As defined by the C and C++ programming languages, it is a valid operation to "free" a NULL pointer; see e.g., Ellis M. and Straoustrup B., The Annotated C++ Reference Manual, Addison-Wesley, 1990, the disclosure of which is hereby incorporated by reference. The wrapper function can treat this as a special case, for shortcutting further checking. At line 28, the function looks at the actual resource to determine whether it is valid. If it is not legitimate, the function determines whether it is a resource which has already been freed (determined by the call at line 33 to findDelayFreed subroutine). If it is an already freed resource, the subroutine call will immediately signal the error. Thereafter, the function returns, at line 36. Note that, at this point, the requested free operation is not passed on to the original free RTL function, since freeing a resource twice leads to an unstable system. If the resource is valid, at line 28, the method proceeds to line 40, where it will place the block (i.e., region) on its delayed-free list of regions. In a corresponding manner, the method places the resource on its delayed-free list of resources, at line 41. Thereafter, the function may return. Compiler Modification A. General Certain programming errors can be detected at runtime if direct compiler support is provided. Consider the following example:
______________________________________
1: int i;
2: char a›10!, b›10!;
3: char *p, *q;
4:
5: a›i + 3! = 0; // valid operation for 0 .ltoreq. i + 3 < 10 only
6: p = &a + i + 3; // valid operation for 0 .ltoreq. i + 3 < 10
______________________________________
only
The above sets forth simple C definitions, including declaration of arrays of characters and pointers to characters. Consider the access in the array .alpha. index on ›i+3!. Here, i+3 must evaluate such that the boundaries of the array are not violated. Specifically, i+3 must be within the range of 0 to 10 (exclusive of 10). Error checking of this requires compiler support, for specifying the appropriate boundaries of .alpha.. In the statement shown at line 6, a similar problem occurs. Here, the statement must evaluate within the range of the array. Note, however, that the pointer, p, can be equal to 10 since, in the C programming language, a pointer can point to the next element after an array. Although it is not permissible to access this element (because it does not exist), it is permissible to have a pointer pointing to it. Consider a third type of problem choosing the above-described data members. i=p-q;/ / valid operation if p and q point into the same block This represents the third type of problem--addition and/or subtracction of pointers--which requires checking. Given the expression p-q, p and q must point into the same block of memory. Consider the following problem which arises in C, as shown in FIG. 3A. Suppose two arrays, a an b, are allocated, with a being allocated just in front of b. Consider the following operation: p=&a+10; This operation is legal, because it points to the next element after a (which, recall, is legal in C). Consider further that q is initialized in the middle of b as follows: q=&b+5; Taking the pointer difference of p-q is legal since p points to the first element of b and q points to the middle of b. Note, however, that p is obtained by adding something to a which is not legal. In order to detect such a mistake, the compiler in the system of the present invention inserts "padding" bytes between variables. This is illustrated diagramatically in FIG. 3B. Since this padding exists between data members at time of compile, the support is at the level of the compiler. In an exemplary embodiment, the padding is at least one byte (or machine addressable unit). Preferably, padding is added between each and every local variable, since each is treated as a separate memory block which is independent of others. Note, however, that parameters (i.e., arguments passed to functions) cannot be padded, without changing the calling convention (e.g., pascal, cdecl, and the like). In a preferred embodiment, therefore, all local and global variables are padded, except for arguments passed to functions. In order to validate an operation, the CodeGuard Library needs the following information: (1) Kind of operation: that is, memory access, pointer arithmetic, and the like; (2) The operands of the operation: memory address and access size (i.e., size of the particular data members); and (3) Location and size of allocated memory blocks. Each will be described in further detail. B. Location and size of allocated memory blocks Compiler support is not required for blocks allocated by malloc, the C runtime library routine for dynamic allocation. As previously described, CodeGuard already knows about the blocks that have been allocated by malloc, as it tracks them as they are allocated. On the other hand, the CodeGuard Library does not known about local and global variables for each module. Accordingly, compiler support is provided in the form of descriptors for indicating the local and global variables. As previously described, for each module, the compiler builds one descriptor for all global variables (i.e., static data) of the module. The address of the global descriptor is passed to the CodeGuard library during execution of startup code of the module. The compiler also builds one descriptor for all local variables of each function. The actual descriptor for the local variables is stored after the code of each function. The address of the local descriptor is stored at a fixed offset in the stack frame of each guarded function. In a preferred embodiment, a special ID or "magic number" is stored at a fixed offset in the stack frame (EDP-4), for identifying a function as a "code guarded" function. Upon encountering this at runtime, the system may then read the accompanying address of the descriptor in the stack frame for accessing the descriptor. After referencing these descriptors, the CodeGuard library knows where all allocated memory blocks (including simple variables) reside, as well as blocks allocated by malloc. C. Type of operation and operands of operation Consider again the following statement: a›i+3!=0; This operation will be performed by one instruction: mov byte ptr ›EBP+EAX-17!, 0 This instruction can be decomposed into several offsets as follows. The a variable resides on the stack; the i variable resides in the EAX register (Intel). The addition of +3 corresponds to stack offset of -17--that is, the offset of a (-20+3). To check this at runtime, the following must be communicated to the CodeGuard Library; the starting address of the array, the offset within it, and how much of it is accessed. The operations which must be checked, therefore, include pointer arithmetic and access of data at the pointed-to address. The generated code is as follows:
______________________________________
push 1 ; element size
push -20 ; offset of a
push EAX ; index of array
push 0 ; no index scaling (i.e., 2.degree.)
push 3 ; offset
call .sub.-- CG.sub.-- LDA.sub.-- EOXSY
______________________________________
As shown, the system first pushes the size of the element--one byte of the array. This is followed by a push of the offset of a, at the second line. The index of the array (i) is then pushed, followed by the index scaling, here 0 (i.e., 1 which is 2 raised to the power of 0). Finally, before the call, the offset (3) is pushed. The procedure is called using a mnemonic which indicates the procedure check. As indicated in FIG. 4, the mnemonic starts with an identifying sequence 410 of .sub.-- CG.sub.--. This is followed by character 421, L, which indicates local access. Since the access is local, EBP is implied (and, therefore, need not be pushed onto the stack). Character 423, D, indicates that the checked operation includes a pointer dereferencing and, thus, includes a memory access operation. Character 425, A, indicates that the call includes pointer arithmetic. Finally, the mnemonic includes a parameters list 431. A whole list of such functions is possible, depending on which operations are being performed. Appended herewith as Appendix A are source listing demonstrating possible checks in an exemplary embodiment. The compiler inserts the relevant call in front of instructions which access memory or perform pointer arithmetic. The CodeGuard Library, in turn, employs its database of valid memory blocks to check whether the pointer arithmetic and access are valid. Specifically, the CodeGuard Library can check whether the access is completely within the two same blocks (i.e., no block overrun) and check whether pointer arithmetic does not span over two blocks. The calls are inserted by the code generator of the compiler before each operation to be checked. The additional code is inserted during the regular code generation phase of compilation and, thus, requires no additional compilation phase. D. Compiler operation The operation of compilation by a compiler comprises two main components: a front end and a back end. The "front end" of the compiler parses the source program and builds a parse tree--a well known tree data structure representing parsed source code. The "back end" traverses the tree and generates code (if necessary) for each node of the tree, in a post-order fashion. For an introduction to the general construction and operation of compilers, see Fischer et al., Crafting a Compiler with C, Benjamin/Cummings Publishing Company, Inc., 1991, the disclosure of which is hereby incorporated by reference. In a preferred embodiment, CodeGuard compiler support is provided, without requiring an additional compilation phase. Instead, the additional CodeGuard calls added by the compiler are generated at the same time as the regular calls (i.e., the calls which result from the user's source code). Consider, again, the statement: a›i+3!=0; Above, the variable a is indexed by i+3. In the system, code generation is delayed when walking or traversing the parse tree, until the addressing modes of the processor can be fully exploited. A parse tree for the above statement is shown as tree 500, in FIG. 5. The code generator will not necessarily generate code for each node, however. For instance, the compiler would not emit code which would load i in a register, load 3 in a register, and then emit a register instruction for adding the two registers. Instead, the entire expression can be executed in a single instruction, using a different addressing mode of the processor. In the example above, therefore, only one instruction is emitted for the top node, coding the entire tree 500. In a preferred embodiment, the CodeGuard all is emitted just before the instruction(s) performing the operating which requires access to memory. Thus for the example above, this would be just before the assignment--at the top of the tree. Further, the compiler builds a CodeGuard call when operands are created, not when used. For the example above, for instance, there are many possible uses of the operands. To compile a check for each possible use of an operand would be inefficient (and highly impractical). A check would have to be inserted, for the above example, in all possible uses of a (i+3). Therefore, the call is generated in the system of the present invention when the above is built, not when it is used. The actual nodes which can build an operand in memory are relatively few. In an exemplary embodiment, they comprise the following: index node: ›! field node: . dereference node: *() address node: & pointer addition node: + pointer subtraction node: - As shown, nodes exist to index a variable (as shown in the above array example), dereference a field of a variable (e.g., using ›struct!›field!syntax), derference a pointer (e.g., *p=0), take the address of an operand (e.g., p=&a), and perform pointer arithmetic (both addition and subtraction). Although a simple variable can be used as an operand, it is not really of interest. Specifically, the variable is used without change (e.g., i.=5). Note that, here a simple variable does not required checking, since there is nothing to "go wrong." Instead, calls to CodeGuard are reserved for operations which can go wrong. This is illustrated diagramatically in FIG. 6. As shown, the assignment of i=5 (shown at 601) corresponds to the parse tree of 611. Memory space is allocated by the compiler for i; the address is not manipulated by the programmer. Accordingly, modification is made in the compiler for processing of those particular nodes which involve memory access (i.e., operations which can "go wrong"). Care must be taken regarding the context of a particular operation. Consider the following statement: p›i!.x=0; A parse subtree 700 for the above statement is shown in FIG. 7A. A check should not be made at the index node (i.e., node 701), since the whole structure is not being accessed; instead, only the field x of the structure is being accessed. Accordingly, checking is deferred until node 703 is reached. In other words, checking is deferred until the particular address of interest (i.e., field member within a structure) is resolved. FIG. 7B illustrates this diagramatically. Consider an array of p (i.e., memory block 750). The range associated with a particular element of array p--that is, p ›i!--is shown by memory block 751. Note, however, that this memory block is not accessed. Instead, the small range associated with field x is accessed, as shown by memory block 753. Stated generally, therefore, checking is deferred until the system resolves a particular memory address which is knows will actually be accessed. Consider, in contrast, the following statement: p›i!=p›j!; A parse subtree for the above statement is shown as subtree 800, in FIG. 8. Here, the whole structure (i.e., p›i!) is actually accessed (not just a particular member). In this case, two checks are required. Specifically, a check is made at node 801, when the structure p›j! is read. Additionally, a check is made at node 803, when the structure p›i! is written. All told the decision whether to check a particular node depends on what is "on top" of the node--that is, its particular context (relative to other nodes). As the back end of the compiler is processing nodes of the parse tree, whether it inserts a CodeGuard check in the executable binary image being generated depends on the above-described analysis of context. In addition to the above-illustrated checking of access to a particular region of memory, the system of the present invention also, when appropriate, checks a particular address (without access to that address). Consider the following statement: i=&p›i!.x; Here, instead of accessing p›i!.x, the statement takes the "address" of p›i!x. The parse subtree for the statement is shown as parse subtree 900, in FIG. 9. As shown, the compiler does not insert a check for p›i!.x itself; in other words, node 901 and node 903 are not checked. The check is, instead, deferred until node 905, which corresponds to the particular node where a check of the address of p›i!.x can be appropriately inserted. According to the present invention, therefore, sometimes a check is made for the "access" to an address, and other times a check is made for the address itself (i.e., without access). In this manner, the system can appropriately handle situations in which the "address" itself is valid but access to that address is invalid. In the instance of the statement shown in FIG. 9, presence of line & node inhibits the "access" check in the field node, thus leading the compiler to insert a "value" check. The particular modification to the compiler for performing the above-described checks is as follows. In general operation, the compiler first traverses or walks down a parse tree and, then, generates program code when coming back up the tree. When walking down the tree, the compiler marks those nodes that do not need to be checked. Referring back to FIG. 9, for instance, when the compiler walks from the address node 905 (where a check is performed) to node 903 and node 901, it marks each of these children nodes (i.e., node 903 and node 901) as "no check." In a corresponding manner, a field node (e.g., node 703, shown in FIG. 7A) will indicate to its children nodes that they should not check, since the field node itself is performing the check. According to the present invention, therefore, each node of the parse tree includes a flag indicating whether a CodeGuard check should be preformed. When the flag for a node is set to indicate that a check should be performed, "no check" is propagated to subsequent "children" nodes. When traversing back up the tree, after code has been generated for a particular node, the compiler inserts a CodeGuard check for the node if the corresponding flag has been set for checking (specifically, the check-inhibiting flag is absent). Referring now to FIG. 10, the general processing of an index node, n will be illustrated. FIG. 10 illustrates a parse subtree 1000, including an index node 1001. When proceeding down the tree, the compiler performs the following:
______________________________________
down: mark n .fwdarw. base for no access check
process n .fwdarw. base
process n .fwdarw. index
generate code for n depending on the
addressing modes returned by the processing
of n .fwdarw. base and n .fwdarw. index.
______________________________________
When proceeding back up the tree, the compiler performs the following: up: if n is not marked, generate call CodeGuard, checking the access to n. A particular problem must be addressed, however. Consider the following case: (*p) ›i, j, k!=. . . ; The parse subtree for the statement is shown as parse subtree 1100, in FIG. 11. This is an example of a pointer pointing to an array; the pointer is de-referenced to the array which is then indexed with particular indexes, here, i, j and k. According to the above-described approach, the check will be performed at the top index node (i.e., node 1101), which will, therefore, inhibit checking in lower nodes (i.e., node 1103 and node 1105). It suffices to perform only one check in the case that the base address, p, is known. For the example of FIG. 11, first the array is pointed to (i.e., the base address of the array), then indexes are added (i,j,k) to generate the end address (corresponding to node 1101). The system must, however, be sure that it did not overrun several blocks of memory, so it is necessary to know at which point it started. If a simple variable exists at the beginning, the system always knows the address (at a given instance). If, on the other hand, the address is volatile (e.g., adding indexes i, j, and k to the particular register containing the pointer), the starting base or address is destroyed--the starting base is not preserved. In the case of a volatile address, the system of the present invention propagates down information indicating that the starting address is required to perform the check. In response to this information, the starting address is preserved or saved. In other words, when the compiler walks down the tree, it communicates to lower nodes that the base address of an operand will be needed for checking. This information is propagated all the way down to the base. There, if the base is a (simple) variable, its address cannot be modified; thus, it is not necessary to really save the address. If, on the other hand, the base is a de-referenced pointer (e.g., as illustrated in FIG. 11), the pointer value should be preserved, such as saving it on the stack. Also at compile time, the system saves information indicating where on the stack this value can be found when the system arrives at the top index node at runtime. In any case, the location of the base address, saved or not, is pushed onto a compile-time stack, and the node is marked as "saved." The saved information is then propagated up the tree until it is used by a call to the CodeGuard Library. There, the location of the base address is popped from the compile-time stack. If the retrieved information indicates that the base address was saved on the run-time stack, code is generated to pop the base address from the stack before passing it to the CodeGuard Library. Therefore, either the base address is saved (and available from the stack), or, in the case of a simple variable, the offset is saved in a compile-time stack. D. Compiler implementation In an exemplary embodiment, the following flags are employed for walking the nodes: 1. CG.sub.-- NOVALCK 2. CG.sub.-- NOACCCK 3. CG.sub.-- SAVEVAL 4. CG.sub.-- SAVEADR 5. CG.sub.-- SAVED 6. CG.sub.-- SAVEBVAL 7. CG.sub.-- SAVEBADR The first flag indicates "no value check." The second flag indicates "no access check." The third flag indicates "save value" of the base. The fourth flag indicates "save address" of the base. The fifth flag indicates that these have been "saved." The sixth flag indicates "save base value." The seventh flag indicates "save base address." The sixth and seventh flag are required to address the problem that leaf nodes of the tree are preferably not marked. The leaf nodes of the tree are elements of the symbol table. These elements of the symbol table are, in turn, used by multiple trees. If the above flags were stored in the symbol table, it is possible that processing of another tree might modify the flag inappropriately. In accordance with the present invention, therefore, when a symbol must be marked, the system actually marks the node on top of the symbol, indicating that it is for the base. Thus, the sixth and seventh flag are provided for indicating that the base value and base address should be saved, respectively. Two save flags are provided because the "address" or the "value" can be saved. Consider a statement which includes de-referencing a pointer. In the corresponding parse subtree, the de-reference node transforms the "save address" flag into the "save value " flag, while propagating it downward. This is illustrated in FIG. 12A. Below the de-reference node 1201, it is the value of the pointer which is saved. The complementary case is illustrated in FIG. 12B. Here, the address node 1251 does the opposite; transforms a "save value" flag into a "save address" flag when propagating it downward. Actual modification to the compiler includes extending existing node processing functions by adding calls to CGDown and CGUp functions which perform the actual marking, base saving, and generation of calls into the CodeGuard library. In an exemplary embodiment, the CGDown function may be constructed as follows (using the C programming language).
__________________________________________________________________________
static
void CGDown(Node *n, Node *base, CGFlags f)
/*
propagate CG.sub.-- SAVExxx flags down
add f to base->fld.cgflags if base is a node
convert and add f to n->fld.cgflags if base is a symbol
*/
CGFlags g;
if (base->g.kind == NK.sub.-- CSE)
base = base->op.right;
g = n->fld.cgflags & (CG.sub.-- SAVEADR.vertline.CG.sub.-- SAVEVAL);
if (n->g.kind == NK.sub.-- ADR)
{
if (g & CG.sub.-- SAVEVAL)
g = CG.sub.-- SAVEADR;
}
else
{
g &= .about.CG.sub.-- SAVEVAL; /* is propagated down through f if
needed */
if (n->g.kind == NK.sub.-- INDIR)
{
if (g & CG.sub.-- SAVEADR)
g = CG.sub.-- SAVEVAL;
}
}
f .vertline.= g;
if (base->g.kind < NK.sub.-- NULL)
{
if (f & CG.sub.-- SAVEADR)
n->fld.cgflags .vertline.= CG SAVEBADR;
else if
(f & CG.sub.-- SAVEVAL)
{
if (base->g.flags & SF.sub.-- REG)
{
if (base->g.kind == SY.sub.-- ABSVAR)
/* should actually not occur: always a NK.sub.-- FIELD
on top of pseudoreg (pointer cast),
so CG.sub.-- SAVEVAL is masked out */
base->g.c.mr = base->sym.val.intVal;
CGSaveVal(base); /* save it before it's too late */
n->fld.cgflags .vertline.= CG.sub.-- SAVED;
}
else
n->fld.cgflags .vertline.= CG SAVEBVAL;
}
}
else
{
base->fld.cgflags .vertline.= f;
assert((base->fld.cgflags & (CG.sub.-- SAVEADR.vertline.CG.sub.--
SAVEVAL))
|= (CG.sub.-- SAVEADR.vertline.CG.sub.-- SAVEVAL), "C");
}
}
__________________________________________________________________________
In an exemplary embodiment, the CGUp function maybe constructed as follows:
__________________________________________________________________________
static
void CGUp(Node *n, Node *base, CGFlags f)
/*
save base symbol if marked
propagate CG.sub.-- SAVED flag up if base is a node
check if needed, and save n if needed */
if (base->g.kind == NK.sub.-- CSE)
base = base->op.right;
if (base->g.kind < NK.sub.-- NULL)
{
if (n->fld.cgflags & CG.sub.-- SAVEBADR)
{
CGSav.sub.-- Adr(base);
n->fld.cgflags .vertline.= CG.sub.-- SAVED;
}
else if (n->fld.cgflags & CG.sub.-- SAVEBVAL)
{
CGSaveVal(base);
n->fld.cgflags .vertline.= CG.sub.-- SAVED;
}
}
else
n->fld.cgflags .vertline.= base->fld.cgflags & CG.sub.-- SAVED;
if (f & CG->CHECKACC)
{
if (n->fld.cgflags & CG.sub.-- NOACCCK)
{
if
(n->fld.cgflags & CG.sub.-- NOUPCALL)
CGCheckAdr(n);
}
else
CGCheckAcc(n);
}
if ((f & CG.sub.-- CHECKVAL) && |(n->fld.cgflags & CG.sub.-- NOVALCK))
N
CGCheckVal(n);
if ((n->fld.cgflags & (CG.sub.-- SAVEADR.vertline.CG.sub.-- SAVEVAL))
&& | (n->fld.cgflags & CG.sub.-- SAVED))
{
if (n->fld.cgflags & CG.sub.-- SAVEADR)
CGSavAdr(n);
else
CGSaveVal(n);
n->fld.cgflags .vertline.= CGSAVED;
}
}
__________________________________________________________________________
These functions are invoked as follows. For a field node, the CGDown function is invoked as follows: CGDown (n, structure, CG.sub.-- NOACCCK); As shown in FIG. 13, the first parameter, n, corresponds to the top level node. The second parameter, structure, corresponds to the structure or record for that field. The third parameter is set to "no access," for indicating that an access check is not required. When coming back up, the field node is processed by CGUp, as follows: CGUp(n, structure, CG.sub.-- CHECKACC); When coming back up the tree, access is checked. As the process is recursive, the node can be marked as no check. For a de-reference node, the CGDown function is invoked as follows. CGDown (n, pointer, 0); As shown, when going down, the system does not request "no check," because when a pointer is de-referenced (as indicated in FIG. 14), if it is an expression or variable, it will be accessed to de-reference it. For the de-reference node, the call to CGUp is as follows: CGUp (n, pointer, CG.sub.-- CHECKACC); As shown, an access check is required. The CGDown function is invoked as follows: CGDown (n, array, CG.sub.-- NOACCCK.vertline.(array.fwdarw.g.kind==NK.sub.-- INDEX?CG.sub.-- SAVEADR: 0)); For the index node, n and the array are passed. Additionally, "no access check" is required. As also shown, however, the address is also saved for instances where multiple indexes exist for the array (e.g., as illustrated in FIG. 11). In such a case, the address is also saved for use later, as previously described. When coming back up the index node, the CGUp function is invoked as follows : CGUp (n, array, CG.sub.-- CHECKACC); As shown, access is checked on the return trip. For an address node, the CGDown function is invoked as follows: CGDOWN (n, expression, CG.sub.-- NOACCOK); As shown, when going own, "no access check" is specified. When coming back up, the CGUp function is involved as follows: CDGUp (n, expression, CG.sub.-- CHECKVAL); When coming back up, as shown, the "value" (not the "address") is checked. For pointer addition or subtraction, the CGDown function is invoked as follows: CGDown (n, pointer, CG.sub.-- NOVALCK.vertline.C CG.sub.-- SAVEVAL); As shown, when going down the flag is set specifying "no value check" (as something will be added to it); the value is, instead, saved. When traversing back up, the CGUp function is invoked as follows: CGUp (n, pointer, CG.sub.-- CHECKVAL); As shown, when coming back, the value is checked. If, for example, a chain of additions occur, the following will happen. For each node encountered, on the downward traversal, "no value check" is specified but the "save value" will be propagated. On the return trip up, nothing will be checked until the result node. At that point, the system will compare the starting address (that was saved) with the end result, for making sure that both are within the same block. Additional reference material illustrating a preferred user interface and general operation of the present invention is available from Borland Code Guard.TM.; User's Guide (Part No. LCG1110WWW21770, Borland International, Inc. of Scotts Valley, Calif.), which is appended herewith as Appendix C. While the invention is described in some detail with specific reference to a single preferred embodiment and certain alternatives, there is no intent to limit the invention to that particular embodiment or those specific alternatives. Thus, the true scope of the present invention is not limited to any one of the foregoing exemplary embodiments but is instead defined by the appended claims. ##SPC1##
|
Same subclass | ||||||||||
