Database server for handling a plurality of user defined routines (UDRs) expressed in a plurality of computer languages6223179Abstract User Defined Routines (UDRs), capable of being expressed in one or more languages, are handled by determining a language native to the UDR, for example, by looking up a system catalog. If a language manager associated with the native language has not been loaded already, the language manager is loaded into a server memory. If the UDR has not already been instantiated, the UDR is instantiated and initialized. Then an execution context for the UDR is created and the UDR is executed. Loading of the language manager is handled by a general language interface capable of initializing the language manager, loading the language manager, creating a language manager context, and executing the language manager. Claims What is claimed is: Description BACKGROUND
STATUS udrlm_XLANG_init( Performs any language specific
initialization. Is
udrlm_ldeac* Idesc); // descriptor to fill in called once at the time
of the first reference to
this language. Will fill in
the udrlm_ldesc
function pointers and
language specific field as
needed. All memory should be
allocated from a
system wide pool. Returns
FUNCSUCC on success.
STATUS udrLn_XLANG_shut( Performs any language specific
cleanup. Is called
udrlm_ldesc* Idesc); //Language descriptor once at the time of the
unloading of the last
routine that references this
language. Will free
resources allocated at
initialization time. Returns
FUNCSUCC on success.
STATUS udrlm_XLANG_parse( Parses the given external_name
string into a module
char* external_name, // external name from name key value and a symbol
name to be used by
user subsequent language functions_
The module argument
char* module, //module name is set to the module name
and any other information
char_ symbol); //symbol name in the string is parsed into
the symbol argument.
If there is no module name,
a unique string, e.g.,
"NULL", should be copied
into external_name. If
there is no extra
information in the external_name
string, the symbol argument
is set to a null string.
For safety, both arguments
should be the size of the
full external name attribute
of the SYS_PROCEDURES
table. This function is used
when initializing the
udrlm_mdesc structure
STATUS udr1m_XLANG_load( Performs any language specific
loading of the
proccache_t*rdesc, //routine descriptor specified routine and
module, e.g., linking shared
udrlm_mdesc*mdesc); //module descriptor objects and finding symbols.
If the mdesc reference
count field is zero, then
the structure is not fully
initialized and this is the
first reference to the
module specified in the
rdesc. In this case this
function will load the
module and fill in the mdesc
language specific field as
needed. After the module
is loaded this function will
be called again with
mdesc reference counts
greater than zero. In these
cases, the specific routine
should be loaded and the
rdesc language specific
field filled in
appropriately. (In the C
language for example, the
first call does a dlopen( )
and subsequent calls do
dlsym( )s to locate
functions in the module). All
memory should be allocated
from a system wide pool.
Returns FUNCSUCC on success.
Note 1: When external
modules are being loaded,
this function should take
pains to insure the
security and integrity of
the system. For example
in the C language, various
schemes for monitoring
file ownership and
permissions should be implemented
to prevent un-trusted
modules from being installed
STATUS udrlm_XLANG_unload( Performs any language specific
unloading of the
udrlm_mdesc* mdesc ); // module descriptor specified module, e.g.,
unlinking shared objects.
Returns FUNCSUCC on success.
STATUS udrlm_XLANG_context_open( This function will be called
before the first
proccache_t* rdesc, //routine descriptor execution of a User Routine
in a statement (e.g., at
udrlm_rinst** rinst); //instance descriptor the start of an SQL
statement, or when a new late-
(ref) bound routine is resolved). If
*rinst is NULL this
function will allocate a new
structure and any other
memory needed by the language
context for the UDR.
Otherwise this is a previously
allocated context
structure that is being
recycled from cache and will
be initialized for the first
use of the referenced
Routine. Memory should be
allocated from a session
pool. Returns FUNCSUCC on
success.
Note: This function allocates
both common and
language specific memory in
order to minimize the
number of allocation calls and
reduce memory
fragmentation. For instance,
logically, everything
but language state
initialization should happen on a
per-execution basis, but by
moving all this to a
per-instance function,
significant overhead may be
removed from most execution
loops.
STATUS udrlm_XLANG_context_close( Performs any language specific
cleanup after the
udrlm_rinst* rinst); //Instance descriptor final use of the context
structure. It will free
the rinst structure and
associated language specific
resources. Will be called
when cached descriptors
are removed. Returns
FUNCSUCC on success
STATUS udrlm_XLANG_execute( Sets up and executes a routine
using the arguments
udrlm_rinst* rinst, //instance descriptor given. The rinst will be
used to maintain state
void* args, //arguments information through multiple
calls to this routine.
void* rets); //return values The return value(s) will be
placed in the rots and
the return state will be
placed in the rinst on
successful execution. The
STATUS return indicates
the result of the execution
attempt, NOT the result
of the UDR itself.
STATUS reload_module( This function may be called from
SQL to reload a
ModuleName, //module to reinstall routine module. All
executions of UDRs referencing
Language) //language to use this module while this
function is operating will
continue to use the old
module until the end of the
statement. if the module was
not already installed
it will be loaded for the
first time.
STATUS replace_module( This function may be called from
SQL to replace a
OldModuleName, //module to reinstall routine module. All
executions of UDRs referencing
NewmoduleName, //new module to use this module while this
function is operating will
Language) //language to use continue to use the old
module until the end of the
statement.
When the first routine using a specific language is invoked, the language interface itself is loaded and initialized following these steps: 1) a udrlm_ldesc structure is located or created for the language. 2) The row for the language is selected from a SYSROUTINELANGS catalog. If there is no row for this language an error is returned. 3) The row specified by a "langinitfunc" attribute of the SYSROUTINELANGS table specifies the language initialization routine in the SYS_PROCEDURES catalog. 4) a standard C language load module operation is performed: a udrlm_mdesc structure and a temporary udrlm_rdesc structure are initialized from the SYS_PROCEDURES table. The built-in C language udrlm_clang_load_module() function is called to load the language initialization interface module. 6) The udrlm_XLANG_init() function referenced in the langinitfunc attribute is called to initialize the udrlm_ldesc structure. When the last reference to a UDR using a specific language is dropped, the language interface is shut down and removed. The steps that occur during the language drop are: 1) The language's udrlm_ldesc is removed from the language list to prevent new routines from using this language. 2) The udrlm_XLANG_shut() function is called to clean up resources. 3) The built-in C language udrlm_clang_unload_module() function is called to unload the language interface module. If any routines that reference the language under consideration are invoked subsequently, and the language entries in the SYSROUTINELANGS and SYS_PROCEDURES catalog have not been DELETED, the specified language module will be reloaded and initialized. If the catalog entries are DELETED, subsequent function invocations will attempt to load the language, but an error will occur when the language is not found in the catalog. Moreover, when all routines that reference a module have been dropped, the module will be removed as well. The steps taken during a module DROP are: 1) The udrlm_unload() function is called for a routine. 2) If this is the last reference to the specific module it will be removed by calling the language's udrlm_XLANG_unload_module() function. 3) If this is the last reference to the language a. The udrlm_XLANG_shut() function is called. B. The udrlm_ldesc is freed. C. The built-in C language udrlm clang_unload_module() function is called to unload the language interface module. D. The language module's udrlm_mdesc is freed. Turning now to FIG. 3, a representative system procedure (SYS_PROCEDURE) table is shown. The SYS_PROCEDURE table has a number of arguments, argument type column, return type column, language type column and language specific declaration column. Using the SYS_PROCEDURE table, the UDR determiner 184 can look-up specifics on the function being analyzed, including the language type, the number of arguments accepted, the type of arguments to return, and other language/routine specific information. Referring now to FIG. 4, a query execution process 200 is illustrated. In the process 200, the client initially sends an SQL command to the server 110 (step 202). Next, the server 110 parses the command (step 204). Once parsed, the query is optimized in step 206 by an optimizer. Next, the instructions for performing the requested query are executed (step 208) before the process 200 exits (step 210). Turning now to FIG. 5, the parse process 204 (FIG. 4) is shown in more detail. Initially, the process 204 identifies the function being invoked from an SQL command (step 222). Next, the process 204 resolves the function call to a particular instance using the SYS_PROCEDURE table of FIG. 3 (step 224). The parse process 204 then identifies the particular language being invoked (step 226). Next, the process 204 determines whether the desired language manager already has been loaded in memory (step 228). If not, the desired language manager is loaded (step 230). Alternatively, if the language manager has been loaded, the process 204 proceeds to step 232. From step 228 or 230, the parse process 204 proceeds to step 232 where the language manager determines whether the invoked function already has been loaded into memory. If not, the process 204 instantiates the desired function (step 234) and performs specific function initialization as needed (step 236). From step 236 or step 232, in the event that the function being invoked already has been loaded, the routine 204 creates an execution context (step 238). Next, the parse function 204 exits (step 240). FIG. 6 illustrates details of interactions between the routine manager 188 and language managers 190-192. The managers 188, 190 and 192 interact via a plurality of middle layers, including a routine post-determination layer 302, an instance of the post-determination layer 303, a late bound execution layer 320, an iteration execution layer 321 and a cleanup layer 330. The interaction among the layers and managers is accomplished using data structures, as discussed below. The Routine Descriptor (RDESC) structure is used by LM components in all sessions to save static information about a particular routine. The RDESC structure is allocated in system-wide shared memory and cached for reuse. The Routine Instance (RINST) structure describes the context of a routine sequence in an SQL statement. The RINST structure contains references to the routine's static, state, and dynamic data for each execution. Each routine invocation logically has a distinct RINST, but for efficiency, these structures may be cached and recycled between statements on a session wide basis. The Language Descriptor (LDESC) structure is used to vector LM operations to the appropriate language functions. The LDESC structure is filled in by language specific code when the language is initialized and linked into the languages list by the Language Manager. The Module Descriptor (MDESC) structure is used to describe a single UDR module, each of which may contain many routines. The MDESC structure contains public information, including the name of the module, and private information used only by the module's language functions. The MDESC structure is filled in when the module is loaded and is linked into the module list by the Language Manager. Turning now to the post-determination middle layers 302 and 303, for each routine, the post-determination layer 302 supplies a key to a language manager load routine 304, which in turn loads an RDESC structure. After the LM load routine 304 has been executed, the RDESC structure is passed into a language initialization module 306. After the RDESC data structure has been initialized, the RDESC structure is provided to a load module 308, after which the RDESC structure is fully initialized. The initialized RDESC structure then is provided to the post-determination middle layer 303 where, for each instance, a language manager context open module 310 and a context open module 312 processes the routine descriptor RDESC structure. From the post-determination layer 303, the fully initialized RDESC structure is provided to the LM context open module 310, which in turn generates a RINST data structure. The RINST data structure is initialized at this stage before it is passed back to the post-determination layer 303. Upon completion of processing from the post-determination layer 303, the routine manager layer 300 provides the RDESC structure to an execution middle layer which further can be divided into a late-bound execution layer 320 and an iteration execution layer 321. In the case of the late-bound execution layer 320, the routine instance data structure is provided to a language manager shared state module 322. The shared state module 322 copies state information into the RINST data structure. The updated RINST data structure is provided back to the late bound execution layer 320. Additionally, for each iteration, the routine instance RINST data structure as well as arguments are provided to an LM execution routine 324. The resulting RINST and outputs from the LM execution routine 324 are provided to an execution routine 326 which returns the output back to the iteration execution unit 321. From the iteration execution unit 321, the routine manager layer 300 communicates with the cleanup layer 330 which, for each instance, provides the RINST data structure to an LM context close module 332 for deallocating memory and data storage associated with various data structures. The RINST data structure then is passed to a context close module 334 which provides status information back to the cleanup layer 330. Referring to FIG. 7, a UDR language manager entity relationship diagram is illustrated. In this diagram, four possible relationships 380-386 exist. The relationship 380, exemplified by two parallel lines, indicates a one-to-one relationship. The relationship 382, indicative of a one-to-many relationship, is illustrated by a line next to a left arrow. The relationship 384, indicative of a zero to many mapping, is shown as a circle next to a left arrow. Finally, a zero-to-one relationship of the relationship 386 is shown as a zero next to a vertical line. The UDR language manager entity relationship is discussed in the context of relationships 380-386. In FIG. 7, a Storage Module 372 may contain one or more routines, each represented by a Routine Key 350. a Language Module 370 may reference one or more Storage Modules 372. Each Storage Module 372 has exactly one Module Descriptor 374, and each Language Module 370 has exactly one Language Descriptor 376. The Language Descriptor 376, in turn, has a Module Descriptor 374, which may be used by one or more languages. a Routine Descriptor 352 has one Module Descriptor 374. The Module Descriptor may be referenced by one or more Routine Descriptors 352. Each Routine Descriptors 352 contains one Argument Description 354 and one Return Description 356. a Routine Descriptor 352 may be contained in zero to many Routine Instances 358. Each Routine Instance contains one Routine Argument list 366, one Routine Returns list 364, one Routine State set 362, and one Routine Context 360, none of which are referenced by any other Routine Instance 358. An exemplary C language interface, an instance of the General Language Interface that supplies the function calls, is described below.
STATUS udrlm_clang_init( Fills in the udrlm_ldesc. The
other functions in
udrlm_ldesc*); //language descriptor this section are entered into
the structure for use
by the Language Manager.
These functions are
statically linked into the
CPU module so no loading is
necessary. Returns FUNCSUCC
on success.
STATUS udrim_clang_shut( a NOP because the C language
udrlm_ldesc structure
udrim_ldesc*); //language descriptor is not to be cleared.
STATUS udrlm clang_parse( Parses the module name from the
given external_name
char*, //external_name from user string. The external_name
argument is set to the
char*, //module_ name module_name and the entry
point portion of the
char*); //symbol_name string is copied into the
symbol_name argument. If
there is no module_name, the
string "CNULL" is
copied into module_name. If
there is no entry point
string, the symbol_name
argument is set to a null
string.
STATUS udrlm_clang_load_module( Loads the routine descriptor and
module descriptor
udrlm_rdesc*, //routine descriptor
udrlm_mdesc*); //module descriptor
Since the server language, in this case C, is the basis for the Language Manager itself, these functions are built-in to code on the server 110 rather than being loaded dynamically. Moreover, unlike other languages, the C language initialization function is called at system start-up, rather than at language loading time. The techniques described here may be implemented in hardware or software, or a combination of the two. Preferably, the techniques are implemented in computer programs executing on programmable computers that each includes a processor, a storage medium readable by the processor (including volatile and nonvolatile memory and/or storage elements), and suitable input and output devices. Program code is applied to data entered using an input device to perform the functions described and to generate output information. The output information is applied to one or more output devices. Each program is preferably implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Each such computer program is preferably stored on a storage medium or device (e.g., CD-ROM, hard disk or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described. The system also may be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Other embodiments are within the scope of the following claims.
|
Same subclass Same class Consider this |
||||||||||
