Self-modifying data flow execution architecture6772395Abstract A self-modifying data flow architecture for computer-readable structures, such as markup language, is modeled as a network of interconnected processing elements, each having a data input and a transformation input. Each processing element generates output by applying the transformation input to the data input. The output of one processing element may be provided as either a data input or a transformation input to another processing element. The resulting architecture provides a network of interconnected processing elements which are modified dynamically depending on the data flowing through the overall process. Claims What is claimed is: Description TECHNICAL FIELD
<data>
<item>
<fiie>item1.htm</file>
<description>First item</description>
<author>Michael</author>
</item>
<item>
<file>item2.htm</file>
<description>Second item</description>
<author>Phani</author>
</item>
</data>
A powerful tool for formatting XML documents is Extensible Stylesheet Language (XSL). XSL essentially provides a series of templates that can be used to match some aspect of an XML document, usually by matching defined units, or nodes, in the document. These templates apply patterns to the input XML stream that transform it to an output stream. Importantly, XSL contains a number of functions for making conditional comparisons, sorting, and performing group operations, and thus provides, among other things, a way to transform the order in which elements appear in the input document. Also importantly, XSL is written in XML itself. Thus, transformations may be performed on XSL and the same parser that manipulates XML data can also, with appropriate XSL processing features, reference, retrieve, and manipulate the XSL. XSL is thus a transformational language. It can take an XML document (or a rigorously valid HTML document) and convert it to another XML document, an HTML document, a printable HTML document, a standard ASCII text file, a proprietary text format, or conceivably even a binary representation. Processing element 210 also includes a transformation input 214,which in this exemplary case is a reference, <xe:transform src="indexxform.xsl"/> to an XSL tree which has the following content:
<transform>
<xsl:template xmins:xsl="uri:xsl">
<xsl:for-each select="item">
<a><xsl:attribute
name="href"><xsl:value-of
select="file"/></xsl:attribute>
<xsl:value-of select="description"/></a>
</xsl:for-each>
</xsl:template>
</transform>
A set of HTML tags are included in the XML tree 200. The structure of the input XML tree 200 is important for reasons that will be later described with respect to an exemplary execution engine according to the invention. The execution engine will process the input XML tree 200 to create an output tree. As will be described in more detail below, during the execution process, tags which do not define processing elements 210 or 220 are copied directly to the output tree, preserving the order and structure of the input tree. FIG. 2B is a functional block diagram of the implementation of FIG. 2A. Each processing element 210 and 210 may be represented by a symbol which is provided with a data input 212, 222 and a transformation input 214, 224. Each processing element 210, 220 generates a respective output 216 and 226. FIG. 2C is an exemplary output generated from the input tree 200 of FIGS. 2A and 2B. As the execution engine parses the input tree structure, when a processing element 210 or 220 is encountered, the engine places the respective output "out1.xml" and "out2.xml" in the appropriate position in the output tree 230. Thus, the output 216 corresponds to the XML data in "index.xml" expressed through the transformation defined by "indexxform.xsl." Similarly, the output 226 corresponds to the XML data in "bookinfo.xml" expressed through the transformation defined by "bookxform.xsl." The exemplary input tree of FIG. 2A-2C is useful to introduce the general operation of the processing elements according to the invention. Notably, each of the processing elements 210 and 220 are top-level processing elements, in that they each typically would represent a region of an HTML page. FIGS. 3A and 3B illustrate the self-modifying aspects of an exemplary architecture according to the invention. Here, the output 316 of one processing element 310 is wired to as the transformation input of another processing element 330. The wiring structure represented in FIG. 3B is accomplished through the structure of the input tree or node 300 shown in FIG. 3A. The upper level; processing element 330 includes a child processing element or node 310 defined therein. Child processing element 310 is provided with a data input 312 which references an XML file such as "awardtemplate.xrnl," an exemplary listing of which is provided in the APPENDIX to this specification. Child processing element 310 is also provided with a transformation input 316 which references an XSL file such as "bind.xsl," an exemplary listing of which is also provided in the APPENDIX to this specification. For example, bind.xsl is a style sheet that converts an XML file into a form that can be used for the processing input in the tree. In this specific case, it converts a meta-language into XSL for further use throughout the processing. Those of ordinary skill in the art will recognize that the code listing for bind.xsl is just one example of code that could be used to convert a meta language into XSL. Child processing element 310 is bounded by transformation input tags 311, which indicate that the output of child processing element 310 provides the transformation tree for the upper level processing element 330. The data input 320 for the upper level processing element 330 is provided by reference to "awards.xml" which includes the following tree structure:
<data>
<award>
<name>BN Discover</name>
<search><<;search>; BND.xml<;/search>;
</search>
</award>
<award>
<name>American Book Awards</name>
<search><;search>;BND.xml<<;/search>;
<;xform>;abatemplate.xml<;/xform>;</search>
</award>
<award>
<name>Booker Prize</name>
<search><;inactive>;ABA.xml<;/inactive>;</search>
</award>
<award>
<name>National Book Awards</name>
<search><; inactive>;NBA.xml<;/inactive>;
</search>
</award>
<award>
<name>Nobel Prize</name>
<search><; inactive>; nobel.xml<;/inactive>;
</search>
</award>
</data>
The data flow reflected in the tree structure in FIG. 3A is illustrated in FIG. 3B. In processing the input tree 300, an exemplary execution engine according to the invention, as will be described below, first evaluates the inner most processing element 310. The results of this evaluation are placed in the output tree. If the engine recognizes that another processing element, here 330, contains the inner processing element 310, then the results of the evaluation of the inner processing element are processed by the engine as part of the input tree, using recursive techniques. This evaluation process is further illustrated with reference to FIGS. 4A-4D. FIG. 4A illustrates an input tree 400 containing nested processing element 410. Processing element 410 is nested within the data input of the top-level processing element 420 and includes a transformation input "getdistinct.xsl" which accomplishes the function of transforming an input tree of elements into an output tree of distinct elements, i.e., repetitions in the input data are eliminated. As will be recognized by those of ordinary skill, the transformation input "getdistinct.xsl" may involve conventional methods for determining distinct elements from an input set. For example, one approach for collecting distinct elements would involve sorting the input tree by element name, setting a variable representing the previous element to null, traversing the tree, emitting elements into the new tree when the current element is not the same as the previous element, setting the previous element to the current element, and setting the current element to the next element in the tree and repeating the process. Top-level processing element 420 includes a transformation input "process.xsl" which creates the HTML for an input data tree. The data flow can be represented schematically as in FIG. 4B. FIG. 4C illustrates the output tree from processing element 410 after the deepest processing element 410 is evaluated by the execution engine. The output tree 440 is an XML tree which contains a listing of the unique record elements of the data input for processing element 410, delimited by the <unique> and </unique> tags. During the first pass through the input tree, the execution engine copies the input nodes of the top-level processing element 420 into the output tree. Similarly, FIG. 4D illustrates the output tree after the top level processing element 420 is evaluated, where the transformation "process.xsl" provides an HTML layout for the output tree 440. It will be appreciated by those of ordinary skill that an architecture according to the invention will be characterized by self-modifying aspects related to the use of an output of one processing element as the transformation input of another processing element. As such, each processing element may be modified by an appropriate transform depending on the input data. Thus, the processing that occurs is modified based on the data being processed. The invention also contemplates processing elements which expose particular schema for permitting authoring tools and services to query the types of information and formats expected by the inputs and outputs of the processing elements. In an XML implementation, the schema for a particular processing element provides a description of the expected format of the input tree and the transform tree and describes any methods available on the processing element. Since the schema is represented in XML itself, it can be examined and processed by the execution engine and by authoring tools. For example, an authoring tool might indicate to a user that one processsing element outputs row names and another processing element expects row names as inputs. The authoring tool could then naturally associate these two processing elements together. The invention contemplates a mechanism for permitting a user to expose a property set for one or more processing elements. Such a mechanism provides a user interface for presenting the processing elements and associated property sets to a user to enable the user to functionally connect processing elements. According to the invention, such a mechanism first identifies a processing node or element by first traversing through the input tree. Then, the processing node or element is navigated to determine the node's child elements. The child elements are then displayed to the user in a tree format, for example, by constructing an HTML view. Alternatively, the processing node may be examined to determine if it contains an interface schema. If it does, the interface schema is parsed and the information therein is displayed to the user. As an example, the interface schema description could be provided using a formal standard such as the XML Schema proposal or a custom descriptive language, for example, as follows:
<schema>
<comment>Finds the first count items in the database with
a particular color</comment>
<inputs>
<item name="color" dt:dt="string" count="1"/>
<item name="database" dt:dt="string" count="1"/>
<item name="count" dt:dt="int" count="+"/>
</inputs>
<outputs>
<structure count="*">
<item name="name" dt:dt="string"/>
<item name="price" dt:dt="int"/>
</structure>
</outputs>
</schema>
The invention also contemplates automatic user interface generators. Such user interface generators may employ a technique that would involve the following general steps: First, foreach element in the inputs section, an HTML input field is created. For example, for the above schema, input fields for the "color," "database" and "count" items are created. The resulting created HTML for the user interface would appear as: <html><body ><form > <input name="color" type="text"> <input name="database" type="text"/> <input name="count" type="text"/> </form ></body ></html > Thus, the generated HTML may provide a user interface which will allow the user to enter values for the various inputs to the various widgets. Furthermore, the invention contemplates tools that may examine the inputs and outputs sections of each of the nodes and create suggested hookups. For example, suppose there were two of the above nodes in the tree. The tool could note that the input "name" in one node might logically receive a value from the output "name" of another node--because they have the same name and type, and that furthermore, the "price" and "count" could be connected because they have the same data type. The "price" and "count" connection would be considered less likely and could be diminished in importance using a variety of known standard UI techniques, because the match is only on type, not on type and name. The invention also contemplates an architecture which implements a message bus providing input to the processing elements. Referring to FIG. 5, an exemplary architecture according to the invention may include processing elements 510 and 520, each of which has respective data inputs 512, 522 and respective transformation inputs 514, 524. In addition, message bus inputs 516, 526 are provided on the processing elements for receiving messages sent on the message bus 550. In an exemplary implementation, the message bus inputs 516 and 526 are provided in the form of queries within the tree defining the processing elements 510, 520. FIG. 6A illustrates an XML tree with a message query 610. In this example, the first element in the "unique" tree will be replaced if there is a message on message bus 550 that matches the query "firstitem." That is, if any element at the root of the message bus is named "firstitem." The message query 610 in this example is provided using XSL pattern syntax. However, it will be recognized by those of ordinary skill that message query 610 could use any query syntax, such as SQL. It will be recognized that any element in a particular tree may have messages associated with it. FIG. 6B illustrates another exemplary XML tree in which three of the elements within the "unique" element in the data tree have message queries. In this case, two elements, the "apples" and "oranges" elements are associated with the same message query "firstitem," while the other element "bananas" is associated with a different message query "seconditem." Also, the "unique" element itself may be replaced if there is a message matching the query "unique" on the message bus 550. FIG. 6C illustrates another exemplary XML tree with more expressive message queries. In this example, the "unique" tree will be replaced if there is a message with a "unique" element having an identification attribute greater than 10 or if there is a "search" element with a "unique" child. Those of ordinary skill in the art will recognize that messages may be generated by a variety of different mechanisms. For example, messages may be generated in response to events occurring in a particular system. Messages may be posted to a page on a server or used to establish or preserve state. For example, consider a web page where the customer is placing an order. The server constructs a set of HTML enabling the user to place an order. The user interacts with that HTML and causes a response to be sent to the server. Because the web is by nature a stateless disconnected system, the response sent to the server needs to send a complete set of information enabling the server to reestablish state (the state of placing a specific order for a specific customer). In a similar fashion with the architecture discussed here, the placing of an order can be seen as a message entering the tree. The message can cause some type of execution. The message can also have enough information passed with it to indicate the name of the user, the order being placed, how it relates to previous order processing and so forth, so that the execution tree can reestablish a previous condition. The exemplary messaging described above permits dynamic modifications to the overall process defined by the processing elements in a particular tree. For example, the data source may be changed based on messages that are generated on the message bus 550 in response to particular user actions, i.e., mouse clicks, on a user interface. It will also be recognized that messaging by the invention represents a loose "handshake" in which the message specifies services and the processing elements may select which services to respond to based on the way the processing element is configured for messaging. This contrasts sharply with known "tight handshake" mechanisms such as RPC (Remote Procedure Calls) in which one program can request a service from a program located in another computer in a network without having to understand network details. The messaging employed by RPC's, however, requires that a globally unique identifier target be specified and thus, the message specifies a specific method on a specific object. There are many advantages to a loose handshake. In particular, each side of the handshake need not understand details of the other side. The web page, for example, could request that a search occur. It need not understand whether a single or several modules react to the search. The server side could change the underlying search approach, add new search modules, and so forth, without having to notify the client. Thus, there is more flexibility. Furthermore, the loose approach enables a network of components to react to a request. For example, several parts of the tree can react to a search request and alter data flow. With a tight handshake, such reconfigurations become very complex and code intensive. FIG. 7 illustrates an exemplary execution process for an execution engine for processing a data flow architecture according to the invention. At step 710, the input tree is loaded by the execution engine. The process then proceeds to step 711, where the execution engine determines the deepest node level and then, at step 712, finds all of the child nodes at that current level. At step 714, the execution engine examines the first node at the current level and makes a determination as to whether or not the node is a processing element node. If not, the process, at step 730 copies the node to the output tree and proceeds to step 732 where a determination is made as to whether or not more child nodes exist at the current depth. If so, the process proceeds to step 734 to evaluate the next node and then branches back to step 714. If no more child nodes exist at the current depth, the execution engine unwinds the recursion to the next level up and branches to step 712. If at step 714, the execution engine determines that the currently examined node is a processing node, the process branches to step 720 where the processing node's input trees, i.e., the data input tree and transformation input tree, are traversed to determine the presence of message hook-ups or message queries. If message queries are found at step 722, the process branches to step 724 where the data input node or transformation input node are replaced with the message tree corresponding to the matched message along with any subtrees. The process then branches to step 726, where the transformation input tree is executed against the data input tree to generate a result tree. If at step 722, no message hookups are found, the process bypasses step 724 and proceeds directly to step 726. After the result tree is generated at step 726, the processing node is replaced with the result tree at step 728 and the process continues to step 732. At step 732, a determination is made as to whether or not more child nodes exist at the current depth of recursion. If not, the process branches to step 736 to determine whether or not the current depth is at the top level of the input tree, if so, the process terminates; if not, the process at step 738 unwinds the recursion from the current depth up one level and then branches to step 712 where all of the child nodes at the next higher level are found. The invention also contemplates a caching system based on the above-described architecture. The declarative nature of the data tree, transformation tree, message tree and processing element output trees, provides for efficient dependency analysis on the architecture to determine which processing elements must be reevaluated upon a change in data, transformation input or messages. Specifically, after an architecture is evaluated for the first time, a cache may be provided for storing the evaluation of each individual processing element. Upon a change in data, messages, or transformation input, the reevaluation of particular processing elements can be controlled based on the dependency analysis, thus avoiding the unnecessary reevaluation of processing elements which have not changed. As an example, consider the tree shown in FIG. 2A. The end result of executing the tree is an HTML file containing a table with two rows, as shown in FIG. 2C. Note that the first row depends upon data coming from "index.xml" and that the second row depends upon data coming from "bookinfo.xml." When executing this tree, the results of the first row can cached and the results of the second row can be cached. If the tree is reexecuted and "bookinfo.xml" and "index.xml" have not changed, and the transformation inputs have not changed, both rows can be retrieved from the cache. If "bookinfo.xml" has changed, the "bookinfo" processing element 220 is reevaluated and thus the associated row is recomputed. Yet, if "index.xml" has not changed, the associated row can be retrieved directly from the cache without performing any computations. The caching features of the invention may be accomplished by various known techniques, for example those employed in spreadsheet dependency analysis. Because the dependencies between nodes are specified declaratively, a dependency structure may be built such that, given a particular node, the nodes dependent on that node may be indicated, That is, the nodes that would be affected by a change to a given node may be determined. For example, one could construct a tree where the tree contained one node for each processing entity node in the source tree, and where each of these nodes had a set of pointers to the other nodes in the tree affected by their changes and the current cached result of evaluating the node. Under the above described dependency analysis, when a particular node is changed, the result of the entire tree may be regenerated according to the following general steps, for example: first, the node that has changed is found. Then, all of the nodes linked to the changed node are traversed recursively. During this traversal, each time a linked node is encountered, that linked node needs to be reevaluated rather than being retrieved from the cache and all of its dependent nodes then need to be reevaluated. In this manner, a change will cause updates to all of the nodes affected by the change, but not to any nodes not affected by the change. After all of the results that need to be reevaluated have been updated, the complete tree evaluation can be found by simply aggregating the current cached results from the tree. The dependency of a particular node can be stored and caching controlled according to the stored dependencies. As will be recognized by those of ordinary skill, the invention contemplates more complex examples involving more levels of caching. Moreover, the invention contemplates caching based on changes that occur due to messages on the message bus. Such an approach is similar to the caching approach discussed above. A dependency tree is built that indicates what needs to be reevaluated every time there is a change. When an item changes on the message bus, the corresponding node is found in the dependency tree, and the items that need to be updated are updated. Although the invention has been described above, it should be appreciated that a variety of modifications will be readily available to persons utilizing the invention. The foregoing description is not intended to be limiting, but is merely illustrative of an exemplary adaptation of the invention. Other products, apparatus and methods which incorporate modifications or changes to that which has been described herein are equally included within this application.
APPENDIX
(Listing for "awardtemplate xml")
<data>
<table cellpadding="2" cellspacing="0" align="right" border="0">
<tr valign="top">
<td width="15"></td>
<td colspan="2" bgcolor="#006666">
<font size="-1" color="#FFFFFF" face="arial, helvetica, sans-
serif"><b>Awards</b></font></td>
</tr>
<tr valign="top">
<td width="15"></td>
<td colspan="2" bgcolor="#EEEECC"></td>
</tr>
<for-each src="award">
<tr valign="top">
<td width="15"></td>
<td width="10" bgcolor="#EEEECC"><img
src="round4_subj.gif" alt="."
width="4" height="4" hspace="3" vspace="6" border="0" /></td>
<td width="100%" bgcolor="#EEEECC"><font size="-1"
color="#000000"
face="arial, helvetica, sans-serif"><a
href="javascript:post(`$(search)`)"><data
src="$(name)"/></a></font></td>
</tr>
</for-each>
<tr valign="top">
<td width="15"></td>
<td colspan="2" bgcolor="#EEEECC"></td>
</tr>
<tr valign="top">
<td colspan="3"></td>
</tr>
</table>
</data>
Listing for "bind.xsl."
<transform>
<xst:template xmlns:xst="uri:xsl" xmlns:xsl="uri:foo">
<xst:script>
<![CDATA[
function generateToken(szTok, blnFunction) {
var szRes;
if ((szTok == null) .parallel. (szTok == " ")){
return" ";
}
//TODO: for demo purposes I'm finding the text value
//of whatever gets passed in. But, really we should let
//the object be passed in and change the item appropriately
//for the built in functions that require text.
if (blnFunction) {
szRes = "this.selectSingleNode(`" + szTok + "`).text";
}
else {
szRes = `<xsl:value-of select=`" + szTok + "`/>;
}
return szRes;
}
function findMatch(szStr, nStart) {
var nParen;
var nLen;
nParen = 1;
nLen = szStr.length;
for (i = nStart + 1; i < nLen; i++) {
if (szStr.charAt(i) == "(") {
nParen++;
}
else if (szStr.charAt(i) == ")") {
nParen--;
if (nParen == 0) {
return i;
}
}
}
return -1;
}
//Finds the next separating character, in this case a $
//We'll use $$ to escape the $ character so that it can be used within
strings
//Could have tracked when we were in a string, but doing it this way
allows us to
bind stuff
//into string content
function findNextSep(szStr, nStart) {
var nOff;
nOff = szStr.indexOf("$", nStart);
while (nOff != -1) {
//Is the next one a $ too?
if (szStr.charAt(nOff+1) == "$") {
//TODO: snarf one of the characters out of the string
nOff = szStr.indexOf("$", n Off+2);
}
else
return nOff;
}
return -1;
}
//Should have made this an LR parser; especially now that I need to
recognize
XSL keywords
function parseArgs(szStr, blnFunction) {
var szRes = " ";
var nDollar;
var nParenOpen;
var nParenClose;
var nSeparator;
var nStart = 0;
//Walk through each token
//Find the first $ in the string
nDollar = findNextSep(szStr, nStart);
//Copy all the stuff before the $
szRes += szStr.substr(nStart,nDollar);
while (nDollar !- -1) {
//Move ahead in the string
nStart = nDollar;
//Now find the (
nParenOpen = szStr.indexOf("(", nStart);
//Find the )
nParenClose = findMatch(szStr, nParenOpen);
//Make sure we have both
if((nParenOpen == -1) .parallel. (nParenClose == -1)) {
alert("malformed string" + szStr);
return null;
}
//Do we have a function?
if (nParenOpen > nStart + 1) {
//If we are not in a function
if (!blnFunction) {
//Add the first eval tag
szRes += "<xsl:eval>";
}
//If we are in a function already, just emit the
function
name
szRes += szStr.substr(nStart+1, nParenOpen - nStart);
szRes += parseArgs(szStr.substr(nParenOpen + 1,
nParenClose - nParenOpen - 1), true);
//Now add the ending )
szRes += ")"
if (!blnFunction) {
szRes += "</xsl:eval>"
}
}
else {
//Generate the token
szRes += generateToken(szStr.substr(nParenOpen +
1, nParenClose - nParenOpen - 1), blnFunction);
//Walk beyond the ending )
}
nStart = nParenClose + 1;
nDollar = findNextSep(szStr, nStart);
//Copy all the stuff in the middle
szRes += szStr.substr(nStart, nDollar - nStart);
}
//Copy the rest
szRes += szStr.substr(nStart);
return szRes;
}
function parse(str, flag) {
return parseArgs(str.nodeValue,flag);
}
function strip(str) {
return str.nodeValue.substr(1);
}
]]>
</xst:script>
<xsl:template><xst:attribute
name="xmlns:xsl">uri:xsl</xst:attribute>
<xsl:script>
<![CDATA[
function discount(ret, disc) {
nRet = ret;
nDisc = disc;
res = formatNumber(nRet - nDisc, "$#.00") + " " +
formatNumber((nRet-
nDisc)/nRet,"(##%)");
return res;
}
]]>
</xsl:script>
<xst:apply-templates select="*"/>
</xsl:template>
<xst:define-template-set>
<xst:template match="*">
<xst:copy>
<xst:apply-templates select="@*"/>
<xst:apply-templates/>
</xst:copy>
</xst:template>
<xsl:template match = "textnode( )">
<xsl:value-of/>
</xsl:template>
<xsl:template match = "cdata( )">
<xsl:value-of/>
</xsl:template>
<xst:template match="@*">
<xsl:attribute><xst:attribute
name="name"><xst:node-
name/></xst:attribute>
<xst:eval no-entities="t">parse(this,false)
</xst:eval></xsl:attribute>
</xst:template>
<xst:template match="data">
<xst:for-each select="@src">
<xst:eval no-entities="t">parse(this,false) </xst:
eval>
</xst:for-each>
<xst:apply-templates/>
</xst:template>
<xst:template match="for-each">
<xsl:for-each>
<xst:attribute name="select"><xst: value-of
select="@src"/></xst:attribute>
<xst:apply-templates/>
</xsl:for-each>
</xst:template>
</xst:define-template-set>
</xst:template>
</transform>
|
Same subclass Same class Consider this |
||||||||||
