Case-insensitive custom tag recognition and handling6675354Abstract A method for processing custom tags in a document object model (DOM) representation irrespective of the case in which the tags are authored. In an illustrative embodiment, a document object model (DOM) tree is processed to identify custom tags. Upon encountering a custom tag, an appropriate tag handler (e.g., a Java object, an XSL stylesheet, or the like) is invoked. According to the invention, a tag recognition routine is used for recognizing and handling case-insensitive custom tags. As a servlet engine is examining a tag name, if the name does not match one of the registered tags, the routine converts the name to neutral case. If the tag recognition routine recognizes the name as one of the case-insensitive tags, it converts the attributes to the appropriate case, and hands the resulting element off to a correct tag handler for processing. Claims What is claimed is: Description This application contains subject matter protected by Copyright Law. All rights reserved.
<jsp:root xmlns:sample="/xsp/sample/sample-taglib.xml">
</jsp:root>
Here, a namespace "sample" is defined by the relative URL of
"/xsp/sample/sample-taglib.xml".
The content at the URL would look as follows:
<?xml version="1.0"?>
<taglib>
<tag name="currentTime"
class="xsp.sample.CurrentTimeTagBean"
dtd="currentTime.dtd"/>
<tag name="currentTimeXSL"
styleSheet="http://localhost/xsp/sample/currentTimeCustomTag.xsl"
dtd="currentTime.dtd"/>
</taglib>
This registers a Java TagBean to handle sample:currentTime tags and registers the currentTimeCustomTag.xsl as an XSL handler for sample:curentTimeXSL. As a bonus, both register currentTime.dtd to verify the correctness of the sample:currentTime and sample:currentTimeXSL tags before the formatting of those tags occurs. Returning now to FIG. 4, the routine then continues at step 410. In particular, any additional tag libraries are gathered and registered with a main tag handler. According to the present invention, a tag handler is a process that is used to hand-off execution (of a custom tag in the DOM) to some other process, e.g., a Java object (through a custom tag interface), or via an XSL stylesheet. The routine then continues at step 412 by inserting a _jspServicemethodDefinition as the last child of the jsp:root element. The methodDefinition preferably has the content of most of the servlet code. As will be described in more detail below, the jsp:block element is appended to the methodDefinition to begin the servlet's definition as a Java code block by default or with as the value of the jsp page directive attribute (if it is specified). The main processing of the routine begins at step 416. At this step, a test is made to determine whether the DOM tree has any custom tags that have not been processed. If so, the routine branches to step 418 to locate preferably the left-most leaf node of the tree that satisfies its requirements as a custom tag. The routine then continues at step 420 to invoke an appropriate tag handler (e.g., a Java object, the XSL stylesheet, or the like). Two variations of step 420 are illustrated in the flowcharts of FIGS. 5-6. As will be seen, the appropriate tag handler is given a current tag element and is expected to process it. Typically, a new DOM is returned from the tag handler and is then processed for new tag libraries. At step 422, the new tag libraries are registered in the tag library registry. The routine then continues at step 424 to gather all new jsp:directive tags. The routine then returns to step 416. This loop continues until all custom tags have been processed in a like manner. When the outcome of the test at step 416 is negative, the routine branches to step 426 to collapse the resulting DOM tree into as few as possible method blocks. This operation is illustrated in FIG. 10. The routine then continues at step 428 to generate the servlet by interpreting scriptlet, expression, declaration and methodDefinitions. Thus, for example, at step 428, the routine interprets the DOM by replacing "jsp:scriptlets" with inlined code, by replacing "jsp.expressions" with prints, by replacing "jsp:declarations" with variables and method definitions, by replacing "xsp:methodCalls" with method calls, and by replacing "xsp:methodDefinitions" with method definitions. As described in FIG. 2, the resulting servlet is then compiled and class loaded to be ready for use by the runtime code. Custom Tag Definition Preferably, the DOM tree identifies custom tags as follows. As noted above, the JSP 1.0 specification included a tag library mechanism that defines how to plug in a tag. The specification, however, left the details of the taglib mechanism completely open, with the exception that a url must be used to specify the location of the taglib. According to one aspect of this disclosure, this abstract concept has been unified with an XML namespace and mapped into an XML file. In addition, case-sensitive language support has been added as will be seen. In an illustrative embodiment, the Document Type Definition (DTD) for a taglib according to the present invention is:
<!ELEMENT taglib (tag)*>
<!ELEMENT tag EMPTY>
<!-- Note: either class or styleSheet must exist (but preferably not
both) -->
<!ATTLIST tag
name CDATA #REQUIRED
class CDATA #IMPLIED
styleSheet CDATA #IMPLIED
caseSensitive (truelfalse)"true"
dtd CDATA #IMPLIED>
This data structure defines that a tag library is composed of zero or more tags. Each tag is defined with a name and optional attributes class, styleSheet, and dtd. A tag must have either a class or stylesheet attribute (but preferably not both) to specify a correct tag handler. The value of name is used to identify what tags it handles. If a tag's name is"scriptlet" and its taglib prefix is "jsp", then this rule handles any tag named "jsp:scriptlet". The class and styleSheet attribute dictate which of the two mechanisms are used to handle a custom tag. If the rule specifies a class, then a Java object, e.g., an object satisfying a custom tag interface, is used to process the tag. If the rule specifies a styleSheet, then an XSLT styleSheet is used to process the tree. The optional dtd parameter is used to verify the contents of the custom tag, for example, whether the tag has the proper attributes. Case-Sensitive Tag Handler XML is case sensitive. On the contrary, HTML and other known tag mechanisms are case-insensitive. As a consequence, web page authors often write tags in whichever case is most suited to the author. According to the present invention, the caseSensitive attribute is added into the taglib definition to facilitate handling of tag case sensitivities. In particular, when the servlet engine reads the taglib definition, preferably it registers two entries for a case-insensitive tag and one entry for a case sensitive tag. In particular, if the tag is not case-sensitive, the value of the name attribute is registered together with an altered (e.g., lower, neutral or upper) case version of the name prepended with a given symbol (e.g., "<") that is not a valid character in an XML attribute. A neutral case is any case that can be found or created deterministically by a given algorithm. If the tag is case-sensitive, however, only the name attribute is registered. The following is an example of this registration process for an illustrative taglib:
<!-- contents of case-taglib.xml for namespace "case:" -->
<taglib>
<tag name="FOO"
class="com.ibm.xsp.Foo"
caseSensitive="false"/>
<tag name="BAR"
class="com.ibm.xsp.Bar"
caseSensitive="true"/>
</taglib>
In the engine, the following attribute values are then registered under the prefix "case:" for the illustrative example: "FOO".fwdarw.TagHandler type class "com.ibm.xsp.Foo" "<foo".fwdarw.TagHandler type class "com.ibm.xsp.Foo" "BAR".fwdarw.TagHandler type class "com.ibm.xsp.Bar" As can be seen, there are two entries for the case-insensitive tag and one entry for the case-sensitive tag. The "<foo" tag will not to be registered because "<" is an invalid character in an XML attribute. Thus, in the preferred embodiment, a case-neutral tag is prepended with a character that preferably is invalid in XML, e.g., prepend "<"+lower-case(name). This algorithm allows the page author to have multiple definitions for similar names; e.g., "FOO" and "foo" can be defined, however, only one of definitions can be case-insensitive. As described above, the algorithm for processing custom tags includes the steps of: walking the DOM tree, locating the custom tags, and then invoking the appropriate tagbean handler for each identified tag. These were steps 416, 418 and 420 in FIG. 4. The following logic is used to recognize the case-sensitive custom tags and to call the appropriate tagbean handler according to the present invention. For convenience, the inventive functionality for processing the case-sensitive tags is set off in italics:
public boolean
isCustomTag(String prefix,
String tagName)
{
Hashtable hashtable = getPrefixTable(prefix);
if (prefix == null) {
//prefix is not registered
return false;
}
if (hashtable.contains(tagName)) {
return true;
}
String caseInsensitive = "<" + tagName.toLowercase( );
if (hashtable.contains(caseInsensitive)) {
return true;
}
return false;
}
public TagHandler
getCustomTagHandler(String prefix,
String tagName)
{
Hashtable hashtable = getPrefixTable(prefix);
if (prefix == null) {
//prefix is not registered
return null;
}
TagHandler handler = (TagHandler) hashtable.get(tagName);
if(handler == null) {
String caseInsensitive = "<" + tagName.toLowerCase( );
handler = (TagHandler) handler.get(caseInsensitive);
}
return handler;
}
In this code, the first italicized section is executed if the tag name is not in the hash table. Under this circumstance, the code creates a string caseInsensitive="<"+tagName.toLowerCase( ), and then checks to see if the resulting string is in the hash table. If so, the routine returns true; otherwise, the routine returns false. As noted above, if the tag is not case sensitive, there will be a lower case name attribute prepended with the invalid XML character as a result of the registration process. Thereafter, the tag handler routine is executed. This is the second italicized section above. If there is no handler in the hash table registered for the name, then the code builds a string caseInsensitive="<"+tagname.toLowerCase( ) as before. The routine then checks again to see whether there is a handler registered for this string name. If so, the routine returns the handler, which is then executed in the usual manner. According to the inventive route, attributes of the custom tag are also converted to an appropriate case and passed to the given tag handler. Given the above example, when the engine does not find "foo" (because only "FOO" is registered), the routine then looks for "<foo", finds it, and returns the proper tag handler. Thus, given the following representative XML taglib file, walking the DOM tree to invoke the appropriate tag handlers generates the following "conversation" between the servlet engine and its tag recognition routine:
<jsp:root xmlns:case="/xsp/case/case-taglib.xml">
<case:foo>foo</case: foo>
<case:BAR>bar</case:BAR>
<case:bar>bar</case:bar>
<case:FOO>foo</case:FOO>
</jsp:root>
Q. Is "case:foo" a custom tag? A. No. Q. Is "case:<foo" a custom tag? A. Yes. TagHandler="com.ibm.xsp.Foo" Q. Is "case:BAR" a custom tag? A. Yes. TagHandler="com.ibm.xsp.Bar" Q. Is "case:bar" a custom tag? A. No. Q. Is "case:<bar" a custom tag? A. No. Q. Is "case:FOO" a custom tag? A. Yes. TagHandler="com.ibm.xsp.Foo". The italicized code is the tag recognition routine. As the servlet engine is examining a tag name, if the name does not match one of the registered tags, the routine converts the name to neutral case. If the routine recognizes the name as one of the case-insensitive tags, it converts the attributes to the appropriate case, and hands the resulting element off to the correct tag handler (i.e., the target) for processing. The target tag handler is written to the standard tag handler API and behaves like any other tag handler. The present invention is quite advantageous. It enables the support of legacy tags (e.g., a server side include <SERVLET> tag) in a case-insensitive way even though XML is case-sensitive Using the present invention, the engine can take any case-sensitive tag, write an intermediary bean, and make the tag case-insensitive without changing any code in the case-sensitive tag handler. Other Tag Handling Routines As noted above, FIGS. 5-6 illustrate the preferred tag handling routines. As described, there are two types of tag handling: tags handled by XSLT (FIG. 5) and tags handled by Java code (FIG. 6). Each of these handlers will now be described in more detail. XSL Custom Tag Handler The XSL tag handler routine begins at step 502 by receiving the DOM element as input. At step 504, the tag handler then finds the appropriate stylesheet for the element as supplied by the taglib rules. The routine then continues at step 506 with the tag handler obtaining the element's parent document. At step508, the routine invokes a stylesheet processor on the document and stylesheet. Finally, at step 510, the tag handler returns the new document to complete the translation. Java Object Custom Tag Handler FIG. 6 illustrates a preferred operation of the custom DOM tag Java object handler. By way of brief background, as used herein, a"tagbean" is a Java object that implements a TagBean interface. Preferably, the interface according to the invention is as follows:
public interface TagBean
{
public void
process(Element element);
}
The TagBean interface defines a process method that takes an element in from the DOM tree and performs some function against that element. The context of the entire DOM tree is available to the process method for manipulation through the DOM APIs. The routine begins at step 602 with the Java tag handler receiving the DOM element as input. At step 604, the handler then obtains the appropriate tagbean for the element as supplied by the taglib rules. A number of tagbean routines are illustrated in FIGS. 7-9 and will be described in more detailed below. At step 606, the handler extracts an attributeList from the element. The routine then performs a test at step 608 to determine whether there are any unprocessed attributes. If so, the routine branches to step 610 to determine whether the attribute maps to a setter property on the tagbean. If the outcome of the test at step 610 is negative, control returns to step 608. If, however, the outcome of the test at step 610 is positive, the routine continues at step 612 to set the value of the attribute on the tagbean. Control then continues at step 614 to remove the attribute from the attributeList. Processing then continues back at step 608. Thus, for each attribute in the attributeList, the handler checks for a corresponding setter property on the tagbean. If a corresponding setter property exists, the value of the attribute is set on the tagbean and the attribute is removed from the attribute list. When the outcome of the test at step 608 indicates that all attributes have been checked against the tagbean, routine branches to step 616. At this step, the tagbean's process method is called given the DOM element so that it can manipulate the tree in whatever manner it deems fit. When tagbean.process( ) is complete, the new document is returned from the tag handler at step 618. This completes the processing. FIGS. 7-9 illustrate tagbeans that are useful in the present invention. DOM In, Text Out Tagbean FIG. 7 illustrates a simple DOM in, Text out macro that has the following class:
public abstract class SimpleTagBean implements TagBean
{
public abstract String
translateElement(Element element);
public final void
process(Element element);
}
SimpleTagBean is a class created to simplify the task of writing a tagbean. Using this class, the developer merely has to implement the translateElement method, which takes in a DOM element and returns the corresponding text macro expansion. In particular, the routine reads the DOM tree (e.g., using the DOM APIs), produces an XML block (typically a scriplet), and uses the XML block to replace the current element in the tree. This is advantageous to the writer of the tagbean because, using the invention, he or she does not need to know how to create new nodes and to populate them with values. All the writer has to do is create an XML expanded form of the element passed in. While this approach requires increased execution time at translation, translation only happens once every time the page changes; thus, the technique has little impact on server performance. The SimpleTagBean class works as demonstrated in the flowchart of FIG. 7. The routine begins at step 702 with the Java tag handler calls SimpleTagBean.process with the appropriate tag element. At step 704, the SimpleTagBean hands the element off to its subclass's "overwritten" translateElement method. In the translateElement method, at step 706, the subclass looks at the element and its sub-elements and attributes to produce a text macro expansion of the node. The routine then continues at step 708 with the text expansion being returned to the SimpleTagBean.process method. At step 710, the XML is parsed backed into DOM. At step 712, the top node of the new document object replaces the element that was passed in from the previous document. In particular, in step 712, the top node of the new DOM replaces the element that was passed into translateElement( ). This completes the processing. Text In, Text Out Tagbean FIG. 8 illustrates a Text in, Text out tagbean that may be used to isolate the developer from the DOM API. This is especially useful if the element contains only simple information or does simple query string replacement. A representative class is as follows:
public abstract class TextTagBean extends SimpleTagBean
{
public abstract String
translateText(String text);
public final String
translateElement(Element element);
public final void
process(Element element);
}
TextTagBean extends the SimpleTagBean functionality. In particular, the TextTagBean extends the SimpleTagBean class and implements the translateElement function to inherit the String V DOM output functionality. Instead of the developer writing translateElement, however, he or she now writes translateText. Referring now to FIG. 8, the routine begins at step 802 with the Java custom DOM tag handler handing the SimpleTagBean.process the appropriate element. At step 804, the routine hands the element off to the "overwritten" translateElement method. At step 806, the translateElement method converts the DOM directly into its corresponding XML. In particular, the TextTagBean.translateElement( ) takes the element and flattens it into XML without interpreting any of the XML. The routine then continues at step 808, with the XML then being passed to the translateText method of the subclass. At step 810, the translateText method reads the string and processes it to return a new XML string. In particular, translateText looks at the XML string and manipulates it to produce another text representation of it and returns this representation to TextTagBean.translateElement( ). At step 812, the new XML string is returned to the TextTagBean.translateElement, which returns the string to SimpleTagBean.process. SimpleTagBean.process finishes the processing at step 814, by turning the string into DOM and, at step 816, by replacing the previous element with the root of the new document. Thus, in step 816, the top node of the new DOM replaces the element that was passed into translateElement( ). This completes the processing. Multiple Scripting Language Blocks Another tagbean is illustrated in FIG. 9. This routine, called jsp:block, enables page developers to use multiple scripting languages in the same page. As will be seen, this enables people with different skill sets to add value to the same page. It also enables the developer to chose another language that might be more suitable for a specific job. The routine begins at step 902 with each jsp:block handed off to the JSPBlockTagBean. At step 904, the JSPBlockTagBean chooses the appropriate BlockTagBean according to the language attribute of the jsp:block element. At step 906, the language-specific BlockTagBean creates a methodDefinition element which, at step 908, is then filled with code to set up an appropriate runtime environment for the target language. At step 910, the methodDefinition element is inserted as a child of the root element in the document. The routine then continues at step 912 to create a methodCall element to replace the original jsp:block element. The present invention provides numerous advantages over the prior art. The inventive mechanism enables multiple scripting languages in a Java server page, and it enables a developer to embed one scripting language within another, which is a function that has not been available in the known art. DOM Tree Processing FIG. 10 illustrates a preferred routine for collapsing the DOM tree into the fewest possible methodCalls. The routine begins at step 1002 to test whether there are any unprocessed methodCalls in the document. If not, the routine is done. If, however, the outcome of the test at step 1002 is positive, the routine continues at step 1004 by setting a variable mc equal to the right-most unprocessed leaf node that is a method call. At step 1006, the routine sets a variable collapse equal to an attribute mc.getAttribute(collapse). At step 1008, the collapse attribute is checked. If this attribute is not true, control returns to step 1002. If the outcome of the test at step 1008 is positive, then the contents of the corresponding methodDefinition are expanded in place, and the methodDefinition and methodCalls are removed from the tree. In particular, the routine continues at step 1010 by setting a variable md equal to the methodDefinition for the methodCall. At step 1012, a test is run to determine whether any child nodes exist in the methodDefinition element. If not, the routine branches to step 1014 to remove mc from the document, and control returns to step 1002. If, however, the outcome of the test at step 1012 is positive, the routine continues at step 1016 to let c equal the last child node in the methodDefinition. At step 1018, c is removed from the methodDefinition. The routine then continues at step 1020 to insert c before mc in the document. Control then returns back to step 1012. This completes the processing. For optimization purposes, it is desired to verify context between multiple related XML tags in a DOM. One or more of these related XML tags are custom tags within the context of the inventive framework. By way of brief background, when processing a single custom tag element, that element may need access to all other related tags, processed and unprocessed, within the DOM. Unfortunately, however, there may be other unprocessed custom tags in the DOM that, when processed, would result in one or more related tags the current element is interested in. One solution to this problem is to pass some state information from the current element through the page handling engine. A preferred technique, however, is to use the DOM itself to indicate state. Clean-up Processing FIG. 11 is a flowchart illustrating this clean-up processing. The routine begins at step 1102 during the processing of the DOM tree with a current element being processed replacing itself with a placeholder element. The placeholder element includes attributes indicating its state. At step 1104, a test is performed to determine if a clean-up element already exists for the element being processed. If not, the current element then creates a clean-up element at step 1106. At step 1108, this clean-up element is added to the DOM in a position where it will be processed after all elements related to the current element have been processed. Thus, for example, the clean-up element is added to the DOM as a child node to the root position. If the outcome of the test at step 1104 indicates that such a clean-up element already exists, the current element need not create another clean-up element; rather, the current element need only move the existing clean-up element later in the DOM to ensure it is processed after any other related elements might be processed. This is step 1110. When the processing routine finally encounters the clean-up element, as indicated by a positive outcome of the test at step 1112, this element scans the entire DOM for all the related tags (now placeholders) of interest. This is step 1114. At step 1116, the clean-up element loads the state information from each and, at step 1118, processes them accordingly. When complete, at step 1120, the clean-up element removes itself from the DOM. In this way, the technique shifts processing from each individual element to a single, last-processed element. Thus, in the preferred embodiment, a two-pass solution is implemented. In the first pass, simple translation is performed on the tag, creating new tag place holders to be handled by a clean-up phase. For example, assume the DOM includes the following tags: system:macro1, system:macro2, and system:macro3. It is also assumed that each relies on specific information from other tags but not all the information is available until all of them have been touched once. On the first pass, system:macro1 expands to _system_macro1 and performs all the metadata expansion it can perform at this time to assist the clean-up node. At this time, it also inserts a system:cleanup in the tree as the last child of jsp:root (assuming it is not already there). The second pass is triggered when the clean-up node is hit. For proper processing, it should check to make sure the first pass has completed (no system:macro1 or macro2 or macro3 tags in the tree). If other clean-up nodes exist in the tree, it should remove itself from the tree and let the other nodes handle the clean-up later. Once the clean-up node has determined that the tree is in the correct state, it goes through all the artifacts left by the first process and expands them with all the context available. Tagbean Code Reduction Another optimization reduces the amount of code in the tagbeans. By way of background, if a developer expands everything necessary to perform a function of a tag, that process may produce large amounts of code. In particular, the writing of custom tagbeans may result in a large amount of Java code being generated into the resulting servlet. Because this code may be largely common across servlets generated from the same tagbean (variable names might change, but little else), according to the invention, the functionality is delegated to outside code as much as possible. Preferably, the code is factored into a separate Java bean, and the most convenient place to delegate is the very tagbean generating the code. Thus, the tagbean need only generate enough Java code for the servlet to call out to the separate bean. This dramatically reduces the code in the tag bean handler. As a result, this optimization improves maintainability and greatly simplifies debugging. In addition, because the code is not expanded, the function is hidden from anyone who has access to the generated servlet code. In addition, as a separate Java bean, developers are encouraged to put more error-handling code in the system that may not get put in otherwise. It also further stabilizes the system. Thus, in a preferred embodiment, instead of doing inline expansion of code, the developer should take runtime values of attributes and sub-elements and generate code to make them parameters of a method on the very same bean that can be called at runtime to do the real work. The present invention provides numerous advantages over the prior art. In effect, the inventive page handling mechanism combines the manipulation and template mechanism of XSLT with the scripting capabilities of the JSP/ASP model. In addition, the invention provides a framework for enabling any programming language to be plugged into that model. Further, given that most languages are easily defined in Java byte code, the invention is economical to implement in a runtime using, for example, a Java Virtual Machine. The present invention uses custom DOM tags together with a framework and runtime that provides a powerful macro language to XML/JSP. The custom DOM tags allow a web page author the ability to define a simple markup language tag, e.g., <SHOPPING_CART>, that, at page translation time, is converted into script code by a generic Java object or an XSL stylesheet. This script code is then compiled into Java code and then into a Java servlet, yielding excellent performance servicing a client's request. Because the custom tag replaces the script code in the authored page, the page is kept clean and easy to maintain. The script code is kept separate and, thus, need only be debugged once. Normal ASP development, on the contrary, would force this code to remain in the page, and it would have to be debugged after every modification. The inventive framework is quite advantageous in that it is built on top of XML. Moreover, one of ordinary skill will appreciate that the framework is defineable programmatically or with XSL. In addition, macros written according to the invention can affect the output of an entire page and not just the content between a given pair of tags. As noted above, the inventive mechanism is preferably implemented in or as an adjunct to a web server. Thus, the invention does not require any modifications to conventional client hardware or software. Generalizing, the above-described functionality is implemented in software executable in a processor, namely, as a set of instructions (program code) in a code module resident in the random access memory of the computer. Until required by the computer, the set of instructions may be stored in another computer memory, for example, in a hard disk drive, or in a removable memory such as an optical disk (for eventual use in a CD ROM) or floppy disk (for eventual use in a floppy disk drive), or downloaded via the Internet or other computer network. In addition, although the various methods described are conveniently implemented in a general purpose computer selectively activated or reconfigured by software, one of ordinary skill in the art would also recognize that such methods may be carried out in hardware, in firmware, or in more specialized apparatus constructed to perform the required method steps. Further, as used herein, a Web "client" should be broadly construed to mean any computer or component thereof directly or indirectly connected or connectable in any known or later-developed manner to a computer network, such as the Internet. The term Web "server" should also be broadly construed to mean a computer, computer platform, an adjunct to a computer or platform, or any component thereof. Of course, a "client" should be broadly construed to mean one who requests or gets the file, and "server" is the entity which downloads the file. Having thus described our invention, what we claim as new and desire to secure by letters patent is set forth in the following claims.
|
Same subclass Same class Consider this |
||||||||||
