System and methodology providing automated selection adjustment for refactoring6807548Abstract A system providing methods for adjusting a user's selection of source code of a program to correspond with a parsed version of the program is described. The method commences with user input of a selection of source code of a program for performing an operation such as a refactoring. The user's selection is compared with a parsed version of the program. If the user's selection does not correspond with the parsed version of the program, the user's selection is adjusted to achieve correspondence with the parsed version of the program. Claims What is claimed is: Description COPYRIGHT NOTICE
Re-
Code factoring
Symbol allowed Description
Package Package Rename refactoring a package renames the
package and the entire sub-tree of packages. The
package name cannot already exist in the project.
Class, Rename, Rename refactoring an outer public class renames
inner class Move the source file. If the source file name already
or exists in the current package, the refactoring is
interface prevented. If the class is not the outer public
class and there is another class of the desired new
name, the class is not renamed. Move refactoring
a class moves that class to a new package if the
new package does not already contain a source
file of the new name. The class must be the top
level public class.
Method Rename Rename refactoring a method renames the method
and all references to that method. The method can
be renamed in all classes that this class inherits
from or in all classes in the hierarchy for the
class. A forwarding method can be created.
Field Rename Rename refactoring a field renames the field to a
new name. The new name cannot already exist in
the class that declared the original name. If there
are scope conflicts between the new name and the
old name, then the this keyword is added to the
beginning of the new field name. A warning is
displayed if the new name overrides or is
overridden by an existing field in a superclass
or subclass.
Local Rename Rename refactoring a local variable renames the
Variable variable to the new name. The new name cannot
already exist in the class that declared the
original name.
Property Rename Rename refactoring a property renames the
property as well as its getter and setter. The new
name cannot already exist in the class that
declared the original name.
c) Move Refactoring A "move refactoring" is available for moving classes. Move refactoring moves a specified class to a new package. Move refactoring is only allowed on a top-level public class. The package the class is being moved to cannot already contain a source file of the new name. The refactoring must update the declaration of the class, as well as all the usages of that class. d) Change Parameters A "change parameters" refactoring allows a user to add, rename, delete and re-order a method's parameters. A newly added parameter may be edited before the dialog creating the parameter is closed; however, an existing parameter cannot be edited. e) Extract Method An "extract method" refactoring turns a selected code fragment into a method. The system moves the extracted code outside of the current method, determines the needed parameter(s), generates local variables if necessary, and determines the return type. It inserts a call to the new method in the code where the code fragment resided. f) Introduce Variable An "introduce variable" refactoring allows the result of a complex expression, or part of the expression, to be replaced with a temporary variable name that explains the purpose of the expression or sub-expression. g) Surround with Try/catch A "surround with try/catch" refactoring adds a try/catch statement around a selected block of code. It detects all checked exceptions in a block and adds specific blocks for each checked exception. 2. Access to Refactoring Features The refactoring features of the currently preferred embodiment are accessible from the development system's editor context menu, editor menu, search menu, and UML diagram context menu. Before a refactoring, a user can view, by category, all locations in the current project where the selected symbol is referenced, and navigate to the symbol's definition. If the refactoring cannot be completed, the user interface provides warning and error messages to help explain the problem. In the currently preferred embodiment, warnings do not stop the refactoring. However, if an error is encountered, the refactoring is prevented. For example, a refactoring might be prevented if a file is read-only or if a new name that is selected already exists. Single file refactorings (for example, extract method and introduce variable) do not display output unless there are errors or warnings. Exemplary refactoring tools provide extensive information to the user about the refactoring process, including:
Limitations Checks for conditions where a refactoring might encounter
reporting problems. For example, determines if needed dependency
information is missing or out-of-date, if a file is read-only,
or if a class file does not exist.
References Finds all source files containing dependencies. The exact
discovery source position is located.
Validation Determines if the new name is legal. For example, the name
might already be in use or contain illegal syntax.
Source tree Physically moves a directory or a file within the source tree
updating for a class move refactoring or a package rename
refactoring. The system also updates import statements as
needed for any dependencies.
Reference Renames references with the new name.
renaming
3. Setting up for References Discovery and Refactoring There are several steps that a user may need to perform before performing a refactoring. The currently preferred embodiment provides an option for loading all library relationships, allowing the system to discover all references. To find all references to a symbol, the system should be compiled with this option for loading references from project libraries enabled. Loading all library references is not required; it may slow down both the compilation and the refactoring process. However, if the library references are not loaded, all references to a symbol may not be discovered using the "Find References" command described below. Additionally, before performing a refactoring, it is advisable to ensure that the class files are up-to-date by compiling the source files. 4. Learning About a Symbol Before Refactoring Before a user performs a refactoring, such as a rename refactoring or a move refactoring, the currently preferred embodiment provides several ways for learning about a given symbol. For example, a developer can use the system to find the definition of a symbol as well as all references to the symbol; that is, all source files that use the symbol. A user can also issue a "Find Definition" context menu command to determine where a given symbol is defined. To find a symbol's definition, a user is required to compile the application before using the "Find Definition" command to locate the file containing the definition. In addition, the class that includes the definition must be on the import path or in the same package as the symbol that is of interest. In response to the "Find Definition" command, the source file where the symbol is defined is opened in the editor of the visual development system. If the symbol is an instance of a class, the class is opened in the editor, with the cursor placed on the class declaration. If the symbol is a method, the class that defines the method is opened in the editor, with the cursor placed at the start of the method signature. If the symbol is a variable, and the variable is defined in the open class, the cursor moves to the variable definition. If the variable is public and defined in another class, the class is opened in the editor with the cursor placed on the definition. Before refactoring, a developer might also want to find all source files using a selected symbol. To locate all references to a symbol, the application must be compiled and the references must be loaded from project libraries as previously described. If a developer has compiled the source files and loaded the references, he or she may select the symbol in the editor and enter the "Find References" command. References located in response to the command are displayed on the "Search Results" tab of the message pane. Class and method references are sorted by category. Field and local variable references are sorted by file name. The "Find References" command cannot currently be used to locate references for a package or a property. The following table details, by code symbol, the reference categories that can be displayed in the Search Results tab:
Code
Symbol Reference Category
Class, inner Ancestors-Classes from which this class directly inherits.
class or Descendents-Classes that directly descend from this class.
interface Type references-Classes that declare or instantiate the type
of object for the class.
Descendents type references-Classes that are descendents or
use descendents of the type of object for the class.
Member references-Members in this class.
Descendents member references-Members in classes that
descend from this class.
Method Declarations-Locations where this method is declared.
or Direct usages-Locations in directly instantiated classes that
constructor call this method.
Indirect usages-Locations in ancestor and descendent
classes that are referenced.
Field and Writes-Locations where the field or local variable is
local written. Reads-Locations where the field or local variable is
variable read.
After references for a class have been located, a reference category in the Search Results tab may be selected to obtain additional information about that reference, including a list of the source files referring to the class. A user may then select a source file and a reference to go directly to the reference in the editor. Additional information on a method may be obtained in a similar fashion. For field or local variable references, the writes and reads for the selected symbol may be displayed. 5. Viewing Changes Before Refactoring Before some types of refactoring are performed, the system provides the opportunity to view potential changes to be made before committing the refactoring. A user may wish to preview changes to ensure that the changes that will be made are appropriate before committing the changes. For instance, when the rename or move commands on the context menu are utilized, rename or move dialog boxes are displayed. These dialog boxes provide an option for viewing changes before committing them. If this preview option is selected, potential changes are displayed on the "Refactoring" tab of the message pane. The lines that will potentially change as a result of the refactoring are displayed by file name, sorted in the order of discovery. The following table details the type of information displayed for refactoring:
Code Type of Information
Symbol refactoring displayed
Package Rename Source files that contain a class
reference that
will change
Class, inner Rename Line locations in the current source
file where
class or the class is declared; includes
constructors.
interface Also lists source code locations where
the class
is used.
Class Move Source code locations where the class'
current
package is declared or imported.
Indicates if a
package in the list of imports is
added or
deleted. (An import statement is added
for any
dependencies the class has on the
package it is
being moved from).
Method Rename Source code locations where the method
is
declared and used. Indicates if a
forwarding
method is created.
Field and Rename Source code locations where the symbol
is
local declared and called.
variable
Property Rename Source code locations where the
property is
declared and where accompanying getter
and
setter are declared and called.
A mapping is also provided which enables a user to go directly to a selected reference in a source file. After reviewing the changes, a user may issue a command (using the Refactoring tab) to commit the changes and complete the refactoring. A status bar displays a message providing information about the progress of the refactoring. If the files are edited before the refactoring is complete, the system may require the application to be recompiled, since the binary files (e.g. .class files) and the source files (i.e., .java files) would not be consistent. In this event, a status message is displayed indicating that the refactoring cannot be completed because the files have changed. When the refactoring is completed, a message is displayed indicating that the refactoring is complete. After refactoring, the contents of the Refactoring tab do not change. The original lines of source code are still displayed, so that the changes made by the refactoring can be observed. A user may select an original source code line to go to the line that was changed. After the changes have been reviewed, they may be saved to make them permanent. 6. Performing Different Types of Refactoring a) Optimize Imports An optimize imports refactoring may be used to rewrite and reorganize import statements according to custom settings in the project properties. A user can customize the order of imports on the import style page of the "Project Properties" dialog box of the currently preferred embodiment. To customize the order of import statements for new projects, a "Project .vertline. Default Project Properties" dialog box may be used to make modifications to the import style page. Different import style options may be set for the refactoring. An "Always Import Classes" option may be selected to avoid adding package import statements to an application. If this option is selected, individual classes will be directly imported. When this option is used, the "Package Import Threshold" is ignored. The "Package Import Threshold" sets how many classes must be imported from a package before rewriting the class import statement into a package import statement. Classes up to the import threshold are imported using individual class import statements. When the threshold is exceeded, the entire package is imported. For example, when three is entered in the import threshold field, and four or more classes are used from a package, the entire package will be imported. b) Rename Refactoring a Package A package may also be rename refactored from the editor or a UML class or package diagram. Rename refactoring a package renames the package and the entire subtree of packages to the new root package name. It also moves the package and all class names to the new name and source directory. A user may elect to view references before committing the rename refactoring of a package. After the refactoring is completed, the existing source directory structure for that package is deleted. The package rename refactoring is prevented by the system if the new package name already exists or is invalid. c) Rename Refactoring a Class A class, inner class, or interface can be rename refactored from either the editor or the UML class diagram. Rename refactoring for an outer public class renames all declarations of the class and all usages of the class and the source file. If a constructor is selected, the rename refactoring renames the class. The changes to be made in the refactoring may be previewed as described above, if desired. The rename refactoring is prevented by the system if the class identifier is invalid. In addition, if the class is not the outer public class and there is another non-outer public class of the desired new name, the class is not renamed. d) Move Refactoring a Class A class can be moved to a new package (i.e., move refactored) using the editor or the UML class diagram. Move refactoring a class moves that class to a new package if the new package does not already contain a source file of the new name. The package and import statements in the class source file, as well as in all classes that reference the moved class, are updated. An import statement is added for any dependencies the class has on the package it is being moved from. The class being moved must be the top-level public class. The class is not moved if the class identifier is invalid or if the source file name already exists in the new package. If a class is moved to a package that does not exist, the new package is automatically created and added to the application or project. The system also creates the new source directory and moves the class to the new directory. Package names and import statements are also updated. Additionally, if the package no longer contains any classes, the package is removed from the project and its source directory is deleted. e) Rename Refactoring a Method A method may also be rename refactored. Rename refactoring of a method may be initiated from either the editor or a UML diagram of the development system. Rename refactoring a method renames the method, all declarations of that method, and all usages of that method. The method can be renamed in the entire hierarchy or from the selected class down in the hierarchy. A forwarding method, that passes on the method call to a new method, can also be created using a "Create Forwarding Method" option, allowing a public API to remain intact. Rename refactoring a method does not rename overloaded methods; that is, methods with the same name but with different method signatures. A "Refactor Ancestors" option, when enabled, renames methods in classes that the current class inherits from. This option may be deactivated to rename the method only in this class and in its descendents. Rename refactoring of a method is barred if the new method name already exists in the file where it is declared. If the name exists in other files in the direct inheritance, a warning is issued. If the refactoring is performed with the Refactor Ancestors option enabled, a warning is also be displayed if the method exists, but is not in the editable source path. For example, if the method exists in a library, the method will not be refactored, as libraries are read-only. f) Rename Refactoring a Local Variable In the currently preferred embodiment, a local variable can be rename refactored only from the editor. A local variable rename refactoring changes the declaration and usages of that variable to the new name. A method parameter is also treated as a local variable for these purposes. The rename refactoring is prevented if the new name exists in the class that declared the original variable. g) Rename Refactoring a Field A field can be rename refactored from either the editor or a UML class diagram. A field rename refactoring changes the declarations and usages of that field to the new name. The refactoring may not be completed if the new name exists in the class that declared the field. If there are scope conflicts between the new name and the old name, the this keyword is added to the new field name. A warning is displayed if the new field overrides or is overridden by an existing field in a superclass or subclass. h) Rename Refactoring a Property A property can be renamed from a UML class diagram. A property rename refactoring changes all declarations of that property, as well as its getter and setter methods. A rename refactoring of a property cannot be completed if the new name exists in the class that declared the original property. i) Changing Method Parameters A user can also add, rename, delete, and reorder a method's parameters from the editor or from a UML diagram. A newly edited parameter can be edited before the "Change Parameters" dialog box is closed; however an existing parameter cannot be edited. The "Refactor Ancestors" option (on by default) refactors methods in classes from which this class inherits. The Refactor Ancestors option may be deactivated to refactor the method only in this class and in its descendents. A user can then choose to add a forwarding method by clicking the "Create Forwarding Method" option. A changing message parameters refactoring is prevented if the new method signature already exists in the file where it is declared. If the signature exists in other files in the direct inheritance, a warning is issued. If the refactoring is performed with the Refactor Ancestors option enabled, a warning can also be displayed if the same method exists, but is not in the editable source path. For example, if the method exists in a library, it will not be refactored, as libraries are read-only. In addition, the refactoring may be prevented if the new parameter name or type is not a valid Java identifier. j) Extracting a Method An extract method refactoring turns a selected code fragment into a method. A user can access this refactoring from the editor. The extracted code is moved outside of the current method, the needed parameter(s) are determined, local variables are generated if necessary, and the return type is determined. A call to the new method is also inserted in the code where the code fragment resided. This refactoring may not be allowed if more than one variable is written to or if it is read after the block. k) Introducing a Variable An introduce variable refactoring may be used to replace the result of a complex expression, or part of the expression, with a temporary variable name. The temporary name is also known as an explaining variable which explains the purpose of the expression or sub-expression. A temporary variable with the selected variable name is generated and initialized in the correct place. The original expression is replaced with the newly generated variable. l) Surrounding a Block with Try/catch A user can perform a surrounding a block with try/catch refactoring to place a try/catch statement around a selected block of code. The system will detect all checked exceptions in a block and adds specific blocks for each checked exception. This refactoring is available from the editor. If the selected block is not a valid block of statements, an error will displayed in the refactoring tab and the refactoring will be prevented. 7. Undoing a Refactoring A completed refactoring operation can be easily reversed. In the currently preferred embodiment, an "Undo" button is provided on the refactoring toolbar to undo all changes made in a refactoring. The refactoring should be reversed before any changes are made to other files and before the Refactoring tab is closed. When a refactoring is performed that does not display output in the Refactoring tab, changes can be reversed with an "Edit .vertline. Undo" command. 8. Saving a Refactoring After a refactoring has been successfully completed, the changes may be saved (i.e., the modified files in the software system may be saved) using a "File .vertline. Save All" command. If a version control system is utilized, the changes may be committed or checked into the version control system. If the software system is closed before the refactoring changes are saved, a "Save Modified Files" dialog box is displayed enabling the files that are to be saved to be selected. If the refactored files are not saved, the software system source files revert to their state before the refactoring(s). It should be noted that refactoring can be applied to files that may not be open in the editor at the time of the refactoring. The system automatically saves changes to those files so the source code is not in an inconsistent state. D. Detailed Methods of Operation FIGS. 6A-B comprise a single flowchart illustrating a compiler-assisted refactoring method 600 performed in accordance with the present invention. To illustrate the operation of the present invention, FIGS. 6A-B and the following discussion use as an example a refactoring of a software application written in the Java language containing a number of component source files or programs. However, Java is only one of the possible programming languages with which the present invention may be advantageously utilized. Accordingly, the references to Java in the following discussion are for purposes of illustration and not limitation. In addition, refactoring of a software system may involve applying a number of different changes to the system (e.g., changing a number of different symbols or class names). Accordingly, the steps described below for making a given change may be repeated for applying a number of different changes to a software system, as desired. The method begins at step 601 with the receipt of one or more source file(s) that have been developed or created to perform particular tasks. The source files may, for instance, comprise .java files for a software application that has been developed in the Java programming language for installation on a particular environment (e.g., an e-commerce application to be installed on a Web server). At step 602, the compiler is initially invoked to compile these source file(s) into a set of binary files (e.g., .class files). The compilation process includes parsing the source files, applying type attribution, and generating binary code (e.g., Java bytecode) as previously described. It should be noted that in the case of Java source files, the name of the source files (e.g., class names) as well as other reference information is retained as part of these compiled .class files. The result of the compilation process is that the source files (e.g., .java files) have been translated into machine-readable binary code (e.g., .class files) which may then be executed. After the source file(s) have been compiled, a user may subsequently wish to make changes to the application. For instance, a user may want to perform a rename refactoring to change a particular class name of a component of the application. At step 603, a request is received from a user for refactoring of a program or system. In response to this request, at step 604 the system of the present invention reads and parses the binary modules of the application in order to place entries into a repository. In particular, the above binary files (e.g., .class files) of the application developed or created by the user are examined and the system of the present invention generates information about each binary file (e.g., .class file) and places this information in the repository. In the currently preferred embodiment, information that is placed into the repository includes the element name (e.g., the class name), the source file, and the forward references from this class to other classes. The above process examines the user-developed .class files and does not examine standard libraries and other standard files or components provided as part of the underlying Java programming language. After the user source files of the application have been examined and entries have been placed in the repository describing each of the components of the application, at step 605 the information in the repository is used to identify binary files which may contain dependencies (i.e., which may contain the elements or symbols of interest such as the class name in this example). This process involves using the forward references in the repository to isolate candidate binary files (e.g., .class files) which may contain dependencies. At step 606, the corresponding source files are retrieved for the candidate binary files identified at step 605. After the candidate source files have been retrieved, at step 607 these source files are read into the compiler. The compiler is used to parse the candidate source files (i.e., those identified at step 606) and create parse trees representations of these source files. The parse tree representations of the candidate source files that are generated contain short names (e.g., text names) and position information which references the locations of these short names (or symbols), but do not yet contain type information. At step 608, the compiler's type attribution is used to annotate nodes of these parse trees with type information. More particularly, each node of the parse tree includes a field for storing type information. The compiler adds type information to nodes of the parse tree by traversing (or walking) the parse tree and building look-up tables based upon the surrounding environment (e.g., local variables of a method in which the short name is located). This type information is necessary to enable the short name to be tested to determine whether the short name is, in fact, a local variable or another class or method from outside the local context (i.e., a dependency on another class). Steps 607 and 608 are similar to the previously described compilation process used to generate the binary files, except in this situation only the substeps of parsing the source files and type attribution are used. The substep of code generation is unnecessary. At the completion of the type attribution process, the parse tree representations of the candidate source files include short names (or symbols), type names, and position information. At step 609, the annotated parse trees are traversed to locate nodes that match the short name(s) and type(s) of interest. The position and line number of the matching nodes can then be obtained from the repository. The position information (i.e., the line number in the source file in which the matching node is located) is used to locate this code in the appropriate source file. At step 610, the modifications are applied to the underlying source files (i.e., modifications to the textual form of the source code). After the modifications are applied to the underlying source files, the refactoring is complete. The user can then save the changes made during the refactoring at step 611. For most types of refactoring, the currently preferred embodiment enables the changes made during the refactoring to be viewed by the user before the changes are saved. After the changes have been saved, the user may, if desired, recompile the source code to verify successful refactoring and update the binary files. The user may also proceed to make additional modifications to the application, which may include repeating the above steps for refactoring another element of the application. III. Selection Expansion for Refactoring A. Overview The system of the present invention adjusts (e.g., expands, extends, decreases, or otherwise modifies) the selection of code made by a user (e.g., application developer) using the compiled version of the program (parse tree representation) generated by the compiler and applying constraints based upon the type of refactoring that is being performed, so that the selection falls within the precise boundaries of the compiled version of the program that is being refactored. The system, in its presently preferred embodiment, is used in connection with refactorings which involve operations on expressions and statements. For example, if the user selects only part of an expression, the selection will automatically be widened to capture the entire expression. If the user selects a portion of a set of arbitrarily nested statements, a system constructed in accordance with the present invention will automatically adjust (e.g., widen or extend) the selection to the innermost completely enclosing list of statements. As previously described, when a program is compiled, the compiler generates a parse tree representation of the program. If a user wishes to refactor a selected portion of the program, the compiled version (i.e., the compiler-built parse tree representation) of the program is compared to the code selected by the user in the source code editor. When the user selects code in textual form in the editor, the system of the present invention locates the closest matching selection in the compiled version (parse tree representation) of the program which is then applied instead of the specific text marked by the user. Typically, the text (i.e., source code) selected by the user is expanded (i.e., widened or extended) as a result. In the currently preferred embodiment, the adjusted selection generated as a result of this methodology is then displayed to the user before the refactoring is performed to enable the user to evaluate the adjusted selection before it is applied. B. Example of Selection Adjustment The operations of the present invention may be illustrated by example. FIGS. 7A-D illustrate the operations of the system of the present invention in adjusting (i.e., expanding, extending, or otherwise modifying) two exemplary user selections for refactoring. In its currently preferred embodiment, the present invention is applied in conjunction with several different types of refactoring, including "introduce variable", "extract method", and "surrounding a block with try/catch". However, those skilled in the art will appreciate that the methodology of the present invention is not limited to use in this context but may also be advantageously applied in a number of other circumstances. Accordingly, the following examples are for purposes of illustration and not limitation. FIG. 7A illustrates a portion of a program (or application) 710 which is open in the source editor of an exemplary visual development system. A user has selected (i.e., marked) a segment 711 of the program as shown at FIG. 7A for performing an operation (e.g., an "introduce variable" refactoring). FIG. 7B illustrates an exemplary dialog box 720 of the currently preferred embodiment which displays the adjusted expression 722 for this "introduce variable" refactoring. The adjusted expression 722 is generated by the system of the present invention based upon a portion of the parse tree representation of the program which most closely corresponds to the user's selection based upon the constraints of the applicable refactoring (in this example an "introduce variable" refactoring which involves replacing a given expression with a variable). FIG. 7C illustrates a second portion of a program 730 which is open in the source editor of an exemplary visual development system. As shown, a user has selected a code segment 731 for performing an operation (e.g., an "extract method" refactoring). FIG. 7D illustrates an exemplary dialog box 740 of the currently preferred embodiment which displays the statement 742 generated by the system for this "extract method" refactoring. As shown, the selection made by the user at FIG. 7C has been adjusted as shown at FIG. 7D to capture a syntactically correct selection. The operations of the present invention in automatically adjusting a selection for refactoring will now be described in greater detail. C. Selection Adjustment of Expression FIG. 8 is a flowchart 800 illustrating the methods of operation of the present invention for adjusting a user's selection for refactoring a given expression. The following discussion uses as an example an exemplary "introduce variable" refactoring to illustrate the operations of the present invention. However, an "introduce variable" refactoring is only one example of an operation in which the present invention may be utilized and is used for purposes of illustration and not limitation. The following description presents method steps that may be implemented using computer-executable instructions for directing operation of a device under processor control. The computer-executable instructions may be stored on a computer-readable medium, such as CD, DVD, flash memory, or the like. The computer-executable instructions may also be stored as a set of downloadable computer-executable instructions, for example, for downloading and installation from an Internet location (e.g., Web server). As shown at FIG. 8, the method begins at step 801 with the user (e.g., developer) making a selection of code for performing a particular type of refactoring (e.g., an "introduce variable" refactoring). Typically, the user selects code by marking the textual form of a program from a source file open in an editor (e.g., an editing window of an application development system such as Borland JBuilder.RTM.). At step 802, a compiled version (parse tree representation) of the selected source file is retrieved or generated. If the program has previously been compiled and has not been modified, a parse tree representation may have already been generated by the compiler. Alternatively, if the user has made recent changes to the file, the compiler generates a parse tree representation of the program. Next, at step 803 the parse tree is traversed to find all nodes enclosing (or containing) the user's selection. For example, if the user selects part of a method within a given class, both the method and the class in which the method is included will typically be identified as enclosing the selection, as both the class and the method enclose the user's selection. At step 804, the innermost nodes of the parse tree which enclose the user's selection are determined. With respect to a refactoring involving a particular expression (e.g., an "introduce variable" refactoring as described above), this process involves visiting the expression level nodes of the parse tree. The range selected by the user (i.e., the start and the end of the block marked by the user in the editor) is then compared to the compiled version (i.e., nodes of the parse tree representation) to identify the minimum number of nodes which just enclose the text (i.e., source code) selected by the user. At step 805, the user's selection is expanded (if necessary) to the minimum number of nodes of the parse tree representation necessary to correctly enclose the selection. For example, consider the simple code segment "a=2+5;". If the user selected the text "+5" in the source code editor, the selection would be adjusted (i.e., extended or expanded) to include the expression "2+5" as this is the innermost node that encloses the user's selection. In this example, the user's selection would not be adjusted to also enclose the assignment (i.e., the "a=" portion of this code segment), as the approach of the present invention is to locate the smallest expression that encloses the user's selection, based upon the compiler's parse tree representation of the program. On the other hand, if the user selected "=" from this same code segment, the selection would be adjusted to cover "a=2+5;" as that would be the smallest expression which completely encloses this selection. At step 806, the adjusted selection determined as a result of the above steps is displayed to the user. In the currently preferred embodiment, the revised selection is displayed to the user to enable the user to review the adjusted selection before it is applied in the refactoring. It should be noted that if the user makes a careful selection of the exact text (i.e., source code) to be refactored in a given case, then no adjustment of the text selected by the user may be required. After the selection is adjusted (or verified if not adjusted) as described above, the refactoring itself may be performed. The use of the system and methodology of the present invention ensures that the refactoring works on a valid and complete expression, thereby reducing the possibility for errors as a result of the refactoring. Automatically aligning the user's selection with the closest matching syntactical construct from the compiler's parse tree representation also simplifies the refactoring process as the developer does not have to precisely mark the exact expression to be refactored. For instance, a developer does not have to carefully review a selection to make sure that he or she included the closing parenthesis at the end of the expression. D. Selection Adjustment of Statement Although the above example referenced use of the present invention with expressions, the system and methodology of the present invention may also be used with refactorings involving statements as well as those involving expressions. Certain types of refactoring, such as an "extract method" refactoring for example, involve operations on a group or list of statements. FIG. 9 is a flowchart 900 illustrating the methods of operation of the present invention in an exemplary "extract method" refactoring involving extracting one or more statements into a new method. Although, an "extract method" refactoring is used as an example for illustrating these operations, the methodology of the present invention may also be used in other circumstances. The initial steps are the same as previously described. As shown at FIG. 9, the method begins at step 901 with the user making a selection of code for performing a particular type of refactoring (e.g., an "extract method" refactoring). At step 902, a compiled version (parse tree representation) of the selected source file is generated or retrieved (e.g., if the program has been previously compiled and has not been modified). At step 903 the parse tree representation is traversed to find nodes enclosing the user's selection. At step 904, the statement(s) which enclose the user's selection are determined. After the statement(s) enclosing the user's selection are located, the sub-statements that are part of each of these statements are identified at step 905. After these sub-statements have been enumerated, at step 906 a determination is made as to which of these sub-statements intersect with the user's selection. For instance, if the same portion (i.e., "+5") of the above exemplary code segment "a=2+5;" was selected in conjunction with an "extract method" refactoring, the statement (i.e., "a=2+5;") would be selected for inclusion as this statement intersects with the user's selection. Another example illustrating the operations of the present invention is an if/else statement as illustrated in the following pseudocode:
if (flag) {
doTask(); }
else {
doNothing();}
If the user selects only the else {doNothing( );} portion of this code, the system of the present invention selects the entire if/else construct as the innermost enclosing statement because these two sub-expressions (i.e., the if and else clauses) are interdependent. At step 907, the selection is adjusted to include those statement(s) intersecting with the user's selection. Next, the adjusted (i.e., expanded, extended, or otherwise modified) selection is displayed to the user at step 908. As described above, the adjusted (expanded or extended) selection is displayed to the user as the user may wish to review the adjusted selection before it is applied. The following example illustrates how these selection adjustment operations are performed in the currently preferred embodiment. In this example, assume that a user is performing an "extract method" refactoring involving the following code segment:
1: if (flag) {
2: do_1 ();
3: do_2 ();
4: do_3 ();
5: }
6: else {
7: do_nothing();
8: }
In this example, assume that the user selects the text starting at the "2" of the "do_2( );" statement at line 3 above and ending with the ";" at the end of line 4. Initially, the innermost enclosing statement which encloses the selection is determined. In this case the innermost enclosing statement is the entire clause, including both the it and the else conditions and associated statements at lines 1-8. After the innermost block enclosing the user's selection has been determined, the sub-statements of this block are enumerated. Next, a determination is made as to which of these sub-statements intersect with the user's selection. As both the sub-statement at line 3 ("do_2( );") and the sub-statement at line 4 ("do_3( );") intersect with the user's selection, the selection is adjusted (i.e., expanded or extended) to include both of these sub-statements (i.e., all of line 3 and all of line 4). The other lines of this code segment, containing sub-statements which do not intersect with the user's selection, are not included as part of the result set. E. Detailed Operation 1. Filters for Examining Nodes of Parse Tree The system of the present invention includes infrastructure (i.e., filters) that examine each node of a parse tree representation of a compiled program. For each statement node or expression node, a filter determines whether or not the node is applicable (i.e., of interest) to the problem currently being examined. The specific filter that is used is based upon the type of refactoring that is being performed. For example, if a user is performing an "introduce variable" refactoring in which a variable is being introduced to replace an expression (e.g., because the same expression is repeated a number of times in a given program), an expression-oriented filter is applied to determine the innermost expression enclosing the code selected by the user in the source code editor. However, if the user was performing an "extract method" refactoring in which several statements are to be extracted into their own method, a statement-oriented filter is used. The filter that is used is context dependent and is automatically selected by the system of the present invention based upon the type of refactoring that the user is performing. 2. Innermost Enclosing Expression The following EnclosingExpressionFilter class illustrates an expression filter that is called in the currently preferred embodiment if the user is performing an "introduce variable" or similar refactoring for which the innermost enclosing expression is to be determined:
1:
//---------------------------------------------------------------------
2: // EnclosingExpressionFilter.java
3: //
4: // Find all expressions that are enclosing the range [start..end]
5:
//---------------------------------------------------------------------
6:
7: package com.borland.jbuilder.java.filter;
8:
9: import com.borland.compiler.frontend.*;
10: import com.borland.jbuilder.java.*;
11:
12: public class EnclosingExpressionFilter extends AbstractEnclosing
Filter
{
13: public EnclosingExpressionFilter(int start, int end) {
14: super(start, end);
15: }
16:
17: public boolean _case(AST that) {
18: return AstUtil.isExpression(that) &&
checkEnclosement(that);
19: }
20: }
As shown at line 13, the EnclosingExpressionFilter receives a starting point (start) and ending point (end) for a selected block as input parameters, and checks to determine expression nodes which enclose the starting and ending point. For each of the expression nodes of the parse tree, a check for the enclosement property is made as shown at line 18. If a given expression node encloses the range selected (i.e., encloses the start and end of the block), then it is added to a list of candidate enclosing expressions. For instance, using the prior example of "a=2+5;" and the user's selection of "+5", the assignment operation expression "a=2+5;" is first determined to be an enclosing expression and is added to the list. Next, the binary operation sub-expression "2+5" is also enclosing and is also listed as a candidate enclosing expression. However, the sub-expression "a=" is outside the range (i.e., does not enclose the selected code) and therefore is not added to the list of candidates. Thus, in this example, the EnclosingExpressionFilter lists the two expressions "a=2+5;" and "2+5" as enclosing expressions, with the sub-expression "2+5" listed second in the list. After the list of candidate enclosing expressions is determined, the following getEnclosingExpression method examines the list of candidates to determine the innermost enclosing expression:
1: /**
2: * Get the topmost expression within the selected block
3: * @param si an instance of SourceInfo
4: * @param start the start position for the selected block
5: * @param end the end position for the selected block
6: * @return the innermost enclosing expression or null if
there is
7: * no enclosing expression
8: */
9: public static AST getEnclosingExpression(SourceInfo si, int start,
int end) {
10: AstFilter filter = new EnclosingExpressionFilter(start, end);
11: AST[] asts = si.applyFilter(filter);
12: int length = asts.length;
13: if (length == 0) {
14: return null;
15: }
16: if (length > 1 && asts[length - 2].tag == Constants.
APPLY &&
17: ((Apply)asts[length - 2]).fn == asts[length - 1]) {
18: return asts[length - 2];
19: }
20: else {
21: return asts[length - 1];
22: }
23: }
As shown at line 9 above, the input parameters include the start position (start) and end position (end) of the block of text (i.e., code) selected by the user. The above-described EnclosingExpressionFilter is called with the parameters start and end as illustrated at line 10 to determine the expressions enclosing the selected code (text). The enclosing expressions are then evaluated. As shown at lines 13-14, "null" is returned if there are no enclosing expressions (i.e., if the length of the array of enclosing expressions is equal to zero). If the array is not empty, the last member of the array, representing the innermost enclosing expression is typically returned as shown at lines 20-21. For instance, in the above example, the sub-expression "2+5" which is the second of the two items in the list of enclosing expressions is selected. However, in certain special situations, the next to last expression is returned as shown at lines 16-18. The result is that the innermost expression which encloses the text selected by the user is selected from the parse tree representation of the program source code. The methodology of the present invention which is applied in the case of a refactoring involving statements will next be described. 3. Enclosing Statements Falling within Boundaries of Range Refactorings involving statements, such as an "extract method" refactoring, also involve identifying the statements enclosing the user's selection in a manner similar to that previously described. However, after the statements enclosing the user's selection have been identified, these statements must be post-processed to identify those sub-statements with the block enclosing the user's selection that fall within the boundaries of the range selected by the user. In the currently preferred embodiment, a utility class includes several methods which are used to identify the appropriate statements and sub-statements within the innermost block enclosing the user's selection. The statements that intersect with the user's selection, and should be included in this type of refactoring, are then determined. Several of these methods are described below to provide further illustration of the operations of the currently preferred embodiment of the present invention. Of particular interest, the following getStatementList method identifies all statements within the range selected by the user and processes these statements to determine the portions (i.e., sub-statements) that intersect with the user's selection:
1: /**
2: * Get all statements within the selected range [start..end]
3: * @param si an instance of SourceInfo
4: * @param start the start position for the selected block
5: * @param end the end position for the selected block
6: * @return and array of AST containing all top level
statements of
7: * the innermost block enclosing the range, that
8: * fall within the boundaries of the range.
9: */
10: public static AST[] getStatementList(SourceInfo si, int start, int
end) {
11: // get all enclosing blocks
12: AstFilter filter = new EnclosingStatementFilter(start, end);
13: AST[] asts = si.applyFilter(filter);
14: if (asts == null .vertline..vertline. asts.length == 0) {
15: return emptyAstArray;
16: }
17: List result = new ArrayList();
18: // get the statements of the innermost of the enclosing
statements
19: AST[] stats = getSubStatements(asts[asts.length - 1], start,
end);
20: if (stats == emptyAstArray) {
21: return getStatementArray(asts[asts.length - 1]);
22: }
23: for (int i = 0; i < stats.length; i++) {
24: if (isIntersectingRange(stats[i], start, end)) {
25: result.add(stats[i]);
26: }
27: }
28: return (AST[])result.toArray(new AST[result.size()]);
29: }
As shown at line 10 above, the input parameters to the getStatementList method include the starting (start) and ending (end) position of the block selected by the user. At line 12, the EnclosingStatementFilter method (described below) is called to identify and return a list of the statements (i.e., parse tree nodes) falling within the boundaries of the user's selection. If a list of statements is returned (i.e., the statement array that is returned is not empty as shown at lines 14-15), then the sub-statements of each statement in the array are next enumerated. As shown at line 19, a call is made to the getSubStatements method to identify the sub-statements of each of the statements and store a list of all such sub-statements in an array (AST [ ]). The getSubStatements method which enumerates the sub-statements is illustrated below. After the sub-statements have been identified, each of the sub-statements is examined as shown at lines 23-25. More particularly, an isIntersectingRange method is called at line 24 to determine those sub-statements which intersect with the user's selection. The isIntersectingRange method shown below identifies the sub-statements which intersect with the user's selection. The identified sub-statements are added to the result set as provided at line 25 above. The result array contains all sub-statements of the innermost block enclosing the range that fall within the boundaries of the range selected by the user. This result array is returned as shown at line 28. The methods called by the above getStatementList method are described below. The EnclosingStatementFilter method which is called by the above-described getStatementList method is as follows:
1: //
-------------------------------------------------------------------
2: // EnclosingStatementFilter.java
3: //
4: // Find all statements that are enclosing the range [start..end]
5:
//--------------------------------------------------------------------
6:
7: package com.borland.jbuilder.java.filter;
8:
9: import com.borland.compiler.frontend.*;
10:
11: public class EnclosingStatementFilter extends AbstractEnclosing
Filter
{
12: public EnclosingStatementFilter(int start, int end) {
13: super(start, end);
14: }
15:
16: public boolean _case(AST that) {
17: return AstUtil.isStatement (that) && check
Enclosement(that);
18: }
19: }
The above EnclosingstatementFilter is similar to the expression filter that has been previously described. As shown at line 11, the EnclosingStatementFilter receives the start and end of a block (or text or code) selected by a user as input. For each of the statement nodes of the parse tree, a check is made for the enclosement property (checkEnclosement (that)) as shown at line 17. After the statement(s) enclosing the user's selection have been determined, the following getSubStatements method is called:
1: private static AST[] getSubStatements(AST ast, int start, int end) {
2: switch(ast.tag) {
3: case Constants.BLOCK:
4: return ((Block)ast).stats;
5: case Constants.CONDSTAT: {
6: Conditional cond = ((Conditional)ast);
7: if (isIntersectingRange(cond.cond, start, end)) {
8: return emptyAstArray;
9: }
10: List list = new ArrayList();
11: list.add(cond.thenpart);
12: list.add(cond.elsepart);
13: return getStatementArray(getSingleIntersecting
Statement(list,
start, end));
14: }
15: case Constants.WHILELOOP: {
16: WhileLoop loop = (WhileLoop)ast;
17: if (isIntersectingRange(loop.cond, start, end)) {
18: return emptyAstArray;
19: }
20: return getStatementArray(loop.body);
21: }
22: case Constants.FORLOOP: {
23: ForLoop loop = (ForLoop)ast;
24: // does the selection extend into the for loop header?
25: if (isIntersectingRange(loop.e2, start, end)) {
26: return emptyAstArray;
27: }
28: for (int i = 0; i < loop.el.length; i++) {
29: if (isIntersectingRange(loop.e1[i], start, end)) {
30: return emptyAstArray;
31: }
32: }
33: List list = new ArrayList();
34: for (int i = 0; i < loop.e3.length; i++) {
35: list.add(loop.e3[i]);
36: }
37: list.add(loop.body);
38: return getStatementArray(getSingleIntersecting
Statement(list,
start, end));
39: }
40: case Constants.DOLOOP: {
41: DoLoop loop = (DoLoop)ast;
42: if (isIntersectingRange(loop.cond, start, end)) {
43: return emptyAstArray;
44: }
45: return getStatetmentArray(((DoLoop)ast).body);
46: }
47: case Constants.CASE:
48: return ((Case)ast).stats;
49: case Constants.SWITCH: {
50: Switch switchStat = ((Switch)ast);
51: List list = new ArrayList();
52: Case result = null;
53: for (int i = 0; i < switchStat.cases.length; i++) {
54: if (isIntersectingRange(switchStat.cases[i],
start, end)) {
55: if (result == null) {
56: result = switchStat.cases[i];
57: }
58: else {
59: // we found a case before
60: return emptyAstArray;
61: }
62: }
63: }
64: return getStatementArray(result);
65: }
66: case Constants.TRY: {
67: Try tryStat = ((Try)ast);
68: List list = new ArrayList();
69: list.add(tryStat.body)
70: for (int i = 0; i < tryStat.catchers.length; i++) {
71: list.add(tryStatcatcher(i).body);
72: }
73: list.add(tryStat.finalizer);
74: return getStatementArray(getSingleIntersecting
Statement(list,
start, end));
75: }
76: default:
77: return emptyAstArray;
78: }
79: }
The above getSubStatements method is called to enumerate all sub-statements of the identified statement(s). As shown at line 2, the getSubStatements method includes a switch statement for handling a number of different types of statements. For example, lines 22-39 illustrate the handling of a for loop. The following methods are called by other methods of the program to determine whether or not a given block of code is inside, outside, intersects with, or encloses a given range:
1: /**
2: * Is the ast outside the range [start...end]
3: * @param ast the AST to test
4: * @param start the start position of the range
5: * @param end the end portion of the range
6: * @return true if the range falls completely outside the ast
7: */
8: private static boolean isOutsideRange(AST that, int start, int end)
{
9: return that.endpos <= start .vertline..vertline. end <=
that.pos;
10: }
11:
12: /**
13: * Is the ast inside the range [start...end]
14: * @param ast the AST to test
15: * @param start the start position of the range
16: * @param end the end portion of the range
17: * @return true if the ast falls completely within the range
18: */
19: private static boolean isInsideRange(AST that, int start, int end) {
20: return start <= that.pos && that.endpos <= end;
21: }
22:
23: /**
24: * Is the ast intersecting the range [start...end]
25: * @param ast the AST to test
26: * @param start the start position of the range
27: * @param end the end portion of the range
28: * @return true if the ast intersects with the range
29: */
30: private static boolean isIntersectingRange(AST ast, int start, int
end) {
31: if (ast == null) {
32: return false;
33: }
34: int astStart = getStartPos(ast);
35: return !(end < astStart .vertline..vertline. ast.endpos <
start);
36: }
37:
38: /**
39: * Is the ast enclosing the range [start..end]
40: * @param ast the AST to test
41: * @param start the start position of the range
42: * @param end the end portion of the range
43: * @return true if the range falls completely within the ast
44: */
45: private static boolean isEnclosingRange(AST ast, int start, int end)
{
46: if (ast == null) {
47: return false;
48: }
49: int astStart = getStartPos(ast);
50: return astStart <= start && end <= ast.endpos;
51: }
52:
53: private static AST[] getStatementArray(AST ast) {
54: if (ast == null) {
55: return emptyAstArray;
56: }
57: else if (ast.tag == Constants.BLOCK) {
58: return ((Block)ast).stats;
59: }
60: else if (ast.tag == Constants.CASE) {
61: return ((Case)ast).stats;
62: }
63: else {
64: return new AST[] {ast};
65: }
66: }
As shown, the isOutsideRange method at lines 8-10 above checks to determine if a given statement or sub-statement is completely outside the target range (e.g., the range identified by start and end). Similarly, the isInsideRange method at lines 19-21 checks to see if a given statement or sub-statement is entirely within the range. Of particular interest, the isIntersectingRange method shown at lines 30-36 returns a value of "true" if any portion of a given statement or sub-statement intersects with the specified range (e.g., the range or block selected by the user). Also, the isEnclosingRange method illustrated at lines 45-51 determines if a given statement or sub-statement completely encloses the selected range. Each of the above methods are called by other methods for determining whether or not any portion of a given statement or sub-statement from the compiler parse-tree representation of a program intersects with the user's selection and, if so, the extent to which it intersects with the user's selection. While the invention is described in some detail with specific reference to a single-preferred embodiment and certain alternatives, there is no intent to limit the invention to that particular embodiment or those specific alternatives. For instance, the foregoing discussion uses a refactoring of a software application written in the Java language to illustrate the operations of the present invention. However, Java is only one of the possible programming languages with which the present invention may be advantageously utilized. Accordingly, the references to refactoring a Java application in the foregoing discussion are for purposes of illustration and not limitation. Those skilled in the art will appreciate that modifications may be made to the preferred embodiment without departing from the teachings of the present invention.
|
Same subclass Same class
| |||||||||||||
