Mechanism for reformatting a simple source code statement into a compound source code statement7043720Abstract A mechanism for reformatting a simple source code statement into a compound source code statement is provided. Tokens are identified in unformatted source code, which contains simple statements. A syntax tree is created from the identified tokens. The syntax tree is used to identify one or more simple statements. In processing a particular simple statement, potential statements are identified in the particular simple statement. A tree of blocks, which identifies block levels, is created from the potential statements. An intermediate textual representation is created where each of the potential statements is on a different line. Indentation levels, which correspond to the block levels in the tree of blocks, are associated with each of the potential statements. Formatted source code is created by inserting begin and end block indicators into the intermediate textual representation. Claims What is claimed is: Description FIELD OF THE INVENTION
In step 502, the syntax analyzer 130 analyzes the tokens in the file to create a syntax tree. The syntax tree maybe generated in the same manner as syntax trees are generated by existing compilers. The syntax analyzer 130 is modified to handle the special character 205. The syntax analyzer 130 passes the syntax tree to the intermediate textual representation processor 140. In step 504, the intermediate textual representation processor 140 uses the syntax tree to determine the potential statements for a compound statement and to identify, from the syntax tree, one or more potential statements for a particular simple statement. FIG. 2B is a block diagram of a simple statement, as depicted in FIG. 2A, where the blocks indicate the potential statements that are used to create a compound statement. Since there are no brackets in the simple statement to indicate how to apply conditions to statements, the intermediate textual representation processor 140 uses the compiler rule that a condition only applies to the statement directly below it (hereinafter referred to as "the rule"). In order to determine what conditions apply to what statements, the intermediate textual representation processor 140 needs to find a standalone statement and process the simple statement backwards from the standalone statement. A standalone statement is a syntactically complete statement and therefore can be compiled into executable code without reference to other statements or conditions. For example, the semicolon after the constant "2" is used to determine that the "X=2;" 208 is a standalone statement. Following the rule of a condition applying to one statement, the "IF (3)" only applies to the standalone statement, "X=2;" 208. The standalone statement, "X=2;" 208, is combined with the "IF (3)" to produce a first new statement, "IF(3) X=2;" 206, and "IF (2)" applies to this first new statement. "IF(3) X=2;" 206 is combined with "IF(2)" to produce a second new statement, "IF(2) IF(3) X=2;" 204, and "IF(1)" applies to this second new statement. "IF(2) IF(3) X=2;" 204 is combined with "IF(1)" to produce a third new statement, "IF(1) IF(2) IF(3) X=2;" 202. In the case of a try catch combination, the catch belongs to the try because there is nothing to catch unless a try is performed. Therefore, the catch 212 is inside of try 210. In step 506, the intermediate textual representation processor 140 uses the identified potential statements, as shown in FIG. 2B, to create a tree of blocks. FIG. 3A is a block diagram of a tree of blocks for the potential statements in FIG. 2B. Each block contains a potential statement and indicates the indentation level associated with that potential statement. As can be seen, the standalone statement in "X=2;" block 308 is at the deepest block level. "The rule" is used to place the conditions in the appropriate block levels. Therefore, "X=2;" block 308 is in "IF (3)" block 306, "IF (3)" block 306 is in "IF (2)" block 304, and "IF (2)" block 304 is in "IF (1)" block 302. Once all of the conditions in the blocks 306, 304, and 302 have been applied to the process, it is apparent that the "try block 310" is at the same level as "IF (1)" block 302. The "catch" block 312 is inside of the "try" block 310 because there is nothing to catch unless a try is performed. At step 508, the intermediate textual representation processor 140, uses the tree of blocks to produce an intermediate textual representation. FIG. 3B depicts an intermediate textual representation that is created from the tree of blocks as depicted in FIG. 3A. Each potential statement in the intermediate textual representation is placed on a separate line and an indentation level is associated with each of the potential statements. For example, "IF(1)" block 302 and "try" block 310 are at the outermost level of the tree of blocks; therefore, potential statement 1, e.g., IF (1), and potential statement 5, e.g., try { } catch { }, are placed at the zero indentation level in the intermediate textual representation. "IF (2)" block 304 is at the next block level so potential statement 2 is at the first indentation level in the intermediate textual representation. "IF (3)" block 306 is at the next block level so potential statement 3 is at the second indentation level. "X=2;" block 308 is at the next block level so potential statement 4, e.g., "x=2;", is at the third indentation level. In step 510, the intermediate textual representation processor 140 transmits the intermediate textual representation to the filters 150. The filters 150 (1) insert begin and end block indicators, such as brackets, into the intermediate textual representation, and (2) replace the special character 205 with the saved comment 201 to produce reformatted source code 160. FIG. 4 depicts reformatted source code produced from the intermediate textual representation as depicted in FIG. 3B. In conformance to "the rule", left brackets are inserted on source lines 1, 2, and 3, and right brackets are inserted on source lines 5, 6, and 7. Determining the Percentage of Code that does not Comply with a Formatting Standard FIG. 6A depicts unformatted source code and FIG. 6B depicts formatted source code that complies with a formatting standard. To determine the percentage of code that does not comply with the formatting standard, the total number of lines that are different (referred to hereinafter as "the total number of different lines") between the unformatted source code 602 and the formatted source code 604 are counted, and divided by the total number of lines in the unformatted source code 602. To determine the total number of different lines, each line of unformatted source code 602 is compared with the analogous line in the formatted source code 604. For example, lines 1-3 are the same in the unformatted source code 602 and the formatted source code 604. However line 4 of the unformatted source code 602 lacks a left bracket in comparison with the analogous line, "if (ch═125) {", in the formatted source code 604. Line 6 is the same between the formatted source code 604 and the unformatted source code 602. However line 7 of the unformatted source code 602 lacks a left and right bracket in comparison with the analogous line, "} else if (inLabel) {", in the formatted source code 604. Line 8 is the same between the formatted source code 604 and the unformatted source code 602. However line 9 of the unformatted source code 602 lacks a right bracket in comparison with the analogous line, "if (indent<0) {", in the formatted source code 604. Line 12, 'string2=""' is the same in both the unformatted source code 602 and the formatted source code 604. However, lines 13-17 in the unformatted source code 602 are different from the analogous lines in the formatted source code 604. For example, line 13 in the unformatted source code 602 is "else if (indent>=" whereas the analogous line in the formatted source code 604 is "} else if (indent", line 14 in the unformatted source code 602 is 'string2="> whereas the analogous line in the formatted source code 604 is "string2=", etc. Therefore, in this example, the lines that are different are 4, 7, 9, 13-17, which equals 8 total number of different lines. Then the total number of different lines is divided by the total number of lines in the unformatted source code 602, which in this example is 20. So the percentage of code that does not comply with formatting standards is 8/20, which equals 40%. Hardware Overview FIG. 7 is a block diagram that illustrates a computer system 700 upon which an embodiment of the invention may be implemented. Computer system 700 includes a bus 702 or other communication mechanism for communicating information, and a processor 704 coupled with bus 702 for processing information. Computer system 700 also includes a main memory 706, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 702 for storing information and instructions to be executed by processor 704. Main memory 706 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 704. Computer system 700 further includes a read only memory (ROM) 708 or other static storage device coupled to bus 702 for storing static information and instructions for processor 704. A storage device 710, such as a magnetic disk or optical disk, is provided and coupled to bus 702 for storing information and instructions. Computer system 700 may be coupled via bus 702 to a display 712, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 714, including alphanumeric and other keys, is coupled to bus 702 for communicating information and command selections to processor 704. Another type of user input device is cursor control 716, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 704 and for controlling cursor movement on display 712. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. The invention is related to the use of computer system 700 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 700 in response to processor 704 executing one or more sequences of one or more instructions contained in main memory 706. Such instructions may be read into main memory 706 from another computer-readable medium, such as storage device 710. Execution of the sequences of instructions contained in main memory 706 causes processor 704 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software. The term "computer-readable medium" as used herein refers to any medium that participates in providing instructions to processor 704 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory, such as main memory 706. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 702. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punchcards, papertape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read. Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 704 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 700 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 702. Bus 702 carries the data to main memory 706, from which processor 704 retrieves and executes the instructions. The instructions received by main memory 706 may optionally be stored on storage device 710 either before or after execution by processor 704. Computer system 700 also includes a communication interface 718 coupled to bus 702. Communication interface 718 provides a two-way data communication coupling to a network link 720 that is connected to a local network 722. For example, communication interface 718 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 718 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 718 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information. Network link 720 typically provides data communication through one or more networks to other data devices. For example, network link 720 may provide a connection through local network 722 to a host computer 724 or to data equipment operated by an Internet Service Provider (ISP) 726. ISP 726 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 728. Local network 722 and Internet 728 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 720 and through communication interface 718, which carry the digital data to and from computer system 700, are exemplary forms of carrier waves transporting the information. Computer system 700 can send messages and receive data, including program code, through the network(s), network link 720 and communication interface 718. In the Internet example, a server 730 might transmit a requested code for an application program through Internet 728, ISP 726, local network 722 and communication interface 718. The received code may be executed by processor 704 as it is received, and/or stored in storage device 710, or other non-volatile storage for later execution. In this manner, computer system 700 may obtain application code in the form of a carrier wave. In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
|
Same subclass Same class Consider this |
||||||||||
