Method for verifying code transformers for an incorporated system, in particular in a chip card7020872Abstract The invention relates to a method for verifying transformation (2) of a source code (1) into a transformed code (3) designed for an embedded system (7) such as in a smart card or other portable or mobile device including data processing resources. The method comprises at least the following steps: determining a single virtual machine that factors in the behavior of both of these codes (1, 3), determining for each source code (1) and transformed code (3) a plurality of auxiliary functions representing the residual differences between said source code (1) and transformed code (3), and a step for verifying a correspondence property between the auxiliary functions, the verification of the code transformation (2) being obtained from this last step. Claims The invention claimed is: Description FIELD OF THE INVENTION
Another subject of the invention is the application of such a method to a transformer or converter for generating a code designed to be stored in a chip (smart) card. BRIEF DESCRIPTION OF THE DRAWINGS The invention will now be described in greater detail in reference to the attached drawings, in which: FIG. 1 schematically illustrates the process for transforming a source code into a final transformed code; FIGS. 2A and 2B schematically illustrate one of the essential characteristics of the method according to the invention; and FIG. 3 schematically illustrates the application of the method according to the invention to a chip card. The following will describe in detail the method for verifying code transformers according to the invention. DETAILED DESCRIPTION OF THE INVENTION FIG. 1 schematically illustrates the method for transforming a code 1, which will be called a "source code" in the sense of an original or initial code, into a final code 3, called a "transformed code", by means of a code transformer or converter 2. The latter device can be a computing means or a specific piece of software. Ordinarily, the transformed code is designed to be resident in the embedded system 4 (solid line). The transformer or converter 2 can also be resident in or downloaded into the embedded system: reference 4′ (broken line). After being loaded into or stored in the embedded system 4-4′, the transformed code 3 makes it possible to execute one or more tasks as necessary, represented by the single reference 5. The embedded system 4 is assumed to have standard autonomous computing resources (not represented). A priori, the code transformation is performed once and for all by a given transformer 2, or on rare occasions, involves a modification of a version of the original code or source code 1, for example. It is therefore necessary to be able to establish a formal proof that the transformed code 3 is equivalent to the source code 1. This process makes it possible to verify whether the transformer 2 is working correctly. However, as mentioned, if the two sets formed by the source and transformed codes are considered in their entirety, the theory goes that such a determination is generally not realistically possible. An essential characteristic of the method according to the invention will consist of finding, for each of the two codes, two subsets that will be called the first and second subsets. According to an important characteristic of the method according to the invention, the first subsets form a virtual machine common to the two codes, source and transformed. For this reason, it is not necessary to verify the equivalence of the first subsets. On the other hand, the second subsets, constituted by the auxiliary functions, are different from one code to the other. The determination of the equivalence of the source and transformed codes is therefore reduced to determining the equivalence of all the pairs of auxiliary functions of the second subsets. The residual complexity of the auxiliary functions can be greatly reduced. It follows that determining the aforementioned equivalence becomes possible. FIGS. 2A and 2B illustrate, in highly schematic fashion, the method according to the invention. As shown more particularly in FIG. 2A, the first subsets of the source code 1 and the transformed code 3 form a common virtual machine 13. The second subsets, 10 and 30, are each constituted by a series of so-called auxiliary functions, the equivalence of which must be verified. These auxiliary functions 10 and 30 parameterize the common virtual machine 13. The equivalence of the two codes, source 1 and transformed 2, is therefore reduced to verifying the equivalence of the auxiliary functions 10 and 30, two by two, as will be shown below in reference to FIG. 2B. The steps of the method will now be described in greater detail. The source and transformed codes are associated with first and second virtual machines, respectively. The first step consists in defining a single virtual machine (or set of operational semantics) that makes it possible to factor in the behavior of the source code and the transformed code. The differences between the two codes therefore appear through auxiliary functions that will be interpreted or implemented differently in the two codes. A virtual machine may be represented by a set of rules with the following form:
The premises are either conditions for applying a rule, i.e. boolean expressions, or assignments of variables used to express a change of state. The premises use auxiliary functions to extract information on the state or to express conditions. Each rule indicates how the state of the machine changes when the premises are verified and the instruction "Instruction1" is encountered. One or more rules in this form are defined for each type of instruction in the code. The second step consists in defining the data types or structures used in the two codes. It defines basic types, such as for example: Basic::=Nat|Bool|Name (2), or constructed types, for example: Environment::=Name•Value Instructions::=Instruction1|Instruction2| (3), The third step consists in interpreting the types, referenced•, used in the virtual machines. For each type•, it defines an interpretation for the source code [[•]]S and an interpretation for the transformed code [[•]]T, plus a relation R. between the two interpretations [[.]]S and [[.]]T. These relations, called logical relations, satisfy the structure of the types. For simple types, they must be explicitly defined: for structured types, they are deduced from the types of the components of the structure. For example for the pairs: (a, b) Rθ1×θ2(a′, b′) ##CHR2## a Rθ1a′^b Rθ2b′ (4), a relation wherein •1 and •2 are types and a, b, a′ and b′ are type elements. The same is true for the functions: ƒRθ1→θ2ƒ′ ##CHR3## ƒa Rθ2ƒ′ a′ (5), The logical relations must be "identity" relation for the observable types, i.e. the types for which it is desirable to show that the two codes produce the same result. These are usually types that are printable and/or displayable on a computer screen. They can be basic types, but also structured types representing, for example, a stack or variables of a given program. The fourth step consists in interpreting the auxiliary functions used in the virtual machines. For each auxiliary function ƒ, its definition for the source code, written [[ƒ]]S, and its definition for the transformed code, written [[ƒ]]T, are given. Determining the equivalence consists of showing that the definitions of the auxiliary functions correspond to the logical relations. More precisely, for each auxiliary function ƒ.• • •′, we show [[ƒ]]SRθ→θ′[[ƒ]]T (6), It follows that the two virtual machines are related, i.e. that: [[state]]S Rtype-state[[state]]T (7). Since the relations are the identity for the observable types, the source and transformed codes are observationally identical. The last step consists of showing that there exists a transformer• (FIG. 1:2) that satisfies the logical relations. This can be done by verifying that a given transformer•: S• T satisfies the logical relation associated with the type of its argument, S being the source code (FIG. 1:1) and T being the transformed code (FIG. 1:3). In order to do this, it is necessary for it to obey the following relation: ∀x[[θ]]S.x RθΓ′(x) (8). It has just been shown that the logical relations specify a set of constraints. It is therefore possible to extract the transformer 2 that is correct by construction, by applying refinement or extraction techniques, using one of the appropriate proof assistants. The method according to the invention therefore offers an important advantage, since it allows for a substantial mechanization of the verification process, and above all makes it possible to perform it successfully, since this verification is performed on less complex subsets. Since the transformation of the source code 1 can be described as a succession of simpler transformations, this method can be applied so as to show each transformation independently. It follows that it offers a the great advantage in terms of modularity. The verification need only be performed on the subsets of auxiliary functions 10 and 30, as illustrated by FIG. 2B, by means of a hardware or software device 6. There are assumed to be n auxiliary functions, referenced 10a1, 10b1 . . . , 10i, . . . , 10 n-1, 10n and 20a1, 20b1, . . . , 20i, . . . 20n-1, 20n, respectively. If the device 6 is hardware, it comprises as many verification circuits 60a1, 60b1, . . . , 60i, . . . , 60n-1, 60n (arbitrarily represented in FIG. 2B by the symbol of a comparator), as there are pairs of auxiliary functions to be verified, for example the verification circuit 60i for the pair of functions 10i and 30i. The output or outputs of this device 6, with the single reference 61, indicate(s) that the logical relation between all the possible pairs of corresponding auxiliary functions of the source 1 and transformed 3 codes is satisfied. This series of operations is enough to provide formal proof of the equivalence of the two codes in their entirety. It must be noted that the method according to the invention is just as usable a posteriori, i.e. in order to verify an existing transformer, as it is a priori, as an aid in developing a new transformer. It specifically makes it possible, in the latter case, to determine its characteristics so that it works correctly, in other words so that the transformed code that will be generated by this transformer from the source code satisfies the aforementioned equivalence requirement. The method will now be described in the chip card context. FIG. 3 schematically illustrates the architecture of a chip card, referenced 7. In this figure, only these elements essential to a proper understanding of the method according to the invention are represented. The chip card 7 specifically comprises an input/output device 70 that allows communications with the outside world, a first fixed or programmable memory device 71 (of the ROM, PROM, EPROM or EEPROM type), and a read-write memory 72. Lastly, the chip card 7 comprises a microprocessor or microcontroller 73 that dialogues with the other components of the chip card 7 through a bus. The software architecture of such a chip card 7 complies with the ISO 7816-3 standard, which translates into protocol layers ranging from the lowest layers associated with the input/output devices 70 to the highest layers associated with the software applications stored in the memory of the chip card 7. These standards provide for the transmissions to take place in the serial mode. The source code 1, once transformed by the code transformer or converter 2, is transmitted to the chip card 7 in order to be stored, generally in the fixed or "semi-fixed" memory device 71 via the input/output device 70. The software application or applications run by the chip card 7 can be stored permanently in the chip card 7, i.e. in the memory device 71, or temporarily in the read/write memory 72. In the latter case, the applications are downloaded via the input/output device 70. In the example described, it is assumed that the chip card 7 is a multi-application or multi-user type card. It is therefore assumed that the chip card runs m software applications A1 through Am, written in the transformed language 3. One of the languages commonly used for chip cards, as mentioned above, is the "Java Card" language. It is a language dedicated to chip card programming, a language that constitutes a limitation of the "Java" language. The card 7 can also store an additional converter that performs conversions on code segments in situ as they load. The steps of the method according to the invention that have just been described in a general context, will be illustrated more specifically within the context of the preferred application. As is known, an installation of the "Java Card" language involves a transformer of converter means that transforms so-called "class" files into "CAP" files. A class file is a unit of complication and representation of the object code of a "Java" program. A CAP file groups all the classes of the same "Java Card package" and includes only one "constant pool." A "Java Card package" is a "Java" construction for grouping classes and creating name spaces. A "constant pool" is a table associated with each class file for "Java" and with each "CAP" file for "Java Card." This table contains constants (character strings, integers, etc.). It is used in "Java" and "Java Card" virtual machines. The transformation is nontrivial and global: it replaces all the names of packages, classes, fields, methods) with entities called "tokens," i.e., 7- or 8-bit whole numbers. These "tokens" serve as indices for accessing tables. In addition, the transformation groups all the class files of the same package into a CAP file (with a merging of the "constant pools" and a reorganization of the method tables). The "Java Card" language is specifically designed to be used in banking chip cards. It is therefore imperative to verify the accuracy of the transformation of a program (or "byte code") written in the "Java" virtual machine into a program written in the "Java Code" virtual machine, i.e. to prove of the equivalence of these two programs. This formal proof is provided by executing the steps of the method according to the invention. The first step consists in defining a set of operational semantics. One or more semantic rules are associated with each instruction of the "byte code." The "byte code" is a portable assembler code. It is the object code for "Java" or "Java Card" virtual machines. For example, the semantic rule associated with one of the instructions of this code, the "getfield" instruction, can be described as follows: ƒ—ref:=constant—pool(c)(i) <c—ref, iv>:=h(•) v:=iv(ƒ—ref) <getfield i; bc, r::ops,l,c,h> ##CHR4## <bc, v::ops,l,c,h> (9). In the example, the state is composed of the code executed with the current instruction (getfield i; bc) leading, a stack of operands (r::ops), the local variables (l), a reference to the current class (c) and the heap (h). The rule specifies the operations performed during the execution of getfield i:
The second step consists in defining the types. In the case of the "Java Card" language, it defines the Word type for representing the unit of storage: Word=Object—ref+Null+Boolean+Byte+Short (10), As an example of the constructed type, the type of a constant pool is: Constant—pool=CP—index•CP info (11), with CP—info=Class—ref+Method—ref+Field—ref (12), In the example, a "constant pool" is seen as a function that takes an index (the type CP_index is considered to be basic) and renders an input (in this case a reference to a class, a method or a field). The type of the "byte code" is: Bytecode=Instruction+Bytecode; Bytecode Instruction=getfieldCP—index+Invokevirtual Cp—index+ (13), The "byte code" is an instruction sequence. The instruction type lists all of the instructions used in the "byte code" of "Java Card." The third step consists in interpreting the types. In the case of "Java Card," the interpretation for the source code, in the form of class files (which use names) is written [[.]]name and the interpretation for the transformed code, in the form of CAP files (which use "tokens") is written [[.]]tok. For example the type [[CP_index]]name is verified for the source code: [[CP—index]]name=Class—name×Index (14). In the name-based model, a "constant pool" index is constituted by a class name (to indicate the "constant pool" being referred to) and an index. The type [[CP_index]]tok is verified for the transformed code: [[CP—index]]tok=Package—token×Index (15). A "constant pool" index is constituted by a "package token" (in the example described, there is only one "constant pool" per "package" or CAP file) and an index. The relation RCP—index is defined as a bijection such as: (16) (c—name, i) RCP The name of the "package" of the class containing the "constant pool" being referred to in the name-based module should be in relation with the "token" of the "package" containing the "constant pool" being referred to in the "token"-based model. The only constraint on the indices i and i′ is that RCP The fourth step consists in interpreting the auxiliary functions. For example, the version of the auxiliary function "constant_pool" for the name-based module is: [[constant_pool]]name=cp_name (17), with: cp—name c=let( . . . , cp, . . . )=env—name(pack—name(c))(c) (18).
The function pack_name takes a class name and renders a "package" name, and the function env_name takes a package name and a class name and finds in the class hierarchy the structure representing the designated class file. The constant pool is extracted from the class file. For the "token"-based model, the version of the auxiliary function [[constant_pool]]tok is: [[constant—pool]]tok=cp—tok (19), with: cp—tok c=let( . . . , cp, . . . )=env—tok(p) (20),
The "constant pool" is found in the environment (i.e., the CAP files) by means of the function env_tok and the package token. The fifth step consists of proving that the auxiliary functions satisfy the logical relations. Referring again to the example of the function for accessing the "constant pool," it is necessary to determine that: [[constant—pool]]name R—cp The relation Rcp ∀(c_name, i) (p_tok, i′) such that (c_name, i) RCP with: ( . . . , cp, . . . )=env name(pack_name(c_name))(c_name) ( . . . cp′, . . . )=env_tok(p_tok) (23). The proof is based on the definition of RCP (c_name, i) Rcp ##CHR5## pack_name(c_name) Rpackage The sixth and last step of the method consists of determining transformation is such that the transformation of the code and the data by the converter satisfies given logical relations. For example, the references to "packages" are either names or "tokens" depending on the model. The associated logical relation Rpackage By reading the above, it is easy to see that the invention achieves the objects set forth. It must be clear, however, that the invention is not limited to just the exemplary embodiments explicitly described, particularly in relation to FIGS. 2 and 3. Finally, although the method has been described in detail in the case of the transformation of a program of the "Java" virtual machine into a program of the "Java Card" virtual machine, which is particularly advantageous for chip card or similar applications, the invention is not in any way limited to this particular application. The invention can be applied whenever the device involved has relatively limited computing resources, particularly in terms of memory size (read/write or fixed) and/or the computational power of the processor used. For example, it applies to electronic books, for example of the "e-book" type, designed to download and store data from Internet sites, palmtop computers, for example like the so-called "organizers," certain mobile telephones that can connect to the Internet, etc. In all of these cases, it is necessary to use an optimized language in order to use the integrated computing resources to best advantage. While this invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the preferred embodiments of the invention as set forth herein, are intended to be illustrative, not limiting. Various changes may be made without departing from the true spirit and full scope of the invention as set forth herein and defined in the claims.
|
Same subclass Same class Consider this |
||||||||||||||||||||||||||||
