Method and apparatus for generating encryption stream ciphers6490357Abstract A method and an apparatus for generating encryption stream ciphers are based on a recurrence relation designed to operate over finite fields larger than GF(2). A non-linear output can be obtained by using one or a combination of non-linear processes to form an output function. The recurrence relation and the output function can be selected to have distinct pair distances such that, as the shift register is shifted, no identical pair of elements of the shift register are used twice in either the recurrence relation or the output function. Under these conditions, the recurrence relation and the output function also can be chosen to optimize cryptographic security or computational efficiency. Claims I claim: Description BACKGROUND OF THE INVENTION
TABLE 1
Exponential Table
i xx0 xx1 xx2 xx3 xx4 xx5 xx6 xx7 xx8 xx9
00x 1 2 4 8 16 32 64 128 77 154
01x 121 242 169 31 62 124 248 189 55 110
02x 220 245 167 3 6 12 24 48 96 192
03x 205 215 227 139 91 182 33 66 132 69
04x 138 89 178 41 82 164 5 10 20 40
05x 80 160 13 26 52 104 208 237 151 99
06x 198 193 207 211 235 155 123 246 161 15
07x 30 60 120 240 173 23 46 92 184 61
08x 122 244 165 7 14 28 56 112 224 141
09x 87 174 17 34 68 136 93 186 57 114
10x 228 133 71 142 81 162 9 18 36 72
11x 144 109 218 249 191 51 102 204 213 231
12x 131 75 150 97 194 201 223 243 171 27
13x 54 108 216 253 183 35 70 140 85 170
14x 25 50 100 200 221 247 163 11 22 44
15x 88 176 45 90 180 37 74 148 101 202
16x 217 255 179 43 86 172 21 42 84 168
17x 29 58 116 232 157 119 238 145 111 222
18x 241 175 19 38 76 152 125 250 185 63
19x 126 252 181 39 78 156 117 234 153 127
20x 254 177 47 94 188 53 106 212 229 135
21x 67 134 65 130 73 146 105 210 233 159
22x 115 230 129 79 158 113 226 137 95 190
23x 49 98 196 197 199 195 203 219 251 187
24x 59 118 236 149 103 206 209 239 147 107
25x 214 225 143 83 166
III Memory Implementation When implemented in hardware, shifting bits is a simple and efficient operation. Using a processor and a shift register larger than the registers of the processor makes shifting bits an iterative procedure, which is very inefficient. When the units to be shifted are bytes or words, shifting becomes simpler because there is no carry between bytes. However, the shifting process is still iterative and inefficient. In the exemplary embodiment, the linear feedback shift register is implemented with a circular buffer or a sliding window. The diagrams showing the contents of circular buffer 24a at time n at time n+1 are shown in FIGS. 3A and 3B, respectively. For circular buffer 24a, each element of the shift register is stored in a corresponding location in memory. A single index, or pointer 30, maintains the memory location of the most recent element stored in memory, which is S.sub.k-1 in FIG. 3A. At time n+1, the new element S.sub.k is computed and stored over the oldest element S.sub.0 in memory, as shown in FIG. 3B. Thus, instead of shifting all elements in memory, pointer 30 is moved to the memory location of the new element S.sub.k. When pointer 30 reaches the end of circular buffer 24a, it is reset to the beginning (as shown in FIGS. 3A and 3B). Thus, circular buffer 24a acts as if it is a circle and not a straight line. Circular buffer 24a can be shifted from left-to-right, or right-to-left as shown in FIGS. 3A and 3B. Correspondingly, pointer 30 can move left-to-right, or right-to-left as shown in FIGS. 3A and 3B. The choice in the direction of the shift is a matter of implementation style and does not affect the output result. To generate an output element in accordance with a recurrence relation, more than one element is typically required from memory. The memory location associated with each required element can be indicated by a separate pointer which is updated when the register is shifted. Alternatively, the memory location associated with each required element can be computed from pointer 30 as necessary. Since there is a one-to-one mapping of each element to a memory location, a particular element can be obtained by determining the offset of that element from the newest element (in accordance with the recurrence relation), adding that offset to pointer 30, and addressing the memory location indicated by the updated pointer. Because of the circular nature of the memory, the calculation of the updated pointer is determined by an addition modulo k of the offset to pointer 30. Addition modulo k is simple when k is a power of two but is otherwise an inefficient operation on a processor. In the preferred embodiment, the shift register is implemented with sliding window 24b as shown in FIG. 3C. Sliding window 24b is at least twice as long as circular buffer 24a and comprises two circular buffers 32a and 32b arranged adjacent to each other. Each of circular buffers 32a and 32b behaves like circular 24a described above. Circular buffer 32b is an exact replica of circular buffer 32a. In normal operation, buffer 32b contains meaningful values. Values stored in buffer 32a are then calculated from the values in buffer 32b. Thus, each element of the shift register is stored in two corresponding locations in memory, one each for circular buffers 32a and 32b. Pointer 34 maintains the memory location of the most recent element stored in circular buffer 32a, which is S.sub.k-1 in FIG. 3C. In the exemplary embodiment, pointer 34 starts at the middle of sliding window 24b, moves right-to-left, and resets to the middle again when it reaches the end on the left side. From FIG. 3C, it can be observed that no matter where in circular buffer 32a pointer 34 appears, the previous k-1 elements can be addressed to the right of pointer 34. Thus, to address an element in the shift register in accordance with the recurrence relation, an offset of k-1 or less is added to pointer 34. Addition modulo k is not required since the updated pointer is always to the right of pointer 34 and computational efficiency is obtained. For this implementation, sliding window 24b can be of any length at least twice as long as circular buffer 24a, with any excess bytes being ignored. Furthermore, the update time is constant and short. IV. Exemplary Stream Cipher Based on LFSR Over GF(2.sup.8) The present invention can be best illustrated by an exemplary generator for a stream cipher based on a linear feedback shift register over GF(2.sup.8). The stream cipher described below uses the byte operations described above over the Galois field of order 8 with the representation of .sym. and x for operations of addition and multiplication, respectively, over the Galois field. In the exemplary embodiment, table lookup is utilized for the required multiplication with constants C.sub.j. In the exemplary embodiment, a sliding window is used to allow fast updating of the shift register. A block diagram of the exemplary generator is shown in FIG. 4. In the exemplary embodiment, linear feedback shift register 52 is 17 octets (or 136 bits) long which allows shift register 52 to be in 2.sup.136 -1 (or approximately 8.7.times.10.sup.40) states. The state where the entire register is 0 is not a valid state and does not occur from any other state. The time to update register 52 with a particular number of non-zero elements in the recurrence relation is constant irrespective of the length of register 52. Thus, additional length for register 52 (for higher order recurrence relation) can be implemented at a nominal cost of extra bytes in memory. In the exemplary embodiment, linear feedback shift register 52 is updated in accordance with the following recurrence relation: ##EQU4## where the operations are defined over GF(2.sup.8), +.multidot. is the exclusive-OR operation on two bytes represented by Galois adders 58, and .times..multidot. is a polynomial modular multiplication represented by Galois multipliers 54 (see FIG. 4). In the exemplary embodiment, the modular multiplications on coefficients 56 are implemented using byte table lookups on pre-computed tables as described above. In the exemplary embodiment, the polynomial modular multiplication table is computed using the irreducible polynomial defined by equation (3). The recurrence relation in equation (4) was chosen to be maximal length and to have few non-zero coefficients, so that the shift register elements used were distinct from the ones used for the non-linear functions below. In the exemplary embodiment, to disguise the linearity of shift register 52, two of the techniques described above are used, namely stuttering and using a non-linear function. Additional non-linearity techniques are utilized and are described below. In the exemplary embodiment, non-linearity is introduced by performing a non-linear operation on multiple elements of shift register 52. In the exemplary embodiment, four of the elements of shift register 52 are combined using a function which is non-linear. An exemplary non-linear function is the following: ##EQU5## where V.sub.n is the non-linear output (or the generator output), + .multidot. is the addition truncated modulo 256 represented by arithmetic adders 60, and .times. .multidot. is the multiplication modulo 257 represented by modular multiplier 62 as described below. In the exemplary embodiment, the four bytes used are S.sub.n, S.sub.n+2, S.sub.n+5 and S.sub.n+12, where S.sub.n is the oldest calculated element in the sequence according to the recurrence relation in equation (4). These elements are selected such that, as the register shifts, no two elements are used in the computation of two of the generator outputs. The pairwise distances between these elements are distinct values. For example, S.sub.n+12 is not combined with S.sub.n+5, S.sub.n+2, nor S.sub.n again as it is shifted through register 52. This property is referred to as a "full positive difference set." Simple byte addition, with the result truncated modulo 256, is made non-linear in GF(2.sup.8) by the carry between bits. In the exemplary embodiment, two pairs of elements in the register {(S.sub.n and S.sub.n+5) and (S.sub.n+2 and S.sub.n+12)} are combined using addition modulo 256 to yield two intermediate results. However, addition modulo 256 is not ideal since the least significant bits have no carry input and are still combined linearly. Another non-linear function which can be computed conveniently on a processor is multiplication. However, truncation of a normal multiplication into a single byte may not yield good results because multiplication modulo 256 does not form a group since the results are not well distributed within the field. A multiplicative group of the field of integers modulo the prime number 257 can be used. This group consists of integers in the range of 1 to 256 with the group operation being integer multiplication reduced modulo 257. Note that the value 0 does not appear in the group but the value 256 does. In the exemplary embodiment, the value of 256 can be represented by a byte value of 0. Typically, processors can perform multiplication instructions efficiently but many have no capability to perform, or to perform efficiently, divide or modulus instructions. Thus, the modulo reduction by 257 can represent a performance bottleneck. However, reduction modulo 257 can be computed using computation modulo 2.sup.n, which in the case of n=8 is efficient on common processors. It can be shown that for a value X in the range of 1 to 2.sup.16 -1 (where X is the result of a multiplication of two 8th order operands), reduction modulo 257 can be computed as: ##EQU6## where X.sub.257 is the reduction modulo 257 of X and X.sub.256 is the reduction modulo 256 of X. Equation (6) indicates that reduction modulo 257 of a 16-bit number can be obtained by subtracting the 8 most significant bits (X/256) from the 8 least significant bits (X.sub.256). The result of the subtraction is in the range of -255 and 255 and may be negative. If the result is negative, it can be adjusted to the correct range by adding 257. In the alternative embodiment, reduction modulo 257 can be performed with a lookup table comprising 65,536 elements, each 8 bits wide. Multiplication of the two intermediate results is one of many non-linear functions which can be utilized. Other non-linear functions, such as bent functions or permuting byte values before combining them, can also be implemented using lookup tables. The present invention is directed at the use of these various non-linear functions for producing non-linear output. In the exemplary embodiment, stuttering is also utilized to inject additional non-linearity. The non-linear output derived from the state of the linear feedback shift register as described above may be used to reconstruct the state of the shift register. This reconstruction can be made more difficult by not representing some of the states at the output of the generator, and choosing which in an unpredictable manner. In the exemplary embodiment, the non-linear output is used to determine what subsequent bytes of non-linear output appear in the output stream. When the generator is started, the first output byte is used as the stutter control byte. In the exemplary embodiment, each stutter control byte is divided into four pairs of bits, with the least significant pair being used first. When all four pairs have been used, the next non-linear output byte from the generator is used as the next stutter control byte, and so on. Each pair of stutter control bits can take on one of four values. In the exemplary embodiment, the action performed for each pair value is tabulated in Table 3.
TABLE 3
Pair
Value Action of Generator
(0, 0) Register is cycled but no output is produced
(0, 1) Register is cycled and the non-linear output XOR with
the constant (0 1 1 0 1 0 0 1).sub.2 becomes the output of the
generator. Register is cycled again.
(1, 0) Register is cycled twice and the non-linear output
becomes the output of the generator.
(1, 1) Register is cycled and the non-linear output XOR with
the constant (1 1 0 0 0 1 0 1).sub.2 becomes the output of the
generator.
As shown in Table 3, in the exemplary embodiment, when the pair value is (0, 0), the register is cycled once but no output is produced. Cycling of the register denotes the calculation of the next sequence output in accordance with equation (4) and the shifting this new element into the register. The next stutter control pair is then used to determine the action to be taken next. In the exemplary embodiment, when the pair value is (0, 1) the register is cycled, and the non-linear output is generated in accordance with equation (5). The non-linear output is XORed with the constant (0 1 1 0 1 0 0 1).sub.2, and the result is provided as the generator output. The register is then cycled again. In FIG. 4, the XORed function is performed by XOR gate 66, and the constant is selected by multiplexer (MUX) 64 using the stutter control pair from buffer 70. The output from XOR gate 66 is provided to switch 68 which provides the generator output and the output byte for stutter control in accordance with the value of the stutter control pair. The output byte for stutter control is provided to buffer 70. In the exemplary embodiment, when the pair value is (1, 0) the register is cycled twice and the non-linear output generated in accordance with equation (5) is provided as the generator output. In the exemplary embodiment, when the pair value is (1, 1) the register is cycled and the non-linear output generated in accordance with equation (5). The non-linear output is then XORed with the constant (1 1 0 0 0 1 0 1).sub.2, and the result is provided as the generator output. In the exemplary embodiment, the constants which are used in the above steps are selected such that when a generator output is produced, half of the bits in the output are inverted with respect to the outputs produced by the other stutter control pairs. For stutter control pair (1, 0), the non-linear output can be viewed as being XORed with the constant (0 0 0 0 0 0 0 0).sub.2. Thus, the Hamming distance between any of the three constants is four. The bit inversion further masks the linearity of the generator and frustrates any attempt to reconstruct the state based on the generator output. The present invention supports a multi-tier keying structure. A stream cipher which supports multi-tier keying structure is especially useful for wireless communication systems, wherein data are transmitted in frames which may be received in error or out-of-sequence. An exemplary two-tier keying structure is described below. In the exemplary embodiment, one secret key is used to initialized the generator. The secret key is used to cause the generator to take an unpredictable leap in the sequence. In the exemplary embodiment, the secret key has a length of four to k-1 bytes (or 32 to 128 bits for the exemplary recurrence relation of order 17). Secret keys of less than 4 bytes are not preferred because the initial randomization may not be adequate. Secret keys of greater than k-1 bytes can also be utilized but are redundant, and care should be taken so that a value for the key does not cause the register state to be set to all 0, a state which cannot happen with the current limitation. A flow diagram of an exemplary secret key initialization process is shown in FIG. 5. The process starts at block 110. In the exemplary embodiment, at block 112, the state of the shift register is first initialized with the Fibonacci numbers modulo 256. Thus, elements S.sub.0, S.sub.1, S.sub.2, S.sub.3, S.sub.4, S.sub.5, and so on, are initialized with 1, 1, 2, 3, 5, 8, and so on, respectively. Although Fibonacci numbers are used, any set of non-zero numbers which are not linearly related in the Galois field can be used to initialize the register. These numbers should not have exploitable linear relationship which can be used to reconstruct the state of the register. Next, the loop index n is set to zero, at block 114. The secret key initialization process then enters a loop. In the first step within the loop, at block 116, the first unused byte of the key material is added to S.sub.n. Addition of the key material causes the generator to take an unpredictable leap in the sequence. The key is then shifted by one byte, at block 118, such that the byte used in block 116 is deleted. The register is then cycled, at block 120. The combination of blocks 116 and 120 effectively performs the following calculation; ##EQU7## where K is the first unused byte of the key material. The loop index n is incremented, at block 122. A determination is then made whether all key materials have been used, at block 124. If the answer is no, the process returns to block 116. Otherwise, the process continues to block 126. In the exemplary embodiment, the length of the key is added to S.sub.n, at block 126. Addition of the length of the key causes the generator to take an additional leap in the sequence. The process then enters a second loop. In the first step within the second loop, at block 128, the register is cycled. The loop index n is incremented, at block 130, and compared against the order k of the generator, at block 132. If n is not equal to k, the process returns to block 128. Otherwise, if n is equal to k, the process continues to block 134 where the state of the generator is saved. The process then terminates at block 136. In addition to the secret key, a secondary key can also be used in the present invention. The secondary key is not considered secret but is used in an exemplary wireless telephony system to generate a unique cipher for each frame of data. This ensures that erased or out-of-sequence frames do not disrupt the flow of information. In the exemplary embodiment, the stream cipher accepts a per-frame key, called a frame key, in the form of a 4-octet unsigned integer. The per-frame initialization is similar to the secret key initialization above but is performed for each frame of data. If the use of the stream cipher is such that it is unnecessary to utilize per-frame key information, for example for file transfer over a reliable link, the per-frame initialization process can be omitted. A flow diagram of an exemplary per-frame initialization process with the frame key is shown in FIG. 6A. The process starts at block 210. In the exemplary embodiment, at block 212, the state of the generator is initialized with the state saved from the secret key initialization process as described above. Next, the loop index n is set to zero, at block 214. The per-frame initialization process then enters a loop. In the first step within the loop, at block 216, the least significant byte of the frame key is added modulo 256 to S.sub.n. The frame key is then shifted by three bits, at block 218, such that the three least significant bits used in block 216 are deleted. The register is then cycled, at block 220. In the exemplary embodiment, the loop index n is incremented at block 222 and compared against value '11' at block 224. The value of '11', as used in block 224, corresponds to the 32 bits used as the frame key and the fact that the frame key is shifted three bits at a time. Different selections of the frame key and different numbers of bits shifted at a time can result in different comparison values used in block 224. If n is not equal to '11', the process returns to block 216. Otherwise, if n is equal to '11', the process continues to block 226 and the register is cycled again. The loop index n is incremented, at block 228, and compared against 2k, at block 230. If n is not equal to 2k, the process returns to block 226. Otherwise, if n is equal to 2k, the process terminates at block 232. The present invention has been described for the exemplary Galois finite field having 256 elements. Different finite fields can also be utilized such that the size of the elements matches the byte or word size of the processor used to manipulate the elements and/or the memory used to implement the shift register, or having other advantages. Thus, various finite fields having more than two elements can be utilized and are within the scope of the present invention. The example shown above utilizes a variety of non-linear processes to mask the linearity of the recurrence relation. Other generators can be designed utilizing different non-linear processes, or different combinations of the above described non-linear processes and other non-linear processes. Thus, the use of various non-linear processes to generate non-linear outputs can be contemplated and is within the scope of the present invention. The example shown above utilizes a recurrence relation having an order of 17 and defined by equation (4). Recurrence relation having other orders can also be generated and are within the scope of the present invention. Furthermore, for a given order, various recurrence relations can be generated and are within the scope of the present invention. In the present invention, a maximal length recurrence relation is preferred for optimal results. V. A Second Exemplary Stream Cipher Based on LFSR Over GF(2.sup.8) Both the recurrence relation and the non-linear function access elements of the shift register. Just which elements are accessed are chosen so that the distances between the elements form a "full positive difference set" ("On Security of Nonlinear Filter Generators", J. Dj. Golic, in Proceedings of Fast Software Encryption 1996 Cambridge Workshop, Springer-Variag 1996.) These elements are then portioned between the recurrence relation and the nonlinear function to maximize the spread for each. Under these constraints, the present invention can be further developed to enhance cryptographic security and computational efficiency. The second exemplary embodiment provides improved cryptographic security as compared with the first exemplary embodiment. The LFSR over GF(2.sup.8) is equivalent, mathematically, to eight parallel shift registers over GF(2) of length 136, each with the same recurrence relation. The exemplary embodiment of the present invention includes a recurrence relation over GF(2.sup.8), which is equivalent to a binary recurrence relation whose characteristic polynomial has 51 non-zero coefficients. The three tap positions in the recurrence are determined by the criterion outlined above (i.e., "full positive difference set"). Ideally, the degree 136 polynomial over GF(2), for best strength against cryptanalysis and maximum diffusion, should have approximately half of its coefficients as 1. There are many polynomials over GF(2.sup.8) which have three coefficients which approach this goal, but all three of the coefficients are greater than 1. This means that using such polynomials would require three lookup tables and references, which is less efficient than the current implementation of the present invention. Such polynomials would, however, be perfectly acceptable on the grounds of theoretical security. With a goal of getting the best possible equivalent binary polynomial while retaining the current structure with a coefficient of 1 (which avoids a multiplication table and lookup), analysis indicates that the use of 65 non-zero binary coefficients can provide a preferred embodiment that nearly achieves the goal of 68 non-zero coefficients. There are 16 polynomials over GF(2.sup.8) meeting these criteria. There are always groups of 8 polynomials over GF(2.sup.8) which have the same equivalent binary polynomial; these are just shifted bit positions in the byte. (Each equivalent binary polynomial can be found, for example, by the Berlekamp-Massey algorithm.) Thus, as shown in Table 4, there are two distinct types of polynomials meeting this criterion. For the second exemplary embodiment of the present invention, the first set of coefficient in Table 4 was used.
TABLE 4
Recurrence Coefficients
S.sub.n S.sub.n+1 S.sub.n+15 Type
99 1 206 1
106 1 201 1
142 1 126 1
148 1 214 1
203 1 146 1
210 1 19 1
213 1 195 1
222 1 136 1
40 1 109 2
45 1 38 2
46 1 159 2
57 1 129 2
110 1 209 2
117 1 63 2
32 1 219 2
140 1 97 2
A block diagram of the second exemplary generator is shown in FIG. 7. In this exemplary embodiment, linear feedback shift register 82 is 17 octets long although other lengths for register 82 (for different order recurrence relation) can be implemented and are within the scope of the present invention. A recurrence relation of order 17 is well suited for applications using up to 128-bit key material. In this exemplary embodiment, linear feedback shift register 82 is updated in accordance with the following recurrence relation: S.sub.n+17 =(206xS.sub.n+15).sym.S.sub.n+4.sym.(99xS.sub.n) (8) where the operations are defined over GF(2.sup.8), .sym. is the exclusive-OR operation on two bytes represented by Galois adders 88, and x is a polynomial modular multiplication represented by Galois multipliers 84 (see FIG. 7). In this exemplary embodiment, the modular multiplications on coefficients 86 are implemented using byte table lookups on pre-computed tables as described above. The recurrence relation in equation (8) was chosen to be maximal length. In this exemplary embodiment, to disguise the linearity of shift register 82, two of the techniques described above are used, namely stuttering and using a non-linear function. Additional non-linear techniques are described elsewhere in the present specification. In this exemplary embodiment, non-linearity is introduced by combining four of the elements of shift register 82 using a function (or output equation) which is non-linear with respect to the linear operation over GF(2.sup.8). In this exemplary embodiment, the four bytes used are S.sub.n, S.sub.n+2, S.sub.n+5 and S.sub.n+12, where S.sub.n is the oldest calculated element in the sequence according to the recurrence relation in equation (8). Much of the cryptographic security of the present invention comes from the use of the non-linear function to defeat attacks against the stuttering phase so that it is desirable to make this function as strong, that is, as non-linear, as possible. Numerous possible functions have been tried so as to compare the non-linear function to its nearest linear approximation in each bit position, and calculating the mean absolute deviation and root-mean-square deviation from 0.5, which is the theoretically perfect result. Studies have indicated that superior solutions result from rotating partial sums, a process which has carry effects in the high order bits, so that these bits are combined with the least significant bits of other elements. On a microprocessor, the addition function will generally accept only two operations at a time, so the best apparent strategy will be to rotate after one intermediate addition. Denoting the rotation operation as ROTL(x), meaning the result of rotating the bits of x to the left by 1 position, a far superior non-linear function is: V=ROTL(S.sub.n +S.sub.n+2)+S.sub.n+5 +S.sub.n+12 (9) Here V.sub.n is the non-linear output and + is addition truncated modulo 256 (with the overflow discarded) represented by arithmetic adders 90. ROTL denotes the rotation operator 91. An additional rotation after adding S.sub.n+5 does not appear to yield a better result. As discussed elsewhere in the present specification, using lookup tables which implement explicitly non-linear permutations provides another alternative, but would significantly degrade the computational efficiency of the present invention. In this exemplary embodiment, the bytes used for recurrence relation (8) comprise S.sub.n, S.sub.n+4, and S.sub.n+15 and the bytes used for output equation (9) comprise S.sub.n, S.sub.n+2, S.sub.n+5 and S.sub.n+12. In this exemplary embodiment, these bytes are selected to have distinct pair distances. For recurrence relation equation (8), the three bytes used have pair distances of 4 (the distance between S.sub.n and S.sub.n+4), 11 (the distance between S.sub.n+4 and S.sub.n+15), and 15 (the distance between S.sub.n and S.sub.n+15). Similarly, for output equation (9), the four bytes used have pair distances of 2 (the distance between S.sub.n and S.sub.n+2), 3 (the difference between S.sub.n+2 and S.sub.n+5), 5 (the distance between S.sub.n and S.sub.n+5), 7 (the distance between S.sub.n+5 and S.sub.+12), 10 (the distance between S.sub.n+2 and S.sub.n+12), and 12 (the distance between S.sub.n and S.sub.n+12). The pair distances in recurrence relation (8) (i.e., 4, 11, and 15) are unique (or distinct) within that first respective group and that the pair differences in output equation (9) (i.e., 2, 3, 5, 7, 10, and 12) are also distinct within that second respective group. Furthermore, the pair distances in recurrence relation (8) are distinct from the pair distances in output equation (9). Distinct pair distances ensure that, as shift register 82 shifts, no particular pair of elements of shift register 82 are used twice in either recurrence relation (8) or the non-linear output equation (9). This property removes linearity in the subsequent output equation (9). In this exemplary embodiment, multiplexer (MUX) 92, XOR gate 94, switch 96, and buffer 98 in FIG. 7 operate in the manner described above for MUX 64, XOR gate 66, switch 68, and buffer 70 in FIG. 4. A flow diagram of a second exemplary per frame initialization process is shown in FIG. 6B, which is a modification of the flow diagram of FIG. 6A. This embodiment uses the non-linear function during the secondary key-loading process so as to mix the key information in more quickly than before, thereby allowing a shorter mixing run before generating output. This feature prevents the register state from being a linear subspace of the total set of states of the register. The key bytes are added in to the 15.sup.th byte of the register, rather than the zeroth so as to speed diffusion, this being one of the recurrence relation elements. When the "frame" is being loaded, 8 bits are put in at a time. In addition to adding the octet from "frame", this approach also adds the output from "nltap". After "frame" has been loaded, this approach continues cycling the resister and adding the output for some number of cycles. Thus, in comparing FIG. 6B with FIG. 6A, block 218 is modified so that the frame is shifted by 8 bits to remove the 8 least significant bits. New block 219 adds the output from the non-linear function. And finally the value check in block 224 is changed from 11 to 4. VI. A Third Exemplary Stream Cipher Based on LFSR Over GF(2.sup.8) As discussed above, the present invention can be further developed to enhance crypographic security and computational efficiency while maintaining a "full positive difference set." The third exemplary embodiment provides improved computational efficiency as compared with the first exemplary embodiment. Simpler recurrence relations can be used, at the cost of having simpler binary equivalent polynomials, which may make cryptanalysis easier. Firstly, given the constraints of the full positive difference set, by allowing the coefficients of S.sub.n+4 and to both be 1, a multiplication table and corresponding table lookup can be avoided. There are 8 such recurrences, with the same equivalent binary polynomial with 35 non-zero coefficients. These have as the coefficients of S.sub.n : 40, 45, 46, 57, 110, 117, 132 and 140, respectively. Even simpler polynomials are possible, if some internal coefficients are permitted to be zero. In this case, not only the multiplication but the entire reference to the extra term can be removed. There are 32 such recurrences; 8 have an equivalent binary polynomial with 11 non-zero coefficients, while the other 24 have three equivalent binary polynomials with 13 non-zero coefficients. Of these, 8 have the coefficient of 1 associated with the S.sub.n+1 term, while the other 16 have it associated with the S.sub.n+4 term. The equivalent binary polynomial for the former 8 appears, visually, to have the non-zero coefficients more "spread out" than the others, so for a minimum time implementation of the present invention, those recurrences would be used. The coefficients of the Sf term can be any of 79, 83, 166, 187, 225, 239, 243 and 252. For the third exemplary embodiment of the present invention, the first coefficient was used. The recurrence relation then becomes: S.sub.n+17 =79S.sub.n +S.sub.n+15. (11) On a common 8-bit microprocessors, references to the elements of the shift register are relatively expensive. Removing one of these references entirely would seem possible, without affecting the security too much. The element S.sub.n+2 is chosen to be removed, to "spread" the values as much as possible. It is still advantageous to rotate the intermediate sum however, as the non-linearity of the less significant bits is still not as good as would be desired. In fact, the optimum rotation in this case is by four places. Many microprocessors implement a "nybble-swap" instruction which achieves this operation. Using the notation SWAP( ) to mean rotating the byte by four places, the non-linear function becomes: V.sub.n =SWAP(S.sub.n +S.sub.n+5)+S.sub.n+12 (12) A block diagram of the third exemplary generator is shown in FIG. 8. In this exemplary embodiment, linear feedback shift register 102 is 17 octets long although other lengths for register 102 (for different order recurrence relation) can be implemented and are within the scope of the present invention. A recurrence relation of order 17 is well suited for applications using up to 128-bit key material. In this exemplary embodiment, linear feedback shift register 102 is updated in accordance with the following recurrence relation (11), where the operations are defined over GF(2.sup.8), .sym. is the exclusive-OR operation on two bytes represented by Galois adders 108, and x is a polynomial modular multiplication represented by Galois multipliers 104. In this exemplary embodiment, the modular multiplications on coefficient 106 are implemented using byte table lookups on pre-computed tables as described above. The recurrence relation in equation (11) was chosen to be maximal length. Here V.sub.n is the non-linear output and + is addition truncated modulo 256 (with the overflow discarded) represented by arithmetic adders 110. SWAP denotes the swap operator 111. In this exemplary embodiment, switch 116 and buffer 118 in FIG. 8 operate in the manner described above for switch 68 and buffer 70 in FIG. 4. During the stuttering phase, the nonlinear outputs are, in two cases, XORed with constant terms. (See Table 3) In this embodiment, these calculations are omitted. The previous description of the preferred embodiments is provided to enable any person skilled in the art to make or use the present invention. The various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without the use of the inventive faculty. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
|
Same subclass Same class Consider this |
||||||||||
