Method and apparatus for data hiding in images5870499Abstract A method of hiding a pattern in a host image increases and decreases parameter values at randomly selected host image locations assigned to respective first and second groups. The alteration modifies the statistical behavior of a test statistic equivalent to a linear combination of a large number of instances of respective functions, associated with the pattern, of the parameter values at first and second group locations. The presence or absence of the pattern in a test image is determined by comparing the experimental value of the test statistic associated with the pattern with the expected value of the same sum for an unaltered host image. Claims What is claimed is: Description FIELD OF THE INVENTION
______________________________________
normalized shift in expectation
sample size
y certainty of encoding
n (equation 11) .PHI.(y)
______________________________________
0 0 50.00%
319 1 84.13%
1276 2 97.72%
2870 3 99.87%
______________________________________
The uniform PDFs assumed for a.sub.i .degree. and b.sub.i .degree. in the foregoing analysis represent a worst case for the detectability of the encoded signature pattern. In fact, detection is facilitated as the luminance values in the original host image approach a PDF sharply peaked around a single luminance value (i.e., as the tonality of the image approaches uniformity). Yet, even with uniform PDFs, the aggregate effect of the changes in a sample set of only a few thousand pairs, even for values of .delta. smaller than 5 parts out of 256, is easily detectable. Even when the data from entire regions of the image are completely eliminated so that only some of the n pairs can be observed, (e. g., due to post-encoding cropping), the reliability of the test only degrades as the natural logarithm of the test image size. For large values of n, the presence of the signature can cause an experimental value of S.sub.n removed several standard deviations from the expected value of S'.sub.n even if the test image data has been corrupted to the extent that the commercial value of the image has been compromised. Large values of .delta. also promote ready interpretation of the experimental value, but the range of practical values is limited by the considerations of visibility of the alterations to the host image. In this respect, the present invention is especially amenable to invisibly embedding a signature in a color image. The relative insensitivity of the eye to luminance value in color images allows the use of luminance changes of large .delta. without displaying the encoding. Data thus embedded by luminance alteration is not lost by removal of color from the image. In order to determine whether or not the pattern has been embedded, knowledge of the sample set, including which of the selected locations were designated A.sub.i and which were designated B.sub.i, is presumed. A convenient way of preserving this information is to first generate the pairings using a key for a known pseudo-random number generator, for example by designating alternate numbers in the pseudo-random number series A.sub.i and B.sub.i. Since calculation of the test statistic does not actually require pairing the locations, another possibility is to designate the first n numbers in the series as A.sub.i and the second n numbers as B.sub.i. Knowledge of the key then enables recreation of the pairings for calculation of the experimental value of S.sub.n. However, intelligence of .delta. is neither required nor helpful in decoding. Restricting knowledge of .delta. prevents automated tampering with an encoded image to reconstruct the original host image data without seriously degrading the image. A statistic may be embedded using grouping schemes other than the pairings illustrated above. For example, the randomly generated locations may be partitioned into three groups, including n locations in each of two of the groups and 2n locations in the third group, so that each sample comprises locations A.sub.i, B.sub.i, C.sub.2i-1 and C.sub.2i. If the encoding increases the parameter value at each A.sub.i and B.sub.i and decreases the parameter value at each C.sub.2i-1 and C.sub.2i, then an experimental value of the statistic ##EQU7## will indicate whether or not the test image has been encoded. For the unaltered host image, ##EQU8## has expected value E(S'.sub.n)=0, and the expected value is shifted positively by (.delta..sub.a +.delta..sub.b +2.delta..sub.c) for every sample in the set. It is not necessary that the probability density function of S'.sub.n be zerocentric, as long as its expected value is known. For example, for the three-way grouping just presented, if encoding instead increases the parameter value at each A.sub.i and decreases the parameter value at each B.sub.i, C.sub.2i-1 and C.sub.2i, then the statistic of interest is ##EQU9## The corresponding sum ##EQU10## has expected value ##EQU11## the numerical value of which depends on the probability density function underlying the parameter values in the host image. Each sample in the set adds (.delta..sub.a +.delta..sub.b +2.delta..sub.c) to the expected value. The nonzero E(S'.sub.n) does not at all hinder calculation of the normalized shift of equation 11 or of the cumulative distribution function to estimate the certainty of encoding. This three-way grouping described by equation 13 could also be represented in terms of a two-group framework. An alpha group includes all of the locations at which parameter values are to be increased, namely the A locations, with .alpha..sub.j =a.sub.i. A beta group includes all of the locations at which parameter values are to be decreased, namely the B and C locations, with .beta..sub.k =b.sub.i for k=1 to n; .beta..sub.k =c.sub.2l-1 for k=n+l to 2n; and .beta..sub.k =c.sub.2l for k=2n+1 to 3n. A test statistic equivalent to the one of equation 13 is ##EQU12## with f (a.sub.j, j)=a.sub.j and g(.beta..sub.k, k)=.beta..sub.k. In this case, J=n and K=3n. In terms of formulating a test statistic, the only meaningful distinction among points is the direction in which coding adjusts their parameter value. In this regard, the fact that the beta group includes points altered by two distinct values of .delta. is irrelevant. The generalization of the technique can be summarized within a two-group framework as follows. The randomly chosen locations are divided into alpha and beta groups having respective parameter values .alpha..sub.j and .beta..sub.k which are respectively increased and decreased by encoding. It is not necessary that the sample set be decomposed into n distinct samples, of which each includes representatives from each group. Therefore, the numbers of locations belonging to the alpha and beta groups, J and K, may be unequal. Either of these groups may optionally encompass an arbitrary number of subgroups, each having its parameter values altered by a different magnitude by the encoding; and any subgroup may contain a number of locations different from the number contained by any other subgroup. The indicative experimental value S is equivalent to a linear combination of several instances of two arbitrary functions f(a.sub.j, j) and g(.beta..sub.k, k), which are not necessarily linear functions. It must be emphasized that although the functions f(a.sub.j, j) and g(.beta..sub.k, k) for the examples already given have been the identity function, this is not at all necessary. For example, an equally useful, though nonequivalent, formulation of the test statistic of equation 14 could include g(.beta..sub.k k)=3.beta..sub.k. In a digitally represented image, the locations at which the parameter values are adjusted correspond to patches, each a region in the image including several pixels, rather than to ungrouped individual pixels. (In analog images, the distinction between points and patches is arbitrary.) This approach has several benefits. In general, it shifts the noise introduced by the encoding into lower spatial frequencies, so that it is less likely to be removed by low-pass operations such as lossy compression. More specifically, by design of the patch contour, (i.e., the variation of .delta. over the patch area), the patchwise embodiment allows targeting of a particular range of spatial frequencies to contain the embedded data. The lower-frequency nature of patchwise encoding accommodates higher values of .delta. without displaying the encoding. Spreading each parameter alteration over several pixels also makes the encoding less vulnerable to destruction by blurring, cropping, affine transformation, and gamma or tone correction. Typically, the patch size scales with the dimensions of the image. In general, the patch size allowing the image to accommodate n equal to at least about 5,000 gives best results. In one approach, the patches are selected from cells defined by a grid mapped onto the image so as to assign each pixel of the image to a cell. Then the groupings used to specify the encoded pattern designate cells, of which the parameter values of the member pixels are altered. In decoding, the parameter value at an arbitrarily chosen position, such as the centroid of the patch, can be used to represent the patch in the experimental value of S.sub.n, since the parameter values of all of the points in the patch have been altered in the same direction. A simple rectilinear lattice defining square cells 50, as FIG. 2A shows. As the parameter values are altered over the patches defined by such a grid, the resulting discontinuity in, e.g. luminance, is concentrated in the regions near the corresponding cell borders 52. If n is large, so that most of the cells define altered patches, this lattice symmetry promotes visibility of the encoding. The symmetry of the hexagonal grid shown in FIG. 2B makes the border regions 54 between cells 56 less obvious to the eye. In another approach, the patches are scattered randomly across the image. If patches are constructed around points 58 randomly selected from all the points in the image, the result resembles the arrangement of patches 59 indicated by FIG. 2C. Such an arrangement minimizes the perceptible distortion introduced by encoding. The contour of a patch largely determines which spatial frequencies will by modified by the encoding. FIGS. 3A-3C show .delta., which can be represented graphically as the patch depth, as a function of position for three possible one-dimensional patch contours, and for each contour the approximate form of the frequency spectrum of a line over which such patches have been distributed in a pseudo-random manner. The patch demonstrated in FIG. 3A has sharp edges and covers a small area in the image. This contour concentrates most of the patch's energy in the higher-frequency portion of the image spectrum and thus increases the susceptibility of the encoding to removal by lossy compression. The broad, diffuse shape shown in FIG. 3B represents the other extreme, concentrating most of the information in lower frequencies. Data embedded using this profile is susceptible to corruption by operations such as contrast enhancement. The wide, sharp patch of FIG. 3C disperses the energy across the entire frequency spectrum. The patch contour should be chosen based on the modifications the image is expected to undergo after encoding. Spreading the patch energy across the spectrum is recommended if the post-encoding treatment of the image will include techniques affecting several spectral regions or is not known. Patchwise parameter alteration allows the use of larger values of .delta., which promotes clean interpretation of the experimental value of S.sub.n, without drawing attention to the encoding. The ability to adjust parameter values over a multi-pixel area allows smoother variation in patch depth around the edges of the patch. This rounding takes advantage of the lower sensitivity (about 1 part in 40) of the eye to smoothly changing luminance values compared to its sensitivity (about 1 part in 240) to discontinuous changes in a region of otherwise uniform luminance. The use of patching imparts resistance to the encoded information against blurring, cropping, and affine transformations that is not attainable with pixel-wise parameter alteration. This principle is illustrated with respect to rotation by the two identical 4.times.4 rectangular grids shown in FIG. 4. The first grid 60 has been rotated about 10.degree. clockwise with respect to a second grid 70 about their mutual center 65. The shaded regions 75 indicate the overlap between a given cell in the first lattice 60 and its analog in the second lattice 70. The overlap is substantial, and the center of every cell in the first lattice 60 falls in the corresponding cell in the second lattice 70. If an image were rotated after encoding with patches conforming to a rectilinear grid as shown in FIG. 2A, the resulting relationship between the encoded host image and the rotated image would be as shown in FIG. 4. Testing the image for the embedded signature pattern first requires mapping a grid onto the test image. Without intelligence of the rotation undergone by the test image since encoding, the decoder would impose the rectilinear array used in the encoding on the image. Although the mapping thus imposed would not perfectly match that used for the encoding, the overlap between the patches of the two grids would allow calculation of a meaningful test statistic. If the points to be examined for calculation of the test value are misidentified due to offset by cropping, translation, rotation, or scaling between the times of encoding and decoding, the utility of an experimental value of S.sub.n is degraded. The value of S.sub.n calculated in decoding is particularly sensitive to affine transformations undergone by the image. However, if information about the transformations undergone by the test image is available at decoding, the accuracy of the method in cases of post-encoding image alteration is improved. If a description of the geometrical transformation is available, the grid defining the patches can be adapted so that it better corresponds to that used for encoding. Moreover, if the original host image is available, the cropping or affine transformation may be determined by comparison of the test image with the original image, such as by using feature recognition techniques. Information about both cropping and affine transformations may also be recovered using affine coding, which uses a high bit-rate coding technique to embed a predefined reference pattern in the host image. Estimation of the geometric transformation undergone by the test image is achieved by comparing the original shape, size, and orientation of the reference pattern with that in the test image. Based on this information, an inverse affine transform can be applied to recover the original image, apart from portions removed by cropping. The parameters governing the sample size and the size, shape, contour, and depth of the patches can all be determined algorithmically at encoding or decoding for a particular image, based on criteria incorporating the above-mentioned considerations. The invention may also be used to advantage for marking photographs by embedding a signature pattern. One way of doing this is to embed the pattern on the photographic paper prior to exposing the paper to the negative image, thereby subjecting the paper to a double exposure: a first exposure to embed the signature and a second to transfer the image as usual. The embedding exposure provides a relatively low-level overall intensity that is uniform except for a pattern of regions of greater and lesser intensity arranged according to groupings of random locations as disclosed in the foregoing description. Later, the paper is exposed and the image developed as usual to produce a photograph. Another possibility is the construction of a digital camera that places a signature pattern on every photograph it takes. The signature can be decoded from any image suspected of being a copy of the photograph by examination of the image as already described. Refer now to FIG. 5, which illustrates, in block-diagram form, a hardware system incorporating the invention. As indicated therein, the system includes a system bus 155, over which all system components communicate, a mass storage device (such as a hard disk or optical storage unit) 157 as well as a main system memory 160. The operation of the illustrated system is directed by a central-processing unit ("CPU") 170. To facilitate rapid execution of the image-processing operations hereinafter described, the system preferably contains a graphics or image-processing board 172; this is a standard component well-known to those skilled in the art. The user interacts with the system using a keyboard 180 and a position-sensing device (e.g., a mouse) 182. The output of either device can be used to designate information or select particular areas of a screen display 184 to direct functions to be performed by the system. The main memory 160 contains a group of modules that control the operation of CPU 170 and its interaction with the other hardware components. An operating system 190 directs the execution of low-level, basic system functions such as memory allocation, file management and operation of mass storage devices 157. At a higher level, an analysis module 192, implemented as a series of stored instructions, directs execution of the primary functions performed by the invention, as discussed below: instructions defining a user interface 194 allow straightforward interaction over screen display 184. User interface 194 generates words or graphical images on display 184 to prompt action by the user, and accepts user commands from keyboard 180 and/or position-sensing device. A random number generator 186 creates the ordered series of pseudo-random numbers used in encoding or decoding. The main memory 160 also includes one or more input image buffers 196 that contain image(s), such as a host or test image, used as input for processing according to the invention and output image buffers 197 that contain an output image generated by that processing. The contents of each input or output image buffer define a "raster," i.e., a regular two-dimensional pattern of discrete pixel positions that collectively represent an image and may be used to drive (e.g., by means of image-processing board 172 or an image server) screen display 184 to display that image. The values of pixel parameters, such as luminance, contained at each memory location in an image buffer 196 or 197 directly governs the appearance of a corresponding pixel on display 184. One or more databases 198 contain encoding and/or decoding information, e. g., the output of the random number generator, the key used by it to generate the pseudo-random number series, the rule governing assignment of patches to groups, the description of patches, the test statistic formulation and expected value or descriptions of geometric transformation. One or more of the databases 198 may be associated with each one of the image buffers 196 or 197 and contain information specific to the image contained in the associated buffer; or, one database 198 may contain information generic to all images encoded or decoded by the apparatus. The databases may be stored in the mass storage device 157 in file(s) linked to file(s) containing the associated image(s). It must be understood that although the modules of main memory 160 have been described separately, this is for clarity of presentation only; so long as the system performs all necessary functions, it is immaterial how they are distributed within the system and its programming architecture. Likewise, although conceptually organized as grids, pixelmaps need not actually be stored digitally in this fashion. Rather, for convenience of memory utilization and transmission, the raster pattern is usually encoded as an ordered array of pixels. The host or test image may be provided in electronic or hardcopy format, in which case the image is processed by a digitizer 198 before encoding or decoding. The digitized image is sent as bitstreams on the bus 155 to an image buffer 196 of the main memory 160. The source or test image may be stored in the mass storage device 157 as well as in image buffers 196. As noted above, execution of the key tasks associated with the present invention is directed by analysis module 192, which governs the operation of CPU 170 and controls its interaction with main memory 160 in performing the steps necessary to encode a signature pattern in a host image or to detect the presence or absence of a signature pattern in a test image. In particular, the procedure followed by the hardware system for encoding a pattern in a host image is shown in FIG. 6. In a first step 200, the host image is loaded into a first one of input image buffers 196, so that it is available to analysis module 192. Then the module 192 establishes the encoding parameters in step 210. These parameters include the number, size, shape, arrangement, contour, and depths of the patches, the key for generating the pseudo-random number series in step 215, and the rule for assigning the numbers to the alpha and beta groups and any subgroups. In response to a user command, the module 192 either retrieves these parameters, from the user interface 194 or the appropriate database 198, or determines the appropriate parameters for encoding the host image based on the considerations outlined previously herein. If the chosen patch form so requires, this step also includes generating a grid defining the patches and mapping it onto the host image. The values determined for the parameters, as well as the entire series of pseudo-random numbers generated in step 215, may be retained in one of the databases 198. In step 215, the random number generator 186 provides an ordered series of pseudo-random numbers, each interpreted to correspond to a location, a pixel or grid cell, in the host image. The analysis module 192 assigns numbers in the series to alpha and beta groups to effect the assignment of locations to groups and any subgroups as described above. In step 220, the module 192 generates an output image by altering the pixel parameter values of the locations designated by the pseudo-random numbers generated in step 215 so as to embed the desired pattern. Specifically, the analysis module 192 changes the parameter values of the pixels associated with the generated pseudo-random numbers by increasing the values for pixels associated with numbers assigned to the alpha group and decreasing the values for pixels associated with numbers assigned to the beta group. The encoded output image is then stored in second one of the output image buffers 197. As shown in FIG. 7, for decoding a particular signature in a test image, in the first step 240, the image is first loaded into one of the image buffers 162. In step 245, the module 192 performs the grid mapping or pixel mapping onto the image. If the test image has been significantly geometrically altered since encoding, meaningful assignment of grid cells to regions of the image requires information about the alteration. In response to a user command, the module 192 either retrieves this information, from the user interface 194 or from one of the databases 198, or assesses the changes based on the properties of the host image or on data encoded in the test image. The module 192 accounts for this information in the mapping. In step 250, either the analysis module 192 retrieves the pseudo-random number series and group assignment associated with the pattern in question, or the random number generator 186 recreates it based on a key provided by the user or a database 198. In step 255, the module 192 accesses the test image stored in one of the image buffers 162 and computes the experimental value of a test statistic for the pattern being decoded. In step 260, the module 192 generates an indication of whether or not the signature pattern is present. This indication may entail simply showing on display 184 the calculated experimental value or the likelihood that it belongs to the test statistic probability density function of the unaltered host image. It will therefore be seen that the foregoing represents a highly extensible and advantageous approach to low-bit-rate data embedding, especially for signature marking of digitally represented images. The terms and expressions employed herein are used as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. For example, the various modules of the invention can be implemented on a general-purpose computer using appropriate software instructions, or as hardware circuits, or as mixed hardware-software combinations (wherein, for example, pixel manipulation and rendering is performed by dedicated hardware components).
|
Same subclass Same class Consider this |
||||||||||
