Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
17
CHAPTER-2
IMAGE WATERMARKING LITERATURE SURVEY
Within the field of watermarking, image watermarking particularly has attracted lot of
attention in the research community. Most of the research work is dedicated to image
watermarking as compared to audio and video. There may be 3 reasons for it. Firstly,
because of ready availability of the test images, secondly because it carries enough
redundant information to provide an opportunity to embed watermarks easily, and lastly,
it may be assumed that any successful image watermarking algorithm may be upgraded
for the video also.
Images are represented/stored in spatial domain as well as in transform domain. The
transform domain image is represented in terms of its frequencies; whereas, in spatial
domain it is represented by pixels. In simple terms, transform domain means the image is
segmented into multiple frequency bands. To transfer an image to its frequency
representation, we can use several reversible transforms like Discrete Cosine Transform
(DCT), Discrete Wavelet Transform (DWT), or Discrete Fourier Transform (DFT). Each
of these transforms has its own characteristics and represents the image in different ways.
Watermarks can be embedded within images by modifying these values, i.e. the
transform domain coefficients. In case of spatial domain, simple watermarks could be
embedded in the images by modifying the pixel values or the Least Significant Bit (LSB)
values. However, more robust watermarks could be embedded in the transform domain of
images by modifying the transform domain coefficients. In 1997 Cox et al. presented a
paper “Secure Spread Spectrum Watermarking for Multimedia” [19], one of the most
cited paper (cited 2985 times till April’ 2008 as per Google Scholar search), and after that
most of the research work is based on this work. Even though spatial domain based
techniques can not sustain most of the common attacks like compression, high pass or
low pass filtering etc., researchers present spatial domain based schemes. Firstly, brief
18
introductions of some classical well-known spatial domain based schemes are being
given as follows [19]:
2.1 SPATIAL DOMAIN BASED WATERMARKING SCHEMES
2.1.1 LSB BASED SCHEMES
In their paper, Macq and Quisquater [60] briefly discussed the issue of watermarking
digital images as part of a general survey on cryptography and digital television. The
authors provided a description of a procedure to insert a watermark into the least
significant bits of pixels located in the vicinity of image contours. Since it relies on
modifications of the least significant bits, the watermark is easily destroyed. Further, their
method is restricted to images, in that it seeks to insert the watermark into image regions
that lie on the edge of contours.
Rhoads [79] described a method that adds or subtracts small random quantities from each
pixel. Addition or subtraction is determined by comparing a binary mask of bits with the
LSB of each pixel. If the LSB is equal to the corresponding mask bit, then the random
quantity is added, otherwise it is subtracted. The watermark is subtracted by first
computing the difference between the original and watermarked images and then by
examining the sign of the difference, pixel by pixel, to determine if it corresponds to the
original sequence of additions and subtractions. This method does not make use of
perceptual relevance, but it is proposed that the high frequency noise be prefiltered to
provide some robustness to lowpass filtering. This scheme does not consider the problem
of collusion attacks.
2.1.2 PATCH WORK BASED SCHEMES
Another, well known spatial domain based scheme is patchwork-based technique given
by Bender et al. [7]. They described two watermarking schemes. The first is a statistical
method called patchwork. Patchwork randomly chooses pairs of image points, and
increases the brightness at one point by one unit while correspondingly decreasing the
brightness of another point. The second method is called “texture block coding” wherein
19
a region of random texture pattern found in the image is copied to an area of the image
with similar texture. Autocorrelation is then used to recover each texture region. The
most significant problem with this scheme is that it is only appropriate for images that
possess large areas of random texture. The scheme could not be used on images of text.
Other Patchwork based algorithm can be found in [110, 124].
2.1.3 CORRELATION BASED WATERMARKING SCHEMES
The most straightforward way to add a watermark to an image in the spatial domain is to
add a pseudorandom noise pattern to the luminance values of its pixels. Many methods
are based on this principle [6, 11, 27, 33-34, 53, 68, 70, 91, 95, 114-117].
2.1.3.1 CORRELATION BASED SCHEMES WITH 1 PN SEQUENCE: A well
known technique for watermark embedding is to exploit the correlation properties of
additive pseudo-random noise patterns as applied to an image [42, 52]. A Pseudo-random
Noise (PN) pattern W (x, y) is added to the cover image I (x, y), according to the
Equation 2.1 given below:
),(*),(),( yxWkyxIyxI w += ……………………………………………………… (2.1)
In Equation 2.1, k denotes a gain factor and IW the resulting watermarked image.
Increasing k increases the robustness of the watermark at the expense of the quality of the
watermarked image. To retrieve the watermark, the same pseudo-random noise generator
algorithm is seeded with the same key, and the correlation between the noise pattern and
possibly watermarked image is computed. If the correlation exceeds a certain threshold T,
the watermark is detected, and a single bit is set. This method can easily be extended to a
multiple-bit watermark by dividing the image into blocks and performing the above
procedure independently on each block.
2.1.3.2 CORRELATION-BASED IMAGE WATERMARKING SCHEMES WITH
2PN SEQUENCES: This basic algorithm, as given in previous section, can be improved
in a number of ways. First, the notion of a threshold being used for determining a logical
20
“1” or “0” can be eliminated by using two separate pseudo-random noise patterns. One
pattern is designated a logical “1” and the other a logical “0”. The above procedure is
then performed once for each pattern, and the pattern with the higher resulting correlation
is used. This increases the probability of correct detection, even after the image has been
subject to attack [42, 52].
2.1.3.3 IMAGE WATERMARKING USING PRE-FILTERING: We can further
improve the basic algorithm by pre-filtering the image before applying the watermark. If
we can reduce the correlation between the cover image and the PN sequence, we can
increase the immunity of the watermark to additional noise. By applying the edge
enhancement filter shown below in Figure 2.1, the robustness of the watermark can be
improved with no loss of capacity and very little reduction of image quality [42, 52].
−−−−−−−−
=1111101111
21
edgeF
Figure 2.1: FIR Edge Enhancement Pre-Filter
2.1.4 CDMA BASED IMAGE WATERMARKING SCHEME
Rather than determining the values of the watermark from “blocks” in the spatial domain,
we can employ CDMA spread-spectrum schemes to scatter each of the bits randomly
throughout the cover image, thus increasing capacity and improving resistance to
cropping. The watermark is first formatted as a long string rather than a 2D image. For
each value of the watermark, a PN sequence is generated using an independent seed.
These seeds could either be stored or themselves generated through PN methods. The
summation of all of these PN sequences represents the watermark, which is then scaled
and added to the cover image [42, 52].
To detect the watermark, each seed is used to generate its PN sequence which is then
correlated with the entire image. If the correlation is high, that bit in the watermark is set
to “1”, otherwise a “0”. The process is then repeated for all the values of the watermark.
21
CDMA improves on the robustness of the watermark significantly but it requires more
computation.
2.1.5 OTHER SPATIAL DOMAIN BASED WATERMARKING SCHEMES
In [104], a method that embeds a binary watermark image in the spatial domain is
proposed. A spatial transform that maps each pixel of the watermark image to a pixel of
the host image, is used. Chaotic spread of watermark image pixels in the host image is
achieved by “toral automorphisms”. For watermark embedding, the intensity of the
selected pixels is modified by an appropriate function that takes into account
neighborhood information in order to achieve watermark robustness to modifications. For
detection, a suitable function is applied on each of the watermarked pixels to determine
the binary digit (0 or 1) that has been embedded. The inverse spatial transform is then
used to reconstruct the binary watermark image.
In the method proposed in [69], the image is split into two random subsets A and B and
the intensity of pixels in A is increased by a constant embedding factor k. Watermark
detection is performed by evaluating the difference of the mean values of the pixels in
subsets A and B. This difference is expected to be equal to k for a watermarked image
and equal to zero for an image that is not watermarked. Hypothesis testing can be used to
decide for the existence of the watermark. The above algorithm is vulnerable to lowpass
operations. Extensions to above algorithm are proposed in [64]. According to this
paper, the robustness of the method can be increased by grouping pixels so as to form
blocks of certain dimensions to enhance the low pass characteristics of the watermark
signal. Alternatively, one can take advantage of the fact that different embedding factor
can be used for each pixel, to shape appropriately the watermark signal. An optimization
procedure that calculates the appropriate embedding value for each pixel so that the
energy of the watermark signal is concentrated at low frequencies is proposed.
Constraints that ensure that the watermark signal is invisible can be incorporated in the
optimization procedure.
22
In [45] the authors derived analytical expressions for the probabilities P-, P+ of false
negative and false positive watermark detection. Their model assumes an additive
watermark and a correlator-based detection stage. Both, the white watermarks and
watermarks having low pass characteristics, are considered. The host image is treated as
noise, assuming a first order separable autocorrelation function. The probabilities P-, P+
are expressed in terms of the watermark to image power ratio. The authors conclude that
detection error rates are higher for watermarks with low pass characteristics.
In last 12 years, number of publications in this area is increasing very rapidly and no
survey can cover all the presented schemes, but there are some very good survey papers
and interested reader may explore the papers [3, 13, 54, 76]. We are limiting the
discussion of the spatial domain based schemes here.
2.2 TRANSFORMED DOMAIN BASED SCHEMES As presented in literature, transformed domain based watermarking schemes are more
robust as compared to simple spatial domain watermarking schemes. Such algorithms are
robust against simple image processing operations like low pass filtering, brightness and
contrast adjustment, blurring etc. However, they are difficult to implement and are
computationally more expensive. We can use either of Discrete Fourier Transform
(DFT), Discrete Cosine Transform (DCT) or Discrete Wavelet Transform (DWT) but
DCT is the most exploited one. A General transformed domain based scheme, as
presented by Cox, is shown in Figure 2.2. A very good discussion on DCT/DWT/DFT
based watermarking schemes is given in [76].
2.2.1 DFT BASED WATERMARKING SCHEMES
We start from DFT. There are few algorithms that modify these DFT magnitude and
phase coefficients to embed watermarks. Ruanaidh et al. proposed a DFT watermarking
scheme in which watermark is embedded by modifying the phase information within the
DFT. It has been shown that phase based watermarking is robust against image contrast
operation [114]. Later Ruanaidh and Pun showed how Fourier Mellin transform could be
23
used for digital watermarking. Fourier Mellin transform is similar to applying Fourier
Transform to log-polar coordinate system for an image.
This scheme is robust against geometrical attacks [116]. De Rosa et al. proposed a
scheme to insert watermark by directly modifying the mid frequency bands of the DFT
magnitude component [115]. Ram kumar et al. also presented a data hiding scheme based
on DFT, where they modified the magnitude component of the DFT coefficients. Their
simulations suggest that magnitude DFT survives practical compression which can be
attributed to the fact that most practical compression schemes try to maximize the PSNR.
Hence using magnitude DFT is a way to exploit the hole in most practical compression
schemes.
Figure 2.2: A General Frequency domain based watermarking model as presented by Cox [19]
24
The proposed scheme is shown to be resistant to Joint Photographic Expert Group
(JPEG) and (Set Partitioning In Hierarchical Trees) SPIHT compression [68]. Lin et al.
presented a RST resilient watermarking algorithm. In their algorithm, the watermark is
embedded in the magnitude coefficients of the Fourier transform re-sampled by log-polar
mapping. The scheme is, however, not robust against cropping and shows weak
robustness against JPEG compression (Q = 70) [53]. Solachidis and Pitas presented a
novel watermarking scheme. They embed a circularly symmetric watermark in the
magnitude of the DFT domain [8]. Since the watermark is circular in shape with its centre
at image center, it is robust against geometric rotation attacks. The watermark is centered
around the mid frequency region of the DFT magnitude. Neighborhood pixel variance
masking is employed to reduce any visible artifacts. The scheme is computationally not
expensive to recover from rotation. Robustness against cropping, scaling, JPEG
compression, filtering, noise addition and histogram equalization is demonstrated. A
semi-blind watermarking scheme has been proposed by Ganic and Eskicioglu [30]. They
embed circular watermarks with one in the lower frequency while the other is in the
higher frequency.
2.2.2 DCT BASED WATERMARKING SCHEMES
DCT domain watermarking can be classified into Global DCT watermarking and Block
based DCT watermarking. One of the first algorithms presented by Cox et al. [19] used
global DCT approach to embed a robust watermark in the perceptually significant portion
of the Human Visual System (HVS). Embedding in the perceptually significant portion of
the image has its own advantages because most compression schemes remove the
perceptually insignificant portion of the image. In spatial domain it represents the LSB.
However in the frequency domain it represents the high frequency components.
As described in [76], steps in DCT Block Based Watermarking Algorithm are:
1) Segment the image into non-overlapping blocks of 8x8;
2) Apply forward DCT to each of these blocks;
3) Apply some block selection criteria (e.g. HVS);
25
4) Apply coefficient selection criteria (e.g. highest);
5) Embed watermark by modifying the selected coefficients; and
6) Apply inverse DCT transform on each block.
Most DCT based algorithms differ with each other on account of step 3 and 4 i.e. they
differ either in the block selection criteria or coefficient selection criteria. Initially, Koch,
Rindfrey, and Zhao [7] proposed a method for watermarking images. In that method, they
break up an image into 8x8 blocks and compute discrete cosine transform (DCT) of each
of these blocks. A pseudorandom subset of the blocks is chosen and then in each such
block, a triplet of frequencies is selected from one of 18 predetermined triplets and
modified so that their relative strengths encode a ‘1’ or ‘0’ value. The 18 possible triplets
are composed by selection of three out of eight predetermined frequencies within the 8x8
DCT block. The choice of the eight frequencies to be altered within the DCT block is
based on a belief that the “middle frequencies have moderate variance,” i.e. they have
similar magnitude. This property is used to allow the relative strength of the frequency
triplets to be altered without requiring a modification that would be perceptually
noticeable.
Several DCT based schemes are presented in [8, 17-19, 21, 37, 71, 74, 81, 99, 118].
Using the DCT, an image can easily be split up in pseudo frequency bands so that the
watermark can conveniently be embedded in the most important middle band frequencies.
Furthermore, the sensitivity of the HVS to the DCT based images has been extensively
studied, which resulted in the recommended JPEG quantization Table [112]. These
results can be used for predicting and minimizing the visual impact of the distortion
caused by the watermark. Finally, the block-based DCT is widely used for image and
video compression. By embedding a watermark in the same domain as the compression
scheme used to process the image (in this case in the DCT domain), we can anticipate
lossy compression because we are able to anticipate which DCT coefficients will
be discarded by the compression scheme. Furthermore, we can exploit the DCT
decomposition to make real-time watermark applications.
26
Further improvements for DCT-domain correlation-based watermarking systems'
performance could be achieved by using watermark detectors based on generalized
Gaussian model instead of the widely used pure Gaussian assumption [35]. By
performing a theoretical analysis for DCT-domain watermarking methods for images, the
authors in [35] provided analytical expressions which could be used to measure
beforehand the performance expected for a certain image and to analyze the influence of
the image characteristics and system parameters (e.g. watermark length) on the final
performance. Furthermore, the result of this analysis may help in determining the proper
detection threshold T to obtain a certain false positive rate. The authors in [35] claimed
that by abandoning the pure Gaussian noise assumption, some substantial performance
improvements could be obtained.
In [4], the authors embedded a watermark signal domain by modifying a number of
predefined DCT coefficients. They used a weighting factor to weight the watermark
signal in the spatial domain according to HVS characteristics. In [75] authors embedded
watermark data in DCT Difference (JND) as predicted domain in perceptually meaningful
way and used the Just Noticeable by model reported in [108].
2.2.2.1 THE MIDDLE-BAND COEFFICIENT EXCHANGE SCHEME [42, 52]:
The middle-band frequencies (FM) of an 8x8 DCT block are shown in Figure 2.3. In this
Figure, FL is used to denote the lower frequency components of the block and FH is used
to denote the higher frequency components. FM is chosen as embedding region to
provide additional resistance to lossy compression techniques, while avoiding significant
modification of the cover image. First, 8x8 DCT of an original image is taken. Then, two
locations DCT (u1, v1) and DCT (u2, v2) are chosen from the FM region for comparison of
each 8 x 8 block. These locations are selected based on the recommended JPEG
quantization table shown in Figure 2.4. If two locations are chosen such that they have
identical quantization values, then any scaling of one coefficient will scale the other by
the same factor to preserve their relative strength. It may be observed from Figure 2.4,
that coefficients at location (4, 1) and (3, 2) or (1, 2) and (3, 0) are more suitable
candidates for comparison because their quantization values are equal. The DCT block
27
will encode a “1” if DCT (u1, v1) > DCT (u2, v2); otherwise it will encode a “0”. The
coefficients are swapped if the relative size of coefficients does not agree with the bit that
is to be encoded [42, 52].
Thus, instead of embedding any data, this scheme is hiding watermark data by means of
interpreting “0” or “1” with relative values of two fixed locations in middle frequency
region.
FL
FM
FH
Figure 2.3: Frequency regions in 8 x 8 DCT
Swapping of such coefficients will not alter the watermarked image significantly, as it is
generally believed that DCT coefficients of middle frequencies have similar magnitudes.
Further, the robustness of the watermark can be improved by introducing a watermark
“strength” constant k, such that DCT (u1, v1) – DCT (u2, v2) > k. If coefficients do not
meet these criteria, they are modified by the use of random noise to satisfy the relation.
Increasing k thus reduces the chance of detection errors at the expense of additional
image degradation. By increasing k, larger coefficients remain larger even after lot of
compression and thus help in decoding because their relative values decide the decoding
of the watermark data.
While extracting the watermark, again the 8x8 DCT of image in taken in which “1” is
decoded if DCT (u1, v1) > DCT (u2, v2); otherwise a “0” is decoded.
28
Figure 2.4: JPEG Quantization matrix
Limitation of middle-band coefficient exchange scheme: Experimental results show that
Middle-Band Coefficient Exchange is quite efficient against JPEG compression,
Cropping, Noising and other common image manipulation operations. But above scheme
has one serious drawback. If only one pair of coefficient is used (say (4, 1) and (3, 2)) to
hide the watermark data, then it is vulnerable to collusion attack. By analyzing four or
five watermarked copies of an image, one can easily find out that these coefficients
always have a certain pattern and attacker can predict the watermark as well as destroy it.
2.2.2.2 DCT-CDMA BASED IMAGE WATERMARKING [42, 52]: In this
technique authors embedded a PN sequence W into the middle frequencies of the DCT
block. A DCT block can be modulated using the Equation 2.2.
∉∈+
=M
MyxyxyxW FvuvuyIx
FvuvuWkvuIvuI
,),,(,,),,(*),(
),( ,,, ………………………………………….. (2.2)
For each 8 x 8 block of the image, the DCT for the block is first calculated. In that block,
the middle frequency components FM are added to the PN sequence W, multiplied by a
gain factor k. Each block is then inverse-transformed to give the final watermarked image
IW.
The watermarking procedure is made somewhat more adaptive by slightly altering the
embedding process to the method shown in Equation 2.3.
29
∉∈+
=M
MyxyxyxW FvuvuyIx
FvuvuWkvuIvuI
,),,(,,)),,(*1(*),(
),( ,,, ……………………………... (2.3)
This slight modification scales the strength of the watermarking based on the size of the
particular coefficients being used. Larger values of k can thus be used for coefficients of
higher magnitude; in effect strengthening the watermark in regions that can afford it;
weakening it in other regions.
For detection, the image is broken up into same 8x8 blocks and a DCT is taken. The same
PN sequence is then compared to the middle frequency values of the transformed block.
If the correlation between the sequences exceeds some threshold T, a “1” is detected for
that block; otherwise a “0” is detected. Again k denotes the strength of the watermarking,
where increasing k increases the robustness of the watermark at the expense of quality.
2.2.3 DWT BASED WATERMARKING SCHEMES
If watermarking techniques can exploit the characteristics of the Human Visual System
(HVS), it is possible to hide watermarks with more energy in an image, which makes
watermarks more robust. From this point of view, the DWT is a very attractive transform,
because it can be used as a computationally efficient version of the frequency models for
the HVS [5]. For instance, it appears that the human eye is less sensitive to noise in high
resolution DWT bands and in the DWT bands having an orientation of 45° (i.e., HH
bands). Furthermore, DWT image and video coding, such as embedded zero-tree
wavelet (EZW) coding, are included in the upcoming image and video compression
standards, such as JPEG2000 [112]. Thus DWT decomposition can be exploited to
make a real-time watermark application.
Many approaches apply the basic schemes described at the beginning of this section
to the high resolution DWT bands, LH, HH, and HL [35, 40]. A large number of
algorithms operating in the wavelet domain have been proposed till date.
30
Figure 2.5: 1-Scale and 2-Scale 2-Dimensional Discrete Wavelet Transform
2.2.3.1 CDMA-DWT BASED WATERMARKING SCHEME: This scheme is the
most straightforward scheme which is similar to embedding scheme to that used in the
DCT-CDMA scheme. The embedding of a CDMA sequence in the frequency bands is
shown in Equation 2.4.
∈∈+
=HHLLvuWLHHLvuxWW
Ii
iiivuW ,,
,,,,
α………………………………………………. (2.4)
where Wi denotes the coefficient of the transformed image, xi the bit of the watermark to
be embedded, and α a scaling factor. To detect the watermark, same pseudo-random
sequence used in CDMA generation is generated and its correlation is determined with
the two transformed frequency bands. If the correlation exceeds some threshold T, the
watermark is detected.
This can be easily extended to multiple bit messages by embedding multiple watermarks
into the image. In the spatial version, a separate seed is used for each PN sequence,
which are then added to the frequency coefficients. During detection, if the correlation
exceeds T for a particular sequence a “1” is recovered; otherwise a “0”. The recovery
process then iterates through the entire PN sequence until all the bits of the watermark
have been recovered.
31
DWT based watermarking schemes follow the same guidelines as DCT based schemes,
i.e. the underlying concept is the same; however, the process to transform the image into
its transform domain varies and hence the resulting coefficients are different. Wavelet
transforms use wavelet filters to transform the image. There are many available filters,
although the most commonly used filters for watermarking are Haar Wavelet Filter,
Daubechies Orthogonal Filters and Daubechies Bi-Orthogonal Filters. Each of these
filters decomposes the image into several frequencies. Single level decomposition gives
four frequency representations of the images. In their paper [76], authors presented a
survey of wavelet based watermarking algorithms. They classify algorithms based on
decoder requirements as Blind Detection or Non-blind Detection. As mentioned earlier
blind detection doesn't require the original image for detecting the watermarks; however,
non-blind detection requires the original image.
2.2.3.2 DWT BASED BLIND WATERMARK DETECTION: Lu et al. [58]
presented a novel watermarking technique called as "Cocktail Watermarking". This
technique embeds dual watermarks which compliment each other. This scheme is
resistant to several attacks, and no matter what type of attack is applied; one of the
watermarks can be detected. Furthermore, they enhance this technique for image
authentication and protection by using the wavelet based Just Noticeable Distortion
(JND) values. Hence this technique achieves copyright protection as well as content
authentication simultaneously. Zhu et al. [126] presented a multi-resolution watermarking
scheme for watermarking video and images. The watermark is embedded in all the high
pass bands in a nested manner at multiple resolutions. This scheme doesn't consider the
HVS aspect; however, Kaewkamnerd and Rao [43-44] improve this scheme by adding
the HVS factor in account. Voyatzis and Pitas [104], who presented the "toral
automorphism" concept, provide a technique to embed binary logo as a watermark which
can be detected using visual models as well as by statistical means. So, in case the image
is degraded too much and the logo is not visible, it can be detected statistically using
correlation. Watermark embedding is based on a chaotic (mixing) system. Original image
is not required for watermark detection. However, the watermark is embedded in spatial
domain by modifying the pixel or luminance values.
32
A similar approach is presented for the wavelet domain [121], where the authors
proposed a watermarking algorithm based on chaotic encryption. Zhao et al.[125]
presented a dual domain watermarking technique for image authentication and image
compression. They used the DCT domain for watermark generation and DWT domain for
watermark insertion. A soft authentication watermark is used for tamper detection and
authentication while a chrominance watermark is added to enhance compression. They
use the orthogonality of DCT-DWT domain for watermarking [125].
2.2.3.3 DWT BASED NON-BLIND WATERMARK DETECTION: This technique
requires the original image for detecting the watermark. Most of the schemes found in
literature use a smaller image as a watermark and hence cannot use correlation based
detectors for detecting the watermark; as a result they rely on the original image for
informed detection. The size of the watermark image (normally a logo) normally is
smaller compared to the host image. Xia et al. presented a wavelet based non-blind
watermarking technique for still images where watermarks are added to all bands except
the approximation band. A multi-resolution based approach with binary watermarks is
presented here [37]. Here, both the watermark logo as well as the host image is
decomposed into sub bands and later embedded. Watermark is subjectively detected by
visual inspection; however, an objective detection is employed by using normalized
correlation. Lu et al. presented another robust watermarking technique based on image
fusion. They embedded a grayscale and binary watermark which is modulated using the
"toral automorphism" described in [106]. Watermark is embedded additively. The
novelty of this technique lies in the use of secret image instead of host image for
watermark extraction and use of image dependent and image independent permutations to
de-correlate the watermark logos [57]. Raval and Rege presented a multiple
watermarking scheme. The authors argued that if the watermark is embedded in the low
frequency components, it is robust against low pass filtering, lossy compression and
geometric distortions. On the other hand, if the watermark is embedded in high frequency
components, it is robust against contrast and brightness adjustment, gamma correction,
histogram equalization and cropping and vice-versa. Thus, to achieve overall robustness
33
against a large number of attacks, the authors proposed to embed multiple watermarks in
low frequency and high frequency bands of DWT [78].
Kundur and Hatzinakos [50] presented image fusion watermarking scheme. They used
salient features of the image to embed the watermark. They used a saliency measure to
identify the watermark strength and later embedded the watermark additively.
Normalized correlation is used to evaluate the robustness of the extracted watermark.
Later the authors proposed another scheme termed as FuseMark [51] which includes
minimum variance fusion for watermark extraction. Here, they propose to use a
watermark image whose size is a factor of the host by 2xy. Tao and Eskicioglu presented
an optimal wavelet based watermarking scheme. They embedded binary logo watermark
in all the four bands. But they embedded the watermarks with variable scaling factor in
different bands. The scaling factor is high for the LL sub band but for the other three
bands it is lower. The quality of the extracted watermark is determined by Similarity
Ratio measurement for objective calculation [100]. Ganic and Eskicioglu inspired by
Raval and Rege [78] proposed a multiple watermarking scheme based on DWT and
Singular Value Decomposition (SVD). They argued that the watermark embedded by
Raval and Rege [78] scheme was visible in some parts of the image especially in the low
frequency areas, which reduced the commercial value of the image. Hence they
generalized their scheme by using all the four sub bands and embedding the watermark in
SVD domain. The core technique is to decompose an image into four sub bands and then
applying SVD to each band. The watermark is actually embedded by modifying the
singular values from SVD [30].
2.3 RECENT METHODOLOGIES
Now-a-days, researchers are focusing on mixing of spatial and transformed domains (i.e.
combinations of DFT, DWT and DCT) concepts and also applying more and more
mathematical and statistical model, and other interdisciplinary approaches in
watermarking: for example use of chaotic theory, fractal image coding etc. In this section
we are presenting the brief of few recent watermarking algorithms.
34
In [103], authors presented a reversible watermarking scheme for the 2D-vector data
(point coordinates), which are used in geographical information related applications. This
reversible watermarking scheme exploits the high correlation among points in the same
polygon in a map and achieves the reversibility of the whole scheme by an 8-point
integer DCT, which ensures that the original 2D-vector data can be watermarked during
the watermark embedding process and then perfectly restored during the watermark
extraction process. In this scheme, author used an efficient highest frequency coefficient
modification technique in the integer DCT domain to modulate the watermark bit “0” or
“1”, which can be determined during extraction without using any additional information.
To alleviate the visual distortion in the watermarked map caused by the coefficient
modification, they proposed an improved reversible watermarking scheme based on the
original coefficient modification technique. Combined with this improved scheme, the
embedding capacity could be greatly increased while the watermarking distortion is
reduced as compared to the original coefficient modification scheme presented in [103].
In [65], authors presented zero-knowledge watermark detectors. Current detectors are
based on a linear correlation between the asset features and a given secret sequence. This
detection function is susceptible of being attacked by sensitivity attacks for which zero-
knowledge does not provide protection. In this work, a new zero-knowledge watermark
detector robust to sensitivity attacks is presented, using the generalized Gaussian
Maximum Likelihood (ML) detector as the basis. The inherent robustness that this
detector presents against sensitivity attacks, together with the security provided by the
zero-knowledge protocol that conceals the keys that could be used to remove the
watermark or to produce forged assets, results in a robust and secure protocol.
Additionally, two new zero-knowledge proofs for modulus and square root calculation
are presented. They serve as building blocks for the zero-knowledge implementation of
the Generalized Gaussian ML detector, and also open new possibilities in the design of
high level protocols.
35
If digital watermarking is to adequately protect content in systems which provide
resolution and quality scalability, then the watermarking algorithms must provide both
resolution and quality scalability. Although there exists a trade off between resolution
and quality scalability, it has been demonstrated that it is possible to achieve both types
by taking advantage of human visual system characteristics to increase quality scalability
without compromising resolution scalability. Watermarking algorithms considering this
problem have been proposed; however, they tend to focus on a single type of scalability,
resolution [96, 120] or quality [12, 98]. Peng et al. [66] considered both types, but their
algorithm deals exclusively with authentication and is not a watermarking algorithm. In
their work [67] authors focused on providing a spread spectrum watermarking algorithm
which had both resolution and quality scalability demonstrated through experimental
testing using the JPEG2000 compression algorithm. To alleviate this trade off, they began
with a non-adaptive resolution scalable algorithm and exploited the contrast sensitivity
and texture masking characteristics of the HVS to construct an HVS adaptive algorithm
that has good quality scalability. Their algorithm is specifically designed to concentrate
on textured regions only, avoiding the visible distortions, which may occur when strength
increases are applied to edges. Furthermore, this texture algorithm is applied in the
wavelet domain but uses only a single resolution for each coefficient to be watermarked.
In their work [126], authors presented a new image adaptive watermarking scheme based
on perceptually shaping watermark block wise. Instead of the global gain factor, a
localized one is used for each block. Watson’s DCT-based visual [109] model is adopted
to measure the distortion of each block introduced by watermark, rather than the whole
image. With the given distortion constraint, the maximum output value of linear
correlation detector is derived in one block, which demonstrated the reachable maximum
robustness in a sense. Meanwhile, an EXtended Perceptually Shaped Watermarking (EX-
PSW) is acquired through making detection value which approaches to upper limit. It is
proved mathematically that EX-PSW outputs higher detection value than Perceptually
Shaped Watermarking (PSW) with the same distortion constraint. Authors used this idea
and also discussed the adjustment strategies of parameters in EX-PSW, which were
36
helpful for improving the local image quality. Experimental results show that scheme
provides very good results both in terms of image transparency and robustness.
In [10], authors presented an Independent Component Analysis (ICA) [40-41] based
watermarking method. This watermarking scheme is domain-independent ICA-based
approach. This approach can be used on images, music or video to embed either a robust
or fragile watermark. In the case of robust watermarking, the method shows high
information rate and robustness against malicious and non-malicious attacks while
inducing low distortion. Another version of this scheme is a fragile watermarking scheme
which shows high sensitivity to tampering attempts while keeping the requirement for
high information rate and low distortion. The improved performance is achieved by
employing a set of statistically independent sources (the independent components) as the
feature space and principled statistical decoding methods.
In [90], authors presented a dual watermarking Scheme. In general, the watermark
embedding process affects the fidelity of the underlying host signal. Fidelity, robustness
and the amount of data which can be embedded without visible artifacts, often conflict.
Most of early watermarking schemes have focused on embedding the watermark
information applying a global power constraint such as the Peak-Signal-to-Noise-Ratio
(PSNR) to satisfy fidelity constraints. But, the PSNR value is reflecting human’s visual
system because local image properties such as edges or textures are not considered. The
watermarking systems have been proposed that allowed the embedded signal to be locally
varied in response to the local properties of the corresponding host signal [38, 73, 77].
Authors in their paper [90] neglected the PSNR value and use the fact that all common
lossy image compression schemes are PSNR optimized. They embedded watermark
information by geometrically shifting objects and object borders in a given host image. If
an observer has no original image for comparison, the embedding process is
imperceptible. As a consequence, this approach turns out to be extremely robust to
common image compression. Common lossy image compression is optimized for
maintaining the geometric image structure. Hence, as they demonstrate, the embedded
37
information is not affected by a successive embedding approach in the compression
domain.
Authors in their paper [39] presented an improved invariant wavelet and designed a DCT
based blind watermarking algorithm against Rotation-and Scaling-and Translation (RST)
attacks by exploiting the affined invariance of the invariant wavelet. Surviving geometric
attacks in image watermarking is considered to be of great importance. In the face of
geometrical attacks, all shortcomings of almost all digital watermarking algorithms have
been exposed. Therefore, this paper presents an improved invariant wavelet that is better
than the bilinear interpolation and whose performance is close to of bi-cubic when
scaling factor is very close to 1, and designs a novel blind image watermarking algorithm
based on DCT in the (RST) Xiong’s Invariant Wavelet, i.e. RSTXIW domain. The
experiments show that this novel watermarking algorithm is robust against filter, noise
and arbitrary RST geometrical attacks, however, sensitive to local crop attacks.
In their paper [107], authors presented an image watermarking scheme based on 3-D
DCT. A gray-level image is decomposed into a 3-D sub-image sequence by sub sample
of zigzag scanning order that is transformed using block-based 3-D DCT.
Simultaneously, they proved that the distribution of 3-D DCT AC coefficients follows the
generalized Gaussian density function using the distribution relative entropy theory. To
satisfy the balance between the robustness and the imperceptivity, a 3-D HVS model is
improved to adjust the embedding strength. In watermark detecting, the optimum detector
is used to implement the blind detection. It is shown in experiments that the scheme is
strongly robust against various attacks.
In paper [101] proposed digital watermarking scheme uses the properties of DCT and
DWT to achieve almost zero visible distortion in the watermarked images. These
schemes use a unique method for spreading, embedding and extracting the watermark.
Embedding using a linear relation between the transform coefficients of the watermark
and a security matrix has been proposed with satisfactory results.
38
In [59], algorithm is based on multistage Vector Quantization (VQ) that embeds both
robust watermark for copyright protection or ownership verification and fragile
watermark for content authentication or integrity attestation. The method in [122]
combined DCT and VQ to simultaneously embed robust and fragile watermarks.
In their paper [31], two simple dither modulation schemes for a pair of DCT coefficients
are proposed. The first step is to handle the original image using the sub sampling
technique as described in [14]. Then, transform it into DCT domain to obtain four sub
images. By dividing them into two groups, we introduce distinguishing dither modulation
processes in the two coefficient pairs with two robust watermarks. Experimental results
show that the proposed method is blind and robust and through adopting dither
modulation in sub images gained by sub sampling, two independent robust watermarks
can be embedded in the original image.
In the field of color images watermarking, many methods are accomplished by marking
the image luminance, or by processing each color channel separately. Therefore in paper
[55], authors proposed a new DCT domain watermarking expressly devised for RGB
color images based on the diversity technique in communication system. The watermark
is hidden within the data in the same sequence by modifying a subset of the block.
DCT coefficients of each color channel. Detection is based on combination method
which takes into account the information conveyed by three color channels. Even if a
particular channel is severely faded, they are still able to recover a reliable estimated of
transmitted watermark through other propagation channel. Experimental results, as well
as theoretical analysis, are presented to demonstrate the validity of the new approach with
respect to algorithm operating on image luminance only.
2.4 PROBLEM STATEMENT FORMULATION
Since, financial implications of some of the application areas like fingerprinting and
copyright protection are very high and till now no successful algorithm seems to be
39
available to prevent illegal copying of the multimedia contents, the primary goal of this
thesis work is to develop watermarking schemes for images (which are stored in spatial
domain as well as transformed domain) which can sustain the known attacks and various
image manipulation operations. Out of image, audio and video, the image watermarking
was chosen as a goal because any successful image watermarking algorithm may be
extended to video watermarking also. Therefore, to keep the future extension in mind, the
cover medium chosen is an image.
Based on the literature survey presented in Sections 2.1, 2.2 and 2.3, the following issues
were also identified:
ISSUE 1: Till now there in no “Generic” nature in the watermarking algorithms
available. More precisely, if certain approach is applicable for a gray level image, the
same approach does not work for the other formats of an image.
ISSUE 2: Even if gray color image watermarking algorithms are extended for RGB color
images, the maximum work has been done for BLUE color channel only because human
eyes are less sensitive to detect the changes in BLUE color channel. No attack impact
analysis, i.e, which color channel may be affected by a particular attack, has been
carried out.
In view of the above, our problem statements are as follows:
Problem statement 1: Choose Image Watermarking as a major problem.
Problem statement 2: Identify, for multi-color channel images (True color windows
BMP, uncompressed JPEG), the suitability of a color channel with respect to attack (if
any).
Problem statement 3: Explore the ways such that attack impacts may be minimized
before the watermark embedding process.
40
ISSUE 3: In most of the research papers, once the watermarking scheme is finalized, it is
applied to all test images. Since each image is different and has certain characteristics
and after embedding the watermark data by a particular watermarking scheme, its
performance against a particular attack may not be similar with other image. No study is
conducted to make the embedding scheme based on some image characteristics. Thus, the
next problem statement is:
Problem statement 4: Explore the relationship between the performance of
watermarking scheme and the cover image characteristics itself.
ISSUE 4: Mostly watermarking schemes are developed in a way that first a scheme is
developed based on the extension of earlier presented one and then see its performance
against the common image manipulation and known attack. There are huge financial
implications for watermarking schemes (say fingerprinting), but no scheme has been
developed, which is, by design, resistant to at least one attack which can not be
conducted by an attacker, leading to next problem statement:
Problem statement 5: Embed an inherent nature in the developed watermarking
schemes to guarantee that at least one serious attack having most financial implications
cannot attack on watermarked images.
2.4.1 JUSTIFICATIONS OF THE PROBLEM STATEMENT CHOSEN
While deciding the way to start the development of our watermarking schemes, first we
resolved the ISSUE 4, because this must be dealt first among all the 4 issues listed above.
It is known that the application area having the highest financial implications is
‘Fingerprinting”.
If attacker has access to more than one copy of watermarked image, he/she can predict/
remove the watermark data by colluding them. This is known as Collusion attack.
41
Researchers working on “fingerprinting” primarily focus on the “collusion attack”.
Network technology research center, Nanyang Technological University, Singapore
website states that they pay at least equal attention to watermark attacks/counter-attacks
as watermark designs [63]. To facilitate pirate tracing in video distribution applications,
different watermarks carrying distinguishing client information are embedded at source.
If few clients requesting for the same source data get their differently marked versions
together, they may collude to remove or weaken the watermark leading to what is
commonly called “collusion attack”.
Collusion attacks are powerful attacks because they are capable of achieving their
objective without causing much degradation in visual quality of the attacked data
(sometimes, visual quality may even improve after attack.).
In their paper “Multi-bits Fingerprinting for Image” [46], authors focused on collusion
attack for fingerprinting application. It has been stated, that in fingerprinting different
copies for each customer can be produced and this point is very helpful for attackers.
Attackers compare several fingerprinting copies and find the location of the embedded
information and destroy it by altering the values in those places where a difference was
detected.
One more work, specially conducted against collusion attack can be found as “Collusion-
resistant watermarking and fingerprinting (US Patent Issued on June 13, 2006)” [15].
Interested readers can find more literature based on collusion attack on watermarking
system in [127-131].
Therefore, Collusion attack is the most severe problem for the watermarking application
area having the most financial impact. So while designing a watermark scheme, we
decided that our proposed schemes must be designed in such a way that schemes are
inherently collusion attack resistant. Therefore by this thesis, we are presenting a new
term “ICAR (Inherently Collusion Attack Resistant)” as a requirement for a
watermarking system.
42
The other 3 issues were taken into account while developing the watermarking schemes.
After this, we had to decide the working domain and approaches of our developments
based on the findings in the literature survey.
Since transformed domain watermarking has been proved better than spatial domain
watermarking, we decided to start with the transformed domain watermarking for gray
level images and then subsequently move further for Colored and JPEG image
watermarking keeping the first issue in mind.
Apart from ICAR nature and resistant to common image manipulations and known
attacks, we primarily focus the JPEG compression attack. This Lossy attack can reduce
the size of an image up to 1% without altering much visual quality of an image.
Therefore, we picked up the classical Middle Band Coefficient Exchange (MBCE)
scheme (refer Section 2.2.2.1) as a base for developing our schemes because this scheme
takes the JPEG quantization table into consideration to hide the watermark data and thus
ensures the robustness against JPEG compression attack.
To move further, we again had to decide the categories of the watermarking application
areas based on Figure 1.2, we are targeting to develop in this thesis work. Thus,
Figure 2.6 is the same as Figure 1.2 but with highlighted types.
43
Figure 2.6: The Targeted types of to be developed watermarking schemes
The first 2 red highlighters are already justified. The last one (destination based) is again
understood as we are focusing ICAR nature in our watermarking schemes to be
developed, which are highly correlated with fingerprinting which comes under the
destination based watermarking.
Among the visible and invisible, we picked up the non-visible watermarking because in
most of the cases, the presence of the watermark or copyright data is to be hidden. The
most crucial decision before us was to decide the choice among fragile versus robust
watermarking. Since in the business, “Temper detection” have more serious financial
44
implications than the copy or copyright control, we decided to go for fragile
watermarking.
To conclude, it was decided to work on IMAGE WATERMARKING in
TRANSFORMED DOMAIN (more precisely DCT based) to develop an ICAR
watermarking scheme to hide an INVISIBLE watermark data which is FRAGILE in
nature. In addition, the schemes to be developed should be generic in nature i.e. which
could be extended to other images which are stored in spatial domain and transformed
domain.