A flexible H.264/AVC compressed video watermarking scheme using particle swarm optimization based dither modulation

Int. J. Electron. Commun. (AEU) 65 (2011) 27–36

Contents lists available at ScienceDirect

Int. J. Electron. Commun. (AEU)

1434-84

doi:10.1

� Corr

E-m

journal homepage: www.elsevier.de/aeue

A flexible H.264/AVC compressed video watermarking scheme using particleswarm optimization based dither modulation

C.H. Wu a,�, Y. Zheng b, W.H. Ip a, C.Y. Chan a, K.L. Yung a, Z.M. Lu c

a The Department of ISE, The Hong Kong Polytechnic University, Hung Hom, Hong Kongb Department of Information Science and Technology, Sun Yat-Sen University, Guangzhou, Chinac The Institute of Astronautic Electronic Engineering, The School of Aeronautics and Astronautics, Zhejiang University, Zhejiang, China

a r t i c l e i n f o

Article history:

Received 22 May 2009

Accepted 9 December 2009

Keywords:

Digital watermarking

Dither modulation

H.264/AVC

Particle swarm optimization

11/$ - see front matter & 2010 Elsevier Gmb

016/j.aeue.2010.02.003

esponding author. Tel.: +852 27666584; fax:

ail address: [email protected] (C.H. Wu).

a b s t r a c t

Due to the explosion of data sharing on the Internet and the ultimate use of digital media, especially

digital video clips, there is great interest by digital video owners in video security and copyright

protection. In this paper, a flexible particle swarm optimization (PSO) based dither modulation (DM)

watermarking scheme in H.264/AVC compressed video is proposed. Its robustness, against the

commonly employed watermarking attacks described in literature, is illustrated. The watermark

imperceptibility and video quality also need to be taken into consideration, hence, a PSO is employed

for optimizing the conflicting requirements. In addition to focusing on improving the robustness of the

new watermarking scheme, the imperceptibility is enhanced by applying PSO, which keeps checking

the objective function, including both factors. The proposed scheme allows users to add or drop diverse

watermarking attacks for optimization and the effectiveness of this scheme is verified through a series

of experiments.

& 2010 Elsevier GmbH. All rights reserved.

1. Introduction

The focus in this paper is on engaging digital watermarkingtechniques to protect the H.264/AVC video copyright, and thesecurity, robustness and imperceptibility are the main concerns[1] . The H.264/AVC standard is designed to have high compres-sion efficiency [2]. It has good performance in low bite rateapplications and has experienced widespread adoption withinthis few years. It has been employed widely in applicationsranging from digital video broadcast (DVB) to video for mobiledevices. Since it has high compression efficiency, it is easier to betransmitted through the Internet or by illegal duplication, piracyis prevalent. Thus, the copyright protection of the H.264/AVCvideos will be an important issue. Designing a feasible water-marking scheme can be regarded as an optimization problem withthese contradictory goals: better robustness and higher imper-ceptibility. In order to balance the robustness and impermeability,there are two major methods: heuristic optimization basedalgorithms [3–6] and the human vision system (HVS) basedapproaches [7–9].

In this paper, the particle swarm optimization (PSO) method isemployed to enhance the imperceptibility and robustness of theproposed watermarking scheme for H.264/AVC video. In order to

H. All rights reserved.

+852 23625267.

enhance the security, there are two keys in the proposed scheme.The first key is the predefined key by copyright owner and thesecond key is used for storing the locations of the dithermodulated coefficients. The PSO shares a numerous similaritieswith other evolutionary computation techniques, such as geneticalgorithm (GA). Compared to GAs, simplicity, shorter CPU runtime and fewer parameters to adjust are the advantages of PSO[10]. Applying the algorithm to image watermarking has beenproposed by Wang et al. [11] but it is still a research area wherefew researchers have tried to explore, especially in H.264/AVCvideo. According to the aforementioned advantages and opportu-nity, we made a humble attempt to apply PSO to find the optimalfrequency bands for watermarking in the DCT-based water-marking system, which can improve both the robustness andimperceptibility by striving for a balance between the lowfrequency and high frequency bands. Experimental results areprovided to show that this H.264/AVC watermarking scheme canimprove both the imperceptibility and the robustness underattacks.

2. Previous work

In the literature, there are a variety of schemes that have beenproposed for embedding watermarks for use with well knownvideo compression standards, such as MPEG-1, -2, MPEG-4, andH.264/AVC. Hartung et al. [12] and Dittmann et al. [13] proposed

www.elsevier.de/aeue

dx.doi.org/10.1016/j.aeue.2010.02.003

mailto:[email protected]

dx.doi.org/10.1016/j.aeue.2010.02.003

C.H. Wu et al. / Int. J. Electron. Commun. (AEU) 65 (2011) 27–3628

video watermarking techniques in DCT domain respectively. Qiuet al. [14] suggested a system that embeds the watermark bits inthe least-significant-bit (LSB) bits of the motion vectors. In 2006,Nguyen et al. [15] proposed a fast watermarking system thatworks on the H.264 motion vectors. Kim et al. [16] devised awatermarking scheme in the H.264 CAVLC codes. Up to now, thecompressed-domain video watermarking schemes can be gen-erally classified into three groups: the first depends on the DCTcoefficients, the second relies on the motion vectors and the thirdworks on the entropy coding domain. Watermarking DCTcoefficients has the advantage of being more robust to attacksand distortions, while watermarking in the motion vector or inthe entropy coding domain is much simpler in terms ofcomputational complexity.

The dither modulation (DM) watermarking scheme, which isestablished from the principle of the quantization index modula-tion (QIM), has become one of the popular watermarking schemes[17–22]. In 2005, the experiment, which is conducted by Profrocket al., showed that the lower embedding overhead can beachieved by QIM approach than the LSB approach [22]. In 2007,Noorkami et al. [9] proposed a theoretical framework for robustwatermarking of H.264 video with the adapted human visualmodel. However, there has been relatively little research work incharacterizing the inherent trade-offs among the robustness andimperceptibility of DM watermarking scheme in H.264/AVCvideo.

3. Particle swarm optimization (PSO) – the basic

The particle swarm optimization algorithm was first describedin 1995 by Kennedy et al. [23]. The swarm is typically modeled byparticles in multidimensional space that have position andvelocity. These particles fly through the search space and havetwo essential reasoning capabilities: the memory of their ownbest position and knowledge of the global or their neighborhood’sbest (position). The particles start at a random initial position andsearch for the minimum or maximum of a given objectivefunction by moving through the search space. The movement ofa particle depends only on its velocity and the location wheregood solutions have already been found by the particle itself or inneighboring particles in the swarm. The manipulation of theswarm can be written as:

vid ¼w � vidþc1 � r1 � ðpid�xidÞþc2 � r2 � ðlbesti�xidÞ ð1Þ

xid ¼ xidþvid ð2Þ

In Eq. (1), w is the inertia weight. The velocity v of each particleis updated in each dimension d, whereas Eq. (2) updates eachparticle’s position x in the search space. c1, c2 are cognitivecoefficients and r1, r2 are random numbers between [0, 1]; pid isthe personal best position and lbesti is the neighborhood best ofi th particle. As the swarm iterates, the fitness of the global bestsolution increases or decreases for the optimization problem. It isa fact that all particles could be influenced by the global besteventually and approach to the global best, and the fitness willnever improve no matter how many runs are undertaken by thePSO thereafter. The particles also move about in the search spacein close proximity to the global best and do not explore the rest ofsearch space [24]. This phenomenon is convergence. If the inertialcoefficient of the velocity is small, all particles could slow downuntil they reach zero velocity at the global best. Thus, theselection of coefficients in Eq. (1) affects the convergence and theability of the swarm to search for the optimum. One way toovercome this situation is to re-initialize the particles at pre-defined intervals or when convergence is detected.

4. Dither modulation – the basic

The quantization index modulation (QIM) watermarkingembedding algorithm was first proposed by Chen et al. [25]. TheQIM system has considerable performance advantages overpreviously proposed spread-spectrum systems. The basic idea ofQIM is to quantize the host data to different quantization intervalsaccording to the watermark bit. Dither modulation is a mostpopular realization for QIM.

The model of dither modulation is described as follows. It isassumed that the host signal is an N dimensional vector xARN andthe watermark w is chosen from the set f0;1; . . . ;2NRm�1g, whereRm is the embedding rate (bits per host signal sample). Anembedding operation Em(x,w) will map the host signal x to awatermarked signal x0. After watermarking, the watermarkedsignal may be subjected to some kinds of attacks leading to acorrupted output signal x00, and the watermark extractor willoutput the estimated watermark w according to x00. In this model,the embedding operation Em(x,w) can be viewed as a collection offunctions in a quantizer set Q:

Q ¼ fqð�;0Þ; qð�;1Þ � � � qð�;2NRm�1Þg ð3Þ

Each quantizer qð�; iÞ corresponds to a watermark with valuei; i¼ 0;1; . . . ;2NRm�1. The watermarked signal can be formulatedas:

x0 ¼ Emðx;wÞ ¼ qðx;wÞ ð4Þ

As a simple example, set N=1,Rm =1 and a scalar uniformquantizer with step size D. The watermark w to be embedded is 0or 1. Assuming that the x axis is divided into intervals A and Bwith length D. The interval A indicates watermark bit 1 andinterval B indicates watermark bit 0. According to the watermarkbit to be embedded, the one dimensional host signal x isquantized to the nearest point of q(x,0) or q(x,1), which lies inthe middle of each interval. When watermarking detection, if thecorrupted signal x00 lies in the interval B, the extracted watermarkw is 0, otherwise it is 1. Commonly the dither modulationembedding is implemented in a transform domain and itsembedding location in the host signal is predefined and fixed[26,27]. The algorithm devised adaptively selects the location toembed the watermark to improve the fidelity and robustnesssimultaneously.

5. The PSO-based watermarking scheme and watermarkextraction

The H.264/AVC is the newest digital video compressionstandard jointly developed by ITU-T and ISO/IEC [28]. H.264/AVC has achieved a significant improvement in rate-distortionratio compared to previous video coding standards. Ignoring thewatermark embedding part of Fig. 1, the basic structure of H.264/AVC encoder is demonstrated.

The input video frames are decomposed into macroblocks.Each macroblock is either intra or inter predicted based on thereconstructed samples. Then the prediction error is transformedand quantized. A low computation cost integer DCT is adopted byH.264/AVC, which can be carried out with integer arithmeticusing only additions, subtractions and shifts, after which thequantized coefficients are Zigzag reordered and entropy coded.The entropy-encoded coefficients, together with side information(prediction modes, quantization parameter, motion vector in-formation, etc.) are passed to a network abstraction layer (NAL)for transmission or storage. A deblocking filter is also applied tothe decoded macroblocks in order to improve the compressionperformance of the inter-coded blocks of future frames.

Fig. 1. The basic structure of H.264/AVC encoding and watermarking embedding.

C.H. Wu et al. / Int. J. Electron. Commun. (AEU) 65 (2011) 27–36 29

The new scheme embeds the watermark by modifying thequantized integer DCT coefficients of I-frames’ intensity compo-nents, and each I-frame is repeatedly embedded with the samewatermark. As Fig. 1 shows, the watermark scheme is incorpo-rated into the encoder itself. In this way, the error introduced bywatermarking will be compensated for future predictions and willnot be propagated to P-frames and B-frames. More watermarkbits can be embedded in the compressed video compared to thewatermarking techniques that embed the watermark into thecompressed video bitstream directly [16].

5.1. Dither modulation embedding algorithm

Assuming the original input digital video is of size M�N andthe watermark W is a binary image with Wðs;tÞAf0;1g, then theH.264/AVC encoder will produce the number of P I-frames for the

video X. One quantized integer 4�4 DCT block in an I-frame isdenoted as Y(s,t) and the resulting 16 coefficients can berepresented as:

Yðs;tÞ ¼[15

k ¼ 0

fYðs;tÞðkÞg 1rsrM=4; 1rtrN=4 ð5Þ

The elements in Y(s,t) are zigzag scanned so that a smaller k

corresponds low frequency components and a larger k corre-sponds high frequency components. s and t represents theposition of each 4�4 DCT block. The goal is to select onecoefficient, except the DC component Y(s,t)(0) in a block Y(s,t), toembed one single watermark bit W(s,t) for 1rsrM=4;1rtrN=4,therefore, the maximum size of the watermark should also beM4 �

N4. If necessary, one can resize the watermark to fit the host

video first. Before embedding, pseudo-randomly permute thebinary watermark image W with a predefined key, key1 isrequired, in order to increase the security and robustness.

Wp ¼ permuteðW ; key1Þ ð6Þ

In the proposed algorithm, selecting the embedding location ofthe integer DCT coefficient in a block is based on PSO, and theembedding is based on dither modulation of the selected ACcoefficient. Suppose that the watermark to be embedded is W(s,t)

and a coefficient Y(s,t)(k) has been obtained by PSO in order tosimplify the explanation of the dither modulation in thequantized integer DCT domain. The dither modulation embeddingalgorithm can be formulated as follows:

1.
Divide the Y(s,t)(k) axis into intervals A and B based on thequantization step D. If the quantized watermarked coefficientY(s,t)0
(k) lies in the interval A, it indicates that the watermark bit1 is embedded. If the quantized value Y0(s,t)(k) lies in interval B,it indicates that the watermark bit 0 is embedded.

2.
Calculate the integer quotient m and remainder r
m¼ floorYðs;tÞðkÞ

Dþ0:5

� �ð7Þ

where floor() is an operation of rounding towards �ve infinity.

r¼ Yðs;tÞðkÞ�mD ð8Þ

Quantize according to the sign of Y(s,t)(k) and the watermark
3. bit W(s,t). There are four combinations:a. If Yðs;tÞðkÞZ0, Wðs;tÞ ¼ 1
Y 0ðs;tÞðkÞ ¼

2kD if m¼ 2k

2kDþ2D if m¼ 2kþ1 and rZ0

2kD if m¼ 2kþ1 and ro0

8><>: ð9Þ

b. If Yðs;tÞðkÞZ0, Wðs;tÞ ¼ 0

Y 0ðs;tÞðkÞ ¼

2kDþD if m¼ 2k and rZ0

2kD�D if m¼ 2k and ro0

ð2kþ1ÞD if m¼ 2kþ1

8><>: ð10Þ

c. If Yðs;tÞðkÞo0, Wðs;tÞ ¼ 1

Y 0ðs;tÞðkÞ ¼

�2kD if m¼�2k

�2kD if m¼�ð2kþ1Þ and rZ0

�2kD�2D if m¼�ð2kþ1Þ and ro0

8><>: ð11Þ

d. If Yðs;tÞðkÞo0, Wðs;tÞ ¼ 0

Y 0ðs;tÞðkÞ ¼

�2kDþD if m¼�2k and rZ0

�2kD�D if m¼�2k and ro0

�ð2kþ1ÞD if m¼�ð2kþ1Þ

8><>: ð12Þ

m0 ¼ floorY 0ðs;tÞðkÞ

Dþ0:5

!ð13Þ


where floor() is an operation of rounding towards �veinfinity.

The quantization step D controls the robustness and percep-tions of the embedded watermark. It can be seen that themaximum quantization error is D from the formulas (7), (9)–(12).During watermark extraction, if m0, in formula (13), is odd, theextracted watermark bit should be 0, otherwise, it should be 1.

5.2. PSO training

Treat each 16�16 macroblock M(u,v) with 1rur M16,

1rvr N16 as a separate PSO training unit, namely, and PSO

training is performed on a macroblock basis. As an example, onezigzag scanned block within a macroblock can be seen in Fig. 2.One AC coefficient in a block and totally 16 coefficients in amacroblock are randomly chosen as the initialization of zeroiteration for a particle in PSO. However, the randomly selectedcoefficients might not ensure imperception and robustness. Byapplying PSO, the locations of the AC coefficients for all particlesare updated after each iteration. The resulting PSO selectedlocations represent the coefficients which hold higher videoquality and robustness, subjected to some attacks at the sametime. After a certain iteration time, 16 optimal locations will beobtained to embed 16 bits watermark in a macroblock.

In order to formulate a fitness function in PSO, one shouldmeasure the watermarked video quality and robustness quanti-tatively in a macroblock. In 2008, Huynh-Thu and Ghanbariconcluded that peak signal-to-noise ratio (PSNR) can be a goodindicator of the variation of the video quality as long as the qualitymeasurement conducted in a single video content [29]. Thus, thewidely used objective video quality metric, PSNR, is chosen torepresent video quality and the normalized cross-correlation(NCC) value [3,30] between the original watermark and extractedwatermark indicates the robustness. The PSNR is defined asfollows:

PSNR¼ 10log10ð2552=MSEÞ ð14Þ

MSE¼1

256

X16

i ¼ 1

X16

j ¼ 1

ðMðu;vÞði; jÞ�M0ðu;vÞði; jÞÞ2

ð15Þ

Fig. 2. Illustration of blo

where MSE denotes the mean square error between the original M

(u,v)(i,j) and watermarked samples M0(u,v)(i,j) in pixel domain andNCC can be defined as:

NCC ¼

P16i ¼ 1 � ½wðiÞ �w0

ðiÞ�

16ð16Þ

where w(i) is the original watermark and w0(i) is the extractedwatermark. � denotes the XOR operation and � denotes the NOToperation. In this study, three common attacks, namely, LowpassFilter, Median Filter and Gamma correction ðgÞ are employed inthe scheme. Each attack corresponds to an NCC value. Then, thefitness function can be defined as:

f ¼ PSNRþX3

i ¼ 1

liNCCi ð17Þ

where li is a weighting factor which balances the imperceptionsand robustness of the embedded watermark. A larger li means ahigher robustness and less imperceptibility, and vice versa.

Shieh et al. [3] devised a GA-based method for imagewatermarking. A similar structure is also adopted in this researchfor watermarking H.264/AVC compressed video based on PSO, asshown in Fig. 3. For this new method, widely used attacks fromStirmark, lowpass filtering and median filtering [31] and gcorrection, which represents one of the beautification attacks,are included in the training, and each particle is a 16 dimensionvector representing the embedding locations in one macroblock.In each iteration, all the particles are updated according to theformulae (18) and (19). Assuming the total number is r particlesand the i th particle in the k th iteration is denoted as Pi

k(j),j=1,2y,16, then the corresponding update velocity is denoted asVi

k(j), j=1,2y,16.

Vkþ1i ðjÞ ¼ c0Vk

i ðjÞþc1r1ðjÞ½PlocaliðjÞ�Pki ðjÞ�

þc2r2ðjÞðPglobalðjÞ�Pki ðjÞÞ for j¼ 1;2; . . . ;16 ð18Þ

Pkþ1i ðjÞ ¼ roundðPk

i ðjÞþVkþ1i ðjÞÞ

j¼ 1;2; . . . ;16; constraint : 1rPkþ1i ðjÞr15 ð19Þ

where Plocali(j), j=1,2,y,16 are the local best locations for the i thparticle in the previous iterations, which means that it outputs

cks in a macroblock.

Fig. 3. Flow chart of the PSO training.


the largest fitness value among all previous iterations. Pglobal(j),j=1,2,y,16 are the global best locations for all r particles in allprevious iterations. c0, c1 and c2 are constant values and r1(j) andr2(j) are uniform random numbers between 0 and 1. The functionround() means rounding locations to the nearest integer numbers,since the locations of coefficients must be integers. The constraint1rPkþ1

i ðjÞr15 means there are 15 AC coefficients in a block.Therefore, if Pi

k + 1(j) is larger than 15, Pik + 1(j) is assigned as a

random number between 8 and 15, and if Pik +1(j) is smaller than 1,

a random number is assigned between 1 and 8.After certain number of iterations, the optimal locations

Pglobal(j), j=1,2,y,16 with higher PSNR and NCC will be obtained.Then, these coefficients can be quantized to embed the watermarkin the macroblock. After that, they are passed to the nextmacroblock and embed the next 16 watermark bits through PSOand dither modulation until all the watermark bits are embedded.When performing extraction, the locations of these dithermodulated coefficients are also needed. They should be storedand the locations treated as another key, e.g. key2.

5.3. Watermark extraction

When extracting the watermark, the original video X is notrequired in the algorithm. The watermarked video, however, maybe subject to some intentional or unintentional attack, and theresulting video after attack is denoted by X

00

. The extractor is alsoinserted into the H.264/AVC decoder, as shown in Fig. 4.

When each entropy is decoded and reordered, the macroblockis passed into the watermark extractor, and each 16-bit water-mark is extracted with the secret key, key2, using formula (13).After all macroblocks have been extracted and decoded, one canget the whole permuted watermark information W0p. Then W0pcan be inverse permuted with key1 to get the extracted binarywatermark image W0. Since there are multiple I-frames in a video

clip and all I-frames are embedded with the same watermark, onecan combine all the watermarks extracted from all I-frames to geta final extracted watermark Wf. Each bit in Wf is determined bythe corresponding bits in multiples of W0 which have moreoccurrences.

6. Experimental results and analysis

In the experiments, the well-known 4:2:0 YUV format videoForeman with size 352�288, 200 frames and compressed at600 kps is used for testing. The binary watermark image Frog isshown in Fig. 5. Its size is 88�72 so that each watermark bit isembedded into each non-overlapping 4�4 block in an I-frame ofthe test video. Here, the free software codec X264 is used as theencoder, and the fixed I-frame mode is used, namely, setting oneI-frame every 5 frames. Therefore, there are totally of 40 I-framesto be embedded with the same watermark, Frog. Typically, thereare two different scenes in the Foreman video clip. They are ‘‘Theclose-up of foreman’’ and ‘‘The construction site’’ as shown inFig. 6. The testing results of the chosen frames are captured andshown in this section and the detailed testing results of all 40I-frames are also presented in this section.

Aforementioned watermarking attacks, lowpass filtering,median filter and g correction are applied to evaluate the scheme.Since the PSNR value is much larger than the NCC value, theweighting factors are set, l1 ¼ l2 ¼ l3 ¼ 30, based on numerousexperiment results and validated empirically. These weightingfactors associate with the trade-off between robustness andimperceptibility. If the weight is too large, the watermarked videomay have a very low PSNR value and vice versa. Also, themodulation step D is set to 1. In the PSO training algorithm, astatistical analysis has been done with different parameterssetting. It has been found that c0=0.4 and c1=c2=1.8 are goodparameter settings for this optimization process. In order to reveal

Fig. 4. The basic structure of the watermark extracting system.

Fig. 5. The watermark.


the visual quality, two specific watermarked frames with ouralgorithm, which are extracted from the two different scenes ofthe video respectively, are depicted in Figs. 7 with theircorresponding extracted watermarks after attacks. All the testsare executed 30 times on an Intel Core2Duo E8400 CPU with 2 GRAM computer. The results are tabulated in Tables 1 and 2 bycomparing the average PSNR and NCC values with 50 and 100particles respectively. Better performance has been achieved byincreasing the number of particles and iteration. The comparisonof no PSO training is shown in Table 3, namely, the embeddinglocations are randomly generated in all 30 runs. In Table 4,

simulation results of the watermarking scheme, where the GAselection adopted in Shieh’s work [3] is used instead of the PSOtraining, are shown. According to the results, it can be seen thatthere is observable improvements on both NCC and PSNR valuesafter applying PSO training. Using the PSO training is better thanusing the GA selection adopted in Shieh’s work because both theresulted fidelity and the robustness of the PSO-based approachare higher and with fewer iteration. In general, the average runtime of the proposed PSO-based approach is 20% less than the GA-based method.

The experiments were also conducted with compression onlyin order to reveal the reduction on PSNR caused by watermarking.The resulting PSNR values of the selected frames A and B aftercompression without watermarking are 41.99 and 40.34. There-fore, applying the PSO-based watermarking scheme caused only amore or less 1.38 and 0.98 drop in term of the PSNR values inaverage. Since the typical quality requirement is in the range of30–40 dB, reasonable and improved results can already beachieved by applying the PSO-based watermarking scheme with50 particles and 50 iterations. The two watermarked andcompressed frames and the extracted watermarks are illustratedin Fig. 7, which were obtained by applying PSO for 100 particlesand 150 iterations. The comparisons, without using PSO, areshown in Fig. 8. The plots, which are shown in Figs. 9 and 10,present the mean values of PSNR, NCC1, NCC2 and NCC3 of eachwatermarked I-frame after 30 runs.

We have tested our proposed scheme with different attacks,however, all three attacks have been adopted in the training.Thus, an additional attack, which was not present at the training,should be selected to demonstrate the capabilities and theshortcoming of the scheme. According to Ling et al. [32], there-encoding at a lower bit rate is an appropriate way to testthe watermark robustness. The results of the extracted water-marks after re-encoding at different compression rates are shownin Fig. 11. Although the attack is not included in the training, theextracted watermarks from frame A can be recognized until thecompression rate reached 400 kps.

In order to investigate the distributions of the AC coefficientslocations, the two chosen frames are studied. The distributions ofthe two frames are different because of the nature (scene) of thetwo frames’ different characteristics. The proposed PSO-basedwatermarking scheme can exploit the characteristics of differentframes to embed robust and imperceptible watermarks. Interest-ingly, there is a common point, in that more coefficients areselected in the low and middle frequency bands and frequencybands 1, 4 and 5 have the most occurrences in both frames, eventhough they have different scenes.

As aforementioned previously, the value of l associates withthe trade-off between robustness and imperceptibility, henceTable 5 indicates the trade-off effect with different values of l.The number of particles and iterations of the PSO are 100 and 150.Form the table, it can be seen that when l is smaller than 30, theaverage NCC values decrease obviously but the average PSNR

increase. When l is larger than 30, there is a little drop in PSNR

but the NCC values increase.

7. Conclusion and future work

A new and robust PSO-based watermarking scheme for H.264/AVC is presented in this paper. It should be noticed that, theproposed algorithm is also applicable to other kinds of videocompression standards based on DCT transform, such as MPEG-2and H.263. It is robust because a PSO is used to search for theoptimal frequency set for embedding the watermark. Theproposed scheme can cope with different filter attacks. Besides,

Fig. 7. Simulation results with PSO for the frame A (PSNR: 40.60; NCC1: 0.9691; NCC2: 0.8116; NCC3: 0.9520) and for the frame B (PSNR: 39.35; NCC1: 0.9544; NCC2: 0.8527;

NCC3: 0.9717).

Table 1Testing results for the frame A.

Particles Iteration Average PSNR (dB) Average NCC1 (LPF) Average NCC2 (MF) Average NCC3 ðgÞ

50 1 40.46 0.7932 0.6318 0.8695

50 40.54 0.9475 0.7760 0.9500

100 40.56 0.9604 0.7902 0.9532

150 40.58 0.9640 0.7992 0.9536

100 1 40.50 0.8111 0.6369 0.8773

50 40.58 0.9582 0.7907 0.9481

100 40.60 0.9662 0.8050 0.9495

150 40.61 0.9692 0.8116 0.9520

Fig. 6. The chosen frame in the first scene in Foreman (left) and the one in the second scene (right).

Table 2Testing results for the frame B.

Particles Iteration Average PSNR (dB) Average NCC1 (LPF) Average NCC2 (MF) Average NCC3 ðgÞ

50 1 39.10 0.7386 0.6256 0.8641

50 39.31 0.9252 0.8015 0.9604

100 39.32 0.9417 0.8305 0.9706

150 39.34 0.9512 0.8359 0.9792

100 1 39.29 0.7678 0.6372 0.8799

50 39.34 0.9458 0.8235 0.9652

100 39.35 0.9529 0.8470 0.9690

150 39.36 0.9545 0.8526 0.9715


Table 3Testing results without PSO training.

Frame Average PSNR (dB) Average NCC1 (LPF) Average NCC2 (MF) Average NCC3 ðgÞ

A 39.02 0.6345 0.6318 0.7803

B 38.23 0.5866 0.574 0.7707

Table 4

Testing results with the GA selection adopted in Shieh’s method with 200 iterations and li ¼ 30.

Frame Average PSNR (dB) Average NCC1 (LPF) Average NCC2 (MF) Average NCC3 ðgÞ

A 40.52 0.9500 0.7978 0.9452

B 39.30 0.9454 0.8420 0.9627

Fig. 8. Simulation results without PSO for the frame A (PSNR: 39.02; NCC1: 0.6345; NCC2: 0.6318; NCC3: 0.7803) and for the frame B (PSNR: 38.23; NCC1: 0.5866; NCC2:

0.574; NCC3: 0.7707).

43

42

41

40

PS

NR

39

38

375 10 15 20 25 30 35 40

The Index of I-frame

Fig. 9. The mean PSNR values of each watermarked I-frame after 30 runs.


the watermark imperceptibility is also improved with the aid ofPSO, thus, higher video quality can be preserved. Experimentalresults reveal that by applying the proposed scheme, obviousimprovement can be achieved in both robustness and impercept-ibility. The NCC of the extracted watermark after certain attacks isgreatly enhanced when compared with the original DM water-marking method. The power of the PSO-based approach in

watermarking techniques is still untapped. According to thesuggested scheme, researchers and users can add or drop theattacking modules, such as transcoding and brightness adjust-ment, this will give extra flexibility for the watermarkingalgorithm development and optimize the robustness level invideo watermarking applications with the adjustable, comple-mentary training for different anti-attacking purposes. In realsituation, each owner will have the keys beforehand. The keysmust be kept in a secure location to prevent from obtaining byanyone. The decoding process will only be conducted by theowner if and only if he or she wants to prove the copyrightownership. In addition, it is not necessary for each I frame to beevaluated by the proposed PSO-based approach, otherwise themethod would be too time-consuming. It has been found, andvalidated empirically, that evaluate one I frame out of every threeI frames is a good criteria for applying the proposed scheme.

Acknowledgments

The authors wish to thank the Research Committee andDepartment of ISE, the Hong Kong Polytechnic University forsupport in the research project (G-YG44). Our gratitude is alsoextended to Department of Information Science and Technology,Sun Yat-Sen University, China.

1

0.95

0.9

NC

C

0.855 10 15

The Index of I-frame20 25 30 35 40

0.855

0.845

NC

C

0.835

0.825

0 5 10 15 20 25 30 35 40The Index of I-frame

1

0.95

NC

C

0.9

0.855 10 15

The Index of I-frame20 25 30 35 40

Fig. 10. The mean NCC values of each watermarked I-frame after different attacks: (a) after 30 runs with passing through a lowpass filter, (b) after 30 runs with passing

through a median filter, (c) after 30 runs with g correction.

NCC: 0.9407 NCC: 0.8996 NCC: 0.8018 NCC: 0.6229 original

Fig. 11. The extracted watermarks after re-encoding at different compression rates (from left to right: 600, 550, 500, 400 kps, original watermark).

Table 5

The averaged results of the 40 I frames in the Foreman with different l.

l Average

PSNR (dB)

Average

NCC1 (LPF)

Average

NCC2 (MF)

Average

NCC3 ðgÞ

0 43.23 0.5286 0.5464 0.6957

10 40.57 0.9011 0.7998 0.9214

20 40.42 0.9369 0.8180 0.9492

30 40.21 0.9612 0.8311 0.9618

40 39.75 0.9697 0.8354 0.9679

50 39.14 0.9713 0.8385 0.9698


References

[1] Dittmann J, Steinebach M. Joint watermarking of audio-visual data. In:Proceedings of IEEE fourth workshop on multimedia signal processing,Cannes, 2001. p. 601–6.

[2] Richardson IEG. H.264 and MPEG-4 video compression. Chichester: Wiley;2004.

[3] Shieh CS, Huang HC, Wang FH, Pan JS. Genetic watermarking based ontransform-domain techniques. Pattern Recogn 2004;37:555–65.

[4] Kumsawat R, Attakitmongcol K, Srikaew A. A new approach for optimizationin image watermarking by using genetic algorithms. IEEE Trans Signal Process2005;53:4707–19.

[5] Chen CC, Lin CS. A GA-based nearly optimal image authentication approach.Int J Innov Comput Inf Control 2007;3:631–40.

[6] Aslantas V. A singular-value decomposition-based image watermarking usinggenetic algorithm. AEU Int J Electron Commun 2008;62:386–94.

[7] Delaigle JF, Vleeschouwer CD, Macq B. Watermarking algorithm based on ahuman visual model. Signal Process 2007;66:319–35.

[8] Qi H, Zheng D, Zhao J. Human visual system based adaptive digital imagewatermarking. Signal Process 2008;88:174–88.

[9] Noorkami M, Mersereau M. A framework for robust watermarking of H.264-encoded video with controllable detection performance. IEEE Trans Inf ForenSec 2007;2:14–23.

[10] Ting TO, Rao MVC, Loo CK, Ngu SS. Solving unit commitment problem usinghybrid particle swarm optimization. J Heuristics 2003;9:507–20.

[11] Wang Z, Sun X, Zhang D. A novel watermarking scheme based on PSOalgorithm. In: Lecture notes in computer science, vol. 4688, 2007. p. 307–14.

[12] Hartung F, Girod B. Digital watermarking of MPEG-2 coded video in thebitstream domain. In: Proceedings of IEEE international conference onacoustics, speech, and signal processing, Munich, 1997. p. 2621–4.

[13] Dittmann J, Steinebach M, Steinmetz R. Robust mpeg video watermarkingtechnologies. In: Proceedings of the sixth ACM international conference onmultimedia, Bristol, 1998. p. 71–80.

[14] Qiu G, Marziliano P, Ho ATS, He D, Sun Q. A hybrid watermarking scheme forH.264/AVC video. In: Proceedings of the 17th international conference onpattern recognition, Cambridge, 2004. p. 865–8.


[15] Nguyen CV, Tay DBH, Deng G. A fast watermarking system for H.264/AVCVideo. In: Proceedings of IEEE Asia Pacific conference on circuits and systems,Singapore, 2006. p. 81–4.

[16] Kim SM, Kim SB, Hong Y, Won CS. Data hiding on H.264/AVC compressedvideo. In: Lecture notes in computer science, vol. 4633, 2007. p. 698–707.

[17] Chen B, Wornell GW. Provably robust digital watermarking. In: Proceedingsof the multimedia systems and applications, Bellingham, 1999. p. 43–54.

[18] Chen B, Wornell GW. Quantization index modulation: a class of provablygood methods for digital watermarking and information embedding. IEEETrans Inform Theory 2001;47:1423–43.

[19] Li Q, Cox IJ. Using perceptual models to improve fidelity and provideinvariance to volumetric scaling for quantization index modulation water-marking. In: Proceedings of IEEE international conference on acoustics,speech, and signal processing, Philadelphia, 2005. p. 1–4.

[20] Okman OE, Akar GB. Quantization index modulation-based image water-marking using digital holography. J Opt Soc Am A 2007;24:243–52.

[21] Noda H, Niimi M, Kawaguchi E. High-performance JPEG steganography usingquantization index modulation in DCT domain. Pattern Recogn Lett2005;27:455–61.

[22] Profrock D, Richter H, Schlauweg M, Muller E. H.264/AVC video authentica-tion using skipped macroblocks for an erasable watermark. In: Proceedings ofvisual communications and image processing, Beijing, 2005. p. 1480–9.

[23] Kennedy J, Eberhart RC. Swarm intelligence. San Francisco: MorganKaufmann; 2001.

[24] Engelbrecht AP. Fundamentals of computational swarm intelligence. UK:Wiley; 2005.

[25] Chen B, Wornell G. Quantization index modulation: a class of provably goodmethods for digital watermarking and information embedding. IEEE TransInform Theory 2001;47:1423–43.

[26] Miyazaki A, Okamoto A. Analysis of watermarking systems in the frequencydomain and its application to design of robust watermarking systems. In:Proceedings of the international conference on image processing, Thessalo-niki, 2001. p. 506–9.

[27] Kii H, Onishi J, Ozawa S. The digital watermarking method by using bothpatchwork and DCT. In: Proceedings of IEEE international conference onmultimedia computing and systems, Florence, 1999. p. 895–9.

[28] Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. Draft ITU-T recommen-dation and final draft international standard of joint video specification ITU-TRec. (H.264/ISO/IEC 14 496-10 AVC). 2003.

[29] Hyunh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video qualityassessment. Electron Lett 2008;44:800–1.

[30] Hsu CT, Wu JL. Hidden digital watermarks in images. IEEE Trans ImageProcess 1999;8:58–68.

[31] Gonzalez RC, Woods RE. Digital image processing. Reading, MA: Addison-Wesley; 1992.

[32] Ling LF, Lu ZD, Zou FH. Turbo-based DNW algorithm for compressed video inVLC domain. Wuhan Univ J Nat Sci 2005;10:297–302.

C.H. Wu received a B.Eng. degree in Industrial andSystems Engineering from The Hong Kong PolytechnicUniversity. He is currently pursuing a Ph.D. degree inthe Department of Industrial and Systems Engineeringat The Hong Kong Polytechnic University. He is amember of British Machine Vision Association &Society for Pattern Recognition and Hong Kong Societyfor Quality. His current research areas encompassevolutionary computing in machine vision system,image processing and quality control enhancementsystem.

Y. Zheng received the MS degree in ElectronicEngineering from the Harbin Institute of Technologyin 2008 and the BS degree in Measurement and Controlfrom the Harbin Institute of Technology in 2006. He isa doctoral candidate in the Department of InformationScience and Technology at the Sun Yat-Sen University,China. His research focuses on machine learning,computer vision and digital watermarking.

W.H. Ip received his Ph.D. from LoughboroughUniversity in the UK. He obtained his MBA fromBrunel University, M.Sc. in Industrial Engineering fromCranfield University, and LLB (Hons) from University ofWolverhampton. He is a member of the Hong KongInstitution of Engineers, the Institution of ElectricalEngineers, the Institution of Mechanical Engineersand a fellow member of the Hong Kong QualityManagement Association. He is currently an associateprofessor in the Industrial and Systems EngineeringDepartment of The Hong Kong Polytechnic University.He has more than 20 years of experience in industry,
education and consulting. His research interests are AI,
vision systems, ERP, and logistics and supply chain management.

C.Y. Chan received his B.Eng. (Hons) and Ph.D. in 1994in Mechanical Engineering in the Salford University,UK. He has been appointed as Lecturer in theDepartment of Industrial and Systems Engineering atThe Hong Kong Polytechnic University since 1995. Heis occupied in both teachings and researches. Cur-rently, his research areas including signal processing,system dynamics, embedded passives, PCBA rework,control and automation.

K.L. Yung obtained his B.Sc. in Electronic Engineering(1975), M.Sc., DIC in Automatic Control Systems(1976), Ph.D. in Microprocessor Applications in ProcessControl (1985) in the UK and became a charteredengineer (C.Eng., MIEE) in 1981. After graduation hewas working in the United Kingdom for companiessuch as BOC Advanced Welding Co. Ltd., the BritishEver Ready Group and the Cranfield Unit for PrecisionEngineering. In 1986, Professor Yung returned to HongKong to join the Hong Kong Productivity Council asConsultant and subsequently switched to academia tojoin The Hong Kong Polytechnic University where he is
now with the Department of Industrial and Systems
Engineering. His research interests include precision motion control and systemaspects of computer integrated manufacturing and management, computer vision,logistic planning and optimization.

Z.M. Lu was born in Zhejiang Province, China, in 1974.He received the B.S. and M.S. degrees in ElectricalEngineering and the Ph.D. degree in MeasurementTechnology and Instrumentation from the HarbinInstitute of Technology (HIT), Harbin, China, in 1995,1997, and 2001, respectively. He is currently aProfessor with the School of Aeronautics and Astro-nautics, Zhejiang University. His current researchinterests include speech coding, image processing,and information security. He has published more than80 papers in the international journals and Chinesejournals. He was awarded one of the 2003 One
Hundred Most Excellent Doctors in China award for
authoring more than 40 papers in the field of vector quantization.

Documents

A flexible H.264/AVC compressed video watermarking scheme using particle swarm optimization based dither modulation