Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
CHAPTER 1
INTRODUCTION
The growth of high speed computer networks and World Wide Web (WWW) have explored
means of new business, scientific, entertainment and social opportunities in the form of
electronic publishing and advertising, massaging, real-time information delivery, data
sharing, collaboration among computers, product ordering, transaction processing, digital
repositories and libraries, web newspapers and magazines, network video and audio, personal
communication and lots more. The cost effectiveness of selling softwares in the form of
digital images and video sequences by transmission over WWW is greatly enhanced due to
the improvement in technology.
We know that one of the biggest technological events of the last two decades was the
invasion of digital media in an entire range of everyday life aspects. Digital data can be
stored efficiently and with a very high quality, and it can be manipulated very easily using
computers. Furthermore, digital data can be transmitted in a fast and inexpensive way
through data communication networks without losing quality. Digital media offer several
distinct advantages over analog media. The quality of digital audio, images and video
signals are higher than that of their analog counterparts. Editing is easy because one can
access the exact discrete locations that need to be changed. Copying is simple with no loss
of fidelity. A copy of a digital media is identical to the original. With digital multimedia
distribution over World Wide Web, authentications are more threatened than ever due to the
possibility of unlimited copying. The easy transmission and manipulation of digital data
constitutes a real threat for information creators, and copyright owners want to be
compensated every time their work is used. Furthermore, they want to be sure that their
work is not used in an improper way (e. g. modified without their permission). For digital
data, copyright enforcement and content verification are very difficult tasks. One solution
would be to restrict access to the data using some encryption techniques. However,
2
encryption does not provide overall protection. Once the encrypted data are decrypted, they
can be freely distributed or manipulated.
Unauthorized use of data creates several problems. For example, if we visit http:\\www.
wallpaper.com, we observe that all the wallpaper images are created by the owners, which
are their Intellectual Property Right (IPR). Any user can download the wallpapers. Now,
consider that a user downloads the images and posts those images (either after modifying or
original) on his/her website. Three issues may arise in this situation:
1) How will the owner of wallpaper.com know that there is one more web server on
WWW posting their wallpapers?
2) If the owner knows about this fact, where shall he go to make a complaint?
3) The last but very important issue is that even if first two problems are resolved, how
the owner will prove the ownership on the wallpaper images posted on another
server?
The first issue is related to network technologies and involves issues like ‘web crawler’ and
‘pattern matching’ etc. Second issue is related to the international copyright laws and is
another very tricky issue. This thesis does not deal with these 2 issues. This thesis covers
the third issue, the authentication i.e. how to prove the ownership?
The above problem can be solved by hiding some ownership data into the multimedia data,
which can be extracted later to prove the ownership. This idea is implemented in bank
currency notes embedded with the watermark which is used to check the originality of the
note. The same “watermarking” concept may be used in multimedia digital contents for
checking the authenticity of the original content.
To begin with a quick background of watermarking, first we present the history of data
hiding and related terminologies. Then, we will move on to a discussion on the
3
watermarking, requirements that watermarking system must meet, types of the watermarking,
applications and then various attacks on a watermarking system.
1.1 DATA HIDING BACKGROUND The solution of the problem discussed above seems to lie in a technique that dates back to
ancient Egypt and Greece: data hiding or steganography. Steganography deals with the
methods of embedding data within a medium (host or cover medium) in an imperceptible
way. All forms of digital data (still images, audio, video, text documents and multimedia
documents) can be used as a cover medium for information hiding.
The history of steganography goes all the way back to the 5th Century. The earliest known
writings about steganography were by the Greek historian Herodotus. The historian relates
how a slave had a message tattooed on his head by Histiaeus who was trying to get a
message to his son-in-law Aristagoras. Once the slaves’ hair was long enough to cover the
message he was sent to Aristagoras in the city of Miletus [92].
Stegnography has been used in many different ways. The simplest was the use of invisible
inks that a person could use to send a message to another person without anyone else
knowing. Different forms of invisible ink were used to conceal messages. Some of the more
common forms of invisible ink have been lemon juice, milk, and urine to name a few. If
someone wanted to conceal a message, he would simply write a message, using one of these
inks, on a sheet of paper that already had something written on it. The person receiving the
message would then hold the paper over a flame and the transparent message would appear.
Image stegnography was done during the early twentieth century. During the Boer War in
South Africa, the British were using Lord Robert Baden-Powell as a scout. He was scouting
the Boer artillery bases mapping their positions. He took his maps and converted them into
pictures of butterflies with certain markings on the wings that were actually the enemies’
positions [92].
4
During World War II, Nazis introduced a new concept in espionage, which was called the
microdot. This simple device could conceal a full typewritten page within the size of a
common period. A microdot could hold valuable information such as charts, diagrams and
drawings.
Figure 1.1: Watermark on the bank currency note
Thus, stegnography is an area which is, more or less, a Hide-&-Seek game. Some important
data or information is hidden in another medium. The cover medium has no relationship
with the data or information hidden. Data or information which is hidden is not encrypted
also. The key issue in a stegnography system becomes that no one should suspect that a
particular medium is carrying any hidden data or information.
We can extend the stegnography concept for the authentication of digital multimedia data.
Digital multimedia data which has to be protected is now the cover medium and then we can
hide the copyright data into it. In this case, there will be two major requirements as follows:
1) Imperceptibility: After hiding the copyright data, cover medium should not be
affected, and
2) Robustness: No body should be able to remove the data without affecting the cover
medium.
Watermark symbol is added here to prove the originality
5
The copyright data may be termed as digital watermark data. This area of application of
stegnography is known as Digital Watermarking. Therefore, digital watermark is a
message/data/information which is embedded into digital content (audio, video, images or
text) that can be detected or extracted later. Such message/data/information mostly carries
the copyright or ownership information of the content. The process of embedding digital
watermark information into digital content is known digital watermarking.
Before moving further in this discussion, we must first understand the difference of the
digital watermarking with other related terms like stegnography, cryptography and digital
signature.
1.1.1 STEGANOGRAPHY VS WATERMARKING
Watermarking is the subset of Stegnography. In Stegnography, data which is hidden has no
relationship with the cover medium and the requirement from such a system is that no
suspicion should arise that a medium is carrying any hidden data. In watermarking, unlike
stegnography, the data which is hidden has relationship with the cover medium data. Data
hidden is the ownership data of the cover medium and there is no issue like suspecting that a
particular medium is carrying some copyright data.
As the purpose of stegnography is to have a covert communication between two parties i.e.
existence of the communication is unknown to a possible attacker, and a successful attack
shall detect the existence of this communication. On the contrary, watermarking, as opposed
to stegnography, requires a system to be robust against possible attacks. Other requirements
of watermarking are entirely different from stegnography and these are discussed in detail in
Section 1.3.
1.1.2 CRYPTOGRAPHY VS. WATERMARKING
Cryptography can be defined as the processing of information into an unintelligible form
known as encryption, for the purpose of secure transmission. Through the use of a “key”,
the receiver can decode the encrypted message (the process known as decryption) to retrieve
6
the original message. So, cryptography is about protecting the contents of the message. But
as soon as the data is decrypted, all the in-built security and data is ready to use.
Cryptography "scrambles" a message so that it can not be understood by unauthorized user.
This does not happen in watermarking. Neither the cover medium nor the copyright data
changes its meaning. Rather, copyright data is hidden to give the ownership information of
the medium in which it is hidden.
1.1.3 DIGITAL SIGNATURE VS. WATERMARKING
Digital signatures, like written signatures, are used to provide authentication of the associated
input, usually called a "message”. Digital signature is an electronic signature that can be used
to authenticate the identity of the sender of a message or the signer of a document, and
possibly to ensure that the original content of the message or document that has been sent is
unchanged. Digital signatures are easily transportable, cannot be imitated by someone else,
and can be automatically time-stamped. The ability to ensure that the original signed
message arrived means that the sender cannot easily repudiate it later. A digital signature can
be used with any kind of message, whether it is encrypted or not, simply so that the receiver
can be sure of the sender's identity and that the message arrived intact. A digital signature is
apart from the protected message, whereas a digital watermark is inside a multimedia
message. Both, digital signature and watermarking protect integrity and authenticity of a
document. Digital signature system is vulnerable to distortion but a watermark system has to
tolerate a limited distortion level.
So, to conclude, Watermarking is adding“ownership” information in multimedia contents to
prove the authenticity. This technology embeds a data, an unperceivable digital code,
namely the watermark, carrying information about the copyright status of the work to be
protected. Continuous efforts are being made to device efficient watermarking schema but
techniques proposed so far do not seem to be robust to all possible attacks and multimedia
data processing operations. The sudden increase in watermarking interest is most likely due
to the increase in concern over IPR. Today, digital data security covers such topics as access
control, authentication, and copyright protection for still images, audio, video, and
7
multimedia products. A pirate tries either to remove a watermark to violate a copyright or to
cast the same watermark, after altering the data, to forge the proof of authenticity. Generally,
the watermarking of still images, video, and audio demonstrate certain common fundamental
concepts.
1. 2 APPLICATION AREAS OF DIGITAL WATERMARKING Watermarking techniques may be relevant in the following application areas [26]:
1.2.1 COPYRIGHT PROTECTION
The primary use of watermarking is where an organization wishes to assert its ownership of
copyright for digital objects. This application is of great interest to ‘big media’
organizations, and of some interest to other vendors of digital information, such as news and
photo agencies. These applications require a minimal amount of information to be
embedded, coupled with a high degree of resistance to signal modification (since they may
be subjected to deliberate attack). For example, now a days, a news channel “AAJ-TAK” is
showing the animal’s clips (which are already shown on “Discovery” Channel) by hiding the
Discovery channel’s logo on the video clips. As per the law, The AAJ-TAK should show the
curtsey-sign and should pay the copyright fee to the Discovery channel. In such cases,
There is a strong need of watermarking as once the digital data is broadcasted, any body else
can start selling it without paying the IPR value to its owner.
1.2.2 COPY PROTECTION
Watermarking can be used as a strong tool to prevent illegal copying. For example, if an
audio CD has a watermark embedded into it, then any of the system (Hardware like DVD, or
software) can not make a copy of it, and even if it copies, the watermark data will not get
copied to new duplicate audio CD. Now the duplicate CD can be easily found because it
does not have watermark data. Some schemes have attempted to satisfy more complex copy
protection requirements. An early example is the Serial Copy Management System (SCMS),
introduced in the 1980s, which enabled a user to make a single digital audio tape of a
recording they had purchased but prevented the recording of further copies (i.e. second
8
generation) from that first copy. The scheme failed ultimately because not all manufacturers
of consumer equipment were prepared to implement the scheme in their products.
1.2.3 TEMPER DETECTION
In this application area, it is necessary to assure that the origin of a data object is
demonstrated and its integrity is proved. One example of temper detection is photographic
forensic information which may be presented as evidence in the court. Given the ease with
which digital images can be manipulated, there is a need to provide proof that an image has
not been altered. Such a mechanism could be built into a digital camera [29]. For example,
if a cop’s camera catches an over speeding vehicle then when proving the driver guilty in
front of the judge, the accused may claim that the video presented in the court is tempered
and the car shown in the video does not belong to him. A watermarking system which is
embedded in digital cameras may help to resolve the issue. If somebody tries to temper the
data, the watermark will get destroyed indicating that the data is tempered. In our country, a
well-known example is the “Tahalka-Scam”.
1.2.4 BROADCAST MONITORING
There are several types of organizations and individuals interested in monitoring the
broadcast of their interest. For example, advertisers want to ensure that they receive the exact
airtime that they have purchased from broadcasting firms. Musicians and actors want to
ensure that they receive accurate royalty payments for broadcasts of their performances and
copyright owners want to ensure that their property is not illegally rebroadcast by pirate
stations. In 1997, a scandal broke out in Japan regarding television advertising. At least two
stations had been routinely overbooking air time. Advertisers were paying for thousands of
commercials that were never aired [16]. The practice had remained largely undetected for
over twenty years because there were no systems in place to monitor the actual broadcast of
advertisements.
9
This broadcast monitoring can be implemented by putting a unique watermark in each video
or sound clip prior to broadcast. Automated monitoring stations can then receive broadcasts
and look for these watermarks identifying when and where each clip appears.
1.2.5 FINGERPRINTING
If monitoring and owner identification applications place the same watermark in all copies of
the same content, it may create a problem. If out of n number of legal buyers of a content,
one starts selling the contents illegally, it may be very difficult to catch who is redistributing
the contents without permission. Allowing each copy distributed to be customized for each
legal recipient can solve this problem. This capability allows a unique watermark to be
embedded in each individual copy. Now, if the owner finds an illegal copy, he can find out
who is selling his contents by finding the watermark which belongs to only singly legal
buyer. This particular application area is known as fingerprinting. This is potentially
valuable both as a deterrent to illegal use and as a technological aid to investigation.
1.2.6 ANNOTATION APPLICATIONS
In this applications area, watermarks convey object-specific information (“feature tags” or
“captions”) to users of the object. For example, patient identification data can be embedded
into medical images. These applications require relatively large quantities of embedded data.
While there is no need to protect against deliberate tampering. Normal use of the data object
may involve such transformations as image cropping or scaling and will require the use of a
technique that is resistant to those types of modification.
For more details of various watermarking applications, one may refer [20].
10
1. 3 CHARACTERISTICS OF WATERMARKING SCHEMES An effective watermarking scheme should have the following characteristics:
1) Imperceptibility: In terms of watermarking, imperceptibility means that after inserting
the watermark data, cover medium should not alter much. In other words, the
presence of the watermark data should not affect the cover medium being protected.
If a watermarking scheme does not ensure this requirement, it may happen that after
inserting a watermark data in a cover medium (say an image), image quality may alter
which the owner of the image will never like that a protecting mechanism modifies
his work.
2) Robustness: Robustness of the watermark data means that the watermark data should
not be destroyed if someone performs the common manipulations as well as
malicious attacks. It is more of a property and also a requirement of watermarking
and its applicability depends on the application area.
3) Fragility: Fragility means that the watermark data is altered or disturbed up to a
certain extent when someone performs the common manipulations & malicious
attacks. Some application areas like temper detection may require a fragile
watermark to know that some tempering is done with his work. Some application
may require semi-fragility too. The semi-fragile watermark comprises a fragile
watermark component and a robust watermark component i.e. semi-fragile
watermarks are robust to some attacks but fragile to others attacks.
4) Resilient to common signal processing: The watermark should be retrievable even if
common signal processing operations are applied to the watermarked cover medium
data. These operations include digital-to-analog and analog-to-digital conversion (i.e.
taking the printout of an image and then scan it to create another digital copy of the
image), re-sampling, re-quantization (including dithering and recompression), and
common signal enhancements such as image contrast, brightness and color
adjustment, or audio bass and treble adjustment, high pass and low pass filtering,
histogram equalization of an image and format conversion (BMP image to JPEG
image, MPEG movie to WMV movie, mp3 song to mp4 etc.)
11
5) Resilient to common geometric distortions (image and video data): Watermarks in
image and video data should also be immune from geometric image operations such
as rotation, translation, cropping and scaling. This property is not required for audio
watermarking.
6) Robust to subterfuge attacks (collusion and forgery): In addition, the watermark
should be robust to collusion attack. Multiple individuals, who possess a watermarked
copy of the data, may collude their watermark copies to destroy the watermark
presence and can generate a duplicate of the original copy. Further, if a digital
watermark is to be used in litigation, it must be impossible for colluders to combine
their images to generate a different valid watermark.
7) Unambiguousness: Retrieval of the watermark should unambiguously identify the
owner. Furthermore, the accuracy of owner identification should not degrade much
in the case of an attack. The Unzign and Stirmark [97] have shown remarkable
success in removing data embedded by commercially available programs.
Watermarking of watermarked image (re-watermarking) is also a major threat [97].
1.4 TYPES OF DIGITAL WATERMARKS Prof. S. Mohanty presents a very good classification of watermarking areas in his paper [62].
We can classify the types of watermarking based on the cover medium, embedding domain,
perception and application domain. Figure 1.2 shows the various classifications of
watermarking.
Based on their embedding domain, watermarking schemes can be classified as follows:
1) Spatial Domain: The watermarking system directly alters the main data elements (like
pixels in an image) to hide the watermark data.
2) Transformed Domain: The watermarking system alters the frequency transforms of
data elements to hide the watermark data. This has proved to be more robust than the
spatial domain watermarking.
12
3) Feature Domain: The watermarking system takes into account the region, boundary
and object characteristics. It presents better detection and recovery from attacks.
Figure 1.2: Various classifications of watermarking
Watermarking techniques can also be divided into four categories, according to the type of
document to be watermarked, as follows.
1) Image Watermarking: Figure 1.3 and 1.4 represent the general scheme of an image
watermarking, embedding and decoding (specifically key based, invisible and fragile)
13
system respectively. ‘E’ represents the watermarking embedding algorithm and ‘D’
represents the watermarking decoding algorithm.
2) Other types of watermarking, according to the type of document to be watermarked
are Video Watermarking, Audio Watermarking and Text Watermarking.
Figure 1.3: Image watermark embedding scheme
Figure 1.4: Image watermark detection scheme
According to the human perception, the digital watermarks can be divided into 4 different
types: Visible watermark, Invisible-Robust watermark, Invisible-Fragile watermark, and dual
watermark. Visible watermark is a secondary translucent overlaid into the primary image.
The watermark appears visible to a casual viewer on a careful inspection. The invisible-
robust watermark is embedded in such a way that alternations made to the pixel value are
perceptually not noticed and it can be recovered only with appropriate decoding mechanism.
The fragile watermark is embedded in such a way that any manipulation or modification of
the image would alter or destroy the watermark. Dual watermark is a combination of a
visible and an invisible watermark [8].
14
According to application domain, Source-based watermarks are desirable for ownership
identification or authentication where a unique watermark identifies the owner. A source-
based watermark could be used for authentication and to determine whether a received image
or other electronic data has been tampered. The watermark could also be destination based
where each distributed copy gets a unique watermark identifying the particular buyer. The
destination based watermark could be used to trace the buyer in the case of illegal reselling.
This is used in fingerprinting. A watermark is said private if only authorized readers can
detect it. In other words, in private watermarking, a mechanism is envisaged that makes it
impossible for unauthorized people to extract the watermark. A watermarking algorithm is
said blind if it does not resort to the comparison between the original non-marked and the
marked document to recover the watermark. Conversely, a watermarking algorithm is said
non-blind if it needs the original data to extract the information contained in the watermark.
The definition of invertible and quasi-invertible is more technical and can be given as
follows [2]:
If E is the Embedding algorithm, D is detection algorithm, Cδ is Comparator function, I is
original cover image, Î is watermarked image, J is recovered attacked image, S is watermark
signal and S’ is extracted watermark data, then:
1) E (I, S) = Î
2) D (J, I) = S’ or D (J) = S’
3) Comparator Cδ:
A watermarking scheme (E, D, Cδ) is invertible if:
1) Inverse mapping E-1 does exist such that E-1 (Î) = (Î’, S’) &E (Î’, S’) = Î;
2) E-1 is computational feasible;
15
3) S’ is an allowed watermark;
4) Î and Î’ are perceptually similar; and
5) Comparator output Cδ (D (Î, Î’), S’) = 1
Otherwise the watermarking scheme is non-invertible.
A watermarking scheme (E, D, Cδ) is quasi-invertible if:
1) Properties for invertible watermarking schemes apply;
2) Only difference E (Î’, S’) = Î’’ ≠ Î; and
3) Î’’ and Î perceptually similar.
Otherwise the watermarking scheme is non-quasi-invertible. A Non-invertible scheme can
be quasi-invertible and Non-quasi-invertibility implies non-invertibility.
1.5 STRUCTURE OF THE THESIS This thesis comprises of the following chapters:
Chapter 2 describes the image watermarking literature survey and problem statement.
Chapter 3 describes the preliminaries (like background of JPEG compression, 2D–DCT and
DWT, image quality parameter, some standard watermarking techniques which are used to
compare the performances of the proposed techniques etc and test images data). The
watermarking techniques for gray images have been proposed in Chapter 4. Chapter 5
describes the proposed watermarking techniques and issues related to colored BMP images.
In Chapter 6, the proposed watermarking techniques for JPEG images have been given.
Finally the summary of results, conclusions and future work is given in Chapter 7 followed
by references, author’s publications and synopsis at the end.