9
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008 1055 Cryptanalyzing an Encryption Scheme Based on Blind Source Separation Shujun Li, Chengqing Li, Kwok-Tung Lo, Member, IEEE, and Guanrong Chen, Fellow, IEEE Abstract—Recently, there was a proposal of using the underde- termined blind source separation (BSS) principle to design image and speech encryption. In this paper, we report a cryptanalysis of this BSS-based encryption scheme and point out that it is not secure against known-/chosen-plaintext attack and chosen-cipher- text attack. In addition, we discuss some other security defects of the schemes: 1) it has a low sensitivity to part of the key and to the plaintext; 2) it is weak against a ciphertext-only differential attack; and 3) a divide-and-conquer (DAC) attack can be used to break part of the key. We finally analyze the role of BSS in this approach towards cryptographically secure ciphers. Index Terms—Blind source separation (BSS), chosen-cipher- text attack, chosen-plaintext attack, cryptanalysis, differential attack, divide-and-conquer (DAC) attack, image encryption, known-plaintext attack, speech encryption. I. INTRODUCTION W ITH the rapid development of multimedia and net- working technologies, the security of multimedia data becomes more and more important in many real applications. To fulfill such an increasing demand, during the past two decades, many encryption schemes have been proposed for protecting multimedia data, including speech signals, images, and videos [1]–[9]. According to the nature of the data, multimedia encryption schemes can be classified into two basic types: analog and digital. Most early schemes were designed to encrypt analog data in various ways, e.g., element permuting, signal masking, and frequency shuffling, all of which may be exerted in the time domain, the transform domain, or both. However, due to the simplicity of their encryption procedures, almost all analog encryption schemes are not sufficiently secure against cryptographical attacks, especially those modern attacks such as known/chosen-plaintext and chosen-ciphertext attacks [2], Manuscript received January 1, 2007; revised July 7, 2007. This research was supported in part by The Hong Kong Polytechnic University’s Postdoctoral Fellowships Program under Grant G-YX63 and by the Alexander von Hum- boldt Foundation of Germany. The work of K.-T. Lo was supported by the Research Grants Council of the Hong Kong SAR Government under Project 523206 (PolyU 5232/06E). This paper was recommended by Associate Editor C. W. Wu. S. Li is with the FernUniversität in Hagen, Lehrgebiet Informationstechnik, 58084 Hagen, Germany (www.hooklee.com) C. Li and G. Chen are with the Department of Electronic Engineering, City University of Hong Kong, Kowloon Toon, Hong Kong SAR. K.-T. Lo is with the Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong SAR. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSI.2008.916540 [3], [10], [11]. As a comparison, in digital encryption schemes, one can employ many cryptographically strong ciphers, such as DES [12] and AES [13], to achieve a higher level of security. Besides, to achieve a higher efficiency of encryption and to meet some special demands of multimedia encryption (such as format-compliance [14] and perceptual encryption [15]), many specific multimedia encryption schemes have also been developed [4]–[6]. Recent cryptanalysis work [16]–[30] has shown that some multimedia encryption schemes are insecure against cryptographical attacks. Recently, Lin et al. suggested employing blind source sep- aration (BSS) for the purpose of image and speech encryption [31]–[37]. The basic idea is to mix multiple plaintexts (or mul- tiple segments of the same plaintext) with a number of secret key signals, in the hope that an attacker has to solve a hard mathe- matical problem—the underdetermined BSS problem. In [37, Sec. VII], Lin et al. claimed that this BSS-based cipher “is im- mune from the attacks such as the ciphertext-only attack, the known-plaintext, and the chosen-plaintext attack as long as the intractability of the underdetermined BSS problem is guar- anteed by the mixing matrix for encryption”. This paper reevaluates the security of the BSS-based encryp- tion scheme and points out that it is actually insecure against known-/chosen-plaintext and chosen-ciphertext attacks. In ad- dition, some other security defects are found under the sce- narios of ciphertext-only attack, including its low sensitivity to the mixing matrix (part of the secret key) and to the plaintext, and a differential attack can be recommended, which works well when the matrix size is small. Based on the cryptanalytic find- ings, we further discuss the role of BSS in this approach towards cryptographically secure ciphers. The remainder of this paper is organized as follows. In Section II, a brief introduction is given to the BSS-based encryption scheme. Sections III is the main body of this paper, in which a detailed cryptanalysis of the BSS-based encryption scheme is presented. Then, the role of BSS in cryptography is discussed in Section IV. Finally, the Section V concludes the paper. II. BSS-BASED ENCRYPTION BSS is a technique that tries to recover a set of unobserved sources or signals from some observed mixtures [38]. Given unobserved signals and a mixing matrix of size , the BSS problem is to recover from observed signals , where (1) 1549-8328/$25.00 © 2008 IEEE

Cryptanalysis of an encryption scheme based on blind source separation

  • Upload
    surrey

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008 1055

Cryptanalyzing an Encryption SchemeBased on Blind Source Separation

Shujun Li, Chengqing Li, Kwok-Tung Lo, Member, IEEE, and Guanrong Chen, Fellow, IEEE

Abstract—Recently, there was a proposal of using the underde-termined blind source separation (BSS) principle to design imageand speech encryption. In this paper, we report a cryptanalysisof this BSS-based encryption scheme and point out that it is notsecure against known-/chosen-plaintext attack and chosen-cipher-text attack. In addition, we discuss some other security defects ofthe schemes: 1) it has a low sensitivity to part of the key and to theplaintext; 2) it is weak against a ciphertext-only differential attack;and 3) a divide-and-conquer (DAC) attack can be used to breakpart of the key. We finally analyze the role of BSS in this approachtowards cryptographically secure ciphers.

Index Terms—Blind source separation (BSS), chosen-cipher-text attack, chosen-plaintext attack, cryptanalysis, differentialattack, divide-and-conquer (DAC) attack, image encryption,known-plaintext attack, speech encryption.

I. INTRODUCTION

WITH the rapid development of multimedia and net-working technologies, the security of multimedia data

becomes more and more important in many real applications.To fulfill such an increasing demand, during the past twodecades, many encryption schemes have been proposed forprotecting multimedia data, including speech signals, images,and videos [1]–[9].

According to the nature of the data, multimedia encryptionschemes can be classified into two basic types: analog anddigital. Most early schemes were designed to encrypt analogdata in various ways, e.g., element permuting, signal masking,and frequency shuffling, all of which may be exerted in thetime domain, the transform domain, or both. However, dueto the simplicity of their encryption procedures, almost allanalog encryption schemes are not sufficiently secure againstcryptographical attacks, especially those modern attacks suchas known/chosen-plaintext and chosen-ciphertext attacks [2],

Manuscript received January 1, 2007; revised July 7, 2007. This researchwas supported in part by The Hong Kong Polytechnic University’s PostdoctoralFellowships Program under Grant G-YX63 and by the Alexander von Hum-boldt Foundation of Germany. The work of K.-T. Lo was supported by theResearch Grants Council of the Hong Kong SAR Government under Project523206 (PolyU 5232/06E). This paper was recommended by Associate EditorC. W. Wu.

S. Li is with the FernUniversität in Hagen, Lehrgebiet Informationstechnik,58084 Hagen, Germany (www.hooklee.com)

C. Li and G. Chen are with the Department of Electronic Engineering, CityUniversity of Hong Kong, Kowloon Toon, Hong Kong SAR.

K.-T. Lo is with the Department of Electronic and Information Engineering,The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong KongSAR.

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSI.2008.916540

[3], [10], [11]. As a comparison, in digital encryption schemes,one can employ many cryptographically strong ciphers, such asDES [12] and AES [13], to achieve a higher level of security.Besides, to achieve a higher efficiency of encryption and tomeet some special demands of multimedia encryption (suchas format-compliance [14] and perceptual encryption [15]),many specific multimedia encryption schemes have also beendeveloped [4]–[6]. Recent cryptanalysis work [16]–[30] hasshown that some multimedia encryption schemes are insecureagainst cryptographical attacks.

Recently, Lin et al. suggested employing blind source sep-aration (BSS) for the purpose of image and speech encryption[31]–[37]. The basic idea is to mix multiple plaintexts (or mul-tiple segments of the same plaintext) with a number of secret keysignals, in the hope that an attacker has to solve a hard mathe-matical problem—the underdetermined BSS problem. In [37,Sec. VII], Lin et al. claimed that this BSS-based cipher “is im-mune from the attacks such as the ciphertext-only attack, theknown-plaintext, and the chosen-plaintext attack as long asthe intractability of the underdetermined BSS problem is guar-anteed by the mixing matrix for encryption”.

This paper reevaluates the security of the BSS-based encryp-tion scheme and points out that it is actually insecure againstknown-/chosen-plaintext and chosen-ciphertext attacks. In ad-dition, some other security defects are found under the sce-narios of ciphertext-only attack, including its low sensitivity tothe mixing matrix (part of the secret key) and to the plaintext,and a differential attack can be recommended, which works wellwhen the matrix size is small. Based on the cryptanalytic find-ings, we further discuss the role of BSS in this approach towardscryptographically secure ciphers.

The remainder of this paper is organized as follows. InSection II, a brief introduction is given to the BSS-basedencryption scheme. Sections III is the main body of this paper,in which a detailed cryptanalysis of the BSS-based encryptionscheme is presented. Then, the role of BSS in cryptography isdiscussed in Section IV. Finally, the Section V concludes thepaper.

II. BSS-BASED ENCRYPTION

BSS is a technique that tries to recover a set of unobservedsources or signals from some observed mixtures [38]. Givenunobserved signals and a mixing matrix of size

, the BSS problem is to recover fromobserved signals , where

(1)

1549-8328/$25.00 © 2008 IEEE

1056 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008

When , BSS is possible if satisfies some conditions.However, when , this is generally impossible (what-ever is), thus leading to the difficult underdetermined BSSproblem.

In [31]–[37], Lin et al. introduced a number of secret key sig-nals to make the determination of the plaintext signals becomean underdetermined BSS problem in the case that the key sig-nals are unknown. Given input plain-signalsand key signals , the encryption procedure isdescribed as follows1:

(2)

where denotes cipher-signals,, and is a

mixing matrix whose elements are within [ ,1]. Write, where is a matrix and is a

matrix. Then, the encryption procedure can be represented inan equivalent form as

(3)

where and. Thus, as long as is an invert-

ible matrix, one can decrypt as follows2:

(4)

Different values of were used in Lin et al.’s papers:in [31] and in [32]–[37]. When , Lin et al.further set and , where for imageencryption and for speech encryption. In this case, theencryption procedure becomes

(5)

and the decryption procedure becomes

(6)

Observing (3), one can see that the encryption procedure con-tains two steps:

Step 1) .Step 2) .The first step corresponds to a simple matrix-based block ci-

pher, and the second step corresponds to a simple addition-basedstream cipher. From a different point of view, the two steps areexchanged as follows:

Step 1) .Step 2) .In any case, the BSS-based encryption scheme is always a

product cipher composed by a simple block cipher and a simple

1To provide a clearer description of the BSS-based encryption scheme, in thispaper we use some notations that are different from those in Lin et al.’s originalpapers. For example, in [37], the �th key signal is denoted by � ���, while inthis paper we use � ��� to emphasize the fact that it is a key signal.

2In Lin et al.’s papers, it is said that the decryption procedure was achievedvia BSS. However, from the cryptographical point of view, it is more convenientto denote the decryption procedure by (4).

stream cipher. In Section III, we will show that the two subci-phers can be separately broken by known-/chosen-plaintext at-tack and chosen-ciphertext attack.

In Lin et al.’s BSS-based encryption scheme, the key signalsare as long as the plain-signals and have to be

generated by a pseudorandom number generator (PRNG) witha secret seed , which serves as the secret key. According tothe principle of BSS, the decryption process actually does notrequire any knowledge about the mixing matrix to separate(i.e., decrypt) the encrypted plain-signals (i.e., the plaintexts).Thus, it seems that Lin et al. do not consider as part of thesecret key. However, we believe that should be kept secret aspart of the key, due to the following considerations.

• From a cryptographer’s point of view, except for the se-cret key, all details about a cryptosystem are known to at-tackers. So, if is not part of the secret key, it should beconsidered known to attackers. Unfortunately, in this case,the product cipher will be degraded to be a simple streamcipher. Considering as the equivalentcipher-signal, the encryption procedure becomes

(7)

From the above equation, one can see that the encryptionscheme is actually independent of the underdeterminedBSS problem.

• As the theoretical basis of the BSS technique, it is assumedthat the input signals are mutually independent of eachother. Though Lin et al. have shown that the BSS tech-niques can work well for some images and speeches, it re-mains doubtful if the BSS technique can still work well todecrypt closely related input signals (such as an image andits watermarked version or consecutive frames in a movie).Thus, the knowledge about is required to ensure that anyplaintexts can be exactly decrypted in any case. This is thereason why we represent the decryption procedure as (4)without employing any BSS algorithm.

• As will be shown later in Section III-A4, the key signalscan be totally circumvented in a ciphertext-only differen-tial attack, so the mixing matrix must be kept secret asa second defence to maintain the security.

Thus, in this paper, we assume that the secret key consists ofboth and . Note also that adding into the secret key willdefinitely lead to a stronger (at least equivalently strong) cryp-tosystem compared with the one with a single key . By crypt-analyzing the stronger cryptosystem, the original one is alsocryptanalyzed.

In [31]–[35], the BSS-based encryption scheme was mainlydesigned to encrypt images simultaneously, where isthe th pixel in the th image. In [36] and [37], the encryptionscheme was suggested to encrypt a single speech, each frame ofwhich is divided into segments, and is the th sample inthe th segment. This encryption scheme can also be applied to asingle image, by dividing it into blocks of the same size. To fa-cilitate the following discussion, we assume that the encryptionscheme is used to encrypt a single plaintext with segments ofequal sizes.

LI et al.: CRYPTANALYZING AN ENCRYPTION SCHEME BASED ON BLIND SOURCE SEPARATION 1057

In [37, Sec. VII], it was claimed that the BSS-based encryp-tion scheme is secure against most modern cryptographical at-tacks, including the ciphertext-only attack, known-plaintext at-tack, and chosen-plaintext attack. In Section III, we will showthat this claim is questionable.

III. CRYPTANALYSIS

Before introducing the cryptanalytic results, let us see howlarge the key space is. In [31]–[37], each element of is re-stricted within the interval [ 1,1]. Thus, assuming that each el-ement in has possible values,3 the number of all possiblemixing matrix is . Furthermore, assuming that thebit size of is , the size of the whole key space is .When and , the size of the whole keyspace is . Later, we will show that the real size of the keyspace is much smaller than this estimation, due to some essentialsecurity defects of the BSS-based encryption scheme. We willalso explain why this encryption scheme is not secure againstknown-/chosen-plaintext attack and chosen-ciphertext attack.

A. Ciphertext-Only Attack

1) Divide-and-Conquer (DAC) Attack: Rewrite (4) in the fol-lowing form:

(8)

where and

From the above equation, to recover , one only needs toknow and the th row of . In other words, when theBSS-based encryption scheme is used to encrypt indepen-dent plaintexts, the th plaintext can be exactly recovered withthe knowledge of and the th row of . A similar result can beobtained when segments of one single plaintext is encryptedwith the encryption scheme. This fact means that rows of

can be separately broken with a divide-and-conquer (DAC)attack. As a result, the size of the key space is reduced to be

. When and , it becomes.

2) Low Sensitivity to : From the cryptographical point ofview, given two distinct keys, even if their difference is the min-imal value under the current finite precision, the encryption anddecryption results of a good cryptosystem should still be com-pletely different. In other words, this cryptosystem should havea very high sensitivity to the secret key [12]. Unfortunately,the BSS-based encryption scheme does not satisfy this secu-rity principle, because the involved matrix computation is notsufficiently sensitive to matrix mismatch. Given two matrices

and of size , if the maximaldifference of all elements is , then one can easily deduce that

3The value of � is determined by the finite precision under which the cryp-tosystem is realized. For example, if the cryptosystem is implemented with�-bitfixed-point arithmetic, then � � � ; if it is implemented with IEEE floating-point arithmetic, then � � � (single-precision) or � � � (double-preci-sion) [39], where the sign bit of the floating-point number is always negative.

, which is the th element of , sat-isfies the following inequality:

where denotes the vector composed of absolutes values ofall elements in , i.e.,(the same expression will also be used for other vectors/ma-trices afterwards without further explanation). As a result, thematrix can be approximately guessed under a relatively largefinite precision , still maintaining an acceptable quality of therecovered plaintexts. Since under the finite precision one onlyneeds to guess values of each element in , the sizeof the key space is significantly reduced from to

, where .4 Whenand , the size of the space is reduced from

to .The above low sensitivity can be easily verified with experi-

ments described as follows.Step 1) For a randomly-generated key , calculate the

ciphertext corresponding to a plaintext .Step 2) With another mismatched key , decrypt

to get —which is an estimated version of, where and is a

random ( )-matrix.For each value of , the second step was repeated 100 times

to get a mean value of the recovery error [measured in meanabsolute error (MAE)].5 Then, one can observe the relationshipbetween the recovery error and the value of . Fig. 1 shows theexperimental results when the plaintexts are a digital image anda speech file, respectively.

The experimental results confirm that a mismatched key canapproximately recover the plaintext. Considering that humanshave a good capability of correcting errors in viewing imagesand listening to speech, even relatively large errors may not beable to prevent a human attacker from recognizing the plainimage or plain speech. Thus, the value of may be relativelylarge. When , , and , we give twoexamples of such recognizable plaintexts with relatively largeerrors in Figs. 2 and 3.

From the above experimental results, we can exhaustivelysearch for an approximate version of under the finite precision

. Such an approximate version of is then usedto roughly reveal the plaintext. Since the searching complexityis , such an exhaustive search is feasible when

is not very large.6 When and , we

4In this paper, ��� denotes the ceiling function of �, i.e., the minimal integerthat is not less than �.

5When the plaintext is a digital image with 256 gray scales, we first calibrateeach subimage into the range ��� � � � � ���� and then calculate the recovery errorof the whole image.

6In [31]–[37], small values are used in all examples: � � � or 4 and� � � .

1058 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008

Fig. 1. Experimental relationship between the recovery error and the value of�. (a) Plaintext is a digital image “Lenna” [see Fig. 3(a)]. (b) Plaintext is a speechfile “one.wav” that corresponds to the pronunciation of the English word “one”(from the Merriam-Webster Online Dictionary, http://www.m-w.com).

carried out a large number of experiments using the followingsteps.

Step 1) For a randomly generated key , calculate theciphertext corresponding to a plaintext .

Step 2) Randomly generate a matrix (each element inthe interval [ 1,1]) and then decrypt with theguessed key to get .

Step 3) Repeat step 2) for rounds and output the recov-ered plaintext , every segment of which corre-sponds to the best recovery performance in all of the

rounds.Step 4) For the th segment of , find the corresponding

matrix and extract its th row of its inverseto form the th row of , which is the inverse ofan estimation of the original matrix .

Assuming that the target finite precision is , the interval[ 1,1] is divided into subintervals. Without loss of

Fig. 2. Example of human capability of recognition against large noises inspeech. (a) Original plain-speech “one.wav.” (b) Recovered speech (MAE �

��������). For the reader’s verification, the recovered speech is posted onlineat http://www.hooklee.com/Papers/Data/BSSE/one_MAE=0.164103.wav.

Fig. 3. Example of human capability of recognition against large noisesin images. (a) Original plain-image “Lenna.” (b) Recovered image(MAE � �������).

generality, assume that is an integer. Then, each subintervalis of equal size. Thus, if the element in the random matrix hasa uniform distribution over [ 1,1], the probability that

occurs at least one time in rounds of experiment is, where and are the th

elements of and , respectively. One can easily deduce thatis an increasing function with respect to and

which results in that when . In otherwords, with experiments, it is a high-probability eventthat at least one is “equal” to under the finite precision. To get an approximate estimation of the th row of , one can

see that rounds of experiment are needed.Apparently, the above steps actually simulate the process of

a real ciphertext-only attack that tries to reveal the plaintextand to exhaustively guess (under the assumption that isknown). Note that MAE cannot be calculated to evaluate the re-covery performance in a real attack in which one does not knowthe plaintext. Fortunately, by exploiting the large informationredundancy existing in natural images and speech signals, onecan resort to use some other measures to reflect the recoveryperformance of each segment of . In our experiments, we

LI et al.: CRYPTANALYZING AN ENCRYPTION SCHEME BASED ON BLIND SOURCE SEPARATION 1059

Fig. 4. Recovered speech in one 50 000-round experiment of exhaustivelyguessing � when � � � and � � ��� ���. (a) Original plain-speech“one.wav.” (b) Recovered speech (MANE of each segment: 0.0469, 0.0521).For the reader’s verification, the recovered speech is posted online at http://www.hooklee.com/Papers/Data/BSSE/one_MANE=0.0469–0.0521.wav.

Fig. 5. Two recovered plain images in experiments of exhaustively guessing� when � � � and � � ��� ���. (a) � � ����� (MANE of each seg-ment: 39.7491, 14.9373). (b) � � ����� (MANE of each segment: 16.3888,15.1722).

use a measure called mean absolute neighboring error (MANE),which is defined as follows for the th segment of :

(9)

where denotes the segment length. In Figs. 4 and 5, one recov-ered plain-speech signals and two recovered plain images areshown for demonstration. One can see that (or

) is sufficient to get a good estimation of the plaintext.Note that, for 2-D images, the above 1-D MANE may be gen-

eralized to include more neighboring pixels, thus achieving amore accurate description of the recovery performance. In ad-dition, multiple quality factors can be employed to further en-hance the evaluation of the recovery performance.

3) Low Sensitivity to : Due to the same reason as the lowsensitivity to , one can deduce that the BSS-based encryptionscheme is also insensitive to the key signal . Given two keysignals and , if the maximal difference of all ele-ments is , then each element of is notgreater than . Since itself is not partof the secret key, but generated from , this problem does nothave much negative influence on the security of the whole cryp-tosystem against ciphertext-only attacks.

Fig. 6. Differences of two plain-speech files. (a) First speech “one.wav.”(b) Second speech “two.wav.” (c) Difference speech signal “one”-“two.”(d) Difference speech signal “two”-“one.” For the reader’s verifica-tion, the two difference speech files are posted online at http://www.hooklee.com/Papers/Data/BSSE/one-two.wav and http://www.hooklee.com/Papers/Data/BSSE/two-one.wav.

4) Differential Attack: Given two plaintexts and, if they are encrypted with the same key , one

can get the following formula from (3):

(10)

where and .Note that disappears from the above equation. Thismeans that, from the differential viewpoint, only is the se-cret key, i.e., is insignificant in the key. Considering the lowsensitivity of the encryption scheme to , under finite precision, the key space becomes , and so one might exhaus-

tively search to recover the plaintext difference as follows:

(11)

From the obtained plaintext difference, one can get a mixed viewof the two interested plaintexts, in which both plaintexts may becompletely recognizable by humans. Figs. 6 and 7 show fourplaintext differences of two speech files and two images.

Denoting the guessed matrix by , one has

(12)

Apparently, if , the obtained plaintext differencewill have an inter-segment mixture, which may make the

recognition of the two plaintexts more difficult. Fortunately,when is relatively small, such an inter-segment mixture maynot be too severe to prevent the recognition of the two plaintextsby humans. More importantly, our experiments showed thathumans are able to recognize the two plaintexts even when the

1060 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008

Fig. 7. Differences of two plain-images “Lenna” and “cameraman.”(a) Lenna–cameraman. (b) cameraman–Lenna.

Fig. 8. One obtained plain difference-images with a badly mismatched keywhen � � �.

Fig. 9. One obtained plain difference-speech when � and �� have arelatively large mismatch. For the reader’s verification, this difference-speechis posted online at http://www.hooklee.com/Papers/Data/BSSE/two-one-large-mismatch.wav.

mismatch between and is not very small. When ,then

(13)and a plain difference-image obtained in our experiments isshown in Fig. 8. One can see that both plain-images “Lenna”and “cameraman” can still be roughly recognized from such aheavily mixed difference-image. Another obtained plain differ-ence-speech for “one-way” and “two-way” is shown in Fig. 9,from which the two English words (“one” and “two”) are alsodistinguishable.

To further show the real performance of the differential attackwith a badly mismatched key, we have also carried out some ex-

Fig. 10. One obtained plain difference-image with a badly mismatched keywhen � � �.

Fig. 11. A visually optimal result obtained in 100 plain difference-images.(a) �Optimal difference-image. (b) Negative of the optimal difference-image.

periments on the plain-images “Lenna” and “cameraman” whenand

One result is shown in Fig. 10.In this differential attack, quality evaluation factors (such as

MANE used in Section III-A2) are not suitable to automaticallydetermine the best result in many plain difference-signals, be-cause each segment of an obtained difference-signal is also anatural signal with abundant information redundancy. Instead,one has to output all obtained difference-signals and check themwith naked eyes or ears to find a perceptually optimal result withthe least inter-segment mixture. Fig. 11 shows such a result in100 plain difference-images when and follows (13).By checking each segment separately and combining the op-timal segments together, one can further get a better result withless inter-segment mixture.

While this differential attack works well for as shownabove, it will become infeasible when is sufficiently large,due to the following reasons: 1) the inter-segment mixture is

LI et al.: CRYPTANALYZING AN ENCRYPTION SCHEME BASED ON BLIND SOURCE SEPARATION 1061

too severe and 2) the complexity of checking alldifference-signals is beyond human capability.

B. Low Sensitivity to Plaintext

Another cryptographical property required by a good cryp-tosystem is that the encryption should be very sensitive to plain-text, i.e., the ciphertexts of two plaintexts with a slight differenceshould be very different [12]. However, this property does nothold for the BSS-based encryption scheme. Given two key sig-nals and , if the maximal difference of all elementsis , then each element of is not greaterthan . When the same secret key is usedto encrypt two closely correlated plaintexts, such as a plaintextand its watermarked version, this security defect means that theexposure of one plaintext leads to the revealment of both.

C. Known-Plaintext Attack

In this kind of attack, one can access to a number of plaintextsthat are encrypted with the same key. From (10), with plain-text differences, one immediately knows that the mixing matrixcan be uniquely determined as follows:

(14)

where and are matrices, constructed rowby row from the plaintext differences and the correspondingciphertext differences, respectively. Then, can be fur-ther solved from any plaintext and its ciphertext as

(15)

Now, can be used to recover other plaintexts en-crypted by the same key . Note that has a finitelength determined by the maximal length of all known plain-texts, so can only recover plaintexts under thisfinite length.

When , the key signals can also be determinedas

(16)

If the PRNG used is not cryptographically strong (such as LFSR[12]), it may be possible to further derive the secret seed , thuscompletely breaking the BSS-based encryption scheme.

Note that distinct plaintexts can generateplaintext differences. Solving the inequality ,one can get the number of required plaintexts to yield at leastplaintext differences as

(17)

D. Chosen-Plaintext/Ciphertext Attack

In chosen-plaintext attack, one can freely choose a number ofplaintexts and observe the corresponding ciphertexts, while in

chosen-ciphertext attack one can freely choose a number of ci-phertexts and observe the corresponding plaintexts. So, in theseattacks, one can choose plaintext differences easily, whichmeans that the above differential known-plaintext attack stillworks fine in the same way.

IV. DISCUSSION

As we pointed out in the last section, the BSS-based encryp-tion scheme is always insecure against known-/chosen-plaintextattack. Thus, the secret key cannot be repeatedly used in anycase. This means that the encryption scheme has to work likea common stream cipher by changing the secret key for eachdistinct plaintext. However, in this case, (equivalently, thesecret seed ) is enough to provide a high level of security, since

satisfies the cryptographical properties in a perfectly secureone-time-a-pad cipher (see [37, Sec. V-B]). This means that theBSS process becomes excessive.

Even when one wants to add a second function againstpotential attacks by applying the BSS mixing, the low sensi-tivity of encryption/decryption to the mixing matrix (recallSection III-A2) makes this goal less useful. As a result, in thecurrent encryption design, the BSS model does not play a keyrole in the security of the scheme. The real core of the encryp-tion scheme actually is the embedded PRNG that generates thekey signals for masking the plaintexts.

If one wants to use the BSS-based encryption scheme witha repeatedly used key, some essential modifications have to bemade to enhance the security against various attacks. Followingthe cryptanalysis given in the last section, we suggest adoptingtwo coutermeasures simultaneously: 1) use a sufficiently large

and 2) like the design of most modern block ciphers [12], it-erate the BSS-based encryption for many rounds to improve itssensitivity to the secret key and to the plaintext. It is obviousthat both countermeasures will significantly influence the en-cryption/decryption speed of the encryption scheme. It thereforeseems doubtful if such an enhanced encryption scheme will haveany advantages comparing with other multiple-round block ci-phers, especially AES [13] that can be optimized to run at a veryhigh rate on PCs [40].

Finally, it is worth mentioning that the original BSS-basedencryption scheme can be used to realize lossy decryption, aninteresting feature that may be useful in some real applications.7

This feature means that an encryption scheme can still (perhapsroughly) recover the plaintext even when there are some errorsin the ciphertexts. A typical use of this feature is that the cipher-text can be compressed with some lossy algorithms to save therequired storage in local computers or the channel bandwidth fortransmission. For the BSS-based encryption scheme, the lossydecryption feature is ensured by low sensitivity of decryptionto ciphertext, which is due to the same reason of the low sensi-tivity of encryption to plaintext (recall Section III-B). However,one should keep in mind that the lossy decryption feature is in-duced by the low sensitivity to plaintext/ciphertext, so there is atradeoff between this feature and the security.

7Another possible scheme is a matrix-based image scrambling system pro-posed in [41], as pointed out in [30].

1062 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, MAY 2008

V. CONCLUSION

This paper has analyzed the security of an image/speech en-cryption scheme based on a BSS mixing technique proposedin [31]–[37]. It has been shown that this BSS-based encryp-tion scheme suffers from several security defects, including itsvulnerability to a ciphertext-only differential attack, known-/chosen-plaintext attack, and chosen-ciphertext attack. It remainsto see how the BSS-based technique can be further improved forconstructing cryptographically strong ciphers.

REFERENCES

[1] H. J. Beker and F. C. Piper, Secure Speech Communications. London,U.K.: Academic, 1985.

[2] I. J. Kumar, “Cryptology of speech signal,” in Cryptology: System Iden-tification and Key-Clustering. Laguna Hills, CA: Aegean Park, 1997,ch. 6.

[3] R. K. Nichols and P. C. Lekkas, “Speech cryptology,” in Wireless Secu-rity: Models, Threats, and Solutions. New York: McGraw-Hill, 2002,ch. 6, pp. 253–327.

[4] B. Furht, D. Socek, and A. M. Eskicioglu, “Fundamentals of multi-media encryption techniques,” in Multimedia Security Handbook, B.Furht and D. Kirovski, Eds. Boca Raton, FL: CRC, 2004, ch. 3, pp.93–132.

[5] S. Li, G. Chen, and X. Zheng, “Chaos-based encryption for digital im-ages and videos,” in Multimedia Security Handbook, B. Furht and D.Kirovski, Eds. Boca Raton, FL: CRC, 2004, ch. 4, pp. 133–167.

[6] A. Uhl and A. Pommer, Image and Video Encryption: From DigitalRights Management to Secured Personal Communication. Boston,MA: Springer Science/Business Media, 2005.

[7] B. Furht, E. Muharemagic, and D. Socek, Multimedia Encryption andWatermarking. Berlin, Germany: Springer-Verlag, 2005.

[8] W. Zeng, H. Yu, and C.-Y. Lin, Eds., Multimedia Security Technologiesfor Digital Rights Management. New York: Academic, 2006.

[9] B. Javidi, Optical and Digital Techniques for Information Security.New York: Springer Science/Business Media, 2005.

[10] M. G. Kuhn, Analysis for the Nagravision Video Scrambling Method1998 [Online]. Available: http://www.cl.cam.ac.uk/~mgk25

[11] S. Li, C. Li, G. Chen, N. G. Bourbakis, and K.-T. Lo, “A generalcryptanalysis of permutation-only multimedia encryption algorithms,”Signal Process. [Online]. Available: http://eprint.iacr.org/2004/374, tobe published

[12] B. Schneier, Applied Cryptography – Protocols, Algorithms, and SouceCode in C, 2nd ed. New York: Wiley, 1996.

[13] Specification for the Advanced Encryption Standard (AES) Nat. Inst.Standards Technol., 2001, Federal Information Processing StandardsPublication 197 (FIPS PUB 197).

[14] J. Wen, M. Severa, W. Zeng, M. H. Luttrell, and W. Jin, “Aformat-compliant configurable encryption framework for access con-trol of video,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no.6, pp. 545–557, Jun. 2002.

[15] S. Li, G. Chen, A. Cheung, B. Bhargava, and K.-T. Lo, “On the designof perceptual MPEG-video encryption algorithms,” IEEE Trans. Cir-cuits Syst. Video Technol., vol. 17, no. 2, pp. 214–223, Feb. 2007.

[16] M. Bertilsson, E. F. Brickell, and I. Ingemarson, “Cryptanalysis ofvideo encryption based on space-filling curves,” in Proc. Advances inCryptol.–EUROCRYPT’89, Berlin, Germany, 1989, vol. 434, LectureNotes in Computer Science, pp. 403–411.

[17] J.-K. Jan and Y.-M. Tseng, “On the security of image encryptionmethod,” Inf. Process. Lett., vol. 60, no. 5, pp. 261–265, 1996.

[18] L. Qiao, K. Nahrstedt, and M.-C. Tam, “Is MPEG encryption by usingrandom list instead of ZigZag order secure?,” in Proc. IEEE Int. Symp.Consum. Electron., 1997, pp. 226–229.

[19] T. Uehara and R. Safavi-Naini, “Chosen DCT coefficients attack onMPEG encryption schemes,” in Proc. IEEE Pacific-Rim Conf. Multi-media (IEEE-PCM’2000), 2000, pp. 316–319.

[20] C.-C. Chang and T.-X. Yu, “Cryptanalysis of an encryption scheme forbinary images,” Pattern Recognit. Lett., vol. 23, no. 14, pp. 1847–1852,2002.

[21] A. M. Youssef and S. E. Tavares, “Comments on the security of fastencryption algorithm for multimedia (FEA-M),” IEEE Trans. Consum.Electron., vol. 49, no. 2, pp. 168–170, Feb. 2003.

[22] S. Li and K.-T. Lo, “Security problems with improper implementationsof improved FEA-M,” J. Syst. Software, vol. 80, no. 5, pp. 791–794,2007.

[23] S. Li and X. Zheng, “Cryptanalysis of a chaotic image encryptionmethod,” in Proc. IEEE Int. Symp. Circuits Syst., 2002, vol. II, pp.708–711.

[24] S. Li and X. Zheng, “On the security of an image encryption method,”in Proc. IEEE Int. Conf. Image Process., 2002, vol. 2, pp. 925–928.

[25] C. Li, S. Li, D. Zhang, and G. Chen, “Cryptanalysis of a chaotic neuralnetwork based multimedia encryption scheme,” in Proc. PCM, 2004,vol. 3333, Lecture Notes in Computer Science, pp. 418–425.

[26] C. Li, S. Li, G. Chen, G. Chen, and L. Hu, “Cryptanalysis of a newsignal security system for multimedia data transmission,” EURASIP J.Appl. Signal Process., vol. 2005, no. 8, pp. 1277–1288, 2005.

[27] C. Li, X. Li, . S. Li, and G. Chen, “Cryptanalysis of a multistage en-cryption system,” in Proc. IEEE Int. Symp. Circuits Syst., 2005, pp.880–883.

[28] C. Li, S. Li, D.-C. Lou, and D. Zhang, “On the security of the Yen-Guo’s domino signal encryption algorithm (DSEA),” J. Syst.. Software,vol. 79, no. 2, pp. 253–258, 2006.

[29] S. Li, C. Li, G. Chen, and K.-T. Lo, “Cryptanalysis of RCES/RSESimage encryption scheme,” J. Syst Software, to be published.

[30] S. Li, C. Li, K.-T. Lo, and G. Chen, “Cryptanalysis of an image scram-bling scheme without bandwidth expansion,” IEEE Trans. CircuitsSyst. Video Technol., vol. 18, no. 3, pp. 338–349, Mar. 2008.

[31] Q.-H. Lin and F.-L. Yin, “Blind source separation applied to imagecryptosystems with dual encryption,” Electron. Lett., vol. 38, no. 19,pp. 1092–1094, Sep. 2002.

[32] Q. Lin and F. Yin, “Image cryptosystems based on blind source sepa-ration,” in Proc. Int. Conf. Neural Netw. Signal Process., 2003, vol. 2,pp. 1366–1369.

[33] Q.-H. Lin, F.-L. Yin, and Y.-R. Zheng, “Secure image communica-tion using blind source separation,” in Proc. IEEE 6th Circuits Syst.Symp. Emerging Technologies: Frontiers Mobile Wireless Commun.,2004, vol. 1, pp. 261–264.

[34] Q. Lin, F. Yin, and H. Liang, “Blind source separation-based encryp-tion of images and speeches,” in Proc. Advances Neural Netw., Part II,2005, vol. 3497, Lecture Notes in Computer Science, pp. 544–549.

[35] Q.-H. Lin, F.-L. Yin, and H.-L. Liang, “A fast decryption algorithm forBSS-based image encryption,” in Proc. Advances Neural Netw., PartIII, 2006, vol. 3973, Lecture Notes in Computer Science, pp. 318–325.

[36] Q.-H. Lin, F.-L. Yin, T.-M. Mei, and H. Liang, “A speech encryp-tion algorithm based on blind source separation,” in Proc. Int. Conf.Commun., Circuits Syst., 2004, vol. 2, pp. 1013–1017.

[37] Q.-H. Lin, F.-L. Yin, T.-M. Mei, and H. Liang, “A blind source sepa-ration based method for speech encryption,” IEEE Trans. Circuits Syst.I, vol. 53, no. 6, pp. 1320–1328, Jun. 2006.

[38] J.-F. Cardoso, “Blind signal separation: Statistical principles,” Proc.IEEE, vol. 86, no. 10, pp. 2009–2025, Oct. 1998.

[39] IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Std.754-1985, IEEE Comput. Soc., 1985.

[40] B. Gladman, AES and Combined Encryption/Authentication Modes2006 [Online]. Available: http://fp.gladman.plus.com/AES/index.htm

[41] D. V. D. Ville, W. Philips, R. V. de Walle, and I. Lemanhieu, “Imagescrambling without bandwidth expansion,” IEEE Trans. Circuits Syst.Video Technol., vol. 14, no. 6, pp. 892–897, Jun. 2004.

Shujun Li received the B.S. degree in informationscience and engineering and the Ph.D. degree ininformation and communication engineering fromXi’an Jiaotong University, Xi’an, China, in 1997 and2003, respectively.

After receiving the Ph.D. degree, he performedpostdoctoral research at the City University ofHong Kong, Kowloon Toon, Hong Kong SAR, fromSeptember 2003 to January 2005 and at The HongKong Polytechnic University, Hung Hom, Kowloon,Hong Kong SAR, from June 2005 to January 2007.

Currently, he is a Humboldt Research Fellow with the FernUniversität inHagen, Hagen, Germany. His current research interests include multimediasecurity (mainly image and video encryption), chaotic cryptography, and securehuman–computer identification.

LI et al.: CRYPTANALYZING AN ENCRYPTION SCHEME BASED ON BLIND SOURCE SEPARATION 1063

Chengqing Li was born in Xiangxiang, Hunan,China. He received the B.Sc. degree in pure math-ematics from Xiangtan University, Hunan, in 2002and the M.Sc. degree in applied mathematics fromZhejiang University, Hangzhou, China, in 2005.He is currently working toward the Ph.D. degree inelectronic engineering at City University of HongKong, Kowloon Toon, Hong Kong SAR.

From November 2005 to February 2006, he wasa Research Assistant with the Centre for Chaos Con-trol and Complex Networks, City University of Hong

Kong. He has published more than ten scientific papers (all in English and onthe subject of cryptanalysis of chaotic multimedia encryption schemes).

Kwok-Tung Lo (M’92) was born in Hong Kong.He received the M.Phil. and Ph.D. degrees in elec-tronic engineering from the Chinese University ofHong Kong, Hong Kong SAR, in 1989 and 1992,respectively.

Since 1992, he has been with the Hong KongPolytechnic University, Hung Hom, Kowloon, HongKong SAR, where he is now an Associate Professorwith the Department of Electronic and InformationEngineering. He is very active in research and haspublished over 130 papers in various international

journals and conference proceedings. He is one of the authors of the book Fun-damentals of Image Coding and Wavelet Compression: Principles, Algorithmsand Standards (Tsinghua University Press, 2004). He is currently a memberof the Editorial Board of Multimedia Tools and Applications and an AssociateEditor of the HKIE Transactions. His current research interests include multi-media signal processing, digital watermarking, multimedia communications,and Internet applications.

Guanrong Chen (M’89–SM’92–F’97) received theM.Sc. degree in computer science from ZhongshanUniversity, Guangzhou, China, and the Ph.D. degreein applied mathematics from Texas A&M University,College Station.

Currently, he is a Chair Professor and the foundingdirector of the Centre for Chaos and Complex Net-works, City University of Hong Kong, KowloonToon, Hong Kong SAR.

Prof. Chen has served and is serving as editorfor eight international journals, including the IEEE

Circuits and Systems Magazine, IEEE TRANSACTIONS ON CIRCUITS AND

SYSTEMS, the IEEE TRANSACTIONS ON AUTOMATIC CONTROL, and the Inter-national Journal of Bifurcation and Chaos and has received four Best JournalPaper Awards.