6
Journal of Systems Engineering and Electronics Vol. 18, No. 4, 2007, pp.795–800 High performance scalable image coding Gan Tao, He Yanmin & Zhu Weile School of Electronic Engineering, Univ. of Electronic Science and Technology of China, Chengdu 610054, P. R. China (Received October 19, 2006) Abstract: A high performance scalable image coding algorithm is proposed. The salient features of this algorithm are the ways to form and locate the significant clusters. Thanks to the list structure, the new coding algorithm achieves fine fractional bit-plane coding with negligible additional complexity. Experiments show that it performs comparably or better than the state-of-the-art coders. Furthermore, the flexible codec supports both quality and resolution scalability, which is very attractive in many network applications. Keywords: scalable image coding, significant clusters, list structure. 1. Introduction With the development of the Internet and networking technology, there is a trend of growing heterogeneity in digital image applications. Interesting examples in- clude image database previewing and progressive im- age transmission where the constraints on bit rate or display resolution cannot be anticipated at the time of compression. The challenge is how a single flex- ible bitstream can be provided to accommodate the users’ different needs, according to their bandwidth and computing capabilities. Scalable image coding thus emerges as a promising technology to answer this challenge. There are generally two types of scalabil- ity in image coding: quality scalability and resolution scalability. Quality scalability is a feature of the en- coded bitstream that allows decoders to decode the image in the same spatial resolution, but with dif- ferent fidelity. A fully quality scalable bitstream, also known as an embedded bitstream, can be truncated at any point to achieve the best possible reconstruction for the number of bits received. Resolution scalabil- ity, on the other hand, is a useful functionality that allows decoders to decode the image with different res- olutions. Over the past decade, great efforts have been made to develop image compression algorithms, which have features of good compression performance, modest complexity, as well as high scalability. Wavelet- based compression schemes have demonstrated the promise to achieve the goal. A good example is the JPEG2000 image compression standard. In its core algorithm-Taubman’s embedded block coding with optimized truncation (EBCOT) [1] and the fractional bit-plane coding combined with post-compression rate-distortion (PCRD) optimization achieves high performance. As in each bit-plane, all coefficients need to be coded through several passes. The computational complexity is rather high. On the other hand, to avoid pixel-by-pixel coding, codecs based on morphological representation have been proposed, such as, Servetto et al’s morphological representation of wavelet data (MRWD) algorithm [2] , Chai et al’s significant-linked connected component analysis (SLCCA) algorithm [3] , and Lazzaroni et al.’s embedded morphological dilation coding (EMDC) algorithm [4] .However, to one’s knowledge, none of the morphocodecs so far in the literature have solved the problem of high complexity caused by data ordering and redundant dilation operations. None of them have achieved two types of scalability in a single framework. 2. Data modelling and classification 2.1 Representation of wavelet data There exist two distinct approaches to get an efficient

High performance scalable image coding

  • Upload
    z

  • View
    217

  • Download
    5

Embed Size (px)

Citation preview

Journal of Systems Engineering and Electronics

Vol. 18, No. 4, 2007, pp.795–800

High performance scalable image coding

Gan Tao, He Yanmin & Zhu Weile

School of Electronic Engineering, Univ. of Electronic Science and Technology of China, Chengdu 610054, P. R. China

(Received October 19, 2006)

Abstract: A high performance scalable image coding algorithm is proposed. The salient features of this algorithm

are the ways to form and locate the significant clusters. Thanks to the list structure, the new coding algorithm

achieves fine fractional bit-plane coding with negligible additional complexity. Experiments show that it performs

comparably or better than the state-of-the-art coders. Furthermore, the flexible codec supports both quality and

resolution scalability, which is very attractive in many network applications.

Keywords: scalable image coding, significant clusters, list structure.

1. Introduction

With the development of the Internet and networking

technology, there is a trend of growing heterogeneity

in digital image applications. Interesting examples in-

clude image database previewing and progressive im-

age transmission where the constraints on bit rate or

display resolution cannot be anticipated at the time

of compression. The challenge is how a single flex-

ible bitstream can be provided to accommodate the

users’ different needs, according to their bandwidth

and computing capabilities. Scalable image coding

thus emerges as a promising technology to answer this

challenge. There are generally two types of scalabil-

ity in image coding: quality scalability and resolution

scalability. Quality scalability is a feature of the en-

coded bitstream that allows decoders to decode the

image in the same spatial resolution, but with dif-

ferent fidelity. A fully quality scalable bitstream, also

known as an embedded bitstream, can be truncated at

any point to achieve the best possible reconstruction

for the number of bits received. Resolution scalabil-

ity, on the other hand, is a useful functionality that

allows decoders to decode the image with different res-

olutions.

Over the past decade, great efforts have been made

to develop image compression algorithms, which have

features of good compression performance, modest

complexity, as well as high scalability. Wavelet-

based compression schemes have demonstrated the

promise to achieve the goal. A good example is the

JPEG2000 image compression standard. In its core

algorithm-Taubman’s embedded block coding with

optimized truncation (EBCOT)[1] and the fractional

bit-plane coding combined with post-compression

rate-distortion (PCRD) optimization achieves high

performance. As in each bit-plane, all coefficients

need to be coded through several passes. The

computational complexity is rather high. On the

other hand, to avoid pixel-by-pixel coding, codecs

based on morphological representation have been

proposed, such as, Servetto et al’s morphological

representation of wavelet data (MRWD) algorithm[2],

Chai et al’s significant-linked connected component

analysis (SLCCA) algorithm[3], and Lazzaroni et al.’s

embedded morphological dilation coding (EMDC)

algorithm[4].However, to one’s knowledge, none of the

morphocodecs so far in the literature have solved the

problem of high complexity caused by data ordering

and redundant dilation operations. None of them

have achieved two types of scalability in a single

framework.

2. Data modelling and classification

2.1 Representation of wavelet data

There exist two distinct approaches to get an efficient

796 Gan Tao, He Yanmin & Zhu Weile

classification and representation of wavelet coefficients

in the literature. Although Shapiro’s embedded ze-

rotree wavelet compression algorithm (EZW)[5] and

Said and Pearlman’s set partitioning in hierarchical

tree (SPIHT) algorithm[6] employ a regular zerotree

structure to approximate insignificant groups of coeffi-

cients across subbands, MRWD, SLCCA, and EMDC

represent irregular clusters of significant coefficients

within subbands. The well-known SPIHT algorithm

enjoys good performance with low complexity. Yet, it

has its own drawbacks. First, it fails to efficiently rep-

resent the regions rich in texture. As many nonzero

coefficients exist in high frequency subbands, which

are not well aligned with the tree-structure grid, ze-

rotree structure becomes quite inefficient. Second, as

the bitstreams from different scales are interwoven,

SPIHT does not support resolution scalability. Efforts

have been made by Danyali and Mertins to add the

resolution scalability feature to SPIHT[7], but their

algorithm suffers from a loss of performance. Based

on these considerations, the authors have adopted the

clustering representation instead of zerotree in their

scheme. Being free of the constraint of tree struc-

ture, the subbands can be coded independently, which

makes the task of scalability supporting much easier.

2.2 Classification and ordering

The embedded coding raises the problem of ordering

information according to its importance. It is prefer-

able to code and transmit the most valuable infor-

mation as early as possible. This is the problem of

pixel sorting, originally considered by Li and Lei[8].

For each pixel to be coded, the rate-distortion (R-D)

slope is estimated. The optimal rate-distortion perfor-

mance can be achieved by selecting the pixel with the

maximum R-D slope at each coding stage. The high

complexity involved in the optimal solution is gener-

ally unacceptable in practice, thus fractional bit-plane

coding becomes a good candidate.

Bit-plane coding is the natural choice for embed-

ded image compression. For EBCOT in JPEG2000,

one bit-plane coding is further decomposed into three

passes according to the pixels’ significant situation.

The unknown pixels are classified into two categories.

The pixels that have at least one immediate signifi-

cant neighbor are coded in the first significant propa-

gation pass, whereas, the others are coded in the final

clean-up pass. Obviously, this classification is rather

coarse. The pixels with different significant neigh-

bors in value and number are undistinguished. To

get a better data classification, Peng et al., proposed

a pixel classification and sorting method[9]. In their

algorithm, up to eight significant coding passes and

one magnitude refinement pass are employed in each

bit-plane coding. Consequently, the performance is

improved at the price of higher complexity. In the

following, the authors present their classification and

ordering scheme, as a better trade-off between the per-

formance and complexity.

The unknown pixels to be coded in each bit-plane

are categorized into the following types: (1) the ones

that have a significant neighbor with a big value; (2)

The ones that have a significant neighbor with a small

value. For the pixels with no significant neighbor, the

authors have a further classification according to the

significance state of their parents; (3) the ones that

have a significant parent with a big value; (4) the ones

that have a significant parent with a small value.

Note that the classification, which is based on clus-

tering nature of wavelet coefficients, is finer than that

of EBCOT. It has been shown that in significant cod-

ing, the symbol with higher probability of significance

has a larger R-D slope[8]. The authors claim that

the unknown pixels sorted by their type (A-D), are

ordered in decreasing probability of significance and

thus a better performance is expected to be achieved.

This claim is substantiated by the experiments.

Furthermore, as seen later on, the implementation

of this classification can be done efficiently. The addi-

tional complexity is negligible.

3. Proposed morphological coding

algorithm

3.1 Fractional bit-plane coding

In this algorithm, wavelet data of subbands are repre-

sented and coded as clusters. The cluster is depicted

as a mass of significant pixels surrounded by insignif-

icant ones. They are organized in the following lists.

LSPk the list of all significant pixels, which form

High performance scalable image coding 797

the core of the clusters in the k-th spatial resolution

level.

LIPk the list of all insignificant pixels, which form

the boundary of the clusters in the k-th spatial reso-

lution level.

LISk the list of insignificant sets in the k-th spatial

resolution level with varying sizes, which represent the

remaining insignificant parts.

The elements in both LSPk and LIPk are further

classified into the new and the old ones, depending

on whether they are added to the list in the current

bit-plane pass or not.

Let P n denote the bit-plane pass n, Pn,sk the s-th

sub-bitplane-pass for the process of resolution level k,

N the maximum number of spatial resolution level.

On the basis of the list structures defined earlier,

the authors have developed a fractional bit-plane cod-

ing with the following six sub-passes {P n,sk }n, s, k, to

achieve excellent R-D performance.

Sub-pass 1 (P n,1k , k ∈ [1, N ]): intraband cluster

growing for each entry of LIPk, that is, morphological

dilation on the boundary of the old cluster.

Sub-pass 2 (P n,2k , k ∈ [1, N ]): intraband cluster

growing for the new entry of LSPk, that is, morpho-

logical dilation on the new cluster.

Sub-pass 3 (P n,3k , k ∈ [1, N ]): interband cluster

expansion for old entry of LSPk−1, that is, morpho-

logical dilation on the pixels whose parents belong to

the old clusters.

Sub-pass 4 (P n,4k , k ∈ [1, N ]): interband cluster

expansion for new entry of LSPk−1, that is, morpho-

logical dilation on the pixels whose parents belong to

the new clusters.

Sub-pass 5 (P n,5k , k ∈ [1, N ]): refinement pass for

old entry of LSPk.

Sub-pass 6 (P n,6k , k ∈ [1, N ]): isolate cluster lo-

cation by using set partitioning method on each entry

of LISk.

It is easy to see that proposed fractional bit-plane

coding is just the simple implementation of the afore-

mentioned data classification scheme. Following the

fact that the significance probability is primarily de-

termined by only one significant neighbor[9], the au-

thors’ do not distinguish the cases where the pixel has

one significant neighbor or more. Additionally, in-

stead of introducing a special comparison procedure

to identify the significant neighbors with different val-

ues, the neighbors that have been detected as signif-

icant in the previous bit-plane are taken as the ones

with big values, and the neighbors that have newly

become significant in the current bit-plane are taken

as the ones with small values. Thus, the task is simpli-

fied by keeping two lists, which contain the old and the

new entries during each bit-plane coding and merging

them at the end of the pass. It can be seen that with

the help of the list structure, this classification scheme

is implemented very efficiently.

It is worth noting that the new approach in cluster

formation is the distinguished feature of the proposed

algorithms. As far as is known, all the reported mor-

phocodecs create new clusters at the beginning of each

bit-plane coding, by taking the significant pixels as the

seeds for new morphological dilation. Because of the

lack of a proper structure, they fail to take advantage

of the fact that the clusters tend to grow both in the

spatial and frequency domains when crossing succes-

sive bit-planes. Here, the problem is easily overcome

by starting the growing region from the boundary of

clusters that already exist, that is, from the pixels in

LIPk, similar towhat is done in sub-pass 1. As all

the inner significant pixels of the clusters have been

coded in the previous bit-plane, the reconsideration is

unnecessary and wasteful.

3.2 Patterned morphological dilation

The idea of morphological representation was first in-

troduced by Servetto in his MRWD algorithm. The

clusters of significant coefficients are formed and coded

through iterative operations of morphological dilation.

More details can be found in Ref. [2-3]. In this algo-

rithm, 3 × 3 structure element is used. As the cluster-

ing nature of the wavelet data, the recursive dilations

working on the adjacent pixels may cause highly re-

dundant operations. For example, in an extreme case,

the seed pixel of the previous dilation has eight unnec-

essary tests in the later process. To overcome this, a

new dilation operator, called patterned morphological

dilation, is introduced.

Considering one general case depicted in Fig. 1(a),

in which the eight neighbors of pixel 0 have been tested

798 Gan Tao, He Yanmin & Zhu Weile

(a)one general case of morphological dilation; (b)–(i) eight morphological dilation patterns dedicated to dilations on

eight different positions. Gray and white circles denote seeds and corresponding dilations respectively

Fig. 1 Patterned morphological dilations

during the process of morphological dilation, suppose

pixel 1 becomes significant, the dilation will be ap-

plied on it recursively. It is obvious that only the five

new pixels (pixel 9–pixel 13) instead of all the eight

neighbors of pixel 1 need to be considered. Similarly,

the morphological dilation on other pixels can also be

simplified. Based on this consideration, the authors

define eight morphological dilation patterns dedicated

to dilations on eight different positions as depicted in

Fig. 1(b-i). It can be seen that with these patterned

dilations, on an average, half the visiting time will be

saved.

3.3 Set partitioning method

The set partitioning method to locate the remaining

clusters is another feature of the proposed algorithm.

After the most significant clusters have been extracted

in former sub-passes, there are a few scattered in

the remaining space on the subbands. They can

be effectively coded through the set-partitioning

approach. In traditional partitioning methods, such

as, quadtree partitioning[10], the insignificant coeffi-

cients denoted by a zero block must be in a square

block. However, in this situation, the insignificant

space is dominating and its shape is unpredictable.

It seems to be inefficient to represent it in a fixed

shape. Here a more flexible, set partitioning method

is proposed, in which the zero-block can be in

a square or rectangular shape. Unlike quadtree

partitioning, which subdivides a block into four

equally sized subblocks, binary partitioning is used

here. That is, blocks are divided in two, rather than

quarters. During partitioning, each set in LISk is

divided into approximately equally sized halves and

the authors alternate between splitting horizontally

and vertically. Once the significant pixel is located,

morphological dilation is applied to it immediately.

Additionally, to further improve the efficiency, we

aggressively discard the regions of known clusters are

aggressively discarded from the insignificant set. This

can be accomplished by shrinking the set boundary

to a smaller box before further partitioning.

4. Codestream structure and parsing

In the multimedia server-client environments, the re-

quested image quality and resolution level may vary

significantly from one application to another. To cater

to the needs, the bitstream can be generated by the

encoder in one particular format. The scalability fea-

ture enables the stream parser to extract the data of

interest and assemble the final codestream in the re-

quested format to the decoder.

Figure 2 shows an example of a codestream estab-

lished in the proposed algorithm for progression by

precision, with the finest granularity. In this case, the

bitstreams from different scales are interleaved in each

bit-plane pass P n. The tags are inserted to provide

the information required for identifying each coding

unit {P n,sk }n, s, k when parsing.

Fig. 2 Example of the hierarchical layout of the codestream

High performance scalable image coding 799

5. Complexity analysis

In addition to an impressive performance, the pro-

posed algorithm enjoys low complexity, which is com-

parable to that of SPHIT and less than that of

EBCOT and other morphocodecs.

To get a high coding efficiency, this scheme of data

organization and operation is very similar to that of

SPHIT. Three lists (LSP, LIP, LIS) are kept to group

the data of significant pixels, insignificant pixels, and

insignificant sets. The difference lies in the way of lo-

cating significant pixels (or significant clusters in the

authors’ notation). It is accomplished by recursive

checking and partitioning the spatial orientation trees

in SPHIT, although, morphological dilation and bi-

nary partitioning in the algorithm is proposed. As

simple basic operations are involved, the computa-

tional complexity of the two algorithms are compa-

rable. Besides, the authors’ management of the lists

is more efficient. Although, in SPHIT the significant

pixels are retained in the temporary lists until the hi-

erarchical trees are decomposed to certain level, here

these pixels are detected directly through cluster dila-

tion and prediction. Thus the operations to allocate

and free the temporary lists are saved.

As mentioned earlier, both EBCOT and the pro-

posed algorithm employ fractional bit-plane coding

to achieve good R-D performance. In EBCOT, each

unknown pixel is arithmetically coded in multiple

passes, based on the context. Therefore, its complex-

ity is proportional to the number of unknown pixels

Nunknown. In the proposed algorithm, multiple scan-

ning is avoided, as the data has already been sorted

in the lists. Finer fractional pass coding is achieved

through efficient list operations. More importantly,

as only the newly significant pixels and their neigh-

bors need to be coded, the complexity is on the or-

der of the number of significant pixels Nsig, which is

usually much smaller than Nunknown. Furthermore,

unlike EBCOT, there is no post optimization process-

ing involved. Therefore, it can be concluded that the

complexity of the proposed algorithm is less than that

of EBCOT.

Compared with other morphocodecs, this algorithm

has the striking advantage of lower computational

complexity. As most morphocodecs, so far in the liter-

ature, start morphological dilation on significant pix-

els, here only the ones on cluster boundaries are con-

sidered and the significant redundant operations are

spared. Moreover, patterned morphological dilation

is introduced, which further cuts nearly half the num-

ber of visiting times during the process of recursive

dilation.

6. Experimental results

The performance of the algorithm is evaluated on

five natural 512 × 512 grayscale images: Lena, Bar-

bara, Goldhill, Crowd, and Couple. The five-scale dis-

crete wavelet transform is used with the Daubechies

9/7 biorthogonal wavelet filter and symmetric exten-

sion mode. The bitstream is generated in a format

fully embedded by precision as illustrated in Fig. 2.

The performance is compared with two state-of-the-

art image coding algorithms, SPIHT and EBCOT.

The PSNR results are listed in Table 1.

From the table, it can be seen that the performance

of the proposed algorithm is better than SPIHT and

EBCOT, on an average. Compared with SPIHT, the

improvement is evident, especially for images that are

rich in texture and detail. For instance, the proposed

algorithm averagely outperforms SPIHT by 0.81 dB

for image Barbara and 0.55 dB for image Couple.

Compared with EBCOT, better performance is also

achieved, which fits the expectation about this finer

classification scheme.

It should be noted that as the testing codestream is

organized in a fully embedded fashion, the overhead of

tags inserted in the stream cannot be ignored. The ex-

periment shows quality loss ranges 0.04–0.16 dB over

various bitrates. The lower the bitrate becomes, the

more the loss it will suffer. This can be verified by

the result, which shows that the proposed algorithm

has growing superiority with the increase of bitrate.

For example, at bit rate of 1.0 bpp, it outperforms the

others with 0.52 dB PSNR gain over SPIHT and 0.39

dB The authors have presented a scalable gain over

EBCOT, on an average. Therefore, when a relatively

coarse quality partitioning is requested, better perfor-

mance is expected as the amount of overhead data is

substantially reduced.

800 Gan Tao, He Yanmin & Zhu Weile

Table 1 Performance comparison (PSRN [dB]) of

SPIHT, EBCOT, and the proposed algorithm

Image bpp SPIHT EBOT proposed

Lena 0.125 31.09 31.05 31.16

0.25 34.11 34.16 34.30

0.50 37.21 37.29 37.46

1.00 40.41 40.48 40.59

Barbara 0.125 24.86 25.37 25.37

0.25 27.58 28.40 28.43

0.50 31.39 32.29 32.31

1.00 36.41 37.11 37.38

Goldhill 0.125 28.48 - 28.50

0.25 30.56 30.59 30.65

0.50 33.13 33.25 33.36

1.00 36.55 36.59 36.88

Crowd 0.125 27.07 26.85 27.07

0.25 30.15 29.97 30.16

0.50 33.89 33.68 34.17

1.00 38.86 38.63 39.34

Couple 0.125 26.76 26.92 26.94

0.25 29.05 29.35 29.65

0.50 32.45 32.67 33.09

1.00 36.58 36.65 37.20

7. Conclusions

image coding algorithm, based on efficient methods

of cluster extraction. It employs new efficient frac-

tional bit-plane coding, based on a list structure and

introduces patterned morphological dilation to signifi-

cantly reduce the redundant dilation operations. Con-

sequently, as a counterpart of SPIHT, it achieves bet-

ter performance with comparatively low complexity.

Moreover, the flexible algorithm supports both qual-

ity and resolution scalability, which is very attractive

in many multimedia applications, especially in image

transmission over heterogeneous networks.

References

[1] Taubman D. High performance scalable image compression

with EBCOT. IEEE Trans. on Image Processing, 2000,

9(7): 1158–1170.

[2] Servetto S D, Ramchandran K, Orchard M T. Image coding

based on a morphological representation of wavelet data.

IEEE Trans. on Image Processing, 1999, 8(9): 1161–1174.

[3] Chai B, Vass J, Zhuang X. Significance-linked connected

component analysis for wavelet image coding. IEEE Trans.

on Image Processing, 1999, 8(6): 774–784.

[4] Lazzaroni F, Leonardi R, Signoroni A. High-performance

embedded morphological wavelet coding. IEEE Signal

Processing Letters, 2003, 10(10): 293–295.

[5] Shapiro J M. Embedded image coding using zerotrees of

wavelet coefficients, IEEE Trans. Signal Process, 1993, 41

(12): 3445–3462.

[6] Said A, Pearlman W A. A new fast and efficient image

codec based on set partitioning in hierarchical trees. IEEE

Trans. on Circuits and Systems for Video Technology,

1996, 6 (3): 243–250.

[7] Danyali H, Mertins A. Flexible, highly scalable, object-

based wavelet image compression algorithm for network

applications. IEE Proceedings on vision and image signal

process, 2004, 151(6): 498–509.

[8] Li J, Lei S. Rate-distortion optimized embedding. Proc.

Picture Coding Symp., Berlin, Germany, 1997: 201–206.

[9] Peng K, Kieffer J C. Embedded image compression based

on wavelet pixel classification and sorting. IEEE Trans.

on Image Processing, 2004, 13(8): 1011–1017.

[10] Pearlman W, Islam A, Nagaraj N, Said A. Efficient, low-

complexity image coding with a set-partitioning embedded

block coder. IEEE Trans. on Circuits and Systems for

Video Technology, 2004, 14 (11): 1219–1235.

Gan Tao was born in 1977. He is a ph. D. can-

didate at Shanghai University. His research interests

include communication signal processing and multiple

antenna systems. E-mail: [email protected]

He Yanmin was born in 1977. Now she is a Ph. D.

candidate in Automation Engineering in the Univer-

sity of Electronic Science and Technology of China.

Zhu Weile was born in 1940. Now he is a professor

and doctor supervisor in Electronic Engineering, in

the University of Electronic Science and Technology

of China. His research interests include networking,

multimedia signal processing, digital video compres-

sion, and communication.