36
(12) United States Patent Ramasastry et a]. US007522774B2 US 7,522,774 B2 Apr. 21, 2009 (10) Patent N0.: (45) Date of Patent: (54) (75) (73) (21) (22) (65) (60) (51) (52) (58) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA Inventors: J ayaram Ramasastry, Woodinville, CA (US); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN) Assignee: Sindhara Supermedia, Inc., Redmond, WA (US) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 607 days. Appl. No.: 11/077,106 Filed: Mar. 9, 2005 Prior Publication Data US 2005/0207664 A1 Sep. 22, 2005 Related US. Application Data Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356, ?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004. Int. Cl. G06K 9/36 (2006.01) US. Cl. .................................................... .. 382/232 Field of Classi?cation Search ............... .. 382/232, 382/239, 240, 248; 708/317, 4004401; 375/240, 375/240.01, 240.02, 240.11, 240.18, 240.19; 348/384.1, 398.1, 404.1 See application ?le for complete search history. L I QLJ Server /0 I l i I Encode’ Acquisition l Encoder Decoder Network (9.9., wired and/or Wireless) (56) References Cited U.S. PATENT DOCUMENTS 5,585,852 A * 12/1996 Agarwal .............. .. 375/24011 5,881,176 A * 3/1999 Keith et a1. ............... .. 382/248 6,967,600 B2 * 11/2005 Kadono et a1. .............. .. 341/67 7,295,608 B2 * 11/2007 Reynolds et a1. ..... .. 375/24001 7,333,814 B2* 2/2008 Roberts ................. .. 455/4522 * cited by examiner Primary Examinerilose L Couso (74) Attorney, Agent, or FirmiBlakely, Sokoloff, Taylor & Zafman LLP (57) ABSTRACT Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described. 30 Claims, 18 Drawing Sheets 1 , Optional ) Decoder Optional Encoder __J Decoder - (if) Client I o l)‘

US7522774

Embed Size (px)

Citation preview

Page 1: US7522774

(12) United States Patent Ramasastry et a].

US007522774B2

US 7,522,774 B2 Apr. 21, 2009

(10) Patent N0.: (45) Date of Patent:

(54)

(75)

(73)

(21)

(22)

(65)

(60)

(51)

(52) (58)

METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA

Inventors: J ayaram Ramasastry, Woodinville, CA (US); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN)

Assignee: Sindhara Supermedia, Inc., Redmond, WA (US)

Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 607 days.

Appl. No.: 11/077,106

Filed: Mar. 9, 2005

Prior Publication Data

US 2005/0207664 A1 Sep. 22, 2005

Related US. Application Data

Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356, ?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004.

Int. Cl. G06K 9/36 (2006.01) US. Cl. .................................................... .. 382/232

Field of Classi?cation Search ............... .. 382/232,

382/239, 240, 248; 708/317, 4004401; 375/240, 375/240.01, 240.02, 240.11, 240.18, 240.19;

348/384.1, 398.1, 404.1 See application ?le for complete search history.

L I QLJ

Server /0 I

l i I Encode’ Acquisition l

Encoder ‘ Decoder

Network (9.9., wired and/or Wireless)

(56) References Cited

U.S. PATENT DOCUMENTS

5,585,852 A * 12/1996 Agarwal .............. .. 375/24011

5,881,176 A * 3/1999 Keith et a1. ............... .. 382/248

6,967,600 B2 * 11/2005 Kadono et a1. .............. .. 341/67

7,295,608 B2 * 11/2007 Reynolds et a1. ..... .. 375/24001

7,333,814 B2* 2/2008 Roberts ................. .. 455/4522

* cited by examiner

Primary Examinerilose L Couso (74) Attorney, Agent, or FirmiBlakely, Sokoloff, Taylor & Zafman LLP

(57) ABSTRACT

Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described.

30 Claims, 18 Drawing Sheets

1 , Optional ‘ ) Decoder

Optional Encoder

__J

Decoder

- (if)

Client I o l)‘

Page 2: US7522774
Page 3: US7522774
Page 4: US7522774

US. Patent Apr. 21, 2009 Sheet 3 0f 18 US 7,522,774 B2

Physicat Layer (W~CUMA, CDMA 1 X, cdmaZOOO, GSNLGPRS, UMTS, iBe-n) (1)

Data Link Control (DLC) (2)

Streaming protocol stack (RTP. RTSP. RTCP, '\ \

DDP) (4) } Third party ISO protocol stack (TCP/lP/UDP) (3)

Billing and other ancillary services (5)

Network Aware Layer (NAL) (6)

Application Layer APIs for QwikStream m, Qwikvu1M and QwikTexW (7)

Content Generation Engine (8)

Data Repository (9)

Fig. 3

Page 5: US7522774

US. Patent Apr. 21, 2009 Sheet 4 0f 18 US 7,522,774 B2

Raw YUV color frame data

4 o r

1 t t ! Jo

Wavelet Transform filter bank

#02. r___________ Y

Source Encoder (ARIES)

,3 1 l 1 l t

rjr t i 5' Channel encoding (Tree partitioning, CRC,

t RCPC)

494 i l t r

\QL.

Page 6: US7522774

US. Patent Apr. 21, 2009 Sheet 5 0f 18

Compressed Image (,qvx ?le format) I

Channel decoding (Tree merging, CRC, RCPC)

Source Decoder (l-ARIES) 1

Inverse Wavelet Transform

1

l l I

Raw YUV data

Fig. 48

US 7,522,774 B2

Page 7: US7522774

US. Patent Apr. 21, 2009 Sheet 6 0f 18 US 7,522,774 B2

i 5190 l

i l l Perform a wavelet transformation on each image pixel to ’ transform the pixel into one or more coef?cients in one or

more wavelet maps. l l l l

l Encode each wavelet map by representing the signi?cance, sign and bit plane information of the pixel using a single bit

in a bit stream. In’ 3.0L

l Encode the signi?cant bits into a context variable '

dependent upon the information represented by the bit and its location of the coef?cient being coded (e.g., the l

l probability of occurrence of a predetermined set of bits [ immediately preceding the current bit). ,

l l

l l

l l

Transmit the content of the context variable as a bit stream as an output representing the encoded pixels.

1

Page 8: US7522774

US. Patent Apr. 21, 2009 Sheet 7 0f 18 US 7,522,774 B2

y.

//

Sub-tree 1 Sub'lree 2 Sub-tree 3

\ (HL) (LH) (HH)

Fig. 6

Page 9: US7522774

US. Patent Apr. 21, 2009 Sheet 8 0f 18 US 7,522,774 B2

Fig. 7

Page 10: US7522774

US. Patent Apr. 21, 2009 Sheet 9 0f 18 US 7,522,774 B2

l Determine a number of iterations (nl) based on a number of I

quantization levels, which may be determined on the I @/ largest wavelet coefficient, and set an initial quantization

threshold T = 2 “l h g a t

Populate all insigni?cant pixels in lPQ, all insigni?cant pixel having descendants in ISQ, and all signi?cant pixels in

SPQ.

l For each type i entry of lSQ, if the entry is signi?cant with respect to a current quantization threshold, remove the respective entry from ISO and append it in the SPQ

l For each type I entry of lSQ, if the entry is insigni?cant with respect to a current quantization threshold, remove the respective entry from lSQ and append it in the lPQ

l If the respective type t entry includes descendants, remove the entry from the lSQ and append it at the end of ISO as

type it entry for next iteration; otherwise, the entry is purged.

l For each type It entry of ISQ, if the entry is signi?cant with respect to a current quantization threshold, all offspring of the current lSQ entry are appended to the end of lSQ as

type I entries for next iteration. a» {a g

l Remove any entry in IPQ that is signi?cant with respect to the current quantization threshold and append it in the

“gal,

is m

dxfoq,

Page 11: US7522774

US. Patent Apr. 21, 2009 Sheet 10 0f 18 US 7,522,774 B2

Wavelet Transform filter bank

'BypassTAE/MC' t ___ fortframes

D U

Mmhk f I '7 [\ ME/MC' F, f

l/ e

4”’ t t .

t motion if, 2 _/

P_—_2 J ..... "L ‘ information I \ /

1

Channel encoding (Tree partitioning, CRC, RCPC)

t

‘ compressed ‘me 1 t

Fig. 9A "

Streaming data

Page 12: US7522774

US. Patent Apr. 21, 2009 Sheet 11 0f 18 US 7,522,774 B2

Raw YUV color frame data

1 RN ME/MC

, ltmlmwar Io. , .05 $35

I frames

t

information

Source Encoder (ARIES VII) l-ARIES l/ll

Channel encoding (Tree partitioning, CRC, RCPC)

Compressed Fite

Optxonat

Streaming data

Fig. 9B

Page 13: US7522774

US. Patent Apr. 21, 2009 Sheet 12 0f 18 US 7,522,774 B2

Streaming data

Optional

Compressed Video (.qsx ?le format)

CABAC coded motion

information

i Channel decoding (Tree merging, CRC, RCPC)

Source Decoder (l-ARIES l/ll)

Bypass Mi? for I ] _fra mes

2 1/

MC" Frame Buffer

41 ~\ \

inverse Wavelet Transform

Fig. 10A Raw YUV data

Page 14: US7522774

US. Patent Apr. 21, 2009 Sheet 13 0f 18 US 7,522,774 B2

Streaming data

Optional

Compressed Video (.qsx ?le format)

CABAC coded motion

information

Channel decoding (Tree merging, CRC, RCPC)

Source Decoder (l-ARIES I/ll)

QL Bypass M/O y W" "Ya; I ‘

if frames WW Frame Buffer

Inverse Wavelet Transform

Raw YUV data

Fig. 108

Page 15: US7522774

US. Patent Apr. 21, 2009 Sheet 14 0f 18 US 7,522,774 B2

frame)

l Perform a ME/MC on the coarsest subbands as parent subbands of a current frame other than the l-frame with

respect to the identi?ed reference frame to generate one or more motion vectors for the coarsest subbands.

l Identify a reference frame (129., the ?rst frame or an I- ' l l

Estimate the spatiat shifting of pixels of child subbands using the motion vectors of the parent subbands to determine a search area of the child saubbandsv

l Perform a ME/MC for the child subbands to determine the

motion vectors of the child subbands.

More child subbands?\)

‘l 7 Perform compression on the predicted/compensated data L/xl { 05 into compressed data (6.9., see, Figs. 5 and 8)

Fig. 11

Page 16: US7522774

US. Patent Apr. 21, 2009 Sheet 15 0f 18 US 7,522,774 B2

Fig. 12

l

i > \

o

k C m “D e C n C r C f P. R

,IIH MH a ,

L ._DL blah, Unll soH mwH, amL VE er 2% k0 O k

Boundary of the — - - — Search Area for

refinement MVs

Refinement Vector A 0 fox level k, k orientation 0 31:12:; Block Neighborhood

Page 17: US7522774

US. Patent Apr. 21, 2009 Sheet 16 0f 18 US 7,522,774 B2

integer Motion Prediction

Half-Pei Motion Prediction

Page 18: US7522774

U S. Patent Apr. 21, 2009 Sheet 17 0f 18 US 7,522,774 B2

iction ion Pred Integer Mot

Half-Pei Motion Prediction

Fig. 14

Page 19: US7522774

US. Patent Apr. 21, 2009 Sheet 18 0f 18 US 7,522,774 B2

Block currently being tested

Matching block

OBMC when current block

being tested is in iMV mode

> Motion Vector (identical colors __ __> denote MVs of the same block)

Displaced MV to transiate

8W t Vn ?e EU mow 9“ mo OM t km CC b8 .Dm Cll moo ‘In wm a...“ 8 mo P --> being tested

Fig. 15

OBMC when current block

being tested is in 4MV mode

Page 20: US7522774

US 7,522,774 B2 1

METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA

This application claims the bene?t of US. Provisional Application No. 60/552,l53, ?led Mar. 10, 2004, US. Pro visional Application No. 60/ 552,356, ?led Mar. 10, 2004, and US. Provisional Application No. 60/552,270, ?led Mar. 10, 2004. The above-identi?ed applications are hereby incorpo rated by reference in their entirety.

FIELD OF THE INVENTION

The present invention relates generally to multimedia applications. More particularly, this invention relates to com pressing digital image data.

BACKGROUND OF THE INVENTION

A variety of systems have been developed for the encoding and decoding of audio/video data for transmission over Wire line and/or Wireless communication systems over the past decade. Most systems in this category employ standard com pression/transmission techniques, such as, for example, the ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC Rec. l4496-l0AVC (also referred to as MPEG-4) standards. HoWever, due to their inherent generality, they lack the spe ci?c qualities needed for seamless implementation on loW poWer, loW complexity systems (such as hand held devices including, but not restricted to, personal digital assistants and smart phones) over noisy, loW bit rate Wireless channels. Due to the likely business models rapidly emerging in the

Wireless market, in Which cost incurred by the consumer is directly proportional to the actual volume of transmitted data, and also due to the limited bandWidth, processing capability, storage capacity and battery poWer, ef?ciency and speed in compression of audio/video data to be transmitted is a major factor in the eventual success of any such multimedia content delivery system. Most systems in use today are retro?tted versions of identical systems used on higher end desktop Workstations. Unlike desktop systems, Where error control is not a critical issue due to the inherent reliability of cable LAN/WAN data transmission, and bandWidth may be assumed to be almost unlimited, transmission over limited capacity Wireless netWorks require integration of such sys tems that may leverage suitable processing and error-control technologies to achieve the level of ?delity expected of a commercially viable multimedia compression and transmis sion system.

Conventional video compression engines, or codecs, can be broadly classi?ed into tWo broad categories. One class of coding strategies, knoWn as a doWnload-and-play (D&P) pro ?le, not only requires the entire ?le to be doWnloaded onto the local memory before playback, leading to a large latency time (depending on the available bandWidth and the actual ?le siZe), but also makes stringent demands on the amount of buffer memory to be made available for the doWnloaded payload. Even With the more sophisticated streaming pro?le, the current physical limitations on current generation trans mission equipment at the physical layer force service provid ers to incorporate a pseudo-streaming capability, Which requires an initial period of latency (at the beginning of trans mission), and continuous buffering henceforth, Which imposes a strain on the limited processing capabilities of the hand-held processor. Most commercial compression solu tions in the market today do not possess a progressive trans mission capability, Which means that transmission is possible only until the last integral frame, packet or bit before band

20

25

30

35

40

45

50

55

60

65

2 Width drops beloW the minimum threshold. In case of video codecs, if the connection breaks before the transmission of the current frame, this frame is lost forever.

Another draWback in conventional video compression codes is the introduction of blocking artifacts due to the block-based coding schemes used in most codecs. Apart from the degradation in subjective visual quality, such systems suffer from poor performance due to bottlenecks introduced by the additional de-blocking ?lters. Yet another draWback is that, due to the limitations in the Word siZe of the computing platform, the coded coef?cients are truncated to an approxi mate value. This is especially prominent along object bound aries, Where Gibbs’ phenomenon leads to the generation of a visual phenomenon knoWn as mosquito noise. Due to this, the blurring along the object boundaries becomes more promi nent, leading to degradation in overall frame quality.

Additionally, the local nature of motion prediction in some codes introduces motion-induced artifacts, Which cannot be easily smoothened by a simple ?ltering operation. Such prob lems arise especially in cases of fast motion clips and systems Where the frame rate is beloW that of natural video (e. g., 25 or 30 fps non-interlaced video). In either case, the temporal redundancy betWeen tWo consecutive frames is extremely loW (since much of the motion is lost in betWeen the frames itself), leading to poorer tracking of the motion across frames. This effect is cumulative in nature, especially for a longer group of frames (GoF).

Furthermore, mobile end-user devices are constrained by loW processing poWer and storage capacity. Due to the limi tations on the silicon footprint, most mobile and hand-held systems in the market have to time- share the resources of the central processing unit (microcontroller or RISC/CISC pro cessor) to perform all its DSP, control and communication tasks, With little or no provisions for a dedicated processor to take the video/audio processing load off the central processor. Moreover, most general-purpose central processors lack the unique architecture needed for optimal DSP performance. Therefore, a mobile video-codec design must have minimal client-end complexity While maintaining consistency on the ef?ciency and robustness front.

SUMMARY OF THE INVENTION

Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined thresh old based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration.

Other features of the present invention Will be apparent from the accompanying draWings and from the detailed description Which folloWs.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by Way of example and not limitation in the ?gures of the accompanying draWings in Which like references indicate similar elements.

FIG. 1 is a block diagram illustrating an exemplary multi media streaming system according to one embodiment.

Page 21: US7522774
Page 22: US7522774
Page 23: US7522774
Page 24: US7522774
Page 25: US7522774
Page 26: US7522774
Page 27: US7522774
Page 28: US7522774
Page 29: US7522774
Page 30: US7522774
Page 31: US7522774
Page 32: US7522774
Page 33: US7522774
Page 34: US7522774
Page 35: US7522774
Page 36: US7522774