Lecture6: ARITHMETIC CODES - GUC · Lecture6: ARITHMETIC CODES The third element 2, results in the following update equations i.e., the interval containing the tag for the sequence

SOURCE CODING PROF. A.M.ALLAM

1

Lecture6: ARITHMETIC CODES

Fortunately, we can compute a tag for a given sequence of symbols, only from

probability of individual symbols, or the probability model using a recursive formula

for the upper and lower limit of the interval contains the tag

)4(2

)()()( nn

X

luXT

Notice that throughout this process we did not need to compute any joint probabilities

Using the midpoint of the interval for the tag, then

Therefore, the tag for any sequence of length m can be computed in a sequential

fashion. The only information required by the tag generation procedure is the CDF

of the source, which can be obtained directly from the probability model

In general, we can show that for any sequence X =( x1x2…xn)

)2()1()( )1()1()1()(

nX

nnnn xFlull

)3()()( )1()1()1()(

nX

nnnn xFlulu


LECTURES 11/20/2016 2


From the probability model we can get

30.1)(,0.1)3(,82.0)2(,8.0)1(,00)( kforkFFFFkforkF XXXXX

Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.8, P(a2)=0.02, and P(a3)=0.18.

Encode the sequence 1 3 2 1

Initializing u(0) to 1, and l(0) to 0, then using equation (2) &(3), the first element of the

sequence 1 results in the following update

00)01(0)0()( )0()0()0()1( XFlull

8.08.0)01(0)1()( )0()0()0()1( XFlulu

i.e., the interval containing the tag for the sequence 1 is [0 , 0.8)

The second element of the sequence is 3 , using the update equations we get

656.082.08.0)2()08.0(0)2( xFl X

8.00.18.0)3()08.0(0)2( xFu X

i.e., the interval containing the tag for the sequence 13 is [0.656 , 0.8)

Define the random variable X(ai) = i


LECTURES 11/20/2016 3


The third element 2, results in the following update equations

i.e., the interval containing the tag for the sequence 132 is [0.7712 , 0.77408)

The last element 1, the upper and lower limits of the interval containing the tag are

The tag for the sequence 1 3 2 1 can be generated using equation (4 ) as



Decoding Graphically numberorletterlasttheofxx lowial int

ervalsub

ervalsublow

recursiverange

xxx

int

int


5


-Since, the tag forms a unique representation for the sequence, then the binary representation of

the tag forms a unique binary code for the sequence

(B) Generating Binary Code of the Tag

LECTURES 11/20/2016

-Let us assume a 8 bit binary sign magnitude fixed point representation

comprising a sign bit, three integer bits, and four fractional bits

-The sign bit is used only to represent the sign of the value

(0= positive, 1 = negative) [0 is only considered in arithmetic coding]

-Let us give an example; assume:

Three integer bits that can be used to represent an integer in the range 0 to 7

[Not relevant to Arithmetic coding]

4 bits that can represent from 0.0 to 0.934, which is divided as follows:


6

Lecture6: ARITHMETIC CODES Signed Fixed-Point Arithmetic

LECTURES 11/20/2016


7

Lecture6: ARITHMETIC CODES Signed Fixed-Point Arithmetic

LECTURES 11/20/2016


8

Lecture6: ARITHMETIC CODES Signed Fixed Point Arithmetic

LECTURES 11/20/2016


9


Hence, A binary code for can be obtained by taking the binary representation of this

number and truncating it

)(xTX

1)(

1log)(

xPxl

LECTURES 11/20/2016

To make the code efficient, the binary representation has to be truncated

We have said that the tag forms a unique representation for the sequence. This means that

the binary representation of the tag forms a unique binary code for the sequence

However, we have placed no restrictions on what values in the unit interval the tag can

take. The binary representation of some of these values would be infinitely long, in which

case, although the code is unique, it may not be efficient

It is efficient as the length of the sequence m increased

mXHxlXH

2)()()(

This code can be proven as unique and, uniquely detectable


10


Ex: For the alphabet source A ={ a1, a2, a3 , a4} with P(a1)=1/2, P(a2)=1/4, and

P(a3)=P(a4)=1/8.

Using equation (1 ) or graphically you can get the mid point tag for each symbol as

LECTURES 11/20/2016

Define the random variable X(ai) =i


0.1)4(,875.0)3(,75.0)2(5.0)1(,00)( XXXXX FFFFkforkF

The binary code of each symbol are

The truncated length and e binary code of each symbol are


LECTURES 11/20/2016 11


Tag Generation with Scaling Big Problem

Consider the values of l(n) and u(n) in tag generation , as n gets larger, these

values come closer and closer together

i.e., in order to represent all the subintervals uniquely we need increasing

precision as the length of the sequence increases

however, the binary representation of these values would be infinitely long

,i.e., not efficient code but unique

In a system with finite precision, the two values are bound to converge, and we

will lose all information about the sequence from the point at which the two

values converged

To avoid this situation, we need to rescale the interval

We would also like to perform the encoding incrementally i.e., to transmit portions of the

code as the sequence is being observed, rather than wait until the entire sequence has

been observed before transmitting the first bit


LECTURES 11/20/2016 12


Synchronized Rescaling and Incremental Coding

Consider the case the interval is confined to either the upper half [0,0.5) with

most significant bit 0 or lower half [0.5,1) with most significant bit is 1

We can indicate to the decoder whether the tag is confined to the upper or lower half of the unit

interval by sending the first bit of the tag a “1” for the upper half and a” 0” for the lower half

, we can ignore the halfencoder and decoder know which half contains the tagOnce the

concentrate on the half containing the tagof the unit interval not containing the tag and

interval as: ) 1,0 and mapping that half interval containing the tag to the full [

E1: [0 ,0.5) → [0, 1) E1(x )= 2x

E2 : [0.5, 1) → [0 ,1) E2(x )= 2(x−0.5)

We can now continue with this process, generating another bit of the tag every

time the tag interval is restricted to either half of the unit interval


LECTURES 11/20/2016 13


00)01(0)0()( )1()0()0()1(

X

n Flull

8.08.0)01(0)1()( )0()0()0()1( XFlulu

The interval[ 0, 0.8) is not confined to either the upper or lower

half of the unit interval, so we proceed


30.1)(,0.1)3(,82.0)2(,8.0)1(,00)( kforkFFFFkforkF XXXXX

Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.8, P(a2)=0.02, and P(a3)=0.18.

Encode the sequence 1 3 2 1

Initializing u(0) to 1, and l(0) to 0, then using equation (2) &(3), the first element of the

sequence 1 results in the following update

Define he random variable X(ai) = i


LECTURES 11/20/2016 14


656.082.08.0)2()08.0(0)2( xFl X

8.00.18.0)3()08.0(0)2( xFu X

The second element of the sequence is 3 which results in the update

The interval [0.656, 0.8) is contained entirely in the upper half of

the unit interval, so we send the binary code 1 and rescale using E2

The third element of the sequence is 2 which results in the update

The interval [0.5424, 0.54816) is contained entirely in the upper half

of the unit interval, so we send the binary code 1 and rescale


LECTURES 11/20/2016 15


The interval is contained entirely in the lower half of the unit

interval, so we send the binary code 0 and rescale using E1


interval, so we send the binary code 0 and go to rescale using E1




16


The interval is contained entirely in the upper half of the unit


Continuing with the last element, 1, which results in the update

At this point sending the binary representation of any value in the final tag interval

Generally, this value is taken to be l (n)

In this particular example, it is convenient to use the value of 0.5. The binary representation of

0.5 is .10 , thus, we would transmit a 1 followed by as many 0s as required by the word length of

the implementation being used

Notice that the tag interval size at this stage (0.504256-0.3568) is approximately 64 times

the size it was when we were using the unmodified algorithm (0.773504-0.7712)

It solves the finite precision problem


LECTURES 11/20/2016 17


The bits that we have been sending with each mapping constitute the tag itself, which

satisfies our desire for incremental encoding which is

1100011

We can find that the binary number .1100011 corresponds to the decimal

number 0.7734375

Notice that this number lies within the final tag interval of the unmodified

algorithm, therefore, we could use this to decode the sequence

Documents

Lecture6: ARITHMETIC CODES - GUC · Lecture6: ARITHMETIC CODES The third element 2, results in the following update equations i.e., the interval containing the tag for the sequence