12
Carc 02.03 [email protected] 02. Information theory 02.03. Representation of non-numerical sets Texts Images Signals (Audio/Video) Redundancy and compression Computer Architecture [email protected]

CArcMOOC 02.03 - Encodings of non-numerical sets

Embed Size (px)

Citation preview

Page 1: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

02. Information theory02.03. Representation of non-numerical sets

• Texts

• Images

• Signals (Audio/Video)

• Redundancy and compression

Computer Architecture

[email protected]

Page 2: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Text

1. A text is a sequence of characters

2. Each character is taken from a finite alphabete

3. Using a constant-size encoding for the characters, a text is encoded as a concatenation of character codes

4. ASCII: 7-bit encoding

5. Extended ASCII: 8-bit encoding

Page 3: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Images

1. An image is a matrix of points with assigned colors

2. An image contains infinite points and each point may take infinite colors

3. Both space and color discretization required

4. Discretized points are called pixels

5. Pixels are organized on a matrix

6. Using a constant size encoding for each pixel, an image is a concatenation of pixels, to be read in a given order

Page 4: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Color (gray) levels

1111

1110

1101

1100

1011

1010

1001

1000

0111

0110

0101

0100

0011

0010

0001

0000

The encoding associates a unique code with an

interval of gray levels

All gray levels within the interval are associated

with the same code, thus loosing informationThe original gray level cannot be exactly

reconstructed from the code

Encoding associates each code with a unique gray

level (representative of a class)

Page 5: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

2D images

Gray level

x

y

nlev

nx

ny

pixel

levyx nnnsize 2log

Page 6: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Example

100x100x1bit100x100x8bit

50x50x1bit50x50x8bit

10x10x8bit 10x10x1bit

Page 7: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Analog and digital signals

• Signal: time-varying physical quantity• Analog: continuous-time, continuous-value

• Digital: discrete-time, discrete-value

• The digital encoding of a continuous signal entails:• Sampling (i.e., time discretization)

• Quantization (i.e., value discretization)

sizerate sTssize

Sampling rate

Duration

Sample size

Page 8: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Audio: time series

time

value

levratesizerate nTssTssize 2log

Page 9: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Video

yxcolratesizerate nnnlogTssTssize 2

srate = frame rate

ncol = number of colors

nxny = frame size

time

ny

nx

color

Page 10: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Redundancy

• Redundant encoding: encoding that makes use of more than the minimum number of digits required by an exact encoding

MN Slog

• Motivations for redundancy:

– Providing more expressive/natural encoding/decoding rules

– Reliability (error detection)

Ex: parity encoding

– Noise immunity / fault tolerance (error correction)

Ex: triplication

Page 11: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

01101

• Parity encoding:

– A parity bit is used to guarantee that all codewords have an

even number of 1’s

– Single errors are detected by means of a parity check

Redundancy: examples

0010 00101

000000111000

parity check

0

1

error

Irredundant codeword

• Triple redundancy:

– Each character is repeats 3 times

– Single errors are corrected by means of a majority voting

000000111010

error

0 0 1 0 voting result

Page 12: CArcMOOC 02.03 - Encodings of non-numerical sets

Carc 02.03

[email protected]

Compression

• Lossy compression• Compression achieved at the cost of reducing the accuracy of the

representation

• The original representation cannot be restored

• Always effective

• Lossless compression• Compression achieved by either removing redundancy or

leveraging content-specific opportunities

• The original representation can be restored

• Not always effective