Upload
audi
View
25
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Physics of information. ‘Communication in the presence of noise ’ C.E. Shannon, Proc. Inst. Radio Eng. (1949) ‘Some informational aspects of visual perception’ , F. Attneave, Psych. Rev. (1954). Ori Katz [email protected]. Talk overview. Information capacity of a physical channel - PowerPoint PPT Presentation
Citation preview
Physics of informationPhysics of information‘Communication in the presence of noise’
C.E. Shannon, Proc. Inst. Radio Eng. (1949)
‘Some informational aspects of visual perception’, F. Attneave, Psych. Rev. (1954)
Ori Katz Ori Katz [email protected]@weizmann.ac.il
Talk overviewTalk overview
• Information capacity of a physical channel
• Redundancy, entropy and compression
• Connection to biological systems
Emphasis concepts, intuitions, and examples
A little backgroundA little background
• An extension of “A mathematical theory of communications”, (1948).
• The basis for information theory field (first use in print of ‘bit’)
• Shannon worked for Bell-labs at the time.
• His Ph.D thesis: “An algebra for theoretical genetics”, was never published
• Built the first juggling machine (‘W.C.Fields’), and a mechanical-mouse with learning capabilities (‘Theseus’)
‘‘W.C. Fields’W.C. Fields’
‘‘Theseus’Theseus’
A general communication systemA general communication system
Shannon’s route for this abstract problem:1) Encoder codes each message continuous waveform s(t)2) Sampling theorem: s(t) represented by finite number of samples3) Geometric representation: samples a point in Euclidean space.4) Analyze the addition of noise (physical channel)
a limit on reliable transmission rate
Added noise
Information destination
TransmitterEncoder ReceiverPhysical Channel
(bandwidth W)Decoder
Information source
‘message’ Continuous function s(t) ‘message’
Continuous function s(t)
+n(t)
s(t) – pressure amplitude
The (Nyquist/Shannon) sampling theoremThe (Nyquist/Shannon) sampling theorem
t t2 ...3 t
Vn=[s(Δt), s(2 Δt),…]
dtetsfS fti 2)()(
• Transmitted waveform = a continuous function in time s(t), bandwidth (W) limited by the physical channel: S(f>W)=0
• sample its values at discrete times Δt=1/fs: (fs = sampling frequency)
• s(t) can be represented exactly by the discrete samples Vn as long as:
fs 2W (Nyquist sampling rate)
• Result: waveform of duration T, is represented by 2WT numbers
= a vector in 2WT-dimensions space:
V=[s(1/2W), s(2/2W),… , s(2WT/2W)]
Fourier (freq.) domain:
S(f>W)=0
An example for Nyquist rate – a music CDAn example for Nyquist rate – a music CD
Anecdotes:
• Exact rate was inherited from late 70’s magnetic-tape storage conversion devices.
• Long debate between Philips (44,056 samples/sec) and Sony (44,100 samples/sec)...
• Audible human-ear frequency range: 20Hz - 20KHz
• The Nyquist rate is therefore: 2 x 20KHz = 40KHz
• CD sampling rate = 44.1KHz, fulfilling Nyquist rate.
The geometric representationThe geometric representation• Each continuous signal s(t) of duration T and bandwidth W, mapped to
a point in 2WT-dimension space (coordinates = sampled amplitudes):
V = [x1,x2,…, x2WT] = [s(1/2W), …, s(2WT/2W)]
In our example:
A 1 hour CD recording a single point in a space having:
44,100 x 60sec x 60min = 158.8x106 dimensions (!!)
• The norm (distance2) in this space is measures signal power / total energy An Euclidean space metric
WTPEWdttsWxdTW
nn 22)(2 2
2
1
22
Addition of noise in the channelAddition of noise in the channel• Example in a 3-dimensional space (first 3 samples in the CD):
V = [x1,x2,…, x2WT] = [s(Δt), s(2Δt), …, s(T)]
x1
x3
x2
“mapping”
• Addition of white Gaussian (thermal) noise with an average power N smears each point into a sphere cloud with a radii N:
• For large T, noise power N (statistical average)
Received point, located on sphere shell: distance = noise N
“clouded” sphere of uncertainty becomes rigid
P P
N
VS+N = [s(Δt)+n(Δt), s(2Δt)+n(2Δt), …, s(T)+n(T)]
The number of distinguishable messagesThe number of distinguishable messages• Reliable transmission: receiver must distinguish between any two
different messages, under the given noise conditions
x1
x3
x2
• Max number of distinguishable messages (M) the ‘sphere-packing’ problem in 2TW dimensions:
TW
NM
2NP
}Nradii a with ereVolume{Sph }NPradii a with ereVolume{Sph
volumesphere volumeaccesible
• Longer mapped message, ‘rigid’-er spheres probability to err is as small as one wants (reliable transmission)
P P
N
The channel capacityThe channel capacity• Number of distinguishable messages (coded as signals of length T):
• Number of different distinguishable bits:
• The reliably transmittable bit-rate (bits per unit time):
TW
NM
2NP
N
WTbitsC NPlog#
2
NWC P1log2
The celebrated ‘channel capacity theorem’ by Shannon.
- Also proved that C can be reached
(in bits/second)
N
TWMbits NPloglog# 22
Signal to Noise Ratio (SNR)Channel bandwidth
frequency
|S(f)
|2
Gaussian white noise = Thermal noise?Gaussian white noise = Thermal noise?• With no signal, the receiver measures a fluctuating noise • In our example: pressure fluctuations of air molecules impinging on the
microphone (thermal energy):
• The statistics of thermal noise is Gaussian: P{s(t)=v} exp(-(m/2KT)v2)• The power spectral-density is constant: (power-spectrum |S(f)|2=const)
time
ampl
itude
(pre
ssur
e)
P{s=v}
KT2
“white”
“pink/brown”
Some examples for physical channelsSome examples for physical channelsChannel capacity limit:
1) Speech (e.g. this lecture):W=20KHz, P/N=~1 - 100 C 20,000bps – 130,000bpsActual bit-rate = ~ (2 words/sec) x (5 letters/word) x (5 bits/letter) = 50 bps
2) Visual sensory channel: (Images/sec) x (receptors/image) x (Two eyes)
Bandwidth (W) = ~25 x ~50x106 x ~2 = ~2.55x109 Hz P/N > 256
C 2.5x109 x log2(256) = ~20x109 bps
A two-hour movie: 2hours x 60min x 60 sec x 20Gbps = 1.4x1014bits = ~15,000 Gbytes (DVD = 4.7Gbyte)
• We’re not using the channel capacity redundant information• Simplify processing by compressing signal • Extracting only the essential information (what is essential…?!)
NWC P1log2 (in bits/second)
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
16bitOriginal sample: 44.1Ks/s x 16bit/s = 705Kbps (CD quality)
Redundant information demonstration (using Matlab)Redundant information demonstration (using Matlab)
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
4bit
WithWith only 4bit per sample44.1Ks/s x 4bit/s = 176.4Kbps
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
3bit
With only 3bit per sampleWith only 3bit per sample44.1Ks/s x 3bit/s = 132.3Kbps
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
2bit
With only 2bit per sampleWith only 2bit per sample44.1Ks/s x 2bit/s = 88.2Kbps
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
1bit
With only 1bit per sample (!)With only 1bit per sample (!)44.1Ks/s x 1bit/s = 44.1Kbps
Sounds not-too-good, but the essence is there…Main reason: not all of ‘phase-space’ is accessible by mouth/ear
Another example: (smart) high-compression mp3 algorithm: @16KbpsSound (OLE2)
Visual redundancy / compressionVisual redundancy / compression• Images: Redundancies in Attneave’s paper image compression formats
“a bottle” on “a table”a bottle” on “a table”
(1954) 80x50 pixels
(2008)400x600
704Kbyte .bmp30.6Kbyte .jpg10.9Kbyte .jpg
8Kbyte .jpg6.3Kbyte .jpg5Kbyte .jpg4Kbyte .jpg
- edges
- short-range similarities
- patterns
- repetitions
- symmetries
- repetitions
- etc, etc….
• Movies: the same + consecutive images are similar…• Text: future ‘language’ lesson (Lilach & David)
What information is essential??(evolution…?)
How many bits are needed to code a message?• Intuitively: #bits = log2M (M - possible messages)
• Regularities/Lawfulness smaller M
• some messages more probable can do better than log2M
• Can code a message with: (without loss of information)
• Intuition: Can use shorter bit-strings for probable messages.
How much can we compress?How much can we compress?
iM
ii MpMpmessage
bits )(log)( 2Source
‘Entropy’
Example: M=4 possible messages (e.g. tones):
‘A’ (94%), ‘B’ (2%), ‘C’ (2%), ‘D’ (2%)
1) Without compression: 2 bits/message:
‘A’00, ‘B’01, ‘C’10, ‘D’11.
2) A better code:
‘A’0, ‘B’10 , ‘C’110, ‘D’111
<bits/message> = 0.94x1 + 0.02x2 + 2x (0.02x3) = 1.1 bits/msg
lossless-compression example lossless-compression example (entropy code)(entropy code)
42.002.0log02.0394.0log94.0)(log)( 222 iM
ii MpMpentropysource
• The only measure that fulfills 4 ‘physical’ requirements:
1. H=0 if P(Mi)=1.
2. A message with P(Mi)=0 does not contribute
3. Maximum entropy for equally distributed messages
4. Addition of two independent messages-spaces:
Hx+y = Hx+Hy
Why entropy?Why entropy?
iM
ii MpMpmessage
bits )(log)( 2
Any regularity probable patterns lower entropy (redundant information)
The speech Vocoder (VOice-CODer)The speech Vocoder (VOice-CODer)
Model the vocal-tract with a small number of parameters.
Lawfulness of speech subspace only fails for musical input
Used by Skype / Google-talk / GSM (~8-15KBps)
The ancestor of modern speech CODECs (COder-DECoders):
The ‘Human organ’
• Information is conveyed via. a physical channel:Cell to cell , DNA to cell, Cell to its descendant , Neurons/nerve system
• The physical channel: concentrations of molecules (mRNA, ions….) as a function of space and time.
• Bandwidth limit: parameters cannot change at an infinite rate (diffusion, chemical reaction timescales…)
• Signal to noise: Thermal fluctuations, environment
• Major difference: not 100% reliable transmission Model: an overlap of non-rigid uncertainty clouds.
• Use channel-capacity theorem at your own risk...
Link to biological systemsLink to biological systems
• Physical channel Capacity theorem
• SNR, bandwidth
• Geometrical representation
• Entropy as a measure of redundancy
• Link to biological systems
SummarySummary