142
John Doyle Control and Dynamical Systems Caltech Theory of Comple x Networ ks

John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Embed Size (px)

Citation preview

Page 1: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

John DoyleControl and Dynamical

Systems Caltech

Theory of

Complex

Networks

Page 2: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

FinanceTransportation

Energy

Information

Consumer

Utilities

Manufacturing

Commerce

Health

Our lives are run by/with networks

Emergency

Page 3: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

ConvergentNetworks

Transportation

Energy

Information

Consumer

Manufacturing

CommerceHealth

Environment

Emergency

Finance

Utilities

Page 4: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Convergent networking: the promise

Ubiquitous computing, communications, and

control• that is embedded and

intertwined• via sensors and actuators • in complex networks of

networks, with layers of protocols and feedback.

Resulting in:

• Seamless integration and automation of everything

• Efficient and economic operation

• Robust and reliable services

ConvergentNetworks

Transportation

Energy

Information

ConsumerManufacturing

CommerceHealth

Environment

Emergency

Finance

Utilities

Page 5: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Convergent networking: the reality

• Right now, back in Los Angeles, we can experience (in addition to smog, earthquakes, fires, floods, riots, lawyers,…)– Widespread and prolonged power outages from lightning strikes in

Washington (or just “nonequilibrium market fluctuations”).

– Widespread and prolonged flight delays from weather or ATC software glitches in Chicago or Atlanta.

– Internet meltdowns caused by hackers in Moscow.

– Financial meltdowns caused by brokers in Singapore.

• What can we expect?– Widespread and prolonged meltdowns of integrated power,

transportation, communication, and financial networks caused by lightning strikes in Singapore or a new release of MS Windows 2020?

Page 6: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Elements of systems

• Sense the environment and internal state

• Extract what’s novel

• Communicate or store what’s novel

• Extract what’s useful

• Compute decisions based on what’s useful

• Take action

• Evaluate consequences

• Repeat

DataIs not novel informationIs not useful Information

Is not knowledge Is not understanding

Is not wisdomIs not action Is not results

HARDER

We want results

Page 7: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Two great abstractions of the 20th Century

1. Separate systems engineering into control, communications, and computing

– Theory

– Applications

2. Separate systems from physical substrate• Facilitated massive, wildly successful, and explosive

growth in both mathematical theory and technology…

• …but creating a new Tower of Babel where even the experts do not read papers or understand systems outside their subspecialty.

Page 8: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Tower of Babel

• Issues for theory– Rigor– Relevance– Accessibility

• Spectacular success on the first two• Little success on the last one, which is critical for a

multidisciplinary approach to systems biology• Perhaps all three is impossible?• (In contrast, there are whole research programs in

“complex systems” devoted exclusively to accessibility. They have been relatively “popular,” but can be safely ignored in biology.)

Page 9: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Biology and advanced technology

• Biology– Integrates control, communications, computing– Into distributed control systems– Built at the molecular level

• Advanced technologies will do the same• We need new theory and math, plus

unprecedented connection between systems and devices

• Two challenges for greater integration:– Unified theory of systems– Multiscale: from devices to systems

Page 10: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Compute

Communicate Communicate

StoreCommunicate

Communications and computing

Page 11: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Compute

Sense

EnvironmentEnvironment

Act

Communicate Communicate

StoreCommunicate

Page 12: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Computation

Devices

Dynamical SystemsDynamical Systems

DevicesCommunication Communication

Control

Page 13: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

From• Software to/from human• Human in the loop

To• Software to Software• Full automation• Integrated control,

comms, computing• Closer to physical

substrateCompute

Communicate Communicate

Store

Communicate

Computation

Devices

Dynamical SystemsDynamical Systems

Devices

Communication Communication

Control

• New capabilities & robustness• New fragilities & vulnerabilities

Page 14: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Theoretical foundations

• Computational complexity: decidability, P-NP-coNP

• Information theory: source and channel coding

• Control theory: feedback, optimization, games

• Dynamical systems: dynamics, bifurcation, chaos

• Statistical physics: phase transitions, critical phenomena

• Unifying theme: uncertainty management

• Different abstractions and relaxations

• Integrating these theories involves new math, much not traditionally be viewed as “applied,” e.g..– Perturbation theory of operator Banach algebras

– Semi-algebraic geometry

Page 15: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Uncertainty management

• Each domain faces similar abstract issues and tradeoffs, but with differing details:

• Sources of uncertainty

• Limited resources

• Robust strategies

• Fundamental tradeoffs

• Ignored issues

Page 16: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Control theory

• Sources of uncertainty: plant uncertainty and sensor noise

• Limited resources: sensing, actuation, and computation

• Robust strategies: feedback control and related methods

• Fundamental tradeoffs: Bode’s integral formula, RHP zeros, saturations, …

• Ignored issues: communications in distributed control, software reliability

Page 17: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Information theory

• Sources of uncertainty: source and channel

• Limited resources: storage, bandwidth, and computation

• Robust strategies: coding

• Fundamental tradeoffs: capacity, rate-distortion

• Ignored issues: feedback and dynamics

Page 18: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Computation complexity

• Sources of uncertainty: intractability, problem instance

• Limited resources: computer time and space

• Robust strategies: algorithms

• Fundamental tradeoffs: P/NP/Pspace/undecidable

• Ignored issues: real-time, uncertainty in physical systems

Page 19: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Software correctness

• Sources of uncertainty: bugs, user inputs

• Limited resources: computer time and space

• Robust strategies: formal verification

• Fundamental tradeoffs: computational complexity

• Ignored issues: real-time, uncertainty in physical systems

Page 20: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Multiscale physics

• Sources of uncertainty: initial conditions, unmodeled dynamics, quantum mechanics

• Limited resources: computer time and space, measurements

• Robust strategies: coarse graining, renormalization??

• Fundamental tradeoffs: energy/matter, entropy, quantum, etc…

• Ignored issues: robustness, rigor, computation, etc• (This looks mostly fixable.)

Page 21: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Unified theory of uncertainty management

• Sources of uncertainty: plant, multiscale physics, sensors, channels, bugs, user inputs

• Limited resources: computer time and space, energy, materials, bandwidth, actuation

• Robust strategies: ??

• Fundamental tradeoffs: ??

• Ignored issues: human factors

Page 22: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Progress

• Unified view of web and internet protocols– Good place to start

– Add feedback and dynamics to communications

– Observations: fat tails (Willinger)

– Theory: Source coding and web layout (Doyle)

– Theory: Channel coding and congestion control (Low)

• Unified view of robustness and computation– Anecdotes from engineering and biology

– New theory (especially Parrilo)

– Not enough time today…

Page 23: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Bonus!

• “Unified systems” theory helps resolve fundamental unresolved problems at the foundations of physics

• Ubiquity of power laws (statistical mechanics)• Shear flow turbulence (fluid dynamics)• Macro dissipation and thermodynamics from micro

reversible dynamics (statistical mechanics)• Quantum-classical transition• Quantum measurement• Thus the new mathematics for a unified theory of

systems is directly relevant to multiscale physics• The two challenges (unify and multiscale) are

connected.

Page 24: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Network protocols.

HTTP

TCP

IP

Routers

Files

packetspacketspacketspacketspacketspackets

Page 25: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Web servers

web traffic

Is streamed out on the net.

Creating internet traffic

Webclient

Web/internet traffic

Page 26: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Web servers

web traffic

Is streamed out on the net.

Creating internet traffic

Webclient

Let’s look at some web traffic

Page 27: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

Decimated dataLog (base 10)

Forest fires1000 km2

(Malamud)

WWW filesMbytes

(Crovella)

Data compression

(Huffman)

Los Alamos fire

Cumulative

Page 28: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

FrequencyFires

Web filesCodewords

Cumulative

Log (base 10)

-1/2

-1

Page 29: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

Decimated dataLog (base 10)

Forest fires1000 km2

(Malamud)

WWW filesMbytes

(Crovella)

Data compression

(Huffman)

Los Alamos fire

Cumulative

>1e5 files

>4e3 fires

Page 30: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

10-2

10-1

100

100

101

102

20th Century’s 100 largest disasters worldwide

US Power outages (10M of customers)

Natural ($100B)

Technological ($10B)

Page 31: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

10-2

10-1

100

100

101

102

Log(Cumulative frequency)

Log(size)

= Log(rank)

Page 32: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

0 2 4 6 8 10 12 140

20

40

60

80

100

size

rank

Natural ($100B)

Technological ($10B)

Page 33: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

100

101

102

1

2

3

10

100

10-2

10-1

100

Log(size)

Log(rank)

Page 34: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

100

101

102

20th Century’s 100 largest disasters worldwide

US Power outages (10M of customers)

Natural ($100B)

Technological ($10B)

Slope = -1(=1)

10-2

10-1

100

Page 35: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

Decimated dataLog (base 10)

Forest fires1000 km2

WWW filesMbytes

Data compression

Cumulative

-1/2

-1

Page 36: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency Forest fires1000 km2

WWW filesMbytes

Data compression

Cumulative

-1/2

-1

exponential

Page 37: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

100

101

102

103

10-4

10-3

10-2

10-1

100

loglog

.5

1

semilogy

20 40 60 80 100

0.2

0.6

1

linearPlotting power laws

and exponentials

exp

Page 38: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency Forest fires1000 km2

WWW filesMbytes

Data compression

Cumulative

exponential

All events are close in size.

Page 39: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency Forest fires1000 km2

WWW filesMbytes

Data compression

Cumulative

-1/2

-1

Most events are small

But the large events are huge

Page 40: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency Forest fires1000 km2

WWW filesMbytes

Data compression

Cumulative

-1/2

-1

Most events are small

But the large events are huge

Robust

Yet Fragile

Page 41: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Robustness of HOT systems

Robust

Fragile

Robust(to known anddesigned-foruncertainties)

Fragile(to unknown

or rareperturbations)

Uncertainties

Page 42: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Large scale phenomena is extremely non-Gaussian

• The microscopic world is largely exponential

• The laboratory world is largely Gaussian because of the central limit theorem

• The large scale phenomena has heavy tails (fat tails) and power laws

Page 43: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Size of events x vs. frequency

log(size)

)1()( xxpdx

dPlog(probability)

log(Prob > size)

xPlog(rank)

Page 44: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-1 0 1 2 3 4 5

0

-1

-2

-3

-4

log10(x)

log10(P)

x integer

1e3 samples from a known distribution:

x10

10

=1

xP

10( )

10P X x

x

Page 45: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

)( xXP

Slope = -

=1

=0

=1

=0Cumulative Distributions

dx

dPxp )(

Slope = -(+1)Noncumulative

Densities

Page 46: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

=0

=1

Cumulative Distributions

NoncumulativeDensities

Correct

Wrong

=0

Page 47: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The physics view

• Power laws are “suggestive of criticality”• Self-organized criticality (SOC)• Examples where this holds:

– Phase transitions in lab experiments– Percolation models– Rice pile experiments

• No convincing examples in technology, biology, ecology, geophysical, or socio-economic systems

• Special case of “new science of complexity”• Complexity “emerges” at a phase transition or

bifurcation “between order and disorder.”• Doesn’t work outside the lab.

Page 48: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

Forest fire

SOC = .15

Page 49: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

SOC = .15

Page 50: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

= .15Cumulative distributions

=.15Noncumulative densities, logarithmic binning

Page 51: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

FrequencyFires

Web filesCodewords

Cumulative

Log (base 10)

-1/2

-1

Page 52: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The HOT view of power laws(w/ Jean Carlson, UCSB)

• The central limit theorem gives power laws as well as Gaussians

• Many other mechanisms (eg multiplication noise) yield power laws

• A model producing a power law is per se uninteresting• A model should say much more, and lead to new

experiments and improved designs, policies, therapies, treatments, etc.

Page 53: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The HOT view of power laws

• Engineers design (and evolution selects) for systems with certain typical properties:

• Optimized for average (mean) behavior

• Optimizing the mean often (but not always) yields high variance and heavy tails

• Power laws arise from heavy tails when there is enough aggregate data

• One symptom of “robust, yet fragile”

Page 54: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

HOT and fat tails?

• Surprisingly good explanation of statistics (given the severity of the abstraction)

• But statistics are of secondary importance

• Not mere curve fitting, insights lead to new designs

• Understanding design

Page 55: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Examples of HOT fat tails?

• Power outages• Web/Internet file traffic• Forest fires• Commercial aviation delays/cancellations• Disk files, CPU utilization, …• Deaths or dollars lost due to man-made or natural

disasters?• Financial market volatility?• Ecosystem and specie extinction events?• Other mechanisms, examples?

Detailed simulations

Page 56: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Examples with additional mechanisms?

• Word rank (Zipf’s law)

• Income and wealth of individuals and companies

• Citations, papers

• Social and professional networks

• City sizes

• Many others….

• (Simon, Mandelbrot, …)

Page 57: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data

Page 58: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

Page 59: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

Decimated dataLog (base 10)

WWW filesMbytes

(Crovella)

Cumulative Most files are small

(mice)

Most packets are in large files (elephants)

Page 60: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

NetworkNetwork

Sources

Mice

Elephants

Router queues

Page 61: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

NetworkNetwork

Sources

Mice

Elephants

Router queues

Delay sensitive

Bandwidth sensitive

Page 62: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

cheap

Expensive

• We’ll focus to begin with on similar tradeoffs in internetworking between bandwidth and delay. • We’ll assume TCP (via retransmission) eliminates loss, and will return to this issue later.

Delay

BW

BW = Bandwidth sensitive trafficDelay = Delay sensitive traffic

Page 63: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BWBulk transfers (most packets)

Web navigation, voice (most files)

• Mice: many small files of few packets which the user presumably wants ASAP• Elephants: few large files of many packets for which average bandwidth will be more important than individual packet delay• Most files are mice but most packets are in elephants…•…which is the manifestation of fat tails in the web and internet.

Page 64: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BWBulk transfers (most packets)

Web navigation, voice (most files)

Claim I: Current traffic dominated by these two types of flows

Claim II: Intrinsic feature of many future network applications

Page 65: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data

Page 66: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

Page 67: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

Decimated dataLog (base 10)

WWW filesMbytes

(Crovella)

Cumulative Most files are small

(mice)

Most packets are in large files (elephants)

Page 68: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

Size of events

Frequency

WWW filesMbytes

Data compression

Cumulative

exponential

All events are close in size.

Page 69: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Source coding for data compression

Based on frequencies of source word occurrences,

Select code words.

To minimize message length.

Page 70: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Source coding for data compression

Objectives:• Optimally compress file• Tractable compression• Tractable decompression

Shannon:• Optimally compress ensemble• Tractable compression• Tractable decompression

Kolmogorov:• Optimally compress file• Undecidable compression• Intractable decompression

• Surprise: natural and practical• Stochastic relaxation

• Philosophically important• Turing, Godel, Chaitin, …

Page 71: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Shannon coding

• Ignore value of information, consider only “surprise”• Compress average codeword length (over stochastic

ensembles of source words rather than actual files)• Constraint on codewords of unique decodability• Equivalent to building barriers in a zero dimensional tree• Optimal distribution (exponential) and optimal cost are:

DataCompression

length log( )

exp( )i i

i i

l p

p cl

Avg. length =

log( )

i i

i i

p l

p p

Page 72: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Shannon source coding

1i i iJ p l r Minimize expected

length

source words with probabilities pi

length of codewords li

unique decodability

2 11

log( )

il

i

i i

rl r

Kraft’s inequality

Page 73: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

2 11

log( )

il

i

i i

rl r

Kraft’s inequality =Prefix-less code

0

110

11

100

101

110

111

1110

1111

11100

1110111110

11111

010010111011100111011111011111

Codewords

2 1il

Page 74: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

2 11

log( )

il

i

i i

rl r

0

110

11

100

101

110

111

1110

1111

11100

1110111110

11111

010010111011100111011111011111

Codewords

2 1il

0 dimensional (discrete) tree

Kraft’s inequality =Prefix-less code

cut in a 0-dim tree

Page 75: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

2 1il

Kraft’s inequality =Prefix-less code

Channel noise

Coding = building barriers

Source coding Channel coding

Page 76: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Control = building barriers

Page 77: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

( ) log( )l r r

log( )i il p

1i i iJ p l r

Leads to optimal solutions for codeword lengths.

With optimal cost log( )i iJ p p

Minimize

Equivalent to optimal barriers on a discrete tree (zero dimensional).

Page 78: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

( ) log( )l r r

log( )i il p

1i i iJ p l r

log( )i iJ p p • Compressed files look like white noise.• Compression improves robustness to limitations in

resources of bandwidth and memory.• Compression makes everything else much more fragile:

– Loss or errors in compressed file– Statistics of source file

• Information theory also addresses these issues at the expense of (much) greater complexity

Page 79: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

0 1 2-1

0

1

2

3

4

5

6

DC

Data

Avg. length =

log( )

i i

i i

p l

p p

How well does the model predict the data?

length log(

exp( )

)i i

i i

l p

p cl

Page 80: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

0 1 2-1

0

1

2

3

4

5

6

DC

Data + Modellength log(

exp( )

)i i

i i

l p

p cl

Avg. length =

log( )

i i

i i

p l

p p

How well does the model predict the data?

Not surprising, because the file was compressed using

Shannon theory.

Small discrepancy due to integer lengths.

Page 81: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Why is this a good model?

• Lots of models will reproduce an exponential distribution

• Shannon source coding lets us systematically produce optimal and easily decodable compressed files

• Fitting the data is necessary but far from sufficient for a good model

Page 82: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Web layout as generalized “source coding”

• Keep parts of Shannon abstraction:– Minimize downloaded file size– Averaged over an ensemble of user access

• Equivalent to building 0-dimensional barriers in a 1- dimensional tree of content

Page 83: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

document

split into N files to minimize download time

A toy website model(= 1-d grid HOT design)

Page 84: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

# links = # files

Optimize 0-dimensional cuts in a 1-dimensional document

Page 85: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

More complete website models

(Zhu, Yu, Effros)

• Necessary for web layout design• Statistics consistent with simpler models• Improved protocol design (TCP)• Commercial implications still unclear

Page 86: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Generalized “coding” problems

Web

Data compression

• Optimizing d-1 dimensional cuts in d dimensional spaces…

• To minimize average size of files • Models of greatly varying detail all give a consistent

story.• Power laws have 1/d.• Completely unlike criticality.

Page 87: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

PLR optimization

RrlpJ iiiMinimize

expected loss

P: uncertain events with probabilities pi

L: with loss li

R: limited resources ri

P L R

DC source codewords decodability

WWW user access files web layout

Page 88: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

document

split into N files to minimize download time

1l r r = density of links or filesl = size of files

Page 89: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

pi = Probability

of event

drl

li = volume enclosed

ri = barrier density

d-dimensional

i

d

id

i lrl

1

,

Resource/loss relationship:

Page 90: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

1)(

r

crl

RrlpJ iii

PLR optimization

d

= 0 data compression = 1 web layout

= “dimension”

Page 91: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

RrlpJ iii

PLR optimization

= 0 data compression

=0 is Shannon

source coding

0

0

1

)log(

)(

r

c

r

rl

Page 92: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

0

0)log()log(

1

1

1

ii

iiii

pRc

ppRpp

J

p

01

0)log()(

r

cr

rl

1

)1/(1

)1/(1

j

j

ii

p

Rpcl

RrlpJ iii

11

ipcR

Minimize average cost using standard Lagrange multipliers

With optimal cost

Leads to optimal solutions for resource allocations and the relationship between the event probabilities and sizes.

Page 93: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

01

0)log()(

r

cr

rl

1

)1/(1

)1/(1

j

j

ii

p

Rpcl

RrlpJ iii

11

ipcR

Minimize average cost using standard Lagrange multipliers

With optimal cost

Leads to optimal solutions for resource allocations and the relationship between the event probabilities and sizes.

011

0)log(

1

1

1

i

ii

p

pp

J

Page 94: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

1

)1/(1

)1/(1

j

j

ii

p

Rpcl

To compare with data.

ip

Forward engineering

il ir

Reverse engineering

Page 95: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

(1 1/ )

1 2

ˆ i

i

p

c l c

1

)1/(1

)1/(1

j

j

ii

p

Rpcl

ip il ir

Reverse engineering

To compare with data.

Page 96: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

(1 1/ )

1 2

ˆ i

i

p

c l c

1

)1/(1

)1/(1

j

j

ii

p

Rpcl

1

ˆ

ˆi

i i ik i

P

p l l

Cumulative

ii Pl ˆ,plot

sizesfromdata

computeusingmodel

Page 97: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data

Page 98: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

Page 99: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Typical web traffic

log(file size)

> 1.0log(freq > size)

p s-

Web servers

Heavy tailed web traffic

Is streamed out on the net.

Creating fractal Gaussian internet traffic (Willinger,…)

2

3 H

Page 100: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Fat tail web traffic

Is streamed onto the Internet

creating long-range correlations with 2

3 H

time

Page 101: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

Are individual websites distributed like this?

Roughly, yes.

Page 102: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

-6 -5 -4 -3 -2 -1 0 1 2-1

0

1

2

3

4

5

6

WWWDC

Data + Model/Theory

How has the data changed since 1995?

Steeper. Consistent with more use of cross hyperlinks.

Page 103: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

More complete website models

(Zhu, Yu, Effros)

• More complex hyperlinks leads to steeper distributions with 1< < 2

• Optimize file sizes within a fixed topology:• Tree: 1• Random graph: 2

• No analytic solutions

Page 104: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The broader Shannon abstraction

• Information = surprise… and therefore ignoring– Value or timeliness of information

– Topology of information

• Separate source and channel coding– Data compression

– Error-correcting codes (expansion)

• Eliminate time and space– Stochastic relaxation (ensembles)

– Asymptopia

• Brilliantly elegant and applicable, but brittle• Better departure point than Kolmogorov, et al

Page 105: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

What can we keep?

• Separation:– Source and channel

– Congestion control and error correction

– Estimation and control

• Tractable relaxations– Stochastic embeddings

– Convex relaxations

• Add to information:– Value

– Time and dynamics

– Topology

– Feedback

• More subtle treatment of computational complexity

• Naïve formulations intractable

What must we change?

Page 106: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Distortion

achievable

notRate distortion theory

studies tradeoffs between bandwidth and distortion

from lossy coding.

Page 107: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

cheap

Expensive

• We’ll focus to begin with on similar tradeoffs in internetworking between bandwidth and delay. • We’ll assume TCP (via retransmission) eliminates loss, and will return to this issue later.

Delay

BW

BW = Bandwidth sensitive trafficDelay = Delay sensitive traffic

Page 108: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BWBulk transfers (most packets)

Web navigation, voice (most files)

• Mice: many small files of few packets which the user presumably wants ASAP• Elephants: few large files of many packets for which average bandwidth will be more important than individual packet delay• Most files are mice but most packets are in elephants…•…which is the manifestation of fat tails in the web and internet.

Page 109: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BWBulk transfers (most packets)

Web navigation, voice (most files)

Claim I: Current traffic dominated by these two types of flows

Claim II: Intrinsic feature of many future network applications

Page 110: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

NetworkNetwork

Sources

Mice

Elephants

Router queues

Page 111: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

NetworkNetwork

Sources

Mice

Elephants

Router queues

Delay sensitive

Bandwidth sensitive

Page 112: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BWBulk transfers (most packets)

Web navigation, voice (most files)

Claim (channel): We can tweak TCP using ECN and REM to make these flows co-exist.

Currently: Delays are aggravated by queuing delay and packet drops from congestion caused by BW traffic?

Specifically:• Keep queues empty (ECN/REM). • BW slightly improved (packet loss)• Delay greatly improved (queuing)• Provision network for BW• “Free” QOS for Delay• Network level stays simple

Page 113: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Log(bandwidth)

Log(delay)

Delay

BW

Claim (source): Many (future) applications are natural and intrinsically coded into exactly this kind of fat-tailed traffic.

Expensive

The rare traffic that can’t or won’t will be expensive, and essentially pay for the rest.

Page 114: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Fat tailed traffic is “intrinsic”

• Two types of application traffic are important: communications and control

• Communication to and/or from humans (from web to virtual reality)

• Sensing and/or control of dynamical systems• Claim: both can be naturally “coded” into fat-tailed

BW + delay traffic • This claim needs more research

Log(bandwidth)

Log(delay)

Delay

BW

Expensive

Page 115: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Abstraction I

• Separate source and channel coding• Source is coded into

– Delay sensitive mice

– Bandwidth sensitive elephants

• “Channel coding” = congestion control

Log(bandwidth)

Log(delay)

Delay

BW

Expensive

Page 116: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Putting loss back into the picture

• Packet loss can be handled by coding (application) or retransmission (transport)

• Need coherent theory to perform tradeoffs• Currently, congestion control and reliable transport are

intertwined• What benefits would derive from some decoupling,

enabled by ECN or other explicit congestion control strategies?

Log(BW)

Log(d)

Loss?

Page 117: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Optimization/control framework

• Application specific cost functions J(app,delay,loss,BW) (assume to be minimized)

• Network resources:lines, routers, queues (energy, spectrum, deployment, repair, stealth, security, etc)

• Comm/control network is embedded in other networks (transportation, energy, military action, …)

• Robustness to uncertainties in users and resources• Need to flesh out details for future scenarios

Page 118: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Optimization/control framework• Global optimal allocation sets lower bound on

achievable performance• Control problem is to find decentralized strategies

(eg TCP/IP) with (provably) near optimal performance and robustness in dynamical setting

• Duality theory key to using network • Coding and control interact in unfamiliar ways• Naïve formulations intractable:

– Computation intractable– Requires too much information not available to

decentralized agents

• Key is to find tractable relaxations

Page 119: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Optimization/control framework• Pioneered by Kelly et al and extended by

Low et al.• Ambitious goal: foundation for (much?)

more unified theory of computation, control, and communications

• Hoped for outcome:– Rich theoretical framework– Motivated by practical problems– Yielding principled design of new protocols– And methods for deploying and managing

complex networks

Page 120: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Scalable Congestion Control

LINKS

SOURCES

( )fR s

( )TbR s

ROUTING + DELAY

p : link prices

y : aggregate link flows

x : source rates

q : aggregate prices per source

(Paganini, Doyle & Low ’01)

Page 121: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Robustness, evolvability/scalability, verifiability

Ideal performance

Robustness

Evolvability

Verifiability

Typical design IP

Page 122: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Robustness of HOT systems

Robust

Fragile

Robust(to known anddesigned-foruncertainties)

Fragile(to unknown

or rareperturbations)

Uncertainties

Page 123: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Feedback and robustness

• Negative feedback is both the most powerful and most dangerous mechanism for robustness.

• It is everywhere in engineering, but appears hidden as long as it works.

• Biology seems to use it even more aggressively, but also uses other familiar engineering strategies:– Positive feedback to create switches (digital systems)

– Protocol stacks

– Feedforward control

– Randomized strategies

– Coding

Page 124: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The Internet hourglass

IP

Web FTP Mail News Video Audio ping napster

Applications

TCP SCTP UDP ICMP

Transport protocols

Ethernet 802.11 SatelliteOpticalPower lines BluetoothATM

Link technologies

From Hari Balakrishnan

Page 125: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The Internet hourglass

IP

Web FTP Mail News Video Audio ping napster

Applications

TCP SCTP UDP ICMP

Transport protocols

Ethernet 802.11 SatelliteOpticalPower lines BluetoothATM

Link technologies

From Hari Balakrishnan

Everythingon IP

IP oneverything

Page 126: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Commodities,Hardware

Consumers,Applications

RobustMesoscale

Robust, yet fragile

Hardware

Applications

TCP/IP

Page 127: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

UncertaintyUncertainty

UncertaintyUncertaintyCommodities,

Hardware

Consumers,Applications

RobustMesoscale

Robust

Page 128: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Commodities,Hardware

Consumers,Applications

RobustMesoscale

Yet fragile

Difficult to change

Page 129: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Yet fragile

Protocols allow for the creation of large complex networks, with rare but catastrophic cascading failures.

Page 130: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Software

Hardware

Early computing

Analogsubstrate

Variousfunctionality

Digital

Page 131: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Software

Hardware

Hardware

Applications

OperatingSystem

ModernComputing

Page 132: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Robust, yet fragile

Analogelectronics

Variousfunctionality

Digital

Uncertainsubstrate

Variedfunctionality

Robustmesoscale

Page 133: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Commodities

Consumers

Money

Commodities

Consumers

Barter

Page 134: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Commodities

Consumers

Money

Investments

Investors

Markets,Insitutions

Page 135: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

The hourglass

Dress Shirt Slacks Lingerie Coat Scarf Tie

Garments

Cloth

Sewing

Wool Cotton NylonRayon Polyester

Material technologies

Page 136: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Producers

Consumers

Energy

Energy

• 110 V, 60 Hz AC• Gasoline• ATP, glucose, etc• Proton motive force

Page 137: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Hardware

Applications

TCP/IP

• Decentralized• Asynchronous

Robust to:• Network topology• Application traffic• Delays, link speeds

High performanceNecessity:

Essentially only onedesign is possible

Page 138: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

Hardware

Applications

TCP/IP

• Decentralized• Asynchronous

Robust to:• Network topology• Application traffic• Delays, link speeds

High performanceNecessity:

Essentially only onedesign is possible

The existing designis incredible, but…

It’s a product of evolution,and is not optimal.

Page 139: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

1 dimension

All

None

desi

gnControl Theory

Statistical Physics

Dynamical Systems

Information TheoryComputational

Complexity

Theory ofComplex systems?

Page 140: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

1 dimension

All

None

desi

gnControl Theory

Statistical Physics

Dynamical Systems

Information TheoryComputational

Complexity

Biology• Non-equilibrium• Highly tuned or optimized• Finite but large dimension

Page 141: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

1 dimension

All

None

desi

gnControl Theory

Statistical Physics

Dynamical Systems

Information TheoryComputational

Complexity

Theory needs• Integrated horizontally and vertically• Horizontal: control, communications, computing• Vertical: multiscale physics

• Status: nascent but promise results• Bonus: unexpected synergy

Page 142: John Doyle Control and Dynamical Systems Caltech Theory of Complex Networks

• Ubiquity of power laws• High shear turbulence• Dissipation• Quantum/classical transition• Quantum measurement

1 dimension

All

None

desi

gnControl Theory

Statistical Physics

Dynamical Systems

Information TheoryComputational

Complexity