Engineering Serendipity: Successes in Solitary Foraging ... · The Biomimicry Institute David Brazier. Faulty connections Introduction Reﬂections Faulty connections Missed connections

Engineering Serendipity Successes and New Investigations

Engineering Serendipity: Successes in SolitaryForaging and New Investigations in Cooperative

Task-Processing Networks

Theodore (Ted) P. Pavlic – The Ohio State University

Department of Electrical & Computer Engineering

Thursday, May 6, 2010

Overview

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Solitary foraging: fromecology to engineeringand back

Cooperative taskprocessing

Closing remarks


IntroductionReflections

Faulty connections

Missed connections

Motivations

Solitary foraging: from ecology to engineering and backSpeed choice (→)

Impulsiveness and operant conditioning (←)

Long patch residence times (←)

Cooperative task processingBackground

Task-processing network

Cooperation game

Asynchronous convergence to cooperation

Results

Closing remarks

Manufacturing Serendipity

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


1

Craig Tovey: “manufacturing serendipity”

1Compliments to XKCD: http://xkcd.com/720/.

http://xkcd.com/720/

Engineering Manufacturing Serendipity

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


1

Craig Tovey: success with bees and Internet server allocation

1Compliments to XKCD: http://xkcd.com/720/.

http://xkcd.com/720/

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


The Biomimicry Institute

David Brazier

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks



David Brazier

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks



David Brazier(R. Wehner)

Missed Abstract Connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


(Important?) Missed Abstract Connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


IFD (Fretwell 1972; Fretwell and Lucas 1969) ⇐⇒ Optimal power

dispatch (Bergen and Vittal 2000)


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks




Economic dispatch problem:

minimize

n∑

i=1

Ci(Pi) subject to

n∑

i=1

Pi = P


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks





minimize

n∑

i=1

Ci(Pi) subject to

n∑

i=1

Pi = P

Pareto minimization of costs subject to conservation simplex.


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks





minimize

n∑

i=1

Ci(Pi) subject to

n∑

i=1

Pi = P


Solution (from KKT) is an “upside-down” IFD:

dCi(Pi)

dPi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Equalization of marginal cost matches IFD equalization of suitability.


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks





minimize

n∑

i=1

Ci(Pi) subject to

n∑

i=1

Pi = P


Solution (from KKT) is an “upside-down” IFD:

dCi(Pi)

dPi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Equalization of marginal cost matches IFD equalization of suitability.

[ Conical cost combination with simplex constraint set has simple

solution in dual space (i.e., solve for λ). ]


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks




IFD as optimization problem (thought experiment):

maximize

n∑

i=1

∫ xi

0si(y) dy subject to

n∑

i=1

xi = N

Pareto maximization of (???) subject to conservation simplex.

Right-side-up IFD:

si(xi) = λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute xi to equalize suitability.




Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks





maximize

n∑

i=1

Gi(xi) subject to

n∑

i=1

xi = N

Pareto maximization of gain(?) subject to conservation simplex.

Right-side-up IFD:

dGi(xi)

dxi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute xi to equalize marginal gain.




Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks





maximize

n∑

i=1

Gi(ti) subject to

n∑

i=1

ti = T

Maximization of distributed gain subject to limited time inside patch.

Right-side-up IFD:

dGi(ti)

dti= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute ti to equalize marginal gain.




Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks




Risk-sensitive foraging (Stephens and Charnov 1982; Stephens and

Krebs 1986) ⇐⇒ Sharpe ratio/MPT (Sharpe 1966, 1994)


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks






Sharpe (Nobel prize, Economics, 1990) ratio:

E(R)−Rf

σ


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks







E(R)−Rf

σ

Exactly the Z-score ranking method of risk-sensitive foraging theory.


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks







E(R)−Rf

σ

Exactly the Z-score ranking method of risk-sensitive foraging theory.

MPT (then)→ PMPT (now) (stochastic dominance, Bawa 1982)


Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks






Iain Couzin’s (2000+) ⇐⇒ Bertsekas and Tsitsiklis (1997−) (e.g.,

mysterious torus shapes; symmetry; symmetry breaking)

The Tenth Muse

Introduction

Reflections

Faulty connections

Missed connections

Motivations



Closing remarks


Solitary foraging: from ecology toengineering and back

Introduction


Speed choice (→)

Impulsiveness andoperant conditioning(←)

Long patch residencetimes (←)


Closing remarks


Foraging theory for speed choice

Introduction


Speed choice (→)




Closing remarks


Nice homomorphism between solitary foragers and

autonomous vehicles (Andrews et al. 2004; Charnov

1973; Quijano et al. 2006; Stephens and Krebs 1986)

g − c

τ− 1

λ

J∗

−cscs

λ


Introduction


Speed choice (→)




Closing remarks





Fitness surrogate (e.g., calories, target value)

Diverse collection of targets

Opportunity cost: some should be ignored

Rate maximization for long runs

Target/task choice ⇐⇒ prey model

g − c

τ− 1

λ

J∗

−cscs

λ


Introduction


Speed choice (→)




Closing remarks





Vehicle speed choice is very similar to cryptic prey

problem described by Gendron and Staddon (1983)

Ceteris paribus, encounter rate increases with

search speed

Search cost increases with search speed

Detection mistakes may vary with speed

Non-trivial speed–prey choice coupling

Prey =⇒ speed =⇒ rate =⇒ prey

Bobwhite quail(Gendron andStaddon 1983)

Les Howard


Introduction


Speed choice (→)




Closing remarks





Vehicle speed choice is very similar to cryptic prey

problem described by Gendron and Staddon (1983)

Bobwhite quail(Gendron andStaddon 1983)

Les Howard

To match bobwhite quail observations, Gendron and Staddon

choose detection function P di (u) , (1− (u/umax)

Ki)1/Ki that

maps search speed u ∈ [0, umax] to detection probability P di for

tasks of type i with conspicuousness Ki ∈ [0,∞).

No analytical tractability

Chose n = 2 for simulation (1983)

P di is strange at bounds (1 and 0) u/umax

P di (u)

Ki

b

b

b

b

0.4

0.8

1.5

4

On-line prey–speed choice for n ∈ N

(Pavlic 2007; Pavlic and Passino 2009)

Introduction


Speed choice (→)




Closing remarks


Autonomous vehicle faces n-way merged Poisson process

λi: encounter rate for task of type i

(gi, ti): average (value, time) for processing task of type i

pi: probability that task of type i is processed (decision)

cs: cost per-unit-time of searching



Introduction


Speed choice (→)




Closing remarks







Vehicle goes through cycles of searching and processing

G: average per-encounter gain

T : average per-encounter search and processing time

G(t): Markov renewal–reward process for accumulated gain



Introduction


Speed choice (→)




Closing remarks







Long runtime =⇒ maximize rate of return

aslimt→∞

G(t)

t=

G

T=

−cs +n∑

i=1λipigi

1 +n∑

i=1λipiti

, R(p)

As expected, type-II functional response (Holling’s disk equation

without any sandpaper disks).



Introduction


Speed choice (→)




Closing remarks







In general, pi ∈ [0, 1], but

∂R(p)

∂pi=

λigi

(

1 +n∑

j=1λjpjtj

)

− λiti

(

−cs +n∑

j=1λjpjgj

)

(

1 +n∑

i=1λipiti

)2



Introduction


Speed choice (→)




Closing remarks







So KKT reveals optimization is purely O(2n) combinatorial

∂R(p)

∂pi=

λigi

(

1 +n∑

j=1j 6=i

λjpjtj

)

− λiti

(

−cs +n∑

j=1j 6=i

λjpjgj

)

(

1 +n∑

i=1λipiti

)2

So-called zero–one rule because p∗i ∈ 0, 1



Introduction


Speed choice (→)




Closing remarks







So KKT reveals optimization is purely O(2n) combinatorial

∂R(p)

∂pi=

λigi

(

1 +n∑

j=1j 6=i

λjpjtj

)

− λiti

(

−cs +n∑

j=1j 6=i

λjpjgj

)

(

1 +n∑

i=1λipiti

)2

So-called zero–one rule because p∗i ∈ 0, 1

[ Sufficient condition; not necessary ]



Introduction


Speed choice (→)




Closing remarks







Classical prey ranking refines search from O(2n) to O(n+ 1)

Processed types (p∗i = 1)︷︸︸︷g1t1

>g2t2

> . . . >gk∗

tk∗>

Optimal rate R(p∗)︷︸︸︷

−cs +k∗∑

i=1λigi

1 +k∗∑

i=1λiti

>

Ignored types (p∗i = 0)︷︸︸︷gk∗+1

tk∗+1> . . . >

gntn

where optimal p∗i = [i ≤ k∗] with k∗ ∈ 0, 1, . . . , n



Introduction


Speed choice (→)




Closing remarks







Classical prey ranking does not depend on λ (i.e., speed)

Processed types (p∗i = 1)︷︸︸︷g1t1

>g2t2

> . . . >gk∗

tk∗>

Optimal rate R(p∗)︷︸︸︷

−cs +k∗∑

i=1λigi

1 +k∗∑

i=1λiti

>

Ignored types (p∗i = 0)︷︸︸︷gk∗+1

tk∗+1> . . . >

gntn

where optimal p∗i = [i ≤ k∗] with k∗ ∈ 0, 1, . . . , n


Effects of speed

Introduction


Speed choice (→)




Closing remarks


Speed u ∈ [umin, umax] ⊂ [0,∞) influences each encounter rate

λi(u) = uDiPdi (u)

where Di is the linear density in the population

b

bb

b


Effects of speed

Introduction


Speed choice (→)




Closing remarks



λi(u) = uDiPdi (u)


Detection function is linear interpolation of probability bounds

0

1

u

P di (u)

b

bb

b

umin umax

high

low

P di (u) = P ℓ

i u+ P ai


Effects of speed

Introduction


Speed choice (→)




Closing remarks



λi(u) = uDiPdi (u)



0

1

u

P di (u)

b

bb

b

umin umax

high

low

P di (u) = P ℓ

i u+ P ai

Search cost is also assumed to be affine function

cs(u) = csℓu+ csa


Effects of speed

Introduction


Speed choice (→)




Closing remarks



λi(u) = uDiPdi (u)



0

1

u

P di (u)

b

bb

b

umin umax

high

low

P di (u) = P ℓ

i u+ P ai


ci(u) = cℓiu+ cai

[ Processing costs can be modeled in a similar way ]


Effects of speed

Introduction


Speed choice (→)




Closing remarks



λi(u) = uDiPdi (u)



0

1

u

P di (u)

b

bb

b

umin umax

high

low

P di (u) = P ℓ

i u+ P ai


cs(u) = csℓu+ csa

[ Processing costs can be modeled in a similar way ]

[ . . . but not here. ]


Value of speed

Introduction


Speed choice (→)




Closing remarks


After regrouping, new objective function

R(p, u) =G2(p)u

2 +G1(p)u+G0(q)

T2(p)u2 + T1(p)u+ 1

where coefficients

G2(p) ,n∑

i=1

DipigiPℓi

G1(p) ,

n∑

i=1

DipiPai gi − csℓ

G0(p) , −csa

T2(p) ,

n∑

i=1

pitiDiPℓi

T1(p) ,n∑

i=1

pitiDiPai

are constant with respect to u (i.e., biquadratic ratio)


Value of speed

Introduction


Speed choice (→)




Closing remarks


After regrouping, new objective function

R(p, u) =G2(p)u

2 +G1(p)u+G0(q)

T2(p)u2 + T1(p)u+ 1

where coefficients

G2(p) ,n∑

i=1

DipigiPℓi

G1(p) ,

n∑

i=1

DipiPai gi − csℓ

G0(p) , −csa

T2(p) ,

n∑

i=1

pitiDiPℓi

T1(p) ,n∑

i=1

pitiDiPai

are constant with respect to u (i.e., biquadratic ratio)

Find optimal u∗ for each p∗ candidate (n+ 1 total)


Finding optimal speed

Introduction


Speed choice (→)




Closing remarks


Because biquadratic objective, for each p∗ candidate,

∂R(u)

∂u=

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

)2

By KKT, if quadratic numerator root u∗ ∈ [umin, umax], then u∗ is

optimal speed; otherwise, optimal speed u∗ ∈ umin, umax based

on sign of numerator



Introduction


Speed choice (→)




Closing remarks



∂R(u)

∂u=

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

)2




Implement O(n+ 1) algorithm on-line if Di density estimates

available (Dubin’s car AAV simulations with speed filtering, Pavlic

and Passino 2009)



Introduction


Speed choice (→)




Closing remarks



∂R(u)

∂u=

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

)2




Implement O(n+ 1) algorithm on-line if Di density estimates

available (Dubin’s car AAV simulations with speed filtering, Pavlic

and Passino 2009)

Non-trivial to guarantee convergence of density estimates on-line

Estimation process =⇒ type-II type-III functional response

Estimation, impulsiveness, and the operant laboratory(Pavlic and Passino 2010c)

Introduction


Speed choice (→)




Closing remarks


Laboratory impulsiveness (Ainslie 1974; Bateson and Kacelnik

1996; Bradshaw and Szabadi 1992; Green et al. 1981; McDiarmid

and Rilling 1965; Rachlin and Green 1972; Siegel and Rachlin 1995;

Snyderman 1983; Stephens and Anderson 2001)


Introduction


Speed choice (→)




Closing remarks


Laboratory impulsiveness

Using starvation, animals are trained to use a Skinner box

Repeat mutually exclusive binary-choice trials (at low weight)


Introduction


Speed choice (→)




Closing remarks





What can be inferred about Skinner box results?

Usually assume simultaneous encounters occur with

probability zero (Poisson assumption)

Mutually exclusive choices when prey is immobile?

Patch impulsiveness vanishes (Stephens et al. 2004)

Attention (Monterosso and Ainslie 1999; Siegel and Rachlin 1995)


Introduction


Speed choice (→)




Closing remarks











Worst-case scenario for a robot

Predisposes robots to underestimate


Introduction


Speed choice (→)




Closing remarks











Worst-case scenario for an animal?

Predisposes animals to underestimate?


Introduction


Speed choice (→)




Closing remarks


Estimation of per-type densities only necessary for speed regulation


Introduction


Speed choice (→)




Closing remarks



Graphical description of optimal prey choice:

Processing time

Processinggain

For type i: or @ (processing time ti, gain g(ti))

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

t: Total search and processing time

G(t):

Accumulatednet

gain

Type-# encounter: # (Process) or # (Ignore)

g1

t1

g2

t2

g3

t3

g4

t4

g5

t5

R∗

1

2

2

1

3 3

4 5

Process encounter k when gi(k)/ti(k) > G(t(k))/t(k)


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗


Rule (even with mistakes) is optimal facing Poisson encounters (i.e.,

simultaneous w.p.0)


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗



simultaneous w.p.0)

Digestive rate constraints (bi: prey bulk) (Hirakawa 1995):

n∑

i=1λipibi

1 +n∑

i=1λipiti

≤ BKKT=⇒

p∗1 = 1

...

p∗k∗−1 = 1

p∗k∗ ∈ [0, 1]

Partial Preferences(rank by gi/bi)

Digression


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗



simultaneous w.p.0)

Ecological–physiological hybrid method (Whelan and Brown 2005):

Asymptotic gut constraint ⇐⇒ Rank bygi

ti + tbi

Digression


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗



simultaneous w.p.0)



ti + tbi

Process encounter k when gi(k)/(ti(k) + tbi(k)) > G(t(k))/t(k)

t: Search and non-ballast handling time

G(t):

Accumulatednet

gain


g1

t1 + tb1

g2

t2 + tb2

2

2 1

2 2

1

2 2

2

∼

Handling and search time (no ballast time)Accumulatednet

gain

i : Digestive profitability line for type i

g1

t1 + tb1

1

2

3

4

g5

t5 + tb5

5

Digression


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗



simultaneous w.p.0)



ti + tbi



G(t):

Accumulatednet

gain


g1

t1 + tb1

g2

t2 + tb2

2

2 1

2 2

1

2 2

2

∼

0

0.25

0.50

0.75

1.00

0 1 2 3 4 5

Types (ordered by digestive profitability)Proportion

ofencounters

accepted

PartialPreferences

Digression


Introduction


Speed choice (→)




Closing remarks




Processing time

Processinggain


(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

R∗

=⇒

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗



simultaneous w.p.0)



ti + tbi



G(t):

Accumulatednet

gain


g1

t1 + tb1

g2

t2 + tb2

2

2 1

2 2

1

2 2

2

∼

Handling and search time (no ballast time)Accumulatednet

gain

i : Digestive profitability line for type i

g1

t1 + tb1

1

2

3

4

g5

t5 + tb5

5

SlidingMode

Digression


Introduction


Speed choice (→)




Closing remarks





G(t):

Accumulatednet

gain


g1

t1

g2

t2

g3

t3

g4

t4

g5

t5

R∗

1

2

2

1

3 3

4 5

∼

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗


Attention: simultaneous encounter (w.p.0) =⇒ low time first


Introduction


Speed choice (→)




Closing remarks





G(t):

Accumulatednet

gain


g1

t1

g2

t2

g3

t3

g4

t4

g5

t5

R∗

1

2

2

1

3 3

4 5

∼

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗

R∗type-2-only




Introduction


Speed choice (→)




Closing remarks





G(t):

Accumulatednet

gain


g1

t1

g2

t2

g3

t3

g4

t4

g5

t5

R∗

1

2

2

1

3 3

4 5

∼

Time

Accumulatednet

gain

g1

t1

g2

t2

R∗

R∗depressed


Attention: simultaneous encounter (w.p.1) =⇒ either first

Bifurcation; lucky runs accumulate high initial estimate


Introduction


Speed choice (→)




Closing remarks





G(t):

Accumulatednet

gain


g1

t1

g2

t2

g3

t3

g4

t4

g5

t5

R∗

1

2

2

1

3 3

4 5

∼

Time

Accumulatednet

gain

gp

tp

g1

t1

g2

t2

R∗

R∗type-2-only



Rescue optimality with early ad libitum feeding

Sunk costs and long patch residence times(Pavlic and Passino 2010b)

Introduction


Speed choice (→)




Closing remarks


Nolet et al. (2001) are unable to explain spatial differences in tundra

swan foraging


Introduction


Speed choice (→)




Closing remarks



swan foraging

In shallow water, swans feeding on tubers can “head dip”

In deep water, they must “up end,” which requires more energy

Nolet et al. find it strange that swans spend longer at the more

energetic task

Gemma Longman

Derek Tsang


Introduction


Speed choice (→)




Closing remarks



swan foraging




energetic task

Other sunk cost/Concorde effects (Arkes and Blumer 1985;

Arkes and Ayton 1999; Dawkins and Carlisle 1976; Kanodia

et al. 1989; Staw 1981)


Introduction


Speed choice (→)




Closing remarks



swan foraging




energetic task

Other sunk cost/Concorde effects (Arkes and Blumer 1985;

Arkes and Ayton 1999; Dawkins and Carlisle 1976; Kanodia

et al. 1989; Staw 1981)

Observations consistent with rate maximization when patch entry

costs are modeled


Introduction


Speed choice (→)




Closing remarks



swan foraging


costs are modeled. For n = 1,

R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1−a− 1

λ1

R∗a

t∗1a

Due to entry costs, searching is a less desirable task


Introduction


Speed choice (→)




Closing remarks



swan foraging



R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1−a

−b

− 1λ1

R∗b

t∗1bt∗1a



Introduction


Speed choice (→)




Closing remarks



swan foraging



R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1

−b

−c

− 1λ1

R∗c

t∗1ct∗1b



Introduction


Speed choice (→)




Closing remarks



swan foraging



R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1

−b

−c

− 1λ1

R∗c

t∗1ct∗1b


May explain overstaying as well (Nonacs 2001)

Cooperative task processing

Introduction



Background

Task-processingnetwork

Cooperation game

Asynchronousconvergence tocooperation

Results

Closing remarks


Cooperation for distributed decentralized networks

Introduction



Background


Cooperation game


Results

Closing remarks


Cooperative control usually involves coordination of agents on

(possibly ad hoc) networks

e.g., Global utility functions to maximize

e.g., Projections onto non-separable spaces (i.e., not

Cartesian products)

Challenges to fast and cheap implementation


Introduction



Background


Cooperation game


Results

Closing remarks




Nash (i.e., competitive) equilibria are solutions to separable

variational inequality problems

Amenable to parallel solvers

Used by communication theorists on networks for congestion

control (Altman et al. 2005a,b; Buttyan and Hubaux 2003;

Shakkottai et al. 2006)

Strong connection to biological (and sociological) models of

emergent cooperation in nature


Introduction



Background


Cooperation game


Results

Closing remarks






Used in control to model unknown/unknowable

Typically used in control to model noise or enemy movements

(e.g., worst-case scenarios) or actions of humans in the system

Task conservation is a challenge to communication-like

application of Nash methods to task flow control


Introduction



Background


Cooperation game


Results

Closing remarks







Existing task-processing networks (TPN) (Cruz 1991; Perkins and

Kumar 1989) focus on robustness, not optimality:

Flexible manufacturing system, network components =⇒bounded queues/burstiness

Behaviors are static (i.e., no feedback)


Introduction



Background


Cooperation game


Results

Closing remarks







Existing task-processing networks (TPN) (Cruz 1991; Perkins and

Kumar 1989) focus on robustness, not optimality:

So here, elements merged from communication, TPN, and possible

analogous systems in nature (e.g., Cooperative breeding, Hamilton

and Taborsky 2005)

Try to design system so that Nash equilibrium has

characteristics that are globally favorable

Definition(Pavlic and Passino 2010a)

Introduction



Background


Cooperation game


Results

Closing remarks


A task-processing network is a directed graph:

A ⊂ N: Set of task-processing agents

P ⊆ (i, j) ∈ A2 : i 6= j: Directed arcs connecting distinct agents

Vi , j ∈ A : (j, i) ∈ P: Set of conveyors for each i ∈ A

Ci , j ∈ A : (i, j) ∈ P: Set of cooperators for each i ∈ A

V , j ∈ A : Cj 6= ∅: Set of all conveyors

C , i ∈ A : Vi 6= ∅: Set of all cooperators

Task flows at each agent:

Yi ⊂ N: Possibly empty set of task types that arrive at conveyor i ∈ A

λkj ∈ R>0 : Encounter rate of type-k tasks at agent j ∈ A (e.g., Poisson encounters)

πkj ∈ [0, 1]: Probability that conveyor j ∈ A advertises an incoming k-type task to its connected cooperators Cj

γi ∈ [0, 1]: Probability that cooperator i ∈ A volunteers for advertised task from one of its connected conveyors Vi(collected in γ)

TPN examples(Pavlic and Passino 2010a)


Input streams (k ∈ Yj ⊆ 1, 2, 3):

Conveyors (j ∈ V = 1, 2): V3 = 1 1 V4 = 1, 2 2 V5 = 2

Cooperators (i ∈ C = 3, 4, 5): 3 C1 = 3, 4 4 C2 = 4, 5 5

Y1 = 1, 2 Y2 = 1, 2, 31 2 1 2 3

π11

π21

π12

π22

π32γ3 γ4 γ5

Rate λkj λ1

1 λ21 λ1

2 λ22 λ3

2

Send request @ πkj

Accept request @ γi

Taskarrivals

Nodes andprocessingrequests

Flexible manufacturing system (FMS)



Input streams (k ∈ Yj ⊆ 1, 2, 3):

Conveyors (j ∈ V = 1, 2): V3 = 1 1 V4 = 1, 2 2 V5 = 2

Cooperators (i ∈ C = 3, 4, 5): 3 C1 = 3, 4 4 C2 = 4, 5 5

Y1 = 1, 2 Y2 = 1, 2, 31 2 1 2 3

π11

π21

π12

π22

π32γ3 γ4 γ5

Rate λkj λ1

1 λ21 λ1

2 λ22 λ3

2

Send request @ πkj

Accept request @ γi

Taskarrivals

Nodes andprocessingrequests

Flexible manufacturing system (FMS)

Cooperative breeders?



b

b

×

×

×

×

×

×

×

×

×

×

× ×

×

×

×

×

×

×

×

×

×

×

×

×

×

××

×××

×

×

×

×

×

×

×

××

×

×

×

×

××

×

××

×

×

λ11

λ22 λ3

3

1

2 3

AAV patrol scenario

⇐⇒

b

b

1

2 3

πjj

γi

C = V = 1, 2, 3

Ci = Vi = 1, 2, 3 − i

Yj = j

1

2 3

λ11

λ22 λ3

3

AAV TPN

Cooperation gameMetrics of volunteering

Introduction



Background


Cooperation game


Results

Closing remarks


Need to develop an agent-based metric of performance that

catalyzes cooperation


Introduction



Background


Cooperation game


Results

Closing remarks




Following foraging example, define utility function Ui(γ) based on

rate of gain


Introduction



Background


Cooperation game


Results

Closing remarks




Following foraging example, define utility function Ui(γ) based on

rate of gain

To simplify presentation of combinatorial volunteering analysis,

introduce SOBP and SOMS.


Introduction



Background


Cooperation game


Results

Closing remarks



introduce SOBP and SOMS.

I : finite index set

Ω , γii∈I : indexed family with γi ∈ [0, 1] for each i ∈ I

For g, h ∈ N and Γ ⊆ I ,

SOBPg(Γ) ,

|Γ|∑

ℓ=0

1

g + ℓ

∑

C⊆Γ|C|=ℓ

((∏

i∈C

γi

)(∏

k∈Γ−C

(1− γk)

))

SOMSh(Γ) ,

|Γ|∑

ℓ=0

(−1)ℓ1

h+ ℓ

∑

C⊆Γ|C|=ℓ

(∏

i∈C

γi

)

Several useful relationships between SOBP and SOMS.


Introduction



Background


Cooperation game


Results

Closing remarks



introduce SOBP and SOMS. For Γ ⊆ A,

SOBP1(i, k, ℓ − i)

= (1− γk)(1− γℓ) +1

2γk(1− γℓ) +

1

2γℓ(1− γk) +

1

3γkγℓ

(i.e., sum of binomial products)

For conveyor j ∈ V and cooperator i ∈ Cj = i, k, ℓ,SOBP1(i, k, ℓ − i) is probability that i is chosen to process

an advertised task from j ∈ Vi (given that it volunteered)


Introduction



Background


Cooperation game


Results

Closing remarks



introduce SOBP and SOMS. For Γ ⊆ A,

SOBP1(i, k, ℓ − i)

= (1− γk)(1− γℓ) +1

2γk(1− γℓ) +

1

2γℓ(1− γk) +

1

3γkγℓ

(i.e., sum of binomial products)

For conveyor j ∈ V and cooperator i ∈ Cj = i, k, ℓ,SOBP1(i, k, ℓ − i) is probability that i is chosen to process

an advertised task from j ∈ Vi (given that it volunteered)

SOMS gives curvature information about SOBP

Properties of SOMS and SOBP provide bounds for convergence

analysis (i.e., Lyapunov/non-deterministic set stability)

Cooperation gameAgent utility function – rate of gain


For i ∈ C, the rate of gain

Ui(γ) ,

Conveyor part — constant with respect to γi︷︸︸︷

bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸

Pr(Volunteer from Ci|Advertisement from i)

ri + γi

∑

j∈Vi

(−

Pr(i awarded task from j|i volunteers)︷︸︸︷

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part




Ui(γ) ,


bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸


ri + γi

∑

j∈Vi

(−


SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

where

bi ,∑

k∈Yi

λki

(

bki − c

ki

)

ri ,∑

k∈Yi

λki π

ki

(

rki −

(

bki − c

ki

))

are the costs and benefits of local process-

ing on i ∈ V




Ui(γ) ,


bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸


ri + γi

∑

j∈Vi

(−


SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

and

cij ,∑

k∈Yj

λkj π

kj c

kij

are the costs and benefits to i ∈ C for vol-

unteering for tasks exported from j ∈ Vi




Ui(γ) ,


bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸


ri + γi

∑

j∈Vi

(−


SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

where

bi ,∑

k∈Yi

λki

(

bki − c

ki

)

ri ,∑

k∈Yi

λki π

ki

(

rki −

(

bki − c

ki

))


ing on i ∈ V

and

cij ,∑

k∈Yj

λkj π

kj c

kij






Ui(γ) ,


bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸


ri −Qipi(Qi) + γi

∑

j∈Vi

(pij(Qj) −


SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part — γi and Qj vary with γi

where

bi ,∑

k∈Yi

λki

(

bki − c

ki

)

ri ,∑

k∈Yi

λki π

ki

(

rki −

(

bki − c

ki

))

pi(Qi) ,∑

k∈Yi

λki π

ki p

ki (Qi)


ing on i ∈ V

and

cij ,∑

k∈Yj

λkj π

kj c

kij

pij(Qj) ,∑

k∈Yj

λkj π

kj q

kijp

kj (Qj)



Fictitious payment functions added as stabilizing controls (Qi ,∑

j∈Ciγj )




Ui(γ) ,


bi +

(

1 −∏

j∈Ci

(1 − γj)

)

︸︷︷︸


ri −Qipi(Qi) + γi

∑

j∈Vi

(pij(Qj) −


SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part — γi and Qj vary with γi

where

bi ,∑

k∈Yi

λki

(

bki − c

ki

)

ri ,∑

k∈Yi

λki π

ki

(

rki −

(

bki − c

ki

))

pi(Qi) ,∑

k∈Yi

λki π

ki p

ki (Qi)


ing on i ∈ V

and

cij ,∑

k∈Yj

λkj π

kj c

kij

pij(Qj) ,∑

k∈Yj

λkj π

kj q

kijp

kj (Qj)



Fictitious payment functions added as stabilizing controls (Qi ,∑

j∈Ciγj )

Cournot oligopolies on a graph

Nash equilibriumExistence, uniqueness, and asynchronous convergence

Introduction



Background


Cooperation game


Results

Closing remarks


Natural choice for distributed variational inequality is local gradient

ascent


Introduction



Background


Cooperation game


Results

Closing remarks



ascent

Asynchronous system is governed by difference inclusion (not

difference equation)


Introduction



Background


Cooperation game


Results

Closing remarks



ascent



For set stability, sufficient to show synchronous system is a

contraction mapping

Also gives existence and uniqueness of Nash equilibrium


Introduction



Background


Cooperation game


Results

Closing remarks



ascent




contraction mapping


Because γ ∈ [0, 1]|C| comes from product topology of intervals,

must use block maximum norm (‖γ‖∞ , maxi∈C|γi|)


Introduction



Background


Cooperation game


Results

Closing remarks



ascent




contraction mapping


Because γ ∈ [0, 1]|C| comes from product topology of intervals,

must use block maximum norm (‖γ‖∞ , maxi∈C|γi|)

Procedure leads to constraints on payment functions and topology

Asynchronous convergence to Nash equilibriumPayment and topological constraints

Introduction



Background


Cooperation game


Results

Closing remarks


Assume that (Payment and topological constraints):


Introduction



Background


Cooperation game


Results

Closing remarks



1. For all i ∈ C and j ∈ Vi, pij is a stabilizing payment function

For k ∈ N, p′(Q) , dp(Q)/dQ < 0 for all Q ∈ [0, k]

For k ∈ N, p′′(Q) , d2p(Q)/dQ2 > 0 for all Q ∈ [0, k]

For k ∈ N, γp′′(Q) ≤ −p′(Q) for all Q ∈ [γ, k − (1− γ)]with γ ∈ [0, 1]


Introduction



Background


Cooperation game


Results

Closing remarks


Q

pℓ(Q)

m > 0

b

b

0

b

0 1 2 k

−m

Q

pe(Q)

τ > 1

b

b

0

A

0 1 2 k

−Aτ

Q

ph(Q)

κ > 0

ε > κ+ 1

b

b

0

A

0 1 2 k

−Aκεκ

Q

pp(Q)

p > 1

q0 > k + p− 1

b

b

0

A

0 1 2 k

−Apq0

Sample stabilizing payment (inverse demand) functions




Introduction



Background


Cooperation game


Results

Closing remarks


Q

pℓ(Q)

m > 0

b

b

0

b

0 1 2 k

−m

Q

pe(Q)

τ > 1

b

b

0

A

0 1 2 k

−Aτ

Q

ph(Q)

κ > 0

ε > κ+ 1

b

b

0

A

0 1 2 k

−Aκεκ

Q

pp(Q)

p > 1

q0 > k + p− 1

b

b

0

A

0 1 2 k

−Apq0




2. For all j ∈ V , |Cj| ≤ 3 (i.e., no conveyor can have more than 3

outgoing links to cooperators)


Introduction



Background


Cooperation game


Results

Closing remarks


Q

pℓ(Q)

m > 0

b

b

0

b

0 1 2 k

−m

Q

pe(Q)

τ > 1

b

b

0

A

0 1 2 k

−Aτ

Q

ph(Q)

κ > 0

ε > κ+ 1

b

b

0

A

0 1 2 k

−Aκεκ

Q

pp(Q)

p > 1

q0 > k + p− 1

b

b

0

A

0 1 2 k

−Apq0




2. For all j ∈ V , |Cj| ≤ 3 (i.e., no conveyor can have more than 3

outgoing links to cooperators)

3. For cooperator i ∈ C and j ∈ Vi, if j is a 3-conveyor (i.e.,

|Cj| = 3), then there must be some conveyor k ∈ Vi that is a

2-conveyor

Asynchronous convergence to Nash equilibriumOther example stable topologies

Introduction



Background


Cooperation game


Results

Closing remarks


5

6

4

7

3

0

8 9

2

10

1

11

4

7

2

10

5

6

1

11

3

8 9

λ44

λ77

λ22

λ1010

λ55

λ66

λ11

λ1111

λ33

λ88 λ9

9

Rich yet stable task-processing network.


Introduction



Background


Cooperation game


Results

Closing remarks


5

6

4

7

3

0

8 9

2

10

1

11

4

7

2

10

5

6

1

11

3

8 9

λ44

λ77

λ22

λ1010

λ55

λ66

λ11

λ1111

λ33

λ88 λ9

9


“Pills” stabilize problematic areas by focussing attention


Introduction



Background


Cooperation game


Results

Closing remarks


5

6

4

7

3

0

8 9

2

10

1

11

4

7

2

10

5

6

1

11

3

8 9

λ44

λ77

λ22

λ1010

λ55

λ66

λ11

λ1111

λ33

λ88 λ9

9


“Pills” stabilize problematic areas by focussing attention

Future research direction: Stable network motifs

Asynchronous convergence to Nash equilibriumTotally asynchronous algorithm

Introduction



Background


Cooperation game


Results

Closing remarks


Define T : [0, 1]n 7→ [0, 1]n by T (γ) , (T1(γ), T2(γ), . . . , Tn(γ)) where, foreach i ∈ C,

Ti(γ) , min1,max0, γi + σi∇iUi(γ)

(i.e., projected gradient ascent)


Introduction



Background


Cooperation game


Results

Closing remarks




(i.e., projected gradient ascent), where

1

σi

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n.


Introduction



Background


Cooperation game


Results

Closing remarks





1

σi

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n. If

minj∈Vi

|p′ij (|Cj |) | >

(

|Vi| −1

2

)

maxj∈Vi

|cij |, for all i ∈ C,

then the totally asynchronous distributed iteration (TADI) sequence γ(t)generated with mapping T and the outdated estimate sequence γi(t) for alli ∈ C each converge to the unique Nash equilibrium of the cooperation game.


Introduction



Background


Cooperation game


Results

Closing remarks





1

σi

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n. If (∝ Hamilton’s rule on networks)

Benefit︷︸︸︷

minj∈Vi

|p′ij (|Cj |) | >

Relatedness︷︸︸︷(

|Vi| −1

2

) Cost︷︸︸︷

maxj∈Vi

|cij |, for all i ∈ C,

then the totally asynchronous distributed iteration (TADI) sequence γ(t)generated with mapping T and the outdated estimate sequence γi(t) for alli ∈ C each converge to the unique Nash equilibrium of the cooperation game.

Asynchronous convergence to Nash equilibriumResults: cooperation by cyclic feedback

Introduction



Background


Cooperation game


Results

Closing remarks


λ2 (encounters per second)

Optimal

coop

erationwillingn

ess

γ∗ ifori∈1,2,3

A B C

0

1

0 1 2 3 4

λ1 = 0.6

λ1

λ3 = 1.7

λ3

ba

b

bb

c

γ∗1

γ∗3

γ ∗2

A

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

B

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

C

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2

Simulation of AAV patrol scenario

Converges to predicted Nash equilibrium


Introduction



Background


Cooperation game


Results

Closing remarks



Optimal

coop

erationwillingn

ess

γ∗ ifori∈1,2,3

A B C

0

1

0 1 2 3 4

λ1 = 0.6

λ1

λ3 = 1.7

λ3

ba

b

bb

c

γ∗1

γ∗3

γ ∗2

A

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

B

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

C

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2



Increases in one encounter rate (e.g., λ2) cause equilibrium shift so

neighbors (e.g., 1 and 3) help more and agent (e.g., 2) helps less


Introduction



Background


Cooperation game


Results

Closing remarks



Optimal

coop

erationwillingn

ess

γ∗ ifori∈1,2,3

A B C

0

1

0 1 2 3 4

λ1 = 0.6

λ1

λ3 = 1.7

λ3

ba

b

bb

c

γ∗1

γ∗3

γ ∗2

A

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

B

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

C

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2



Increases in one encounter rate (e.g., λ2) cause equilibrium shift so

neighbors (e.g., 1 and 3) help more and agent (e.g., 2) helps less

Emergent cooperation due to cyclic feedback effects

Closing remarks

Introduction



Closing remarks


Both biology and engineering are full of interesting complex systems

Closing remarks

Introduction



Closing remarks



Real-time implementations in one domain are intuitive and

cognitively simple behaviors in another

Closing remarks

Introduction



Closing remarks





Homomorphisms are not always obvious and should not be

forced

Closing remarks

Introduction



Closing remarks






forced

Unifying principles are more valuable than mimicry

Closing remarks

Introduction



Closing remarks






forced


Catalyze interdisciplinary collaboration

Closing remarks

Introduction



Closing remarks






forced



Inject new ideas

Closing remarks

Introduction



Closing remarks






forced



Inject new ideas

Provides new avenues for careers after graduate school!

Thanks!

Introduction



Closing remarks


(bringing engineers and animals together)

Thank you!

Helpful People: Kevin Passino, Tom Waite, Ian Hamilton

Funding Sources:

Questions?

Further reading

Introduction



Closing remarks

Further reading


Pavlic TP (2007) Optimal foraging theory revisited. Master’s thesis, The

Ohio State University, Columbus, OH. URL

http://www.ohiolink.edu/etd/view.cgi?acc_num=osu118193

Pavlic TP, Passino KM (2010a) Cooperative task processing. IEEE

Transactions on Automatic Control Submitted

Pavlic TP, Passino KM (2010b) The sunk-cost effect as an optimal

rate-maximizing behavior. Acta Biotheoretica Accepted pending

revisions

Pavlic TP, Passino KM (2010c) When rate maximization is impulsive.

Behavioral Ecology and Sociobiology doi:10.1007/s00265-010-0940-1,

in press

http://www.ohiolink.edu/etd/view.cgi?acc_num=osu1181936683

http://dx.doi.org/10.1007/s00265-010-0940-1

Documents

Engineering Serendipity: Successes in Solitary Foraging ... · The Biomimicry Institute David Brazier. Faulty connections Introduction Reﬂections Faulty connections Missed connections