Stochastic Robotics: Complexity, Compositionality, and ......Stochastic Robotics S. Berman, V. Kumar, and R. Nagpal, “Design of control policies for spatially inhomogeneous robot

Stochastic Robotics

Stochastic Robotics:

Complexity, Compositionality, and Scalability

Dr. Theodore (Ted) P. Pavlic

[email protected]

www

Friday, February 22, 2013

mailto:[email protected]

http://www.tedpavlic.com/facjobsearch/

http://www.tedpavlic.com/facjobsearch/

Overview

Introduction

Guarded Command

Programming with Rates

Reaction Networks

Cooperative Transport

Cooperative Task

Processing

Conclusions

Stochastic Robotics

Introduction

Guarded Command Programming with Rates

Application

GCP with Rates

Markov Analysis

Reaction Networks

RoboBees

Swarm Assembly


Modeling Ants

SHS Generators

Cooperative Task Processing

Application: Patrol

Network Cooperation Game

Conclusions

Guarded Command Programming with Rates(Napp and Klavins 2011)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

N. Napp and E. Klavins, “A compositional framework for programming

stochastically interacting robots,” Int. J. Robot. Res. [Special Issue Stochasticity in

Robot. Bio-Systems Part 2], vol. 30, no. 6, pp. 713–729, May 2011.

Application: Factory Floor testbed(Galloway et al. 2010)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Application: Factory Floor testbed(Galloway et al. 2010)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Application: Factory Floor testbed routing(Galloway et al. 2010; Napp and Klavins 2011)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Application: Factory Floor testbed routing(Napp and Klavins 2011)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Guard︷︸︸︷

1000→

Action︷︸︸︷

0100

1 0 0 0

0 1 0 0


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Weakly related to GCP by Dijkstra (1975).

Non-deterministic reasoning about programs (predicate transformers).

x ≥ y → m := x� y ≥ x→ m := y

state = 0100 → state = 1000

� state = 0100 → state = 0010


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Guard︷︸︸︷

1000→

Action︷︸︸︷

0100 (global state)

1 0 0 0

0 1 0 0


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Guard︷︸︸︷

1000→

Action︷︸︸︷

0100

s1s2→ s1s2

(global state)

(site encoded)

1 0 0 0

0 1 0 0


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Guard︷︸︸︷

1000→

Action︷︸︸︷

0100

s1s2→ s1s2

sisi+1→ sisi+1

(global state)

(site encoded)

(behavioral ND program)

1 0 0 0

0 1 0 0


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

For compositionality, augment GCP with exponential rates.

sisi+1 → sisi+1 (GCP)

sisi+1k−⇀ sisi+1 (GCPR)

So a GCPR command is now an edge of a Markov chain. Rates are

either chosen programmatically or are used to model error rates.


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics






� A GCPR Ψ = {(g1, a1, r1), . . . , (gn, an, rn)} is a set of commands

that are each made up of a guard, action, and rate.


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics








� A GCPR Ψ can be scaled by σ ∈ R≥0 such that

σΨ =⋃

(g,a,r)∈Ψ

(g, a, σr).


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics








� A GCPR Ψ can be scaled by σ ∈ R≥0 such that

σΨ =⋃

(g,a,r)∈Ψ

(g, a, σr).

� The composition Ψ ∪ Φ of GCPR Ψ and GCPR Φ is the union of the

two programs.

Analysis of GCPR using Master Equation(Napp and Klavins 2011)

Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

� Desired and undesired behaviors compose a single Markov process.


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics


� Slow, robust behaviors can be mixed with fast, idealized behaviors.


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics



� Rates and scalars are chosen to ensure adequate performance and

error resilience.


Introduction

Guarded Command


Application

GCP with Rates

Markov Analysis

Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics



� Rates and scalars are chosen to ensure adequate performance and

error resilience.

� System described by linear master equation:

~p = ~pQ

where pj is probability of state j and Q is graph Laplacian.

� Correctness: steady-state probability p∗

� Performance: spectrum of Q (i.e., λ2)

Application: Artificial Pollination by RoboBees(Berman et al. 2011a,b)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

S. Berman, V. Kumar, and R. Nagpal, “Design of control policies for spatially inhomogeneous robot

swarms with application to commercial pollination,” in Proceedings of the 2011 IEEE International

Conference on Robotics and Automation, Shanghai, China, May 9–13, 2011.

S. Berman, R. Nagpal, and A. Halasz, “Optimization of stochastic strategies for spatially

inhomogeneous robot swarms: a case study in commerical pollination,” in Proceedings of the 2011

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA,

USA, September 25–30, 2011, pp. 3923–3930.

http://robobees.seas.harvard.edu/

http://youtu.be/GgR-mH6X5VU

http://robobees.seas.harvard.edu/

RoboBees as Chemical Reaction Networks(Berman et al. 2011a,b)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

http://mc.tt/adMGVzNp5u



Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

� Reactions model behavior switches; choose rates programmatically.




Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics


� Motion governed by drift–diffusion process; choose field and diffusion

coefficient programmatically.

~xi(t+δt) = ~xi(t)+~v(~xi, t)δt+√

2Dδt ~Z(t), Zj(t) ∼ N (0, 1)




Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics





2Dδt ~Z(t), Zj(t) ∼ N (0, 1)

� Macroscopic design with advection–diffusion–reaction equations.




Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics





2Dδt ~Z(t), Zj(t) ∼ N (0, 1)

� Macroscopic design with advection–diffusion–reaction equations.

� Goal: Desired flower coverage (1500 robots, 3 minute bouts, wind).



Application: Swarm Robotic Assembly(Matthey et al. 2009)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

L. Matthey, S. Berman, and V. Kumar, “Stochastic strategies for a swarm robotic assembly

system,” in Proceedings of the 2009 IEEE International Conference on Robotics and

Automation, Kobe, Japan, May 12–17, 2009, pp. 1953–1958

Application: Swarm Robotic Assembly(Matthey et al. 2009)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Swarm Robotic Assembly as Chemical Reaction Network(Matthey et al. 2009)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Goal: Equal number of both parts assembled (i.e., xF1 = xF2).

Swarm Robotic Assembly as Chemical Reaction Network(Matthey et al. 2009)

Introduction

Guarded Command


Reaction Networks

RoboBees

Swarm Assembly


Cooperative Task

Processing

Conclusions

Stochastic Robotics

Results from CME Design: Micro-/Macro-scopic Fraction of Parts 1 and 2

Cooperative Transport in Ants(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

G. P. Kumar, A. Buffin, T. P. Pavlic, S. C. Pratt, and S. M. Berman, “A stochastic

hybrid system model of collective transport in the desert ant Aphaenogaster

cockerelli ,” in Proceedings of the 16th ACM International Conference on Hybrid

Systems: Computation and Control, Philadelphia, PA, April 8–11, 2013.

Cooperative Transport in Ants(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

Aphaenogaster cockerelli ants

http://youtu.be/9RicT6EmbcE

http://youtu.be/8jcqzMvhEWo

Stochastic Hybrid System Model for Ants(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

Front Back

Detached

kFB

kBF

kFD

kDF kBD k

DB

xL = vL

mLvL = NFK(vdL − vL)︸︷︷︸

IndividualBehavior

+ µ sgn(vL)(mLg − (NF +NB)FL)︸︷︷︸

Friction

SHS Extended Generator(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

Generalized Lie derivative of moments along trajectories of SHS systems:

� For function ψ,

Lψ(~x) ,∂ψ

∂xLxL +

∂ψ

∂vLvL +

∑

i,j∈{F,B,D}i 6=j

(ψ(φij(~x))− ψ(~x))kijNi

andd

dtE(ψ(~x)) = E(Lψ(~x)).

So arbitrary moment dynamics can be derived from SHS model.

SHS Extended Generator(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

Generalized Lie derivative of moments along trajectories of SHS systems:

� For function ψ,

Lψ(~x) ,∂ψ

∂xLxL +

∂ψ

∂vLvL +

∑

i,j∈{F,B,D}i 6=j

(ψ(φij(~x))− ψ(~x))kijNi

andd

dtE(ψ(~x)) = E(Lψ(~x)).

So arbitrary moment dynamics can be derived from SHS model.

Behavioral switching rates and control parameters can be found by fitting

moment dynamics to statistics.

Fitting Results(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

0.5

0.4

0.3

0.2

0.1

150150

150150150

100100

100100100

50

50

50

505050

40

30

20

10

101010

888

666

444

222

00

00

00

00

00

NF

NB

ND

xL

(cm

)

vL

(cm

/s)

Time (s)Time (s)

Time (s)Time (s)Time (s)

Fitting Results(Kumar et al. 2013)

Introduction

Guarded Command


Reaction Networks


Modeling Ants

SHS Generators

Cooperative Task

Processing

Conclusions

Stochastic Robotics

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

150150150150

150150150150

150150150150

100100100100

100100100100

100100100100

50505050

50505050

50505050

40

40

40

30

30

30

20

20

20

10

1010

10

1010

10

1010

88

88

88

66

66

66

44

44

44

22

22

22

00

00

00

00

00

00

00

00

00

00

00

00

NF

NF

NF

NB

NB

NB

xL

(cm

)xL

(cm

)xL

(cm

)

vL

(cm

/s)

vL

(cm

/s)

vL

(cm

/s)

Time (s)Time (s)Time (s)Time (s)



Bonus – Ant Cognition: Drift–Diffusion Modeling of Quorum Sensing

Introduction

Guarded Command


Reaction Networks


Ant Cognition

Cooperative Task

Processing

Conclusions

Stochastic Robotics

T. P. Pavlic, S. C. Pratt, “Speed–accuracy tradeoffs in Temnothorax

rugatulus ants: sequential-sampling models of quorum detection while

house hunting” (IUSSI-NAS 2012, SMB 2013).

Time

Evid

en

ce

a

z

0

Choice A

Choice B

v

“Drift” toward appropriate

answer over time.

Process variability (noise)

causes occasional errors.

b

bb

http://youtu.be/erpGaclTqqM

http://youtu.be/N4-H2UZ7lQU

Cooperative Task Processing(Pavlic and Passino 2011)

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

T. P. Pavlic and K. M. Passino, “Cooperative task-processing networks,” in Proceedings of

the Second International Workshop on Networks of Cooperating Objects, CONET 2011,

Chicago, IL, USA, April 11, 2011.

Application: Cooperative patrol(Pavlic and Passino 2011)

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

Example AAV Application:

Three MQ-8 Firescouts

on patrol


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

1

2

3

b

b

1

2 3


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

×

× ×

×

××

×

×

×

××

×

×

× ×

×

λ

1

2

3

b

b

1

2 3

λ


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

×

× ×

×

××

×

×

×

××

×

×

× ×

×

λ

1

2

3

b

b

1

2 3

λ

� Task-processing network (TPN): conveyor 1 and cooperators 2 and 3


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

×

× ×

×

××

×

×

×

××

×

×

× ×

×

×

×

×

×

×

×

× ×

×

×

××

××

××

×

×

×

×

××

×

×

×

××

××

××

××

λ1

λ2 λ3

1

2

3

b

b

1

2 3

1

2 3

λ1

λ2 λ3

� Task-processing network (TPN): conveyor 1 and cooperators 2 and 3� Each can be both conveyor and cooperator simultaneously


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

×

× ×

×

××

×

×

×

××

×

×

× ×

×

×

×

×

×

×

×

× ×

×

×

××

××

××

×

×

×

×

××

×

×

×

××

××

××

××

λ1

λ2 λ3

1

2

3

b

b

1

2 3

1

2 3

λ1

λ2 λ3

� Task-processing network (TPN): conveyor 1 and cooperators 2 and 3� Each can be both conveyor and cooperator simultaneously

� Policy should be decentralized but still share load

Cooperation game(Pavlic and Passino 2011)

Stochastic Robotics

For cooperator i ∈ C, its local rate of gain

Ui(~γ) ,

Conveyor costs︷︸︸︷

−ci

(∏

j∈Ci

(1− γj)

)

︸︷︷︸

Pr(No Ci volunteers|i advertisement)

+ γi∑

j∈Vi

(−

Pr(i awarded task from j|i volunteers)︷︸︸︷

Ps(j|i)cij)

︸︷︷︸

Cooperator part:

Costs of local processing on i ∈ V :

ci ,∑

k∈Yi

λki cki

Costs and benefits to i ∈ C for volunteering for

tasks exported from j ∈ Vi:

cij ,∑

k∈Yj

λkj ckij


Stochastic Robotics


Ui(~γ) ,


−ci

(∏

j∈Ci

(1− γj)

)

︸︷︷︸


+ γi∑

j∈Vi

(pij(Qj)−


Ps(j|i)cij)

︸︷︷︸

Cooperator part: γi , Qj vary with γi


ci ,∑

k∈Yi

λki cki



cij ,∑

k∈Yj

λkj ckij

pij(Qj) ,∑

k∈Yj

λkj qkijp

kj (Qj)

Payment functions added as stabilizing controls (“quantity” Qi ,∑

j∈Ciγj ).


Stochastic Robotics


Ui(~γ) ,


−ci

(∏

j∈Ci

(1− γj)

)

︸︷︷︸


+ γi∑

j∈Vi

(pij(Qj)−


Ps(j|i)cij)

︸︷︷︸

Cooperator part: γi , Qj vary with γi


ci ,∑

k∈Yi

λki cki



cij ,∑

k∈Yj

λkj ckij

pij(Qj) ,∑

k∈Yj

λkj qkijp

kj (Qj)

Payment functions added as stabilizing controls (“quantity” Qi ,∑

j∈Ciγj ).

Cournot oligopolieson a graph(∝ jury volunteering)

Decreasing-cost

externality

Totally asynchronous solver(Pavlic and Passino 2011)

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

� Totally asynchronous parallel computation of ~γ∗ by local gradient

ascent:


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


ascent:

� Agents iterate asynchronously.

� Each agent operates on a possibly outdated copy of ~γ.


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


ascent:



� If synchronous transition mapping is a contraction with respect to

maximum norm

‖~γ‖∞ , maxi∈C

{|γi|},

then a unique Nash equilibrium exists and is asymptotically stable

by totally asynchronous distributed gradient ascent

iterations (Bertsekas and Tsitsiklis 1997).


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


ascent:



� If synchronous transition mapping is a contraction with respect to

maximum norm

‖~γ‖∞ , maxi∈C

{|γi|},

then a unique Nash equilibrium exists and is asymptotically stable

by totally asynchronous distributed gradient ascent

iterations (Bertsekas and Tsitsiklis 1997).

� Constraints on topology and payment functions ensure contraction.

� Diagonal-dominance/convexity argument.

� Network structure ensures dominance.

Totally asynchronous projected gradient ascent

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

Define T : [0, 1]n → [0, 1]n by T (~γ) , (T1(~γ), T2(~γ), . . . , Tn(~γ)) where, for

each i ∈ C,

Ti(~γ) , min{γmaxi ,max{0, γi + σi∇iUi(~γ)}}

Projected gradient ascent


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


each i ∈ C,

Ti(~γ) , min{γmaxi ,max{0, γi + σi∇iUi(~γ)}},

where1

σi≥ 2|Vi|max

k∈Vi

|p′ik(0)|

for all ~γ ∈ [0, 1]n.


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


each i ∈ C,


where1

σi≥ 2|Vi|max

k∈Vi

|p′ik(0)|

for all ~γ ∈ [0, 1]n. If

minj∈Vi

|p′ij (|Cj |) | >

(

|Vi| −1

2

)

maxj∈Vi

|cij | for all i ∈ C,

then the totally asynchronous distributed iteration (TADI) sequence {~γ(t)}generated with mapping T and the outdated estimate sequence {~γi(t)} for all

i ∈ C each converge to the unique Nash equilibrium ~γ∗ of the cooperation game.


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics


each i ∈ C,


where1

σi≥ 2|Vi|max

k∈Vi

|p′ik(0)|

for all ~γ ∈ [0, 1]n. If

Benefit︷︸︸︷

minj∈Vi

|p′ij (|Cj |) | >

1/(Relatedness)︷︸︸︷(

|Vi| −1

2

) Cost︷︸︸︷

maxj∈Vi

|cij | for all i ∈ C,

∼ Hamilton’s rule

on networks

then the totally asynchronous distributed iteration (TADI) sequence {~γ(t)}generated with mapping T and the outdated estimate sequence {~γi(t)} for all

i ∈ C each converge to the unique Nash equilibrium ~γ∗ of the cooperation game.

Results: Cyclic feedback(Pavlic and Passino 2011)

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

1

2 3

γi

C = V = {1, 2, 3}

Ci = Vi = {1, 2, 3} \ {i}

Yj = {j}

1

2 3

λ11

λ22 λ3

3

λ22 (encounters per second)

Optimal

coop

erationwillingn

ess

γ∗ ifori∈{1,2,3}

A B C

0

1

0 1 2 3 4 5

λ11 = 0.6

λ11

λ33 = 1.7

λ33

ba

b

bbc

bd

γ∗1

γ∗3

γ ∗2

A

λ22 < λ11 < λ33

γ∗2 > γ∗1 > γ∗3

B

λ11 < λ22 < λ33

γ∗1 > γ∗2 > γ∗3

C

λ11 < λ33 < λ22

γ∗1 > γ∗3 > γ∗2

Equilibrium in AAV patrol scenario(simulation results above match analytic predictions)


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

1

2 3

γi

C = V = {1, 2, 3}

Ci = Vi = {1, 2, 3} \ {i}

Yj = {j}

1

2 3

λ11

λ22 λ3

3


Optimal

coop

erationwillingn

ess

γ∗ ifori∈{1,2,3}

A B C

0

1

0 1 2 3 4 5

λ11 = 0.6

λ11

λ33 = 1.7

λ33

ba

b

bbc

bd

γ∗1

γ∗3

γ ∗2

A

λ22 < λ11 < λ33

γ∗2 > γ∗1 > γ∗3

B

λ11 < λ22 < λ33

γ∗1 > γ∗2 > γ∗3

C

λ11 < λ33 < λ22

γ∗1 > γ∗3 > γ∗2


� Increases in one encounter rate (e.g., λ2) cause equilibrium shift so

neighbors (e.g., 1 and 3) help more and agent (e.g., 2) helps less


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

1

2 3

γi

C = V = {1, 2, 3}

Ci = Vi = {1, 2, 3} \ {i}

Yj = {j}

1

2 3

λ11

λ22 λ3

3


Optimal

coop

erationwillingn

ess

γ∗ ifori∈{1,2,3}

A B C

0

1

0 1 2 3 4 5

λ11 = 0.6

λ11

λ33 = 1.7

λ33

ba

b

bbc

bd

γ∗1

γ∗3

γ ∗2

A

λ22 < λ11 < λ33

γ∗2 > γ∗1 > γ∗3

B

λ11 < λ22 < λ33

γ∗1 > γ∗2 > γ∗3

C

λ11 < λ33 < λ22

γ∗1 > γ∗3 > γ∗2




� Emergence due to market coupling from network cycles


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Application: Patrol

Network Cooperation

Game

Conclusions

Stochastic Robotics

b

b

1

2 3

γi

C = V = {1, 2, 3}

Ci = Vi = {1, 2, 3} \ {i}

Yj = {j}

1

2 3

λ11

λ22 λ3

3


Processingrate

Ri(γ∗ i)fori∈{1,2,3}

A B C

0

1

2

3

4

5

6

7

0 1 2 3 4 5

Rmax

Ropt

λ11+λ22+λ33

λ11 = 0.6

λ11

λ33 = 1.7

λ33

R1(γ∗1)

R3(γ∗3)

R2(γ∗2)

A

λ22 < λ11 < λ33

γ∗2 > γ∗1 > γ∗3

B

λ11 < λ22 < λ33

γ∗1 > γ∗2 > γ∗3

C

λ11 < λ33 < λ22

γ∗1 > γ∗3 > γ∗2




� Emergence due to market coupling from network cycles

Stochastic Robotics: Conclusions

Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics

� Stochastic programming can model failures and blend programs for

optimal performance (Napp and Klavins 2011)


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics



� Chemical reaction networks provide reduced-order analysis

frameworks for design and control (Berman et al. 2011a,b; Matthey

et al. 2009)


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics





et al. 2009)

� Stochastic hybrid system models combine stochastic behavioral

switching with continuous-time dynamics (Kumar et al. 2013)


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics





et al. 2009)



� Numerical solvers stabilize adaptive mixed-Nash equilibria on

networks (Pavlic and Passino 2011)


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics





et al. 2009)





� Acknowledgments:


Introduction

Guarded Command


Reaction Networks


Cooperative Task

Processing

Conclusions

Stochastic Robotics





et al. 2009)





� Acknowledgments:

� Questions? Comments?

Documents

Stochastic Robotics: Complexity, Compositionality, and ......Stochastic Robotics S. Berman, V. Kumar, and R. Nagpal, “Design of control policies for spatially inhomogeneous robot