Concurrent Revisions And Cloud Types

Preview:

DESCRIPTION

Concurrent Revisions And Cloud Types. Sebastian Burckhardt In collaboration with Daan Leijen, Manuel Fähndrich , Alexandro Baldassin, Benjamin Wood, Mooly Sagiv, Yuelu Duan, Alexey Gotsman, Hongseok Yang. Overview. Part I: Concurrent Revisions Summary of prior work(What led me here) - PowerPoint PPT Presentation

Citation preview

Concurrent RevisionsAnd Cloud Types

Sebastian BurckhardtIn collaboration with Daan Leijen, Manuel Fähndrich, Alexandro Baldassin, Benjamin Wood, Mooly Sagiv, Yuelu Duan, Alexey Gotsman, Hongseok Yang

Overview

Part I: Concurrent Revisions

Summary of prior work(What led me here)

Part II: Concurrent Revisions and Distributed Systems

Motivation:a) Programming Apps for Mobile+CloudProposed Solution:b) Revision Consistency c) Cloud Types

Part III: How to make it real: TouchDevelop

CONCURRENT REVISIONSPart I

Parallel tasks: Pick any two

ParallelPerformance Frequent

Conflicts

Serializability

Our pick

ParallelPerformance Frequent

Conflicts

Serializability

Revisions

Concurrent Revisions 101

• When forking a task, state is copied as well.

• Task operates on its copy of the data in isolation.

• When task is joined, changes are merged.

• The merge is fully defined by the data type declarations.• some types may include custom

merge functions• there is no failure , rollback, or

retry

B

D

CA

fork

fork

fork

join

join

Good for Parallel Programming• On multicore : efficient thanks to copy-on-write

• Studied game application [OOPSLA 2010]: revisions provide more parallelization, better performance

• Programming model has good properties: Deterministic Parallel Programming [ESOP 2010]

• Can be extended to express both parallel and incremental computation: “Two for the price of one” [OOPSLA 2011], Distinguished Paper Award

Application Example: SpaceWars3D Game

Revisions helped with these challenges:

- Need stable snapshot for rendering task

- Need to parallelize tasks that may write to same data(physics, collisions, network)

- Need to allow slow background tasks (e.g. autosave) to work on snapshot

Revision Diagram of Parallelized Game Loop

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

Application Example: SpaceWars3D Game

Eliminated Read-Write Conflicts

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

All tasks see stable snapshot

Application Example: SpaceWars3D Game

Eliminated Write-Write Conflicts

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

Network after CD after Physics

Application Example: SpaceWars3D Game

Understanding Concurrent RevisionsOperation-Based Interpretation

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

A : integer = 0;B : integer = 0;

Sees only the initialization operation

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

Set(A,0)Set(B,0)

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

A.Set(2)B.Set(2)

A.Set(1)

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

A.Set(0)B.Set(0)

A.Add(2)

A.Add(1)

A.Get() -> ?

Puzzle 1 A : integer = 0

A.Add(2)

A.Add(1)

A.Get() -> 3

Puzzle 1 A : integer = 0

A.Add(1)

A.Set(0)

A.Add(2)

Answer:

Updates along path:A.Set(0)A.Add(1)A.Add(2)

Result:3

A.Add(1) A.Set(1)

A.Get() -> ?

Puzzle 2A : integer = 0

A.Add(1)

A.Set(1)

A.Add(1)

A.Get() -> 2

A.Set(1)A.Add(1)

Answer

Updates along path:A.Set(0)A.Add(1)A.Set(1)A.Add(1)

Result:2

A.Add(1)

A.Set(0)Puzzle 2

S.Append(“1”)Puzzle 3

S : string = “”

S.Append(“2”)

S.Append(“3”)

S.Append(“4”)

S.Get() -> ?

S.Get() -> ?

Puzzle 3

S.Append(“2”)

S.Append(“4”)

S.Get() -> ?

S.Get()->“13”

Answer 1

Updates along path:S.Set(“”)S.Append(“1”)S.Append(“3”)

Result:“13”

S.Set(“”)

S.Append(“3”)

S.Append(“1”)

Puzzle 3

S.Append(“2”)

S.Append(“4”)

S.Get() -> “1234”

S.Get()->“13”

Answer 2

Updates along path:S.Set(“”)S.Append(“1”)S.Append(“2”)S.Append(“3”)S.Append(“4”)

Result:“1234”

S.Set(“”)

S.Append(“3”)

S.Append(“1”)

S.Append(“1”)S.Append(“2”)

S.Append(“3”)

S.Append(“3”)S.Append(“4”)

Visibility & Arbitration in Revision Diagrams

• Visibilitywho can see what updates?= Reachabilityis there a (directed) path?

• Arbitrationwhose update goes first?= Cactus Walk

1

2

34

5

6

7

8

9

Not everything is a revision diagram: The join condition

Revision diagrams are subject to the join condition:A revision can only be joined into vertices that are reachable from the fork.

Invalid join, no path from fork.

Without join condition, causality may be violated.

A

B

C

• B sees updates of A• C sees updates of B• But C does not see updates of A

• Without join condition, visibility is not transitive.• We prove in paper: enforcing

join condition is sufficient to guarantee transitive visibility.

Conclusion of Part I

• Revision Diagrams• Make replication explicit• Provide a principled way to understand and

define the effect of concurrent conflicting updates

• Concurrent Revisions• Can use revisions as a programming model to

achieve better performance, or to express incremental + parallel algorithms

REVISIONS & DISTRIBUTED SYSTEMSPart II

Revisions + Distributed Systems

• Can think of many applications.

• Revision pattern is commonly used for:• Source control systems

(data structured as file systems, with per-file merge operations)• Classic web applications

Load HTML form – edit locally – submit – server does merge• Modern web applications

read REST object – javascript modifies locally – write REST object

• We are currently focusing on this programming domain:Apps for Mobile + Cloud.

Revisions + Distributed Systems

• Can think of many applications.

• Revision pattern is commonly used for:• Source control systems

(data structured as file systems, with per-file merge operations)• Classic web applications

Load HTML form – edit locally – submit – server does merge• Modern web applications

read REST object – javascript modifies locally – write REST object

• We are currently focusing on this programming domain:Apps for Mobile + Cloud.

MOTIVATIONPart IIa

Why apps communicate

Personal Publishing

Games

Data Collection

Collaboration

Sync and Backup

Transactions

BlogFacebook WallWebsite

MusicVideoSkyDrive

SurveysHigh Scores

OneNoteShared ListsShared CalendarShared Spreadsheet

Real-timeTurn-based

StoreAuctionMatchmaking

Remote Control

Home ControlRoboticsMedia Player

Requirements• Persistence• Data is not deleted when we:

quit the app, lose connection to server, take the battery out, crash due to bug, close the browser, replace the phone

• Reliability• Process is not lost (resume at last stable point)• Data integrity is protected

• Offline support• App continues to work without connection to cloud

• Security• Control who can do what

• Scalability• Support many users and/or large databases at low cost

Requirements• Persistence• Data is not deleted when we:

quit the app, lose connection to server, take the battery out, crash due to bug, close the browser, replace the phone

• Reliability• Process is not lost (resume at last stable point)• Data integrity is protected

• Offline support• App continues to work without connection to cloud

• Security• Control who can do what

• Scalability• Support many users and/or large databases at low cost

Our focus.

Implies:- Need replicas on client - must support eventual

consistency

milkbreadeggs

cilantrosardinesguava

grocery list

Implementation Architecture?

• Peer-to-Peer• Program runs on clients only, no server• Popular with researchers• Not all that common in practice

• Client-Server, or more recently Client-Service• Very common these days• Service typically hosted on virtualized infrastructure (cloud)

-> makes “economy of scale” accessible to everybody

Node

Storage

Compute

Node

Storage

Compute

Node

Storage

Compute

Basic Using Cloud Infrastructure

Storage Storage

Compute Compute Compute

Storage Storage

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Distributed Systems

Layer

Storage Storage

Compute Compute Compute

Storage Storage

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

How to program this machine?

ClientNot physically secureUnreliableCannot detect failuresPotentially many

Cloud ComputePhysically secure, not so manyNot reliable: no persistent stateCan detect failures somewhatRelatively Expensive

Cloud StorageSecureReliableCan be very cheap

Extensive Replication

Replica

App

Local State GUIStateBinding

Storage Backend

save/restoresave/restore

App

Local State GUIStateBinding

save/restoresave/restoreStorageBackend

Messages

sync

Messages

Compute LayerCompute

LayerCompute LayerCompute

LayerCompute LayerCompute

LayerCompute Layer

Mes

sage

s

LocalStorage

LocalStorage

Cache Coherence? Consistency Model?

grocery list

App programmers should not have to think that much.All this stuff is all about the implementation, not about the problem domain.

• Program runs on server, client is just a view• Example: Classic HTML approach• Client clicks link/button to submit,

gets next page• Program runs on client, uses server as

a resource• Client issues webrequests (e.g. REST)

• Program runs on client and on server• Example: websockets are full-duplex• Client and server can send messages,

causing event handlers to launch at other end

• Peer-to-Peer• Program runs on clients only, no

server• Rare for apps, as far as I know

Replica

AppLocal State

GUIState

Binding

Storage Backend

save/restor

e

save/restor

e

AppLocal State

GUIState

Binding

save/restor

e

save/restor

eStorageBackend

Messages

sync

Messages

Compute LayerCompute

LayerCompute LayerCompute

LayerCompute LayerCompute

LayerCompute Layer

Mes

sage

sLoca

lStorage

Local

Storage

Abstractions, please.

We propose:

- Revision Consistency- Cloud Types

Papers:Eventually Consistent Transaction (ESOP 2011)Cloud Types (ECOOP 2012)

Closely Related Work

• CRDTs (Conflict-Free Replicated Data Types)• [Shapiro, Preguica, Baquero, Zawirski]• Similar motivation and similar techniques

• Bayou• user-defined conflict resolution (merge fcts.)

REVISION CONSISTENCYPart IIb

Revision Consistency

device 1 device 2cloud• Client code: Declare data types read/update data yield (=polite sync) flush (=forced sync)

• Under the hood: Revision diagram

rules

device 1 device 2cloud

Implicit Transactions

• At yieldRuntime has permission to send or receive updates. Call this frequently, e.g. automatically “on idle”.

• In between yieldsRuntime is not allowed to send or receive updates

• Implies: all client code executes in a (eventually consistent) transaction

yield

yield

yield

yieldyield

yield

yield

yield

On-Demand Stronger Consistency

• flush primitive blocks until local state has reached main revision and result has come back to device• Sufficient to

implement strong consistency• Flush blocks –times

out if server connection is not available.

flush(blocks)

(continue)

Revision consistency

• Global state evolves as a Revision diagram• Main revision

(center) in reliable cloud storage• Seamless offline

support• Never blocks,

except when client issues fence

B

C

A

D

E

G

F

Client 1 Client 2

flush

yield

yield

yield

yield

yield

Revision consistency

• Global state evolves as a Revision diagram• Main revision

(center) in reliable cloud storage• Seamless offline

support• Never blocks,

except when client issues fence

B

CA

D

E

G

F

Client 1 Client 2

yield

fence

yield

yield

yield

yield

Nice things about Revision Consistency

Strong guarantees:• Guarantees causal eventual consistency• Supports eventually consistent transactions • Supports on-demand stronger consistency

Opportunities for efficient implementation• Naturally supports storage hierarchies• Consistent with full-duplex pull & push updates between

service and client (e.g. websockets)• Can be combined with “log reduction” techniques

• Main Revision = Master Log

• Suggested Implementation:- Main Log in Cloud Storage- Scalable read & write

• It is possible to scale reading/writing of log

• Log Reduction

B

CA

D

E

G

F

Client 1 Client 2

B

AD

CFG

Related to Log-BasedImplementations

[YieldPull]

[YieldPush]

[FlushPush]

[FlushPull]

[SyncPush]

[SyncPull]

[SyncPush]

[SyncPush]

[SyncPull]

[SyncPull]

[YieldPull]

[YieldPush]

1 0 0

1

00

1

1

1

2

22

3

4

3

4

Can build layered serviceReliableStorage

ComputeLayer

ComputeLayer

Client Client

CLOUD TYPESPart IIc

• An abstract data type with

• Initial value e.g. { 0 }• Query operations e.g. { get }• No side effects

• Update operations e.g. { set(x), add(x) }• Total (no preconditions)

• Good cloud types minimize programmer surprises.

What is a cloud type?

Our goals for finding cloud types…

• select only a few• But ensure many others can be derived

• choose types with minimal anomalies• Updates should make sense even if state changes

Forces us to rethink basic data structuring.• Objects & pointers fail the second criterion• Entities & relations do better

Example App: Birdwatching• An app for a birdwatching family.

• Start simple: let’s count the number of eagles seen.

var eagles : cloud integer;

device 1 device 2cloud

var eagles : cloud integer;

Eventually consistent counting

eagles.add(1) eagles.Set(1)

eagles.Get() -> 1

eagles.add(1)

eagles.get() → 3

eagles.add(1)

eagles.get() → 2

device 1 device 2cloud

Counting by birdvar birds: cloud array [name: string] {count : cloud integer}

birds[“jay”].count.Add(1)birds[“gull”].count.Add(2)

birds[“jay”].count.Get() -> 6

birds[“jay”].count.Add(5)

Important: all entries are already there, no need to insert key-value pairs.

Standard Map Semantics Would not Work!

device 1 device 2cloud

if birds.contains (“jay”) birds[jay].Add(5)else birds.insert(“jay”, 5)

?

if birds.contains (“jay”) birds[jay].Add(3)else birds.insert(“jay”, 3)

Our Collection of Cloud TypesPrimitive cloud types

• Cloud Integers{ get } { set(x), add(x) }

• Cloud Strings{ get } { set(s), set-if-empty(s) }

Structured cloud types• Cloud Tables

(cf. entities, tables with implicit primary key)• Cloud Arrays

(cf. key-value stores, relations)

Cloud Tables• Declares• Fixed columns• Regular columns

• Initial value: empty

• Operations: • new E(f1,f2) add new row (at end)• all E return all rows (top to bottom)• delete e permanently delete row• e.f1 read fixed column

• e.coli.op perform operation on cell

cloud table E( f1: index_type1; f2: index_type1;){ col1: cloud_type1; col2: cloud_type2;}

Cloud Arrays

• Initial value: for all keys, fields have initial value• Operations: • A[i1,i2].vali.op perform operation on value• entries A return entries for which

at least one vali is not initial value

cloud array A[ idx1: index_type1; idx2: index_type2;]{ val1: cloud_type1; val2: cloud_type2;}

Arrays + Tables = Relational Data

• Tables• Define entities• Row identity = Invisible primary key

• Arrays• Define arbitrary relations

• Code can access data using queries• For example, LINQ queries

Arrays + Tables = Relational Data

• Example: shopping cart

cloud table Customer{ name: cloud string;}

cloud table Product{ description: cloud string;}

cloud array ShoppingCart[ customer: Customer; product: Product;]{ quantity: cloud integer;}

Function Add(c: Customer; p: Product; x: int){ ShoppingCart[c,p].quantity.Add(x);}

Arrays + Tables = Relational Data

• Example: binary relation

cloud table User{ name: cloud string;}

cloud array friends[ user1 : User; user2 : User;]{ value: cloud boolean;}

Standard math: { relations AxBxC } = { functions AxBxC -> bool }

Arrays + Tables = Relational Data

• Example: linked tables

• Cascading delete: Order is deleted automatically when owning customer is deleted

cloud table Customer{ name: cloud string;}

cloud table Order[ owner: Customer]{ description: cloud string;}

Linked tables solve following problem:

device 1 device 2cloud

delete customer;foreach o in Orders if (o.owner = customer) delete o;

?

new Order(customer);

Flush can be used to implement a lock

We don’t recommend you actually do this in practice. (why?)

function Lock(){ while(lock != my_id) { lock.setIfEmpty(my_id); flush; }}

lock: cloud string;

function Unlock(){ lock.set(“”);}

HOW TO MAKE IT REALPart III

The hypothesis

• Anyone with basic programming skills can write simple apps that share data in the cloud• No harder than to writing a BASIC program or

an Excel script• Just declare your cloud table or cloud indexes,

and off you go.

• Success will mean: things work without pain – users won’t even appreciate that there is a research problem behind it

What is TouchDevelop?• A simple programming language

• An integrated development environment (IDE)

• Optimized for devices (small screens, touch input)no PC required, no keyboard required

• Runs on almost everything (Ipad, Iphone, Android, PC, windows phone …)

• You can share your scripts online (public domain)

or convert them to Windows 8 apps and sell them in the store

News about TouchDevelopConversations by Nokia: "Create your own Lumia apps with TouchDevelopNikkei Computer: "スマホのアプリをスマホで開発"

atmarkIT: "MSの開発環境「TouchDevelop"

PC-Welt: "Microsoft startet webbasierten App Creator für Windows 8"neowin.net: "Microsoft launches web-based Windows 8 app creator"NYTimes: "Fostering Tech Talent in Schools" c't magazine: "Apps für Windows Phone" (in German)TechRepublic: "Fun with TouchDevelop, an IDE for Windows Phone 7"Social Times: "Microsoft Research TouchDevelop for Windows Phone: The First Social Cloud Programming Environment"c't magazine: "[TouchDevelop]" (in German)Social Times: "Microsoft Research [TouchDevelop] Makes Windows Phone a Game Changing Platform: Prepare to be Amazed"Geek Wire: "Microsoft ‘[TouchDevelop]’ uses phone to program phone"

Recommended