70
Concurrent Revisions And Cloud Types Sebastian Burckhardt In collaboration with Daan Leijen, Manuel Fähndrich, Alexandro Baldassin, Benjamin Wood, Mooly Sagiv, Yuelu Duan, Alexey Gotsman, Hongseok Yang

Concurrent Revisions And Cloud Types

  • Upload
    galeno

  • View
    55

  • Download
    0

Embed Size (px)

DESCRIPTION

Concurrent Revisions And Cloud Types. Sebastian Burckhardt In collaboration with Daan Leijen, Manuel Fähndrich , Alexandro Baldassin, Benjamin Wood, Mooly Sagiv, Yuelu Duan, Alexey Gotsman, Hongseok Yang. Overview. Part I: Concurrent Revisions Summary of prior work(What led me here) - PowerPoint PPT Presentation

Citation preview

Page 1: Concurrent Revisions And Cloud Types

Concurrent RevisionsAnd Cloud Types

Sebastian BurckhardtIn collaboration with Daan Leijen, Manuel Fähndrich, Alexandro Baldassin, Benjamin Wood, Mooly Sagiv, Yuelu Duan, Alexey Gotsman, Hongseok Yang

Page 2: Concurrent Revisions And Cloud Types

Overview

Part I: Concurrent Revisions

Summary of prior work(What led me here)

Part II: Concurrent Revisions and Distributed Systems

Motivation:a) Programming Apps for Mobile+CloudProposed Solution:b) Revision Consistency c) Cloud Types

Part III: How to make it real: TouchDevelop

Page 3: Concurrent Revisions And Cloud Types

CONCURRENT REVISIONSPart I

Page 4: Concurrent Revisions And Cloud Types

Parallel tasks: Pick any two

ParallelPerformance Frequent

Conflicts

Serializability

Page 5: Concurrent Revisions And Cloud Types

Our pick

ParallelPerformance Frequent

Conflicts

Serializability

Revisions

Page 6: Concurrent Revisions And Cloud Types

Concurrent Revisions 101

• When forking a task, state is copied as well.

• Task operates on its copy of the data in isolation.

• When task is joined, changes are merged.

• The merge is fully defined by the data type declarations.• some types may include custom

merge functions• there is no failure , rollback, or

retry

B

D

CA

fork

fork

fork

join

join

Page 7: Concurrent Revisions And Cloud Types

Good for Parallel Programming• On multicore : efficient thanks to copy-on-write

• Studied game application [OOPSLA 2010]: revisions provide more parallelization, better performance

• Programming model has good properties: Deterministic Parallel Programming [ESOP 2010]

• Can be extended to express both parallel and incremental computation: “Two for the price of one” [OOPSLA 2011], Distinguished Paper Award

Page 8: Concurrent Revisions And Cloud Types

Application Example: SpaceWars3D Game

Revisions helped with these challenges:

- Need stable snapshot for rendering task

- Need to parallelize tasks that may write to same data(physics, collisions, network)

- Need to allow slow background tasks (e.g. autosave) to work on snapshot

Page 9: Concurrent Revisions And Cloud Types

Revision Diagram of Parallelized Game Loop

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

Application Example: SpaceWars3D Game

Page 10: Concurrent Revisions And Cloud Types

Eliminated Read-Write Conflicts

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

All tasks see stable snapshot

Application Example: SpaceWars3D Game

Page 11: Concurrent Revisions And Cloud Types

Eliminated Write-Write Conflicts

Rend

er

Phys

ics

netw

ork

auto

save

(lo

ng ru

nnin

g)Colli

sion

Dete

ction

part

4pa

rt 3

part

2pa

rt 1

Network after CD after Physics

Application Example: SpaceWars3D Game

Page 12: Concurrent Revisions And Cloud Types

Understanding Concurrent RevisionsOperation-Based Interpretation

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

A : integer = 0;B : integer = 0;

Page 13: Concurrent Revisions And Cloud Types

Sees only the initialization operation

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

Set(A,0)Set(B,0)

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

Page 14: Concurrent Revisions And Cloud Types

A.Set(2)B.Set(2)

A.Set(1)

A.Get() -> 0A.Set(1)

A.Set(2)B.Set(2)

A.Get() -> 1B.Get() -> 2

• Current state determined by update sequence along path from root.

• Tip of arrow (arrow = end of a revision) count as the aggregate of all operations along the revision

A.Set(0)B.Set(0)

Page 15: Concurrent Revisions And Cloud Types

A.Add(2)

A.Add(1)

A.Get() -> ?

Puzzle 1 A : integer = 0

Page 16: Concurrent Revisions And Cloud Types

A.Add(2)

A.Add(1)

A.Get() -> 3

Puzzle 1 A : integer = 0

A.Add(1)

A.Set(0)

A.Add(2)

Answer:

Updates along path:A.Set(0)A.Add(1)A.Add(2)

Result:3

Page 17: Concurrent Revisions And Cloud Types

A.Add(1) A.Set(1)

A.Get() -> ?

Puzzle 2A : integer = 0

A.Add(1)

Page 18: Concurrent Revisions And Cloud Types

A.Set(1)

A.Add(1)

A.Get() -> 2

A.Set(1)A.Add(1)

Answer

Updates along path:A.Set(0)A.Add(1)A.Set(1)A.Add(1)

Result:2

A.Add(1)

A.Set(0)Puzzle 2

Page 19: Concurrent Revisions And Cloud Types

S.Append(“1”)Puzzle 3

S : string = “”

S.Append(“2”)

S.Append(“3”)

S.Append(“4”)

S.Get() -> ?

S.Get() -> ?

Page 20: Concurrent Revisions And Cloud Types

Puzzle 3

S.Append(“2”)

S.Append(“4”)

S.Get() -> ?

S.Get()->“13”

Answer 1

Updates along path:S.Set(“”)S.Append(“1”)S.Append(“3”)

Result:“13”

S.Set(“”)

S.Append(“3”)

S.Append(“1”)

Page 21: Concurrent Revisions And Cloud Types

Puzzle 3

S.Append(“2”)

S.Append(“4”)

S.Get() -> “1234”

S.Get()->“13”

Answer 2

Updates along path:S.Set(“”)S.Append(“1”)S.Append(“2”)S.Append(“3”)S.Append(“4”)

Result:“1234”

S.Set(“”)

S.Append(“3”)

S.Append(“1”)

S.Append(“1”)S.Append(“2”)

S.Append(“3”)

S.Append(“3”)S.Append(“4”)

Page 22: Concurrent Revisions And Cloud Types

Visibility & Arbitration in Revision Diagrams

• Visibilitywho can see what updates?= Reachabilityis there a (directed) path?

• Arbitrationwhose update goes first?= Cactus Walk

1

2

34

5

6

7

8

9

Page 23: Concurrent Revisions And Cloud Types

Not everything is a revision diagram: The join condition

Revision diagrams are subject to the join condition:A revision can only be joined into vertices that are reachable from the fork.

Invalid join, no path from fork.

Page 24: Concurrent Revisions And Cloud Types

Without join condition, causality may be violated.

A

B

C

• B sees updates of A• C sees updates of B• But C does not see updates of A

• Without join condition, visibility is not transitive.• We prove in paper: enforcing

join condition is sufficient to guarantee transitive visibility.

Page 25: Concurrent Revisions And Cloud Types

Conclusion of Part I

• Revision Diagrams• Make replication explicit• Provide a principled way to understand and

define the effect of concurrent conflicting updates

• Concurrent Revisions• Can use revisions as a programming model to

achieve better performance, or to express incremental + parallel algorithms

Page 26: Concurrent Revisions And Cloud Types

REVISIONS & DISTRIBUTED SYSTEMSPart II

Page 27: Concurrent Revisions And Cloud Types

Revisions + Distributed Systems

• Can think of many applications.

• Revision pattern is commonly used for:• Source control systems

(data structured as file systems, with per-file merge operations)• Classic web applications

Load HTML form – edit locally – submit – server does merge• Modern web applications

read REST object – javascript modifies locally – write REST object

• We are currently focusing on this programming domain:Apps for Mobile + Cloud.

Page 28: Concurrent Revisions And Cloud Types

Revisions + Distributed Systems

• Can think of many applications.

• Revision pattern is commonly used for:• Source control systems

(data structured as file systems, with per-file merge operations)• Classic web applications

Load HTML form – edit locally – submit – server does merge• Modern web applications

read REST object – javascript modifies locally – write REST object

• We are currently focusing on this programming domain:Apps for Mobile + Cloud.

Page 29: Concurrent Revisions And Cloud Types

MOTIVATIONPart IIa

Page 30: Concurrent Revisions And Cloud Types

Why apps communicate

Personal Publishing

Games

Data Collection

Collaboration

Sync and Backup

Transactions

BlogFacebook WallWebsite

MusicVideoSkyDrive

SurveysHigh Scores

OneNoteShared ListsShared CalendarShared Spreadsheet

Real-timeTurn-based

StoreAuctionMatchmaking

Remote Control

Home ControlRoboticsMedia Player

Page 31: Concurrent Revisions And Cloud Types

Requirements• Persistence• Data is not deleted when we:

quit the app, lose connection to server, take the battery out, crash due to bug, close the browser, replace the phone

• Reliability• Process is not lost (resume at last stable point)• Data integrity is protected

• Offline support• App continues to work without connection to cloud

• Security• Control who can do what

• Scalability• Support many users and/or large databases at low cost

Page 32: Concurrent Revisions And Cloud Types

Requirements• Persistence• Data is not deleted when we:

quit the app, lose connection to server, take the battery out, crash due to bug, close the browser, replace the phone

• Reliability• Process is not lost (resume at last stable point)• Data integrity is protected

• Offline support• App continues to work without connection to cloud

• Security• Control who can do what

• Scalability• Support many users and/or large databases at low cost

Our focus.

Implies:- Need replicas on client - must support eventual

consistency

Page 33: Concurrent Revisions And Cloud Types

milkbreadeggs

cilantrosardinesguava

grocery list

Page 34: Concurrent Revisions And Cloud Types

Implementation Architecture?

• Peer-to-Peer• Program runs on clients only, no server• Popular with researchers• Not all that common in practice

• Client-Server, or more recently Client-Service• Very common these days• Service typically hosted on virtualized infrastructure (cloud)

-> makes “economy of scale” accessible to everybody

Page 35: Concurrent Revisions And Cloud Types

Node

Storage

Compute

Node

Storage

Compute

Node

Storage

Compute

Basic Using Cloud Infrastructure

Storage Storage

Compute Compute Compute

Storage Storage

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Distributed Systems

Page 36: Concurrent Revisions And Cloud Types

Layer

Storage Storage

Compute Compute Compute

Storage Storage

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

Clie

nt

How to program this machine?

ClientNot physically secureUnreliableCannot detect failuresPotentially many

Cloud ComputePhysically secure, not so manyNot reliable: no persistent stateCan detect failures somewhatRelatively Expensive

Cloud StorageSecureReliableCan be very cheap

Page 37: Concurrent Revisions And Cloud Types

Extensive Replication

Replica

App

Local State GUIStateBinding

Storage Backend

save/restoresave/restore

App

Local State GUIStateBinding

save/restoresave/restoreStorageBackend

Messages

sync

Messages

Compute LayerCompute

LayerCompute LayerCompute

LayerCompute LayerCompute

LayerCompute Layer

Mes

sage

s

LocalStorage

LocalStorage

Cache Coherence? Consistency Model?

Page 38: Concurrent Revisions And Cloud Types

grocery list

Page 39: Concurrent Revisions And Cloud Types

App programmers should not have to think that much.All this stuff is all about the implementation, not about the problem domain.

• Program runs on server, client is just a view• Example: Classic HTML approach• Client clicks link/button to submit,

gets next page• Program runs on client, uses server as

a resource• Client issues webrequests (e.g. REST)

• Program runs on client and on server• Example: websockets are full-duplex• Client and server can send messages,

causing event handlers to launch at other end

• Peer-to-Peer• Program runs on clients only, no

server• Rare for apps, as far as I know

Replica

AppLocal State

GUIState

Binding

Storage Backend

save/restor

e

save/restor

e

AppLocal State

GUIState

Binding

save/restor

e

save/restor

eStorageBackend

Messages

sync

Messages

Compute LayerCompute

LayerCompute LayerCompute

LayerCompute LayerCompute

LayerCompute Layer

Mes

sage

sLoca

lStorage

Local

Storage

Page 40: Concurrent Revisions And Cloud Types

Abstractions, please.

We propose:

- Revision Consistency- Cloud Types

Papers:Eventually Consistent Transaction (ESOP 2011)Cloud Types (ECOOP 2012)

Page 41: Concurrent Revisions And Cloud Types

Closely Related Work

• CRDTs (Conflict-Free Replicated Data Types)• [Shapiro, Preguica, Baquero, Zawirski]• Similar motivation and similar techniques

• Bayou• user-defined conflict resolution (merge fcts.)

Page 42: Concurrent Revisions And Cloud Types

REVISION CONSISTENCYPart IIb

Page 43: Concurrent Revisions And Cloud Types

Revision Consistency

device 1 device 2cloud• Client code: Declare data types read/update data yield (=polite sync) flush (=forced sync)

• Under the hood: Revision diagram

rules

Page 44: Concurrent Revisions And Cloud Types

device 1 device 2cloud

Implicit Transactions

• At yieldRuntime has permission to send or receive updates. Call this frequently, e.g. automatically “on idle”.

• In between yieldsRuntime is not allowed to send or receive updates

• Implies: all client code executes in a (eventually consistent) transaction

yield

yield

yield

yieldyield

yield

yield

yield

Page 45: Concurrent Revisions And Cloud Types

On-Demand Stronger Consistency

• flush primitive blocks until local state has reached main revision and result has come back to device• Sufficient to

implement strong consistency• Flush blocks –times

out if server connection is not available.

flush(blocks)

(continue)

Page 46: Concurrent Revisions And Cloud Types

Revision consistency

• Global state evolves as a Revision diagram• Main revision

(center) in reliable cloud storage• Seamless offline

support• Never blocks,

except when client issues fence

B

C

A

D

E

G

F

Client 1 Client 2

flush

yield

yield

yield

yield

yield

Page 47: Concurrent Revisions And Cloud Types

Revision consistency

• Global state evolves as a Revision diagram• Main revision

(center) in reliable cloud storage• Seamless offline

support• Never blocks,

except when client issues fence

B

CA

D

E

G

F

Client 1 Client 2

yield

fence

yield

yield

yield

yield

Page 48: Concurrent Revisions And Cloud Types

Nice things about Revision Consistency

Strong guarantees:• Guarantees causal eventual consistency• Supports eventually consistent transactions • Supports on-demand stronger consistency

Opportunities for efficient implementation• Naturally supports storage hierarchies• Consistent with full-duplex pull & push updates between

service and client (e.g. websockets)• Can be combined with “log reduction” techniques

Page 49: Concurrent Revisions And Cloud Types

• Main Revision = Master Log

• Suggested Implementation:- Main Log in Cloud Storage- Scalable read & write

• It is possible to scale reading/writing of log

• Log Reduction

B

CA

D

E

G

F

Client 1 Client 2

B

AD

CFG

Related to Log-BasedImplementations

Page 50: Concurrent Revisions And Cloud Types

[YieldPull]

[YieldPush]

[FlushPush]

[FlushPull]

[SyncPush]

[SyncPull]

[SyncPush]

[SyncPush]

[SyncPull]

[SyncPull]

[YieldPull]

[YieldPush]

1 0 0

1

00

1

1

1

2

22

3

4

3

4

Can build layered serviceReliableStorage

ComputeLayer

ComputeLayer

Client Client

Page 51: Concurrent Revisions And Cloud Types

CLOUD TYPESPart IIc

Page 52: Concurrent Revisions And Cloud Types

• An abstract data type with

• Initial value e.g. { 0 }• Query operations e.g. { get }• No side effects

• Update operations e.g. { set(x), add(x) }• Total (no preconditions)

• Good cloud types minimize programmer surprises.

What is a cloud type?

Page 53: Concurrent Revisions And Cloud Types

Our goals for finding cloud types…

• select only a few• But ensure many others can be derived

• choose types with minimal anomalies• Updates should make sense even if state changes

Forces us to rethink basic data structuring.• Objects & pointers fail the second criterion• Entities & relations do better

Page 54: Concurrent Revisions And Cloud Types

Example App: Birdwatching• An app for a birdwatching family.

• Start simple: let’s count the number of eagles seen.

var eagles : cloud integer;

Page 55: Concurrent Revisions And Cloud Types

device 1 device 2cloud

var eagles : cloud integer;

Eventually consistent counting

eagles.add(1) eagles.Set(1)

eagles.Get() -> 1

eagles.add(1)

eagles.get() → 3

eagles.add(1)

eagles.get() → 2

Page 56: Concurrent Revisions And Cloud Types

device 1 device 2cloud

Counting by birdvar birds: cloud array [name: string] {count : cloud integer}

birds[“jay”].count.Add(1)birds[“gull”].count.Add(2)

birds[“jay”].count.Get() -> 6

birds[“jay”].count.Add(5)

Important: all entries are already there, no need to insert key-value pairs.

Page 57: Concurrent Revisions And Cloud Types

Standard Map Semantics Would not Work!

device 1 device 2cloud

if birds.contains (“jay”) birds[jay].Add(5)else birds.insert(“jay”, 5)

?

if birds.contains (“jay”) birds[jay].Add(3)else birds.insert(“jay”, 3)

Page 58: Concurrent Revisions And Cloud Types

Our Collection of Cloud TypesPrimitive cloud types

• Cloud Integers{ get } { set(x), add(x) }

• Cloud Strings{ get } { set(s), set-if-empty(s) }

Structured cloud types• Cloud Tables

(cf. entities, tables with implicit primary key)• Cloud Arrays

(cf. key-value stores, relations)

Page 59: Concurrent Revisions And Cloud Types

Cloud Tables• Declares• Fixed columns• Regular columns

• Initial value: empty

• Operations: • new E(f1,f2) add new row (at end)• all E return all rows (top to bottom)• delete e permanently delete row• e.f1 read fixed column

• e.coli.op perform operation on cell

cloud table E( f1: index_type1; f2: index_type1;){ col1: cloud_type1; col2: cloud_type2;}

Page 60: Concurrent Revisions And Cloud Types

Cloud Arrays

• Initial value: for all keys, fields have initial value• Operations: • A[i1,i2].vali.op perform operation on value• entries A return entries for which

at least one vali is not initial value

cloud array A[ idx1: index_type1; idx2: index_type2;]{ val1: cloud_type1; val2: cloud_type2;}

Page 61: Concurrent Revisions And Cloud Types

Arrays + Tables = Relational Data

• Tables• Define entities• Row identity = Invisible primary key

• Arrays• Define arbitrary relations

• Code can access data using queries• For example, LINQ queries

Page 62: Concurrent Revisions And Cloud Types

Arrays + Tables = Relational Data

• Example: shopping cart

cloud table Customer{ name: cloud string;}

cloud table Product{ description: cloud string;}

cloud array ShoppingCart[ customer: Customer; product: Product;]{ quantity: cloud integer;}

Function Add(c: Customer; p: Product; x: int){ ShoppingCart[c,p].quantity.Add(x);}

Page 63: Concurrent Revisions And Cloud Types

Arrays + Tables = Relational Data

• Example: binary relation

cloud table User{ name: cloud string;}

cloud array friends[ user1 : User; user2 : User;]{ value: cloud boolean;}

Standard math: { relations AxBxC } = { functions AxBxC -> bool }

Page 64: Concurrent Revisions And Cloud Types

Arrays + Tables = Relational Data

• Example: linked tables

• Cascading delete: Order is deleted automatically when owning customer is deleted

cloud table Customer{ name: cloud string;}

cloud table Order[ owner: Customer]{ description: cloud string;}

Page 65: Concurrent Revisions And Cloud Types

Linked tables solve following problem:

device 1 device 2cloud

delete customer;foreach o in Orders if (o.owner = customer) delete o;

?

new Order(customer);

Page 66: Concurrent Revisions And Cloud Types

Flush can be used to implement a lock

We don’t recommend you actually do this in practice. (why?)

function Lock(){ while(lock != my_id) { lock.setIfEmpty(my_id); flush; }}

lock: cloud string;

function Unlock(){ lock.set(“”);}

Page 67: Concurrent Revisions And Cloud Types

HOW TO MAKE IT REALPart III

Page 68: Concurrent Revisions And Cloud Types

The hypothesis

• Anyone with basic programming skills can write simple apps that share data in the cloud• No harder than to writing a BASIC program or

an Excel script• Just declare your cloud table or cloud indexes,

and off you go.

• Success will mean: things work without pain – users won’t even appreciate that there is a research problem behind it

Page 69: Concurrent Revisions And Cloud Types

What is TouchDevelop?• A simple programming language

• An integrated development environment (IDE)

• Optimized for devices (small screens, touch input)no PC required, no keyboard required

• Runs on almost everything (Ipad, Iphone, Android, PC, windows phone …)

• You can share your scripts online (public domain)

or convert them to Windows 8 apps and sell them in the store

Page 70: Concurrent Revisions And Cloud Types

News about TouchDevelopConversations by Nokia: "Create your own Lumia apps with TouchDevelopNikkei Computer: "スマホのアプリをスマホで開発"

atmarkIT: "MSの開発環境「TouchDevelop"

PC-Welt: "Microsoft startet webbasierten App Creator für Windows 8"neowin.net: "Microsoft launches web-based Windows 8 app creator"NYTimes: "Fostering Tech Talent in Schools" c't magazine: "Apps für Windows Phone" (in German)TechRepublic: "Fun with TouchDevelop, an IDE for Windows Phone 7"Social Times: "Microsoft Research TouchDevelop for Windows Phone: The First Social Cloud Programming Environment"c't magazine: "[TouchDevelop]" (in German)Social Times: "Microsoft Research [TouchDevelop] Makes Windows Phone a Game Changing Platform: Prepare to be Amazed"Geek Wire: "Microsoft ‘[TouchDevelop]’ uses phone to program phone"