157
Just-Right Consistency Closing the CAP Gap Christopher S. Meiklejohn (@cmeik), Peter Lash LIGHT ONE

Just-Right Consistency - Closing the CAP Gap - Percona · PDF fileJust-Right Consistency Closing the CAP Gap Christopher S. Meiklejohn ... Maintains “single system image ... Updates

  • Upload
    dangnga

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Just-Right ConsistencyClosing the CAP GapChristopher S. Meiklejohn (@cmeik),Peter Lash

LIGHT ONE

Outline: Closing the CAP Gap

• Just-Right ConsistencyAvailable as possible, and consistent when necessary

2

Outline: Closing the CAP Gap

• Just-Right ConsistencyAvailable as possible, and consistent when necessary

• AntidoteDBThe first database that provides transactions with strong semantics, targeted at the JRC approach

2

Outline: Closing the CAP Gap

• Just-Right ConsistencyAvailable as possible, and consistent when necessary

• AntidoteDBThe first database that provides transactions with strong semantics, targeted at the JRC approach

• Moving forwardAntidote’s path forward from research to company and product

2

Motivation Cloud Databases

3

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

Centralized database.

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

Clients read and write against the primary copy.

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Geo-replicated for both fault-tolerance and high-availability.

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Clients read and write locally for low-latency.

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

What happens if C can’t communicate with other replicas?

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 1: Consistent-Under-Partition (CP)• Synchronize each operation

Maintains “single system image”

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 1: Consistent-Under-Partition (CP)• Synchronize each operation

Maintains “single system image”

• Spanner/F1, serializability modelCoordination is expensive; Spanner typically has to wait 100ms to commit an update transaction

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 1: Consistent-Under-Partition (CP)• Synchronize each operation

Maintains “single system image”

• Spanner/F1, serializability modelCoordination is expensive; Spanner typically has to wait 100ms to commit an update transaction

Over-conservative,but easy to program!

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 2: Available-Under-Partition (AP)• Riak, Cassandra, Dynamo

Operations issued against local copy, and across the cluster in parallel

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 2: Available-Under-Partition (AP)• Riak, Cassandra, Dynamo

Operations issued against local copy, and across the cluster in parallel

• Local operation only, asynchronous propagationStale reads and write conflicts will occur without synchronization

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

Choice 2: Available-Under-Partition (AP)• Riak, Cassandra, Dynamo

Operations issued against local copy, and across the cluster in parallel

• Local operation only, asynchronous propagationStale reads and write conflicts will occur without synchronization

Available,but difficult to program!

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP TheoremCP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

High availability

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

High availability

Anomalies

CP AP

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

High availability

Anomalies

CP AP

False dichotomy!

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

High availability

Anomalies

CP AP

False dichotomy!

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

• No “one-size-fits-all” consistency modelChoosing either model will either be over-conservative or risk anomalies

A

B

C

CAP Theorem

High cost

Low availability

Synchronization

Low cost

High availability

Anomalies

CP AP

False dichotomy!

[Photo: http://vignette3.wikia.nocookie.net/the-titans-rp-and-information/images/f/f5/Blank-World-map2.gif/revision/latest/scale-to-width-down/1280?cb=20141016203452]

• No “one-size-fits-all” consistency modelChoosing either model will either be over-conservative or risk anomalies

• Application-level invariantsInstead, tailor consistency choices based on application-level invariants for each operation

Just Right Consistency• Preserve sequential patterns

Applications written sequentially that are correct should maintain correctness under concurrency

13

Just Right Consistency• Preserve sequential patterns

Applications written sequentially that are correct should maintain correctness under concurrency

• AP-compatible invariantsStrongest AP model; invariants that only require “one way” communications

13

Just Right Consistency• Preserve sequential patterns

Applications written sequentially that are correct should maintain correctness under concurrency

• AP-compatible invariantsStrongest AP model; invariants that only require “one way” communications

• CAP-sensitive invariantsTransactions that require coordination; “two way” communication invariants

13

Just Right Consistency• Preserve sequential patterns

Applications written sequentially that are correct should maintain correctness under concurrency

• AP-compatible invariantsStrongest AP model; invariants that only require “one way” communications

• CAP-sensitive invariantsTransactions that require coordination; “two way” communication invariants

• Tools for analysis and verificationIdentify and verify application has sufficient synchronization to ensure application invariants

13

Example Fælles Medicinkort

14

Fælles Medicinkort• FMK [production] / FMKe [synthetic workload]

Danish National Joint Medicine Card; operating 24x7 since 2013 for 6 million Danish citizens

15

Fælles Medicinkort• FMK [production] / FMKe [synthetic workload]

Danish National Joint Medicine Card; operating 24x7 since 2013 for 6 million Danish citizens

• Lifecycle management for prescriptionsInvolves patient, pharmacy, and doctor management around active prescriptions in Denmark

15

Fælles Medicinkort• FMK [production] / FMKe [synthetic workload]

Danish National Joint Medicine Card; operating 24x7 since 2013 for 6 million Danish citizens

• Lifecycle management for prescriptionsInvolves patient, pharmacy, and doctor management around active prescriptions in Denmark

• Assumed correct in isolation “Correct-Individually”, C in ACID, each operation ensures application-level invariants

15

Fælles Medicinkort• FMK [production] / FMKe [synthetic workload]

Danish National Joint Medicine Card; operating 24x7 since 2013 for 6 million Danish citizens

• Lifecycle management for prescriptionsInvolves patient, pharmacy, and doctor management around active prescriptions in Denmark

• Assumed correct in isolation “Correct-Individually”, C in ACID, each operation ensures application-level invariants

15

• create-prescriptionCreate prescription for patient, doctor, pharmacy

• update-prescription-medicationAdd or increase medication to prescription

• process-prescriptionDeliver a medication by a pharmacy

• get-*-prescriptionsQuery functions to return information about prescriptions

FMKe Invariants• Relative order [referential integrity]

Create a prescription and reference it by a patient

16

FMKe Invariants• Relative order [referential integrity]

Create a prescription and reference it by a patient

• Joint update [atomicity]Create prescription, then update doctor, patient, and pharmacy

16

FMKe Invariants• Relative order [referential integrity]

Create a prescription and reference it by a patient

• Joint update [atomicity]Create prescription, then update doctor, patient, and pharmacy

• Precondition check [if, then]Medication should not be over delivered

16

Invariants AP-compatible

17

AP-compatible• No synchronization

Updates occur locally without blocking, no synchronization in the critical path

18

AP-compatible• No synchronization

Updates occur locally without blocking, no synchronization in the critical path

• Asynchronous operationUpdates are fast, available, and exploit concurrency

18

AP-compatible• No synchronization

Updates occur locally without blocking, no synchronization in the critical path

• Asynchronous operationUpdates are fast, available, and exploit concurrency

• Compatible invariantsRelative order and joint update invariants can be preserved

18

AP-compatibe Data Model

19

RA

RB

RA

RB

1

set(1)

RA

RB

1

set(1)

3

2

set(2)

set(3)

RA

RB

1

set(1)

3

2

set(2)

set(3)

2

3

Concurrent assignmentsdon’t commute!

RA

RB

1

set(1)

3

2

set(2)

set(3)

2

3

Concurrent assignmentsdon’t commute!

Assignment requires CP.

24

Can we find a suitable data model for AP systems?

Can we make non-commutative updates commutative?

24

Can we find a suitable data model for AP systems?

RA

RB

1

set(1)

3

2

set(2)

set(3)

?

?

How do we deterministically pick a value to keep?

RA

RB

1

set(1)

3

2

set(2)

set(3)

?

?

How do we deterministically pick a value to keep?

Do we use a timestamp?(like Cassandra, and drop a value?)

RA

RB

1

set(1)

3

2

set(2)

set(3)

?

?

How do we deterministically pick a value to keep?

Do we use a timestamp?(like Cassandra, and drop a value?)

Timestamps make concurrent operations commute

but fail to capture intent.

Can we be smarter about the merge function?

26

RA

RB

1

set(1)

3

2

set(2)

set(3)

3

3

max(2,3)

max(2,3)

Deterministic conflict resolution

function.

RA

RB

1

set(1)

3

2

set(2)

set(3)

3

3

max(2,3)

max(2,3)

Deterministic conflict resolution

function.

CRDTs generalize

this framework.

Conflict-Free Replicated Data Types

• Replicated abstract data types Extension of sequential data type that encapsulates deterministic merge function

28

Conflict-Free Replicated Data Types

• Replicated abstract data types Extension of sequential data type that encapsulates deterministic merge function

• Many existing designsSets, counters, registers, flags, maps

28

AP-compatibe Relative Order

29

RA

RB

RA

RB

Maintain program order implication invariant.

RA

RB

Maintain program order implication invariant.

For instance, P => Q.

RA

RB

Q

true(Q)

Make Q true.

RA

RB

Q

true(Q)

P

true(P)

Make P true.

RA

RB

Q

true(Q)

P

true(P)

Program order implies ordering relationship.

RA

RB

Q

true(Q)

P

true(P)

Ordering is respected at other replicas.

RA

RB

Q

true(Q)

P

true(P)

Out of order propagation violates invariant!

RA

RB

Q

true(Q)

P

true(P)

P is true, Q is NOT true!

Let’s look at a concrete example.

37

RA

RB

RA

RB

Q

true(Q)

Change default administrator password.

RA

RB

Q

true(Q)

P

true(P)

Enable administrator login.

RA

RB

Q

true(Q)

P

true(P)

Replica A is secure.

RA

RB

Q

true(Q)

P

true(P)

Replica B is secure.

RA

RB

Q

true(Q)

P

true(P)

Reordering allows default password to be used to login!

Causal Consistency• Respect causality

Ensure updates are delivered in the causal order [Lamport 78]

44

Causal Consistency• Respect causality

Ensure updates are delivered in the causal order [Lamport 78]

• Strongest available modelAlways able to return some compatible version for an object

44

Causal Consistency• Respect causality

Ensure updates are delivered in the causal order [Lamport 78]

• Strongest available modelAlways able to return some compatible version for an object

• Referential integrityCausal consistency is sufficient for providing referential integrity in an AP database

44

…relative order invariants are preserved transparently!

45

Causal consistency means…

AP-compatibe Joint Update

46

RA

RB

C1

Client performing reads.

RA

RB

C1

Rx

create Rx

Create prescription.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Add reference in doctor record.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Pt

update Pt(Rx)

Add reference in patient record.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Pt

update Pt(Rx)

Ph

update Ph(Rx)

Add reference in pharmacy record.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Pt

update Pt(Rx)

Ph

update Ph(Rx)

Updates are causally consistent.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Pt

update Pt(Rx)

Ph

update Ph(Rx)

Client can read inconsistent state.

RA

RB

C1

Rx

create Rx

Dr

update Dr(Rx)

Pt

update Pt(Rx)

Ph

update Ph(Rx)

Client is missing update to pharmacy.

Can we ensure updates are All-or-Nothing?

55

RA

RB

C1

T1

create Rxupdate Dr(Rx)update Pt(Rx)update Ph(Rx)

Group updates into an atomic transaction.

RA

RB

C1

T1

create Rxupdate Dr(Rx)update Pt(Rx)update Ph(Rx)

Updates reflect “All-Or-Nothing” property through snapshots.

RA

RB

C1

T1

create Rxupdate Dr(Rx)update Pt(Rx)update Ph(Rx)

T2

Transactions are delivered in causal order.

RA

RB

C1

T1

create Rxupdate Dr(Rx)update Pt(Rx)update Ph(Rx)

T2

Therefore, snapshots are causally consistent.

AP-compatible transactions provide the “A” in ACID

60

Transactional Causal Consistency

61

Strongest model that is available (AP)

Invariants CAP-sensitive

62

What about preventing over delivery of prescriptions?

63

RA(2)

RB(2) ?

?

RC(2) ?

Three replicas each with two available medications.

RA(2)

RB(2) 1

11

pp(1)

RC(2) 1

Replica A checks precondition and delivers medication.

RA(2)

RB(2) 1

11

pp(1)

RC(2) 1

Correct outcomewhere one medication remains.

Is this safe with concurrent operations?

67

RA(2)

RB(2) ?

?

RC(2) ?

Three replicas each with two available medications.

RA(2)

RB(2) 4

41

pp(1)

RC(2) 44

add(3)

Replica A checks precondition and delivers medication.

RA(2)

RB(2) 4

41

pp(1)

RC(2) 44

add(3)

Replica C adds three medicationsto the prescription.

RA(2)

RB(2) 4

41

pp(1)

RC(2) 44

add(3)

Correct outcome with four remaining medications.

RA(2)

RB(2) 4

41

pp(1)

RC(2) 44

add(3)

Correct outcome with four remaining medications.

Precondition is stable under concurrent addition.

Is this safe with concurrent deliveries?

72

RA(2)

RB(2) ?

?

RC(2) ?

Three replicas each with two available medications.

RA(2)

RB(2) -1

-11

pp(1)

RC(2) -10

pp(2)

Replica A checks precondition and delivers medication.

RA(2)

RB(2) -1

-11

pp(1)

RC(2) -10

pp(2)

Replica C concurrently checks preconditionand delivers two medications.

RA(2)

RB(2) -1

-11

pp(1)

RC(2) -10

pp(2)

Incorrect outcome violating non-negative invariant.

RA(2)

RB(2) -1

-11

pp(1)

RC(2) -10

pp(2)

Incorrect outcome violating non-negative invariant.

Precondition is NOT stable under concurrent fulfillment.

RA(2)

RB(2) -1

-11

pp(1)

RC(2) -10

pp(2)

Incorrect outcome violating non-negative invariant.

Precondition is NOT stable under concurrent fulfillment.

• Forbid concurrency Prevent operations from proceeding without synchronization to enforce invariant

• Allow concurrency and remove invariantAllow operation to proceed, knowing that the invariant may be violated under concurrent operations

How do we know when it’s safe?

77

CISE Analysis

78

RA

RB I?

I??

Upre?

RC I??

Vpre?

Analyze possible pairs of concurrent operations…

RA

RB I?

I??

Upre?

RC I??

Vpre?

…to identify operations where the invariant can be violated.

CISE Analysis• Individually correct

Individual operations never violate the invariant

81

CISE Analysis• Individually correct

Individual operations never violate the invariant

• ConvergenceConcurrent effects commute

81

CISE Analysis• Individually correct

Individual operations never violate the invariant

• ConvergenceConcurrent effects commute

• Precondition stabilityPreconditions are stable under every pair of concurrent operations

81

CISE Analysis• Individually correct

Individual operations never violate the invariant

• ConvergenceConcurrent effects commute

• Precondition stabilityPreconditions are stable under every pair of concurrent operations

81

If satisfied, invariant is guaranteed with concurrency.

Database AntidoteDB

82

AntidoteDB• Open-source Erlang database

Developed in Erlang, on top of the Riak Core distributed systems framework

83

AntidoteDB• Open-source Erlang database

Developed in Erlang, on top of the Riak Core distributed systems framework

• Transactional Causal ConsistencyOnly industrial-grade database providing both causal consistency and all-or-nothing transactions

83

AntidoteDB• Open-source Erlang database

Developed in Erlang, on top of the Riak Core distributed systems framework

• Transactional Causal ConsistencyOnly industrial-grade database providing both causal consistency and all-or-nothing transactions

• Alpha release availableCurrently under development, but an alpha release of the product is available on GitHub

83

A

B

N1

N2

TxnMgr

Materializer

Log

InterDC-Repl

Each data center…

A

B

N1

N2

TxnMgr

Materializer

Log

InterDC-Repl

…contains multiple nodes…

A

B

N1

N2

TxnMgr

Materializer

Log

InterDC-Repl

…each operating a transaction manager, materializers, log.

A

B

N1

N2

TxnMgr

Materializer

Log

InterDC-Repl

Strong consistency inside of the data center…

A

B

N1

N2

TxnMgr

Materializer

Log

InterDC-Repl

…with a causal consistency protocol running in the wide area.

Data Model

89

Register• Last-Writer Wins • Multi-Value

Set• Grow-Only • Add-Wins • Remove-Wins

Map

Counter• Unlimited • Restricted ≥ 0

Graph• Directed • Monotonic DAG • Edit graph

Sequence

Object API

90

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Object API

91

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Identify an object by object identifier.

Object API

92

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Use the update API to assign a value to this register.

Object API

93

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Read the object, providing a minimum snapshot time.

Object API

93

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Read the object, providing a minimum snapshot time.

Simple, operation-based API. (think Redis, Riak CRDTs)

Object API

93

User1 = {michel, antidote_crdt_mvreg, user_bucket},

{ok, Time2} = antidote:update_objects(ignore, [], [{User1, assign,

{["Michel", “[email protected]”], ClientIdentifier}}]),

{ok, Result, Time2} = antidote:read_objects( ignore, [], [User1]).

Read the object, providing a minimum snapshot time.

Simple, operation-based API. (think Redis, Riak CRDTs)

Causal dependencies are automatically captured by

execution order.

Transaction API

94

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

Transaction API

95

Start a transaction with the transaction API, with a given snapshot time and return a transaction identifier.

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

Transaction API

96

Read objects using the interactive transaction API.

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

Transaction API

97

Update objects using the interactive transaction API.

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

Transaction API

98

Once finished updating, commit the transaction.

{ok, TxId} = antidote:start_transaction(Timestamp, []), {ok, _} = antidote:read_objects([Set], TxId), ok = antidote:update_objects([{Set, add, "Java"}], TxId), {ok, _} = antidote:commit_transaction(TxId).

Transaction API

98

Once finished updating, commit the transaction.

Transactions read causally consistent snapshots

and updates are applied atomically.

Scalability

99

Kops

/ s

100200300400500600700800

1 x 5

1 x 1

01

x 25

2 x 2

53

x 25

1 x 5

1 x 1

01

x 25

2 x 2

53

x 25

1 x 5

1 x 1

01

x 25

2 x 2

53

x 25

1 x 5

1 x 1

01

x 25

2 x 2

53

x 25

99(1) 90(10) 75(25) 50(50)

read(update) ratio

DCs × Servers

LWW registers 100k keys/partitionpower law distribution

Cure vs. SOA

100

Kops

/ s

0100200300400500600700800900

10001100

Eige

rGR Cure EC

Eige

rGR Cure EC

Eige

rGR Cure EC

Eige

rGR Cure EC

99(1) 90(10) 75(25) 50(50)

read(update) ratio

3 DCs × 25 ServersLWW registers

Cure vs. EC

101

Kops

/ s

100200300400500600700800900

100011001200

Cure

, 1KB

EC, 1

KBCu

re, 1

0KB

EC, 1

0KB

Cure

, 1KB

EC, 1

KBCu

re, 1

0KB

EC, 1

0KB

Cure

, 1KB

EC, 1

KBCu

re, 1

0KB

EC, 1

0KB

Cure

, 1KB

EC, 1

KBCu

re, 1

0KB

EC, 1

0KB

99(1) 90(10) 75(25) 50(50)

read(update) ratio

3 DCs x 25 ServersCRDT sets

Future Features• Intra-DC replication

Antidote provides no replication within the datacenter and assumes only geo-replication at the moment

102

Future Features• Intra-DC replication

Antidote provides no replication within the datacenter and assumes only geo-replication at the moment

• ACID transactionsFor Antidote to provide all of JRC, it needs ACID transaction support: no research needed, only implementation

102

Moving Forward• Research prototype

Originally a research prototype to build a database requiring reduced synchronization (SyncFree FP7) with Basho, Rovio, and Trifork

103

Moving Forward• Research prototype

Originally a research prototype to build a database requiring reduced synchronization (SyncFree FP7) with Basho, Rovio, and Trifork

• Research aheadLightKone (H2020) will investigate moving AntidoteDB close to the edge to provide DDN services

103

Moving Forward• Research prototype

Originally a research prototype to build a database requiring reduced synchronization (SyncFree FP7) with Basho, Rovio, and Trifork

• Research aheadLightKone (H2020) will investigate moving AntidoteDB close to the edge to provide DDN services

• IndustrializationObtaining seed funding to start a company to industrialize AntidoteDB

103

Resources• https://github.com/SyncFree/antidote

AntidoteDB

104

Resources• https://github.com/SyncFree/antidote

AntidoteDB

• http://syncfree.github.io/antidote/Documentation for AntidoteDB

104

Resources• https://github.com/SyncFree/antidote

AntidoteDB

• http://syncfree.github.io/antidote/Documentation for AntidoteDB

• www.antidotedb.comWebsite

104

Resources• https://github.com/SyncFree/antidote

AntidoteDB

• http://syncfree.github.io/antidote/Documentation for AntidoteDB

• www.antidotedb.comWebsite

• docker pull antidotedb/antidoteTry out Antidote!

104

Thanks!

105

More questions? Come visit us at the

Evolution bar!