23
Sharding: patterns and antipatterns Konstantin Osipov (Mail.Ru, Tarantool) Alexey Rybak (Badoo)

"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Embed Size (px)

Citation preview

Page 1: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Sharding: patterns and

antipatterns

Konstantin Osipov (Mail.Ru, Tarantool)

Alexey Rybak (Badoo)

Page 2: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Big picture: scalable databases

● replication

● sharding and re-sharding

● distributed queries & jobs, Map/Reduce

● DDL

● will focus on sharding/re-sharding only

Page 3: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Contents

I. sharding function

II. routing

III.re-sharding

Page 4: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

I. Sharding function

Page 5: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Selecting a good shard key

● the identified object

should be small

● some data you won’t be

able to shard (and have to

duplicate in each shard)

● don’t store the key if you

don’t have to

Page 6: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Good and bad shard keys

● good: user session, shopping order

● maybe: user (if user data isn’t too thick)

● bad: inventory item, order date

Page 7: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Garage sharding: numbers

● replication based doubling (2, 4, 8, out of

cash)

● the magic number 48 (2✕3✕4)

Page 8: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Garage sharding thru hashing

● good: remainderso f(key) ≡ key % n_srv

o f(key) ≡ crc32(key) % n_srv

● bad: first login letter

Page 9: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Sharding for grown-ups

● table function

● consistent hashing

Page 10: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Table functions● virtual buckets: key -> bucket -> shard

o “key -> bucket” function, “bucket -> shard” table

o “key -> bucket” table, “bucket -> shard” table

Page 11: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Consistent hashing

● Danny Lewin RIP

● Kinda ring and like...

uhm... points, you

know ...

● Libraries: Ketama

Page 12: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Guava/Sumbur

● f(key, n_servers) => server_id

● strictly uniform key-to-server mapping

● recurrence formula (15 lines of code)

Page 13: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

II. Routing

Page 14: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Routing types

● smart client

● coordinator

● proxy

● local proxy on every app server

● intra-database routing

Page 15: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Smart Client

● no extra hops

● all clients

(PHP/Python/C...)

should implement

it

● resharding is hard

Page 16: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Proxy

● encapsulates routing logic

● extra hop, traffic

● +1 service

● SPOF

=> local proxy

Page 17: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Coordinator

● centralized

knowledge

● SPOF

Page 18: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Intra-database routing

● too many nodes

● redundancy is high

● ad-hoc requests

Page 19: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

III.Re-sharding

Page 20: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Re-sharding is a pain

● redistribution impacts:o clients

o network performance

o consistency

=> maintenance time window

● forget about it on petabyte scale

Page 21: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

Best practice: no data redistribution

● update is a move

● data expiration (new data on new servers)

● new data on selected servers

Page 22: "Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Константина Осипова (Mail.ru)

DDL

● upgrade your app

● upgrade your database

● update your app and remove any trace of old

schema