44
Core Team Riccardo Capecchi - Marco Careddu - Piermarco Zerbini Mar 2017 Devops Day 2017 Cloud DB - strengths and weaknesses

Idi2017 - Cloud DB: strengths and weaknesses

Embed Size (px)

Citation preview

Core TeamRiccardo Capecchi - Marco Careddu - Piermarco Zerbini

Mar 2017 Devops Day 2017

Cloud DB - strengths and weaknesses

Shopfully - Who We Are

Founded in 2010, ShopFully is the leading platform used by over 25 million users worldwide when getting ready to go shopping in their neighborhood.

The platform contains a variety of information including details on promotions, new products, shops, opening times and contacts of the main retailers and brands in each shopping category, geolocated in one place and easily accessible to users.

2

Shopfully - What We Do

ShopFully is the last mile media, the first source of geolocalized information on promotions, new products, shops, opening times and contacts of the main retailers and brands in all shopping categories

The services offered by ShopFully can be accessed both online at www.shopfully.com (or country specific URLs) through the website as well as through the free app developed for all major mobile platforms: iOS, Android, Windows8, Amazon and BlackBerry.

3

Shopfully - We’re going to talk about

1. Some details on our old Database Infrastructure.2. How we choose our Cloud DB.3. The design of our application, mainly focused on the Database.4. How we move move on a Cloud DB5. The new Challenges and Benefits of a Cloud DB.6. Conclusions, should you consider to go on a cloud DB ?

4

{ }Moving to the Cloud

What about the Infrastructure ?

WHY we move on DB As Service

● 9 dedicated servers ● Galera cluster multi-master managed by severalnines’ cluster control framework● Shared database infrastructure

Before Cloud Database

6

WHY we move on DB As Service

Before Cloud DatabaseProblems:● High load on all nodes during traffic spikes● Very high load on survivor nodes when recovering a broken node

Causes:● Cluster capacity was near to its limit

Possible solutions:● Horizontal scale up: unsafe because of high number of cluster nodes● Vertical scale up replacing all dedicated servers: losing two nodes at same

time was insecure7

WHY we move on DB As Service

Before Cloud Database - Vision

8

WHERE do we move on?

Let’s go on DB As a Service, but… what do we want?

Goals● Zero Downtime● Latest mysql engine version as

possible● Reduce effort for database

management

9

WHERE do we move on?

Let’s go on DB As Service, but… what do we want?

Preferred providers● Google Cloud● AWS

10

WHERE do we move on?

Google Cloud SQL VS Amazon Web Service RDS

Both of them could import a live database, but….

11

WHERE do we move on?

Google Cloud SQLFirst generation VS Second generation

Second generation features:

● Up to 7X throughput and 20X storage capacity of First Generation instances

● Less expensive than First Generation for most use cases● Option to add High Availability failover and read replication● Mysql 5.7

COOL! But… second generation does not supports external replication master.

12

WHERE do we move on?

AWS RDSMysql VS Aurora

Let’s move on Amazon RDS. But which database service?

Amazon Aurora (Aurora) is a fully managed, MySQL-compatible, relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. It delivers up to five times the performance of MySQL without requiring changes to most of your existing applications.

Amazon Aurora makes it simple and cost-effective to set up, operate, and scale your new and existing MySQL deployments, thus freeing you to focus on your business and applications. We love Aurora! But...

13

HOW MUCH does it cost?

OVH vs AWSThe database infrastructure costs was due to:

● visible:○ 9 dedicated servers

● invisible:○ 3 virtual machines (load balancers)○ 1 virtual machine (cluster monitor)○ storage backup○ sysadmin time worked

14

HOW MUCH does it cost?

OVH vs AWS

First look to AWS pricing:

● visible: x10 to Infrastructure Balance for databases

15

HOW MUCH does it cost?

OVH vs AWS

First look to AWS pricing:

● visible: x10 to Infrastructure Balance for databases

16

HOW MUCH does it cost?

Could we split databases and buy smaller instances?

17

HOW MUCH does it cost?

Query AnalysisCollected data from our Mysql Galera Cluster highlights that:

● the “load” of each database● for each database, around half of the

total queries are routed on the writer’s endpoint of RDS Aurora, and the rest on the reader’s endpoint

We mapped each database into an AWS DB tier, in this way the total cost was reduced from 10x to around 4x.

DB1

DB2

DB3

DB4

DB5

DB6

18

HOW MUCH does it cost?

OVH vs AWS - Round 2Considering the followings:

● Our galera cluster was near its limits, and we should have paid more for maintenance and new hardware.

● Amazon solution offers:○ fully managed solution○ data replication across availability zones○ easy way to enlarge/reduce read replicas○ cloudwatch○ possibile automation○ costing management by tag

19

{ }Moving to the Cloud

What about the code ?

Prerequisites for the cloud: Application design + challenges

Is your application ready for the Cloud?● Follow some simple rules to simplify the configuration of your app

○ Twelve-Factor app is your friend (3rd factor in particular)● Our application became almost ‘twelvefactored’ in previous

iterations, to anticipate eventual cloud migration○ it helped a lot in the migration for extremely centralized

configuration

21

Prerequisites for the cloud: Application design + challenges

Is your application ready for a cloud DB ?● Keep a simple design

○ No DB triggers or stored procedures■ In our case we were able to substitute the first with async

application jobs and to avoid the second altogether○ Rare use of specific MySql features

● The day you will want to change DB vendor or upgrade to a new major release, you will also thank yourself

22

Prerequisites for the cloud: Application design + challenges

What we had● Multi-master (Galera)

○ DB Read Write split at the application level using CakePHP ORM■ a simple 'sticky' master after write, to mitigate inherent deadlocks

of multi-master modelWhat we needed

● Master-slave (Aurora)○ improve buggy DB Read Write split

■ moving to master-slave we discovered split was imperfect, ‘leaking’ write queries to slaves

■ bug hidden in the previous multi-master architecture

23

Prerequisites for the cloud: Application design + challenges

Scale for the cloud● Using proper dimensioned clusters pushed our application to the

limits○ Lessons learned

■ OLD (but gold): don’t forget to periodically check your DB indexes (or lack of) usage

■ Use any kind of shielding you can● CDNs, Application Caches etc.

■ Async, async everywhere

24

{ }Moving to the Cloud

How do it with Zero Downtime ?

How do we move on?

From Galera Cluster to Aurora

GOAL:migrate db one at a time

26

How do we move on?

From Galera Cluster to Aurora

GOAL:migrate db one at a timeProblem:binlog-do-db option is not supported by OUR Galera Cluster.

27

How do we move on?

From Galera Cluster to Aurora

Galera Cluster: Replication master for all databases.

Mysql:1 Slave Replica for all databases.

28

How do we move on?

From Galera Cluster to AuroraMysql as “Washing Machine”:Activates binlog for a single DB

29

How do we move on?

From Galera Cluster to AuroraMysql as “Washing Machine”:Works as external replication master for Aurora

30

How do we move on?

From Galera Cluster to Aurora

31

How do we move on?

From Galera Cluster to Aurora

Load BalancerWriter endpoint Reader endpoint

Webservers

32

How do we move on?

From Galera Cluster to Aurora

Load BalancerWriter endpoint Reader endpoint

Webservers

33

How do we move on?

From Galera Cluster to Aurora

Writer endpoint Reader endpoint

Webservers

Load Balancer

34

New Challenges and Benefits

Great we are now with our DB on the Cloud, but how this changes our lives ?

Performance, Price and Availability are now more interconnected than ever, we want responsive and

quick services that use at their best the DB Instances to reduce AWS cost,

35

New Challenges and Benefits - Autoscale the DB

AWS services are famous for their auto-scale capability… but not on RDS.But “something” that turn instances on and off could really be useful to us because most of our traffic is predictable.

As first try we used the aws cli with some simple scheduled tasks on Jenkins to programmatically turn on/off the DB Instances for the different countries at their wake up/sleep time.

This was a good change, but sometimes the load was higher or lower of the expected so we wrote some small bash utilities that periodically (every 2 minutes) check the CPU usage of our replica instances and if the average it’s over or under a threshold it takes an action to scale up or down that cluster.

36

New Challenges and Benefits - Autoscale the DB

This is much better as the number of instances dynamically changes based on the load, but …● Adding an instance take up to 10 minutes.● Removing an instance causes failed connection for our users.

37

New Challenges and Benefits

Some Benefits of using a DB on AWS that we have found include:

- Easily create new instances and replica of them, this means that this tasks can be done now also by less skilled (on the DB) members of the team.

- Easily manage snapshot and restore an instance from them (this can be good also to scratch your staging/dev environment and start fresh every day or week).

- Don’t worry anymore about DR, with Aurora the data is automatically replicated across Availability Zones, optionally you could also have a replica on a different region.

- Easily change the class of your DB servers if you have over or underestimated your load.- Cloudwatch add value with a good range of metrics ready out-of-the-box to be used.- Support Center is your friend: we experienced positive and proactive interaction with them

when we had a development cluster crash experimenting advanced new features (Aurora’s spatial indexes implementation)

38

New Challenges and Benefits

So it’s all wonderful when you stay on cloud ? Not exactly.

These are a few drawbacks of using a DB on AWS for our uses:

- AWS It’s “fully managed” but you have to understand and setup VPC/subnet/security group, this require some costs on time or a consultant.

- RDS it’s “fully managed” but you have to understand how parameter groups works and the slightly difference in Aurora engine.

- RDS it’s great if you have a simple infrastructure but it’s also harder if you want to achieve a 0 downtime service.

- If you don’t plan it wisely the costs can easily grow (i.e. reserved instances).

39

{ }Moving the DB to the Cloud

Is this enough ?

Conclusions

Moving the database as first thing could not be your best option, latency could be an application killer and in general it’s best to start with an application that it’s totally on cloud, much better if it’s new and you can plan it from scratch to work on cloud and all the services you can find there.

Told that, if you have to plan a big change or upgrade for your DB infrastructure a cloud provider could be a great option as it gives a way to start quickly and over time change the capacity without too much hassle.

We have now moved our API to AWS as well and this has increased the performance and lowered the response time … but this is maybe good for another talk ...

41

THANK YOUFOR YOUR ATTENTION

We are hiringhttps://corporate.doveconviene.it/lavora-con-noi/

42

Contacts

● Riccardo Capecchi○ https://about.me/riccardocapecchi

● Marco Careddu○ [email protected]○ https://www.linkedin.com/in/marco-careddu-33707632○ https://twitter.com/sgradix

● Piermarco Zerbini○ [email protected]○ https://it.linkedin.com/in/piermarcozerbini○ https://github.com/Snafrutz

43