70
Scaling Tokopedia Past, Present, Future

Scaling tokopedia-past-present-future

Embed Size (px)

Citation preview

Page 1: Scaling tokopedia-past-present-future

Scaling TokopediaPast, Present, Future

Page 2: Scaling tokopedia-past-present-future

Once Upon a TimeIn Jakarta, Jan 2009

Page 3: Scaling tokopedia-past-present-future
Page 4: Scaling tokopedia-past-present-future
Page 5: Scaling tokopedia-past-present-future

1 Product Guy and 1 Half Engineer

as co-founder

Page 6: Scaling tokopedia-past-present-future

Never have experience to manage a high traffic website

Don’t have business background AT ALL

Page 7: Scaling tokopedia-past-present-future

Perl as back end Build our own perl framework

Apache Mod Perl

Oracle Express Edition

Page 8: Scaling tokopedia-past-present-future
Page 9: Scaling tokopedia-past-present-future
Page 10: Scaling tokopedia-past-present-future
Page 11: Scaling tokopedia-past-present-future

Hm… looks like we need a better front end designer

Page 12: Scaling tokopedia-past-present-future
Page 13: Scaling tokopedia-past-present-future
Page 14: Scaling tokopedia-past-present-future
Page 15: Scaling tokopedia-past-present-future
Page 16: Scaling tokopedia-past-present-future

AwStats and A little bit Google Analytic

Page 17: Scaling tokopedia-past-present-future

CBN

apache server oracle server

Network Topology

Apps Topology

Internet

apache server

oracle server

http req http resp

sql

Page 18: Scaling tokopedia-past-present-future

2 co-founder 1 real engineer

1 cust care

Page 19: Scaling tokopedia-past-present-future
Page 20: Scaling tokopedia-past-present-future
Page 21: Scaling tokopedia-past-present-future

Hooray, WE LAUNCH!!

Page 22: Scaling tokopedia-past-present-future

IDR 33 Mio of GMVin the first month

Page 23: Scaling tokopedia-past-present-future

WE ARE SLOW!!!

Page 24: Scaling tokopedia-past-present-future

* We didn’t have storage * pictures uploaded is stored on the same machine * Web page & static content is served by single apache * We didn’t use CDN * We didn’t even know what is CDN

WHY??

Page 25: Scaling tokopedia-past-present-future

Network Topology

CBN

apache appserver

oracle serverapache staticserver

Page 26: Scaling tokopedia-past-present-future

Apps Topology

Internet

apache app server

oracle server

http req http resp

sql

Internet

apache upload / statis server

oracle server

http upload http resp

sql

Internet

apache upload / statis server

http req http resp

access web page upload pictures read staticpicturescss + js

Page 27: Scaling tokopedia-past-present-future

We are back in business

Page 28: Scaling tokopedia-past-present-future

BUT WE ARE SLOW AGAIN!!!

Page 29: Scaling tokopedia-past-present-future

* Oracle express edition reach it’s limit * No Partition * No Replication * Poor indexing * Read/Write and Query on the same Master DB.

WHY??

Page 30: Scaling tokopedia-past-present-future

SO WE MIGRATE TO

Page 31: Scaling tokopedia-past-present-future

Network Topology

CBN

apache appserver

PostgreSQL Masterapache staticserver

PostgreSQL Slave

Page 32: Scaling tokopedia-past-present-future

Apps Topology

Internet

apache app server

PostgreSQL Master

http req http resp

sql insert sql update sql delete

PostgreSQL Slave

sql iquery

WAL streaming Replication

Page 33: Scaling tokopedia-past-present-future

We did it again!!!!

Page 34: Scaling tokopedia-past-present-future

DAMN SEARCH IS SLOW!!!

Page 35: Scaling tokopedia-past-present-future

* We have a lot of new products every second * We have to show search results in real time * But every second the sorting keep changing * PostgreSQL load is just too much!!!

WHY??

Page 36: Scaling tokopedia-past-present-future

And Many More……..

Page 37: Scaling tokopedia-past-present-future

SEARCH IS EASY !!!!

Page 38: Scaling tokopedia-past-present-future

Come on Man….SLOW AGAIN??

Page 39: Scaling tokopedia-past-present-future

* We were using apache + mod perl * Apache consume a lot of resource * Our code has a lot of memory leak

WHY??

Page 40: Scaling tokopedia-past-present-future

* We found out about NginX is very light and fast * We use nginx as load balancer * Replace apache modperl with nginx-perl * We have 1 nginx load balancer with several nginx-perl servers * For load balancing method, we mix round robin and clustering

SOLUTION

Page 41: Scaling tokopedia-past-present-future

siege -c100 -t5s -i -b -q 'http://www.tokopedia.com/ebenhaezer' siege: invalid option -- 'q' siege: invalid option -- 'q' ** SIEGE 2.72 ** Preparing 100 concurrent users for battle. The server is now under siege... Lifting the server siege... done.

Transactions: 14788 hits Availability: 100.00 % Elapsed time: 4.59 secs Data transferred: 63.50 MB Response time: 0.03 secs Transaction rate: 3221.79 trans/sec Throughput: 13.83 MB/sec Concurrency: 87.52 Successful transactions: 7481 Failed transactions: 0 Longest transaction: 0.43 Shortest transaction: 0.00

Page 42: Scaling tokopedia-past-present-future

Apps Topology

PostgreSQL Master

sql insert sql update sql delete

PostgreSQL Slave

sql iquery

WAL streaming Replication

Internet

http req http resp

NginX Load Balancer

nginx-perl #1 nginx-perl #2 nginx-perl #3 nginx-perl #n

proxy_pass

SOLR

Import

SOLR query

Page 43: Scaling tokopedia-past-present-future

Now what….Storage??

Page 44: Scaling tokopedia-past-present-future

* Hardware limitation * We used SATA HDD not SSD * Disk Utilities 100% * No back up, No Failover * Capacity is critical * Users keep uploading pictures

WHY??

Page 45: Scaling tokopedia-past-present-future

User

Page 46: Scaling tokopedia-past-present-future

We also use CDN

Page 47: Scaling tokopedia-past-present-future

AFTER ALLWE ARE STILL SLOW!!!

Page 48: Scaling tokopedia-past-present-future

SOLUTION

Internet

nginx-perl #1

PostgreSQL Master

http req http resp

nginx-perl #2 nginx-perl #3 nginx-perl #n

NginX Load Balancer

proxy_pass

PostgreSQL Slave

replication

MongoDBprimary

MongoDBsecondary

replication

SOLR

Redis

query & update

3rd Party API such asLogistics, Banks,

Payment GwETC

Internet

Page 49: Scaling tokopedia-past-present-future

We Start To Know About NginX, NoSQL

In-Memory Storage GlusterFS Storage

Scale out (not scale up) and many more…..

Lesson Learn??

Page 50: Scaling tokopedia-past-present-future

Thanks to ourAwesome Engineers

and many more…

Page 51: Scaling tokopedia-past-present-future
Page 52: Scaling tokopedia-past-present-future

We are back in business

Page 53: Scaling tokopedia-past-present-future

BUT …………..

Page 54: Scaling tokopedia-past-present-future

For the first time in our life we were doomed!!!

Page 55: Scaling tokopedia-past-present-future

* One of our GlusterFS Server is broken. Image read/write is super slow.

* We were using version of postgresql which has some bugs on indexing.

WHY??

Page 56: Scaling tokopedia-past-present-future

Another Awesome Engineers

Mixed with International Team

Page 57: Scaling tokopedia-past-present-future

Current State

Page 58: Scaling tokopedia-past-present-future

New VP of Engineering

Page 59: Scaling tokopedia-past-present-future
Page 60: Scaling tokopedia-past-present-future

FUTURE

Page 61: Scaling tokopedia-past-present-future
Page 62: Scaling tokopedia-past-present-future

* Mobile First Company

* Zero Downtime

* Full to Cloud

* Re-architech to SoA

* Open API to Public

* Deploy New Tech, such as replace perl with Go Lang

* Advance Alert & Monitoring

Page 63: Scaling tokopedia-past-present-future

* Redundancy and Failover

* Multiple 3rd party

* Datawarehouse such as Cubes, Pentaho etc

* Machine Learning, Business Intelligence

* Build things that can be share with others

* Really pay attention on security

* and many more……

Page 64: Scaling tokopedia-past-present-future

What if the problems come from ISP?

Unsolved Issues

Page 65: Scaling tokopedia-past-present-future

* User cannot access Tokopedia * Pictures are not showing * css and js are not loaded * Sometime it just show a blank page * Some ISPs do Ads Injection * ALL WITHOUT REASONS

FACTS

Page 66: Scaling tokopedia-past-present-future

WHY??WE DON’T KNOW

BUT SOMETHING HAPPENON ISP SIDE

Page 67: Scaling tokopedia-past-present-future

Works well* Using NginX Geo Module * All HTTPS since Q4 2014 * Try CDN Load balancing

Don’t work at all* Talked to ISP * “Fight” in idEA

What we’ve done

Page 68: Scaling tokopedia-past-present-future

Don’t think “someone else will join and take care of this” — Mike Krieger of Instagram

Page 69: Scaling tokopedia-past-present-future

Whether you think you can, or you think you can’t, you’re right — Henry Ford

Page 70: Scaling tokopedia-past-present-future

THANK YOU ANY QUESTIONS?