19
Tapping the real-time stream with SQL Wix’s SQL-on-Storm Platform May, 2015 Gregory Bondar, [email protected] Igal Shilman, [email protected]

Wix sql on-storm-platform

  • Upload
    alooma

  • View
    60

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Wix sql on-storm-platform

   Tapping the real-time stream

with SQL

Wix’s SQL-on-Storm Platform

May, 2015

Gregory  Bondar,  [email protected]  Igal  Shilman,  [email protected]    

Page 2: Wix sql on-storm-platform

Wix  Company  •  Wix.com  is  the  world’s  leading  cloud-­‐based  web  development  pla<orm  

that  enables  to  create  professional  HTML5  websites  using  online  "Drag  &  Drop"  tools  

•  Wix  was  founded  in  2006,  headquartered  in  Tel  Aviv  •  Wix  has  around  65M  registered  users  and  growing…  

Page 3: Wix sql on-storm-platform

Wix’s  Data  Services:  building  blocks  

•  Batch-­‐Oriented  Data  Processing:  -  Hadoop  ecosystem:  Cloudera  CDH4,  HBase,  Pig,  Oozie,  etc.  

 

•  SQL-­‐on-­‐Hadoop  interfaces:  -  Facebook’s  Presto  with  “home-­‐made”  Parquet,  HBase  and  MS  SQL  

connectors    

•  Real-­‐Zme  Stream-­‐Oriented  AnalyZcs:  -  Storm,  Esper,  etc.    

 

•  And  more:  -  Microso\  SQL  Server  2012  -  Google  Cloud  (AppEngine,  Datastore,  Pub/Sub,  Dataflow,  etc.)  -  Sharded  Redis  cluster  

Page 4: Wix sql on-storm-platform

Major  limitaZons  pushed  us  into  Data  Stream  journey  

•  Latency,  latency,  laaaaaaaaatency…  –  Events  ingesZon  latency  (10-­‐20  minutes  on  average)  –  Hadoop  is  opZmized  for  batch-­‐oriented  processing  of  historical  data  –  Latency  of  analyZc  job  results  (up  to  dozens  of  minutes)  –  Unpredictable  consumpZon  of  Hadoop  cluster  resources  by  on-­‐

demand  analyZc  jobs  

Page 5: Wix sql on-storm-platform

Use  Cases  that  require  Real-­‐Time  Data  Stream  AnalyZcs  

•  Product  personalizaZon  

•  Analysis  of  user  behavior  trends  and  anomalies  

•  OperaZonal  analyZcs  (monitoring,  security,  etc.)  

•  Machine  learning  models  against  user  acZvity  to  predict  user  behavior    

Page 6: Wix sql on-storm-platform

Wix  Data  Stream  Tube  

Let’s  assume  that  all  Wix’s  events  flows    through  a  one  tube  named  “events”  

Page 7: Wix sql on-storm-platform

SQL-­‐like  query  language  

Page 8: Wix sql on-storm-platform

SQL-­‐like  query  language  (Cont.)  

Page 9: Wix sql on-storm-platform

Wix’s  SQL-­‐on-­‐Storm:  requirements  

•  DemocraZzing  Data,  self-­‐service  to  access  and  uZlize  as  much  data  as  legally  possible  

•  User-­‐friendly  interface  for  SQL  patriots  •  Flexibility  to  execute  any  kind  of  queries  •  Ability  to  output  the  query  results  to  external  

services  •  On-­‐demand  and  long-­‐running  queries  support  •  Knowledge  sharing:  “ready-­‐to-­‐use”  query  templates  •  High  throughput    and  maximum  upZme  

Page 10: Wix sql on-storm-platform

Integrated  usage  of  Storm  and  Esper  

Page 11: Wix sql on-storm-platform

Esper  -­‐  hgp://www.espertech.com/esper/  

•  Esper  –  light-­‐weight  Java  library  for  complex  event  processing  (CEP)  and  event  series  analysis  

•  Why  Esper?  –  Offers  rich  SQL-­‐like  event  processing  language  (EPL)  supporZng  very  complex  event  streaming  analyZcs  

–  Easy  to  integrate  and  use  –  Very  stable,  with  high  performance  metrics  –  AcZvely  developed  –  Open  source,  well  documented  

Page 12: Wix sql on-storm-platform

Storm  topology  reuse  by  correct  parZZon  key  

•  Accepts  events  from  log  collectors  •  Converts  them  to  enriched  objects  •  Hash  parZZon  objects  by  key  (e.g.,  user  id,  request  id)  

Page 13: Wix sql on-storm-platform

Compute  Bolt  

•  Manages  Esper  engine  instances    •  Deploy/un-­‐deploy  queries  on  demand  •  Routes  query  results  to  the  ac:on  /  aggrega:on  layers  

Page 14: Wix sql on-storm-platform

AcZons  •  PersonalizaZon  

Services  •  Graphite  •  Database  •  New  Relic  •  Email  •  UDP  and  HTTP  

output  

Page 15: Wix sql on-storm-platform

Wix  SQL-­‐on-­‐Storm  Dashboard:  Demo  

Page 16: Wix sql on-storm-platform

AggregaZon  Bolt  

•  Special  acZon  type  aggregaZng  parZal  results  of  Compute  Bolts  •  In  another  words:  Map-­‐Reduce  paradigm  implementaZon  for  streaming  

Page 17: Wix sql on-storm-platform

Wix  SQL-­‐on-­‐Storm  –  AggregaZon  Queries:  Demo  

Page 18: Wix sql on-storm-platform

Wix  SQL-­‐on-­‐Storm:  Architecture  Summary  

Page 19: Wix sql on-storm-platform

Any  QuesZons?!