16
We put the SQL back in NoSQL* @ApachePhoenix h0p://phoenix.apache.org/ James Taylor (@JamesPlusPlus) * Anybody else noBce that all the NoSQL stores have SQL?

We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

V5  

We  put  the  SQL  back  in  NoSQL*  @ApachePhoenix  

h0p://phoenix.apache.org/  James  Taylor  (@JamesPlusPlus)  

*  Anybody  else  noBce  that  all  the  NoSQL  stores  have  SQL?    

Page 2: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

About  James  •  Architect  at  Salesforce.com  –  Part  of  the  Big  Data  group  –  Lead  of  the  Apache  Phoenix  project  

Page 3: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What  is  Apache  Phoenix?  •  A  relaBonal  database  layer  for  Apache  HBase  

Page 4: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What  is  •  High  performance  horizontally  scalable  byte  store  •  Suitable  as  store  of  record  for  mission  criBcal  data  

?  

Page 5: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What  is  Apache  Phoenix?  

Page 6: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What  is  Apache  Phoenix?  •  A  relaBonal  database  layer  for  Apache  HBase  

–  Query  engine  •  Transforms  SQL  into  naBve  HBase  API  calls  •  Pushes  work  to  cluster  for  parallel  execuBon  

–  Metadata  repository  •  Typed  access  to  data  stored  in  HBase  tables  •  Support  mulB-­‐tenancy  modeled  as  SQL  views  

–  A  JDBC  driver  •  A  top  level  Apache  SoWware  FoundaBon  project  

–  Originally  developed  at  Salesforce  –  Now  a  top-­‐level  project  at  the  ASF  –  A  growing  community  with  momentum  

Page 7: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Where  Does  Phoenix  Fit  In?  Sqoo

p    

RDB  Da

ta  Collector  

Flum

e    

Log  Da

ta  Collector  

Zookeepe

r  Co

ordinaBo

n  YARN  

Cluster  Resource  Manager  /  MapReduce  

HDFS  2.0  

Hadoop  Distributed  File  System  

GraphX  Graph  analysis  framework  

Phoenix  Query  execuBon  engine  

HBase  Distributed  Database  

The  Java  Virtual  Machine  Hadoop  

Common  JNI  

Spark  

IteraBve  In-­‐Memory  ComputaBon  

MLLib  Data  mining  

Pig  Data  ManipulaBon  

Hive  Structured  Query  

Phoenix  JDBC  client  

Page 8: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Why  is  Phoenix  fast?  •  Pushes  down  computaBon  to  region  servers  

–  Start/stop  key  range(s)  –  Time  range  min/max  –  Predicates  –  AggregaBon  –  Sort  –  Limit  –  TopN  

•  Parallelizes  query  from  client  –  Intra-­‐region  through  staBsBcs  collecBon  

•  Supports  secondary  indexes  –  Global  &  co-­‐located  

Page 9: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Client  

Server   Server   Server   Server   Server   Server  Par::oned  Table  R1   R2   …   …   Rn  

Intra-­‐region  guideposts  

Parallel  scans  

•  Filter  •  Aggregate  •  Sort  •  Limit    

Scan  ranges  

Why  is  Phoenix  fast?  

Page 10: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Skip  Scans  •  Push  key  range  sets  through  filter  •  Use  SEEK_NEXT_HINT  to  skip  data  

R1  

Include

Include

Include

Skip

Skip

Skip

Page 11: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Filters  •  Pushes  down  WHERE  clause  to  server  •  Filters  enBre  HFile  based  on  Bme-­‐range  

R1HFile  t1  -­‐  10  

HFile  t11  -­‐  20  

HFile  t21  -­‐  30  

HFile  t31  -­‐  40   Scan  HFile  t31  -­‐  40   Scan  

Page 12: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What’s  Next?  

You  are  here  

Page 13: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Introducing  Apache  Calcite  •  Query  parser,  compiler,  and  planner  framework  – SQL-­‐92  compliant  

•  Pluggable  cost-­‐based  opBmizer  framework  – Sane  way  to  model  push  down  through  rules  

•  Interop  with  other  Calcite  adaptors  – Already  used  by  Drill,  Hive,  Kylin,  Samza  – Supports  any  JDBC  source  (RDBMS  -­‐  remember  them  J)  

– One  cost-­‐model  to  rule  them  all  

Page 14: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

How  does  Phoenix  plug  in?  

Calcite  Parser  &  Validator

Calcite  Query  Op:mizer

Phoenix  Query  Plan  Generator Phoenix  RunBme

JDBC  Client

SQL  +  Phoenix  specific  grammar

Built-­‐in  rules  +  Phoenix  specific  

rules

Page 15: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

What’s  Next?  Query  

Drillbit  

Calcite  

Phoenix/HBase   HDFS  

Drillbit  

Calcite  

RDBMS  

Drillbit  

Calcite  

Samza/  Streams  

Drillbit  

Calcite  

?  

Page 16: We#put#the#SQL#backin# NoSQL* - Apache Phoenixphoenix.apache.org/presentations/HPTS.pdfWhatis#Apache#Phoenix?# • A#relaonal#database#layer#for#Apache#HBase # – Queryengine •

Thank  you!  QuesBons?  

*  who  uses  Phoenix