Overview of stinger interactive query for hive

Overview of S+nger: Interac+ve Query for Hive

@ddkaiser linkedin.com/in/dkaiser slideshare.net/ddkaiser [email protected] [email protected]

OC Big Data Meetup #1 May 21, 2014

David Kaiser

Who Am I?

David Kaiser

20+ years experience with Linux 3 years experience with Hadoop Career experiences: •  Data Warehousing •  Geospa+al Analy+cs •  Open-‐source Solu+ons and Architecture

Employed at Hortonworks as a Senior Solu+ons Engineer


Overview of Stinger: Interactive Query for Hive

• Abstract: – Hadoop is about so much more than batch processing. With the recent release of Hadoop 2, there have been many new approaches for increased applica+on performance.

– Hive is the most used SQL implementa+on on Hadoop. – Hive provides the most amount of SQL compa+bility on Hadoop.

– But… Hive is Slow. – Hive WAS Slow.

– This talk will discuss the S+nger ini+a+ve, which improved Hive performance over 100x.

S"nger Project (announced February 2013)

Batch AND Interactive SQL-in-Hadoop Stinger Initiative A broad, community-based effort to drive the next generation of HIVE

Hive 0.13, April 2014

•  Hive on Apache Tez •  Query Service •  Buffer Cache •  Cost Based Op+mizer (Op+q) •  Vectorized Processing

Hive 0.11, May 2013: •  Base Op+miza+ons •  SQL Analy+c Func+ons •  ORCFile, Modern File Format

Hive 0.12, October 2013:

•  VARCHAR, DATE Types •  ORCFile predicate pushdown •  Advanced Op+miza+ons •  Performance Boosts via YARN

Speed Improve Hive query performance by 100X to allow for interac+ve query +mes (seconds)

Scale The only SQL processing in Hadoop designed for queries that scale from TB to PB

SQL Support broadest range of SQL seman+cs for analy+c applica+ons running against Hadoop

Goals:

An Open Community at its finest: Apache Hive Contribution

1,672 Jira Tickets Closed

145 Developers

44 Companies

~400,000 Lines Of Code Added…

13 Months

Outcomes from the Stinger Project

Page 5

Feature Descrip"on Benefit

Tez Integra+on Tez is significantly beeer engine than MapReduce Latency

Vectorized Query Take advantage of modern hardware by processing thousand-‐row blocks rather than row-‐at-‐a-‐+me. Throughput

Query Planner

Using extensive sta+s+cs now available in Metastore to beeer plan and op+mize query, including predicate pushdown during compila+on to eliminate por+ons of input (beyond par++on pruning)

Latency

ORC File Columnar, type aware format with indices Latency

Cost Based Op+mizer (Op+q)

Join re-‐ordering and other op+miza+ons based on column sta+s+cs including histograms etc. Latency

Hive as a Service Leaves engine running between sessions Latency

Buffer Cache Leaves most used HDFS file blocks in memory Latency

Hadoop 2: Moving Past MapReduce

Page 6

HADOOP 1.0

HDFS (redundant, reliable storage)

MapReduce (cluster resource management

& data processing)

HDFS2 (redundant, highly-‐available & reliable storage)

YARN (cluster resource management)

MapReduce (data processing)

Others

HADOOP 2.0

Single Use System Batch Apps

Mul/ Purpose Pla5orm Batch, Interac/ve, Online, Streaming, …

Apache Tez as the new Primitive

HDFS2 (redundant, reliable storage)

Tez (execu+on engine)

YARN (cluster resource management)

HADOOP 2.0

MapReduce as Base Apache Tez as Base

HDFS (redundant, reliable storage)

MapReduce (cluster resource management

& data processing)

Pig (data flow)

Hive (sql)

Others (Cascading)

HADOOP 1.0 Data Flow

Pig SQL Hive

Others (Cascading)

Batch MapReduce

Slider (con+nuous execu+on)

Online Data

Processing HBase, Accumulo

Real Time Stream

Processing Storm

Complete Open Source Stack •  YARN is the logical extension of Apache Hadoop

–  Complements HDFS, the data reservoir

•  Resource Management for the Enterprise Data Lake –  Shared, secure, mul+-‐tenant Hadoop

Allows for all processing in Open-Source Hadoop

Page 8

HDFS2 (Redundant, Reliable Storage)

YARN (Cluster Resource Management)

BATCH (MapReduce)

INTERACTIVE (Tez)

STREAMING (Storm, S4,…)

GRAPH (Giraph)

IN-‐MEMORY (Spark)

HPC MPI (OpenMPI)

ONLINE (HBase)

OTHER (Search) (Weave…)

Feature Descrip"on Benefit

Tez Session Overcomes Map-‐Reduce job-‐launch latency by pre-‐launching Tez AppMaster Latency

Tez Container Pre-‐Launch

Overcomes Map-‐Reduce latency by pre-‐launching hot containers ready to serve queries. Latency

Tez Container Re-‐Use

Finished maps and reduces pick up more work rather than exi+ng. Reduces latency and eliminates difficult split-‐size tuning. Out of box performance!

Latency

Run+me re-‐configura+on of DAG

Run+me query tuning by picking aggrega+on parallelism using online query sta+s+cs Throughput

Tez In-‐Memory Cache Hot data kept in RAM for fast access. Latency

Complex DAGs Tez Broadcast Edge and Map-‐Reduce-‐Reduce paeern improve query scale and throughput. Throughput

Hive On Tez - Execution

ORC File Advantages Sustained Query Times Apache Hive 0.12 provides sustained acceptable query times even at petabyte scale

131 GB (78% Smaller)

File Size Comparison Across Encoding Methods Dataset: TPC-‐DS Scale 500 Dataset


Encoded with Text

Encoded with RCFile

Encoded with ORCFile

Encoded with Parquet


585 GB (Original Size) •  Larger Block Sizes

•  Columnar format arranges columns adjacent within the file for compression & fast access

Impala

Hive 12

Smaller Footprint Better encoding with ORC in Apache Hive 0.12 reduces resource requirements for your cluster.

ORCFile File Format

Page 11

Query-‐Op"mized: Split-‐able, columnar storage file Efficient Reads: Break into large “stripes” of data for efficient read Fast Filtering: Built in index, min/max, metadata for fast filtering blocks -‐ bloom filters if desired Efficient Compression: Decompose complex row types into primi+ves: massive compression and efficient comparisons for filtering Precomputa"on: Built in aggregates per block (min, max, count, sum, etc.)

A Journey to SQL Compliance Evolu"on of SQL Compliance in Hive SQL Datatypes SQL Seman"cs INT/TINYINT/SMALLINT/BIGINT SELECT, INSERT

FLOAT/DOUBLE GROUP BY, ORDER BY, HAVING

BOOLEAN JOIN on explicit join key

ARRAY, MAP, STRUCT, UNION Inner, outer, cross and semi joins

STRING Sub-‐queries in the FROM clause

BINARY ROLLUP and CUBE

TIMESTAMP UNION

DECIMAL Standard aggrega+ons (sum, avg, etc.)

DATE Custom Java UDFs

VARCHAR Windowing func+ons (OVER, RANK, etc.)

CHAR Advanced UDFs (ngram, XPath, URL)

Interval Types Sub-‐queries for IN/NOT IN, HAVING

JOINs in WHERE Clause

INSERT/UPDATE/DELETE

Legend Hive 10 or earlier

Roadmap

Hive 11

Hive 12

Hive 13

Tez – Execution Performance •  Performance gains over Map Reduce

–  Eliminate replicated write barrier between successive computa+ons. –  Eliminate job launch overhead of workflow jobs. –  Eliminate extra stage of map reads in every workflow job. –  Eliminate queue and resource conten+on suffered by workflow jobs that are started aper a predecessor job completes.

Page 13

Pig/Hive -‐ MR Pig/Hive -‐ Tez

Hive – MR Hive – Tez

Hive-on-MR vs. Hive-on-Tez SELECT a.state, COUNT(*), AVERAGE(c.price) FROM a JOIN b on (a.id = b.id) JOIN c on (a.itemId = c.itemId) GROUP by a.state

SELECT a.state

JOIN (a, c) SELECT c.price

SELECT b.id

JOIN(a, b) GROUP BY a.state

COUNT(*) AVERAGE(c.price)

M M M

R R

M M

R

M M

R

M M

R

HDFS

HDFS

HDFS

M M M

R R

R

M M

R

R

SELECT a.state, c.itemId

JOIN (a, c)

JOIN(a, b) GROUP BY a.state

COUNT(*) AVERAGE(c.price)

SELECT b.id

Tez avoids unneeded writes to HDFS

Vectorization • Rewrite all operations to operate on blocks of 1K+ records, rather than one record at a time

• Block is array of Java scalars, not Objects (eliminate Objects – compounding GC gains over time)

• Avoids many function calls, CPU pipeline stalls

•  Size to fit in L1 cache, avoid cache misses

Page 15

Stinger Phase 3: Unlocking Interactive Query

S"nger Phase 3: Features and Benefits

Container Pre-‐Launch Overcomes Java VM startup latency by pre-‐launching hot containers ready to serve queries

Container Re-‐Use Finished Maps and Reduces pick up more work rather than exi+ng. Reduces latency and eliminates difficult split size tuning

Tez Integra+on Tez Broadcast Edge and Map-‐Reduce-‐Reduce paeern improve query scale and throughput

In-‐Memory Cache Hot data kept in RAM for fast access

Quantifying Stinger

Page 17

Hive 10 Hive 0.13 (Phase 3) Hive 0.11 (Phase 1)

190x Improvement

1400s

39s

7.2s

TPC-‐DS Query 27

3200s

65s

14.9s

TPC-‐DS Query 82

200x Improvement

Query 27: Pricing Analy"cs using Star Schema Join Query 82: Inventory Analy"cs Joining 2 Large Fact Tables

All Results at Scale Factor 200 (Approximately 200GB Data)

41.1s

4.2s

39.8s

4.1s TPC-‐DS Query 52 TPC-‐DS Query 55

Query Time in Seconds

Speed: Delivering Interactive Query

Test Cluster: •  200 GB Data (ORCFile) •  20 Nodes, 24GB RAM each, 6x disk each

Hive 0.12

Hive 0.13 (Phase 3)

Query 52: Star Schema Join with group-‐by, order-‐by on different keys Query 55: Star Schema Join with group-‐by, order-‐by on different keys

22s

9.8s

31s

6.7s TPC-‐DS Query 28 TPC-‐DS Query 12

Query Time in Seconds

Speed: Delivering Interactive Query

Test Cluster: •  200 GB Data (ORCFile) •  20 Nodes, 24GB RAM each, 6x disk each

Hive 0.12

Hive 0.13 (Phase 3)

Query 28: Four sub-‐query join (Vectoriza"on) Query 12: Star Join over range of dates (M-‐R-‐R palern)

Hortonworks Confidential © 2014

Speed@Scale: Large Scale Implementation

Page 20

http://blogs.cisco.com/datacenter/hdp

Cisco Engineering Blog Post Independent assessment by Cisco UCS Team Benchmark @ 30TB

Hortonworks Confidential © 2014

Speed@Scale: Large Scale Implementation

Page 21

https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/

Facebook Engineering Blog Post Hortonworks engineering team worked on ORCFile Facebook provided improvements to ORCFile, working with Hortonworks Hive is used for efficient analytics on the largest Hadoop Data Warehouse site Ultimate Scale Data Analysis

Your Fastest On-ramp to Enterprise Hadoop™!

Page 22

hep://hortonworks.com/products/hortonworks-‐sandbox/

The Sandbox lets you experience Apache Hadoop from the convenience of your own laptop – no data center, no cloud and no internet connec+on needed! The Hortonworks Sandbox is: •  A free download: hep://hortonworks.com/products/hortonworks-‐sandbox/ •  A complete, self contained virtual machine with Apache Hadoop pre-‐configured •  A personal, portable and standalone Hadoop environment •  A set of hands-‐on, step-‐by-‐step tutorials that allow you to learn and explore Hadoop

Ques+ons?


David Kaiser

Data & Analytics

Overview of stinger interactive query for hive