Upload
cloudera-japan
View
3.331
Download
5
Embed Size (px)
Citation preview
1 Cloudera, Inc. All rights reserved.
Impala - Hadoop , Cloudera
2 Cloudera, Inc. All rights reserved.
20114ClouderaCloudera
email: [email protected] twitter: @shiumachi
3 Cloudera, Inc. All rights reserved.
Hadoop
BISQL
Hadoop
Hadoop
4 Cloudera, Inc. All rights reserved.
BI /
Sqoop, Flume
MapReduce, Hive, Pig, Spark
SAS, R, Spark,
Mahout
NoSQL HBase
Spark
Streaming
Impala
Solr
HDFS, HBase
YARN, Cloudera Manager,Cloudera Navigator
5 Cloudera, Inc. All rights reserved.
Cloudera Impala
Hadoop MPP SQL http://impala.io/
Cloudera / MapR / Amazon / Oracle HDFS HBase Hive
ODBC / JDBC Kerberos / LDAP
6 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
Hive Metastore HDFS NN State Store Catalogd
7 Cloudera, Inc. All rights reserved.
Impala 1.x Impala 1.0 (2013/04)
SQL-92 () Hadoop
ParquetAvroSequenceFile Kerberos ODBC / JDBC
Impala 1.1 Apache Sentry RBAC(
Impala 1.2 UDF / UDAF JOIN
Impala 1.3 / CDH 5.0
Impala 1.4 CDH 5.1 (2014/07) SQL (DECIMAL ORDER BY without LIMITetc.) HDFS
8 Cloudera, Inc. All rights reserved.
Impala 2.0 (2014/10)
SQLSQL:2003 /(WHEREEXISTSIN)CHAR / VARCHARGRANT / REVOKE (Sentry )
Hash Table disk join and aggregate tables
9 Cloudera, Inc. All rights reserved.
SQL-on-Hadoop (2014/09)
Impala 1.4.0 Presto 0.74 Stinger phase 3 (Hive 0.13.0) Spark SQL 1.1
TPC-DS Impala TPC-DS https://github.com/cloudera/impala-tpcds-kit
SQL-92 JOIN Presto JVM
http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on-hadoop-impala-1-4-widens-the-performance-gap/
10 Cloudera, Inc. All rights reserved.
Impala :
11 Cloudera, Inc. All rights reserved.
Impala :
12 Cloudera, Inc. All rights reserved.
13 Cloudera, Inc. All rights reserved.
/
2.0
RANK() / DENSE_RANK() FIRST_VALUE() / LAST_VALUE() LAG() / LEAD() ROW_NUMBER()
14 Cloudera, Inc. All rights reserved.
select stock_symbol, closing_date, closing_price,! lag(closing_price,1) over (partition by stock_symbol order by closing_date) as "yesterday closing"! from stock_ticker! order by closing_date;!+--------------+---------------------+---------------+-------------------+!| stock_symbol | closing_date | closing_price | yesterday closing |!+--------------+---------------------+---------------+-------------------+!| JDR | 2014-09-13 00:00:00 | 12.86 | NULL |!| JDR | 2014-09-14 00:00:00 | 12.89 | 12.86 |!| JDR | 2014-09-15 00:00:00 | 12.94 | 12.89 |!| JDR | 2014-09-16 00:00:00 | 12.55 | 12.94 |!| JDR | 2014-09-17 00:00:00 | 14.03 | 12.55 |!| JDR | 2014-09-18 00:00:00 | 14.75 | 14.03 |!| JDR | 2014-09-19 00:00:00 | 13.98 | 14.75 |!+--------------+---------------------+---------------+-------------------+!
15 Cloudera, Inc. All rights reserved.
HBase Impala HBase SELECT INSERT
ImpalaHBase
HBase : WebPVSNS
()HBase :
1 INSERT VALUES
Impala HBase external systems
put SELECT * FROM hbase_tbl
INSERT / INSERT VALUES get, scan
16 Cloudera, Inc. All rights reserved.
impalad
SPOF
17 Cloudera, Inc. All rights reserved.
2 Cloudera Manager fair-scheduler.xml llama-site.xml
18 Cloudera, Inc. All rights reserved.
100 10
10 1
1000 GB
100 GB
Group A
Group B
19 Cloudera, Inc. All rights reserved.
Hue Web UI (CDH)
20 Cloudera, Inc. All rights reserved.
JDBC / ODBC BI
MicroStrategy, QlikViewSASTableau
: https://zoomdata.zendesk.com/hc/en-us/articles/203813488-Date-and-Time-Formats-Supported-By-Zoomdata
21 Cloudera, Inc. All rights reserved.
Impala ()
http://demo.gethue.com/ Quick Start VM (VM)
http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html Cloudera Live
(14)4 TableauZoomData http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html
Cloudera Director AWS http://www.cloudera.com/content/cloudera/en/downloads/cloudera-director/1-1-0.html
Amazon EMR http://docs.aws.amazon.com/ja_jp/ElasticMapReduce/latest/DeveloperGuide/emr-impala.html
22 Cloudera, Inc. All rights reserved.
Thank you
23 Cloudera, Inc. All rights reserved.
24 Cloudera, Inc. All rights reserved.
Impala
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_cluster_sizing.html
: CPU1264GB2TB HDD x 121015TB2020
25 Cloudera, Inc. All rights reserved.
Impala
:
10http://www.slideshare.net/cloudera/the-impala-cookbook-42530186
Parquet read-once SequenceFile + Snappy
26 Cloudera, Inc. All rights reserved.
27 Cloudera, Inc. All rights reserved.
http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
28 Cloudera, Inc. All rights reserved.
http://www.vldb.org/pvldb/vol7/p1295-floratou.pdf