View
12
Download
0
Category
Preview:
Citation preview
How to Eat an Elephant –
Qlik and Big Data Ecosystem
David Freriks
Technology Evangelist
Office of Strategy Management
Q1 2018
2
• Most Big Data Users are not Data Scientists
─ Business users want simple, guided access
• Helping the user find relevant and contextual information
─ Instead of having to search through everything
• Ensuring the solution can accommodate today and tomorrow
─ Big Data landscape continues to rapidly evolve
• Able to use different methods for different data volumes and complexities
─ “One method does not fit all”
Challenge - Providing Big Data to everyone
“A car may produce an exabyte of data a year (a billion gigabytes), but most is
completely meaningless. Isolating the megabyte of data a month that’s really valuable,
and then figuring out what you can do with it, that’s the challenge of Big Data.”
Scott McCormick, president of the Connected Vehicle Trade Association and industry adviser to the U.S. Secretary of Transportation, September 2013
The Qlik platform – for all usersMost Big Data Users are not Data Scientists
Deep drilling
Mostly drilling, some exploration
Mostly exploration,
some drilling
Data Experts
Data Scientists
Breadth of Coverage
Dep
th o
f C
overa
ge
Data Explorers
Descriptive, diagnostic and predictive analytics(“What happened?”, “Why did it happen?” and “What is likely to happen?”
Qlik Accelerates Big Data ROI
Many firms that are investing in Big Data still
struggle to get the most from it.
Qlik’s platform drives higher ROI by delivering big data in
context with other data to ensure that Big Data stays relevant.
Make Big Data
Accessible
Deliver Big Data
In Context
Keep Big Data
Relevant
Qlik within a Big Data Architecture
Analyze
Refinement
Initial Processing
Gather
HADOOPDATA SOURCES
ACCELERATORS
QIX Associative Engine
Unstructured
data
Structured
data
Standards-based or application-specific connector
NON-HADOOP
Hadoop
EDWRDBMS
Data Lake
A Data Lake is a storage repository that holds a vast
amount of raw data in its native format until it is needed. *
Technology Implementation
Source: http://searchaws.techtarget.com/definition/data-lake
The Data Lake
• The New IT: How Technology Leaders Are Enabling
Business Strategy in the Digital Age, Jill Dyché, 2015
“You’ve been loading data into a data
warehouse for as long as we can
remember.
But no one asked us if we needed
any of that data.”
Indexed, Flexible, and
Agile Data Model
Why Qlik for Big Data?Qlik is a data lake accelerator!
Sync Explore
Syncs and indexes data, and makes it available for
(1) search, (2) explore, (3) report.
Simple Analogy: Analytics off of Big Data
Data Lake
Water Tower
Direct
Sync Drink
UsersRaw
If Data Is The New Oil...
Shouldn't We Treat It That Way?
• Nobody Invests In Drilling At Random
• You can’t use raw Oil for anything…
• Refining is key!
The final part of the story is
adding context and relevance
and delivering it to people
at the point of decision.
“ “
Big Data is Only Half the Story
1212
Advanced Analytics Integration (AAI)
• Direct integration with 3rd party advanced analytics
engines through server-side extension APIs
• Allows data to be directly exchanged between the QIX
engine and external tools during analysis
– Leverages Qlik’s Associative Model to pass relevant data
based on user context
• Full integration with Qlik Sense expressions and libraries
• Connectors can be built for any external engines
• Open source connectors to be made available by Qlik for
R and Python
Leverage the power of advanced analytics
calculations in Qlik Sense
Etc..
13
How AAI works
1User interacts with app,
making a selection or a
search2
Hypercube recalculated
by QIX Engine to the new
context3
In-context data and script
sent to external engine
4External engine runs and
sends results to QIX
engine5
QIX engine combines
hypercube with new data6Combined hypercube
Is visualized for the user
in the app
Qlik + Cloudera
12 Points of Integration
App on Demand w/ Impala(In memory user generated data slices )
Direct Query w/ Impala(Data Stored in Parquet or Kudu)
Complex Data Types w/ Impala(Maps, Arrays, and Structures )
Writeback with Kudu(Interactive Analytics)
IOT and Kafka integration(Event Driven / Streaming Analytics)
Solr Integration (In-Memory Apps Built on Solr Data)
Qlik Solr-API App on Demand(Search + QAP + D3js)
Advanced Analytics(Integration with Spark/Python/R)
Cloudera Metrics Dashboard(REST API based management console for CM)
Security – New SSO Support(Kerberos Delegation / SSO Pass-through)
Fast & Flexible BI & Analytics Go Beyond SQL Enterprise Ready
Data Lake Browser (Beta)(Data Concierge for Cloudera)
SAP Offload w/ Attunity(SAP S&D Module into HDFS/Impala)
• Let’s Eat…
Analyze data
Different data volumes and complexities need different Qlik solutions
Method DescriptionQlik
Sense® QlikView®
In-MemoryHighly compresses data into memory.
On Demand App
Generation
User selection generates purpose-built app
Segmentation &
Chaining
Multiple related apps that are linked together
Other methods • APIs related to On Demand App Generation
• Partner solutions
Data Volume• Size (rows)
• Dimensions (columns)
• Cardinality (uniqueness)
App Complexity• Computational complexity
• Object density
Variables
Methods can be combined to meet different use cases
On-Demand App Generation
• A template app summarizes the
entire big data environment
• Users can select subsets of data
and dynamically generate new
apps for analysis
• Analysis apps offer fully
unrestricted search and
exploration
Make selections to segment
Big Data and generate
analysis apps on the fly
Pre
se
nta
tio
nA
pp
lic
ati
on
Qlik Sense HTML 5
Web Client
Proxy
Scheduler
QIX Engine
Repository
Applications
Custom HTML 5
Interface / Client
Used to create /
save apps
2nd proxy used to
auto-login
anonymous users
into known users
Used to read data
from Cloudera
metadata app
Used to visualize
data profiles
On-Demand with Qlik Sense API’s:
Qlik Solr / Data Concierge
Reads users
from NTLM
Impala
Hive
Solr
CM/CN
Demos
• SAP offload to Cloudera
• ODAG
• QlikSolr
• Cloudera Data Explorer
/ Metadata Miner
Thank you
Thank you
Recommended