Upload
hugo-woods
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
1Actuate Corporation © 2012
Big Data and Business IntelligenceVirgil Dodson
2Actuate Corporation © 2012
Today’s Agenda and Goals
• Introduction to Big Data• Eclipse Survey Results• Independent Survey Results• Introduction to BIRT• Big Data Connections• Live Demo• Questions
3Actuate Corporation © 2012
Big Data Definition
Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
web logs RFID sensors social networksInternet text search indexes call detail records astronomyatmospheric info genomics biogeochemicalbiological military surveillance medical recordsphotographs video large-scale e-commerce
- Wikipedia
4Actuate Corporation © 2012
• The “Digital Universe” will expand to over 4 zettabytes… Over 50% growth from 2012
• The Big Data focus will shift “up the stack”, toward analytics and discovery, and analytic applications
• Spending will reach $10 billion in 2013, over $20 billion by 2016
Source: IDC, IDC Predictions 2013 presentation
IDC 2013 Big Data Predictions
5Actuate Corporation © 2012
• Big Data or Little Data - How Do You Display Yours?The Eclipse Foundation would like to better understand how developers are using Eclipse with big data and reporting projects.
• We ran this survey to get the pulse of what technologies where in demand related to Eclipse/BIRT technologies.
• Eclipse Promoted the Survey.
• 60% of 518 responders claimed to be big data users
Eclipse BIRT Survey – Oct/Nov 2012
6Actuate Corporation © 2012
Eclipse BIRT Survey - Technology Choices
Hadoop
Cassandra
MongoDB
BIRT
Hive
Talend Open Studio
Mahout
R
None
2.5% 7.5% 12.5% 17.5% 22.5% 27.5% 32.5% 37.5% 42.5%Hadoop Cassan-
draMongoDB BIRT Hive Talend
Open Stu-dio
Mahout R None
Series1 0.285 0.07300000000000
01
0.17 0.206 0.109 0.07300000000000
01
0.07900000000000
02
0.121 0.4
What big data technologies are you using with Eclipse?
Note: Responders could choose more than one option
7Actuate Corporation © 2012
Eclipse BIRT Survey - Other Mentions
Other Mentions
Home grownJasperGreenplumjdtNetezzaZENDStreamBasehypertableHBaseCouchDBtorquePentahoOOZIESqoopIBM Inforsphere StreamsKamasphereBigtopBerekelyDB-JENext-generation-sequencing (BAM)
8Actuate Corporation © 2012
Eclipse BIRT Survey - Data Visualization
Essential52%
Sometimes important28% Occasionally
useful13%
Never needed7%
How Important is Data Visual-ization/Reporting to Your
Projects?
9Actuate Corporation © 2012
Report/Visualization Tools
I use open source data re-porting/visual-ization tools
I use commer-cial data report-ing/visualization
tools
I use home grown routines or open source libraries to dis-
play data
My projects don't require reporting or
data visualiza-tion
Series1 0.709000000000001
0.2 0.394000000000001
0.0790000000000001
5.0%15.0%25.0%35.0%45.0%55.0%65.0%75.0%
How do you create and/or use data display tools or libraries in
development ?
Note: Responders could choose more than one option
10Actuate Corporation © 2012
Goals:• How many large firms (>$1B) are conducting Big Data projects• What are such companies doing with their Big Data projects• What are the expected benefits for those Big Data initiatives• What are the inhibitors
• King Research received 516 surveys• 316 completed and 200 partially completed surveys
• Completed surveys were the primary source of analysis• 32% of those who completed survey (98 respondents) work at
companies with revenue of $1B or more
Independent Big Data Survey – Sept/Oct 2012
11Actuate Corporation © 2012
• 26% of large companies have Big Data projects. 40% have not evaluated Big Data or have evaluated and decided not to proceed. The balance (34%) are either evaluating or planning such initiatives.
• “Not enough staff with expertise” and “Expected cost of Big Data initiatives” are the major inhibitors
• Major benefits expected from Big Data initiatives are:• Make better decisions, faster• Gain competitive advantage• Improve efficiency• Improve customer targeting
• Major benefits realized from Big Data initiatives are:• Gain competitive advantage• Improve customer targeting• Make better decisions, faster• Improve efficiency
Independent Big Data Survey – Key Findings
12Actuate Corporation © 2012
Does your organization have a Big Data implementation today?
• More large companies have implemented Big Data projects (26%) than the universe of companies represented in this survey (19%)
• Conversely, far fewer respondents at large companies responded “No” to this question (40% versus the universe of respondents 49%)
No – Hav
e not e
valuate
d Big Data
No – Ev
aluate
d and deci
ded not to
procee
d
Evalu
ating
Planning t
o use in th
e short
term –
less t
han 1 ye
ar
Planning t
o use in th
e long t
erm –
more than
1 year
Yes –
We h
ave a
Big Data
implem
entati
on today
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
Independent Big Data Survey – Big Data Usage
$1B+ RevenueUniverse of Respondents
13Actuate Corporation © 2012
What Big Data technologies do you plan to use? (eval/planning)
• We asked about their planned use of 15 technologies, and the top 5, in descending order of frequency of mention are displayed above
• Other technologies planned for use at $1B+ organizations include: Apache Cassandra, 12%; Hortonworks Hadoop, 12%; Amazon DynamoDB, 9%; Apache CouchDB, 9%; VoltDB, 9%; HyperTable, 6%; 10gen MongoDB, 3%; Datastax Cassandra, 3%
Apache Hadoop Cloudera Hadoop Apache Hive Apache HBase EMC Greenplum HD0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
Independent Big Data Survey – Big Data Technologies
$1B+ RevenueUniverse of Respondents
14Actuate Corporation © 2012
What are likely to be your Big Data applications? (responses from those who are evaluating or planning Big Data implementations)
• Our survey listed 23 frequently reported Big Data applications and when asked which of these they have evaluated or planned to use, they indicated an average 4.5 apps each.
• Shown above are the 14 apps that were most frequently indicated
Customer
experi
ence
analy
sis
Customer
insights
Fraud prev
ention an
d analy
sis
Marketi
ng targ
eting /
decisio
n syste
ms
Behav
ioral an
alysis
Customer
lifecyc
le man
agem
ent
Operations im
prove
ment
Pricing a
nalytics
and ch
oice m
odeling
Servi
ce quali
ty im
prove
ments
Capaci
ty forec
asting
Inventory
manag
emen
t
Network
monitorin
g
Researc
h / innova
tion
Risk M
odeling /
man
agem
ent /
mitiga
tion0.00%5.00%
10.00%15.00%20.00%25.00%30.00%35.00%40.00%45.00%50.00%
Independent Big Data Survey – Application Types
15Actuate Corporation © 2012
How many people in your organization will consume information from or use your Big Data applications? (evaluating/planning)
• Clearly companies with revenues of $1B or greater plan to share their Big Data information with large audiences across their companies
1 – 9 people 10 - 49 people 50 - 99 people 100 - 499 people 500 or more people0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
Independent Big Data Survey – Number of End Users
16Actuate Corporation © 2012
Actuate Launches the BIRT Project
AUGUST
2004
Actuate Joins Eclipse Foundation
as Strategic Developerand Board Member
Actuate proposed and started
BIRTBusiness Intelligence
and Reporting Tools Project
… a top-level Eclipse projectAdds BI and Reportingas Open Source Project
Professional open sourcePrimary development resources
funded by Actuate
Contributions from many sourcesIBM, Innovent Solutions and community
17Actuate Corporation © 2012
Simplicity
that makes simple
layouts easy
Power
to createvery complex
layouts
BIRTBIRT
Business Intelligence and Reporting Tools
• Makes all data-driven content development easy• Modern, web-page design metaphor• Open and standards-based• Flexible with rich programmatic control• Full support for libraries and reuse• Foundation for a range of solutions
A New Generation of Data Visualization Technology
18Actuate Corporation © 2012
BIRT Release History
September 2004 BIRT Project proposal accepted, and project launched
June 2005 1.0 Eclipse Report Designer, Report Engine, Chart Engine
December 2005 2.0 Support for a wide variety of common layouts
June 2006 2.1 Advanced parameters, ability to join data sets, …
June 2007 2.2 Dynamic crosstab support, web services data source, …
June 2008 2.3 JavaScript Debugger, BiDi Support, Charts in Crosstabs, …
June 2009 2.5 Page aggregates, Multiple drill-downs in Charts, …
June 2010 2.6 New charts, more chart control, developer productivity, …
June 2011 3.7 POJO Runtime, Hive/Hadoop, Open Office emitters…
June 2012 4.2 Maven Support, Excel Data Source, Relative Time Periods…
• Ground-up initiative: Innovative approach to layout and design• Developed in the open with community feedback at all stages
19Actuate Corporation © 2012
BIRT Example Key Capabilities
Very Simple to Very Complex Layouts• Listings, cross-tab, dashboard, pixel-
perfect, charts …• Grouping, advanced aggregations, sub-
totals, calculations• Multi-section and sub-reports• Conditional sections and logic• Full programmatic control/scripting• Embedded images…
Comprehensive Data Access• SQL databases, Web Services, Flat Files,
XML, scripted data sources …• Multiple data sources in one design…
Output Formats• HTML, PDF, Excel, Word, PowerPoint…• Internationalization of labels and text• Bi-Directional language display
Re-use and Developer Productivity• Library support for publishing and
sharing components• Leverages common standards (SQL,
HTML, JavaScript, Java, XML)• Cascading Style Sheets• Built-in debugger…
Interactivity and Linking• Data driven hyperlinks• Drill-through charts and graphics…
Multiple Usage and Productivity Aids• Graphical layout and design• Query & metadata editors• Formatting Builder • Grouping Builder• Customizable cheat sheets and
templates…
20Actuate Corporation © 2012
Getting to Know BIRT
DEMO
21Actuate Corporation © 2012
BIRT Design Gallery
Charts and Tables
Listing with Groups and Sub-Totals
22Actuate Corporation © 2012
BIRT Design Gallery
Crosstabs
Crosstab and Charts
23Actuate Corporation © 2012
BIRT Design Gallery
Forms
Calendar / Schedule
24Actuate Corporation © 2012
BIRT Design Gallery
Dashboards
Multi-Language and Bi-Directional
25Actuate Corporation © 2012
BIRT Chart Gallery
26Actuate Corporation © 2012
BIRT Chart Gallery
27Actuate Corporation © 2012
BIRT Chart Gallery
28Actuate Corporation © 2012
BIRT Designer BIRT Designer
High-Level BIRT Architecture
BIRT EngineBIRT Engine
PresentationServices
PresentationServices
Design EngineDesign Engine
GenerationServices
GenerationServices
DataServices
DataServices
ChartingEngine
ChartingEngine
EclipseDesigner
EclipseDesigner
ChartDesigner
ChartDesigner
Eclipse DTP,WTP,…
Eclipse DTP,WTP,…
Data
Data
HTMLPDFExcelWordPowerPointPostScript…
XMLDesign
Document
29Actuate Corporation © 2012
Design Engine
Report Engine
Chart Engine
Produces XML Report, Templates, and Library Designs
Runs Reports and produces output – PDF, HTML, Doc, XLS, PS, PPT Etc
Consume Chart EMF model and produces Chart Output. Supports 14 Main types and many sub types. Ouputs to PNG, JPG, BMP, SVG, PDF, SWT, and SWING
DE API
RE API
CE API
All Engines can be ran with or without OSGi
Report Designer Chart Builder Example Viewer
Can be ran outside of BIRT
Core BIRT Open Source Products
High Level BIRT Architecture
30Actuate Corporation © 2012
BIRT AJAX Based Viewer
31Actuate Corporation © 2012
• BIRT Offers many ways to get data• Standard Data Sources
• Flat File (CSV, TSV, SSV, PSV)• Hive Data Source• Cassandra Scripted Data Source• JDBC Textual or Graphical• Web Service - XPath syntax• XML - XPath syntax• XLS/XLSX
• Scripted Data Source Written in Java or JavaScript
• Open Data Access (ODA) DTP Project• Extensible JDBC Driver Framework
BIRT Data Access
Community ContributionsGoogleDocsXML/ACasandraRESTMongoDBMulti-Flat FileGitHubTwitter JSON SearchDropbox usageYQLGoogle AnalyticsLinkedInFacebook FQL
32Actuate Corporation © 2012
Live Demo – New MongoDB ODA
DEMO
33Actuate Corporation © 2012
Connecting to Hadoop
34Actuate Corporation © 2012
Hive JDBC – HQL Sub Query Example
35Actuate Corporation © 2012
Hive JDBC – get_json_object UDF
36Actuate Corporation © 2012
Hive JDBC – RegExP Example
37Actuate Corporation © 2012
Hive JDBC – HQL Hints example
38Actuate Corporation © 2012
Hive JDBC – Transform Example
39Actuate Corporation © 2012
Explore• Search/sort• Rate, comment• Forums
Download• Documentation• Software• Examples
Contribute• BIRT designs, code• Technical tips• Contests
Centralized hub for BIRT developers• Access demos, tutorials, tips and techniques, documentation…• Enables developers to be more productive and build applications faster• Marketplace for applications
BIRT Exchange Community Site
40Actuate Corporation © 2012
Visit BIRT Exchange for full contest details
Contest runs from March 28, 2013 to April 30, 2013
Plug-In CategoriesOpen Data Access (ODA) DriversOutput EmittersReport Item ExtensionsChart Extensions
New iPad for Top 3 Plug-Ins!
Plug in to BIRT Spring 2013 Contest
41Actuate Corporation © 2012
Big Data and Business IntelligenceVirgil Dodson
Questions?