View
255
Download
1
Category
Preview:
Citation preview
DAMA NY CHAPTER PRESENTATION
“Big Data” &“The Cloud”Extreme Performance Data Warehousing Inside Of The Cloud
Robert J. Abate, CBIP, CDMPSolutions Principal, EIM & Analytics PracticeEMC ConsultingC Co su t g
January 19th, 2012
1© Copyright 2012 EMC Corporation. All rights reserved.
DAMA NY CHAPTER PRESENTATION
Big Data & The Cloud
• Background & Definitions
AGENDA
• Background & Definitions
• The Challenge
A hit t l S l ti T Bi D t• Architectural Solutions To Big Data
• It’s A Brave New World
• Example Case Studies
• Open Discussion…
2© Copyright 2012 EMC Corporation. All rights reserved.
Background Background & & DefinitionsDefinitions
3© Copyright 2012 EMC Corporation. All rights reserved.
“Big data will represent a hugely disruptive force during the next five years – enabling levels of insight – that are currently unachievable through any other means”
4© Copyright 2012 EMC Corporation. All rights reserved.
currently unachievable through any other means” Gartner May 2011
We Are Awash In Data
• In the information age, every organization is in the “data” business
• Data is growing exponentially, so are the challenges
• Complexity is causing insight to be lost
Source: IDC Digital Universe White Paper, Sponsored by EMC, May 2009
5© Copyright 2012 EMC Corporation. All rights reserved.
Spo so ed by C, ay 009
Pictorial Representation Of Information
6© Copyright 2012 EMC Corporation. All rights reserved.
Big Data: More Than Just About Volume
i l
• Consider: Master Data, Fidelity, Complexity, Validity, Perishability, Linking Data
Velocity VolumeVideo
Transactional DataIndustry-
specificWeb traffic
• Structured Transactional Data: POS transactions, call detail records, credit card transactions, shipping updates purchase orders
TextSocial
shipping updates, purchase orders, payments, shipments, account transactions
• Unstructured Data: Web logs,
VarietyComplexity
Sensor/
newsfeeds, social media, geo-location, mobile, consumer comments, claims, doctor’s notes, clinical studies, images, video,
Smart GridImages
Audio
Documents
location-based
audio
• Device-generated Data: RFID sensors, smart meters, smart grids GPS spatial micro-payments
7© Copyright 2012 EMC Corporation. All rights reserved.
grids, GPS spatial, micro payments
The Typical BI/DW Environment Today…
8© Copyright 2012 EMC Corporation. All rights reserved.
Big Data’s Potential For Actionable Insight
Today’s Situation Big Data Ramifications
Vast majority of available Less than 10% of the
Forward looking or “Wi d hi ld i ”
Vast majority of available sources and external data
“Rear-view” mirror i d hb d d
Less than 10% of the enterprise’s data
“Windshield-view” predictions with recommendations
Re l time ne e l time
reporting, dashboards and analysis
– Weeks, months, or even quarters old
Correlated, high confidence, governed data
– Real-time near real-time
Incomplete, inaccurate, and disjointed data
quarters old
Vastly accelerated time to market
governed data
Architectures and methods that take 6 to 18 months to exploit
j
9© Copyright 2012 EMC Corporation. All rights reserved.
exploit
“Th Ti V l C ”“Th Ti V l C ”
Time Really is Money!
“THE TIME VALUE CURVE”© 2007 - Dr. Richard Hackathorn, Bolder Technology, Inc., All Rights Reserved. Used with Permission.
“The Time Value Curve”“The Time Value Curve”Value
ost
ost
Business EventBusiness Event
Capture Capture
Valu
e L
oValu
e L
o Latency
AnalysisLatency
Data Ready For Analysis
Information Delivered
Latency
AnalysisLatency
Data Ready For Analysis
Information Delivered
A ti TiA ti Ti
Action TakenTaken
Decision LatencyDecision Latency
DataLifecycle
Action TimeAction TimeTime
10© Copyright 2012 EMC Corporation. All rights reserved.
Lifecycle
Data Is Coming At Us Faster In a recent TDWI survey of 450 CIO’s
17% have a real time data warehouse– 17% have a real time data warehouse
– 90% plan on having a real time warehouse
% ill l l i– 75% will replace to get to a real-time solution
“REAL TIME IS A RAPIDLY BECOMING A NECESSARY FOUNDATION TO A A NECESSARY FOUNDATION TO A DATA SOLUTION AND WITHOUT
ARCHITECTURE THERE IS CHAOS!”
11© Copyright 2012 EMC Corporation. All rights reserved.
Data Is Coming From All Directions
Data is now commonly entering into the enterprise from external sourcesp– Government (Census, Revenues, …)
– Neilson, NPD Group (Sales), p ( )
– Bloomberg, NYSE (Financial Position)
– Experian, TransUnion, Equifax (Credit Experian, TransUnion, Equifax (Credit Reporting)
– Google Maps, MapInfo (Geospatial, …)
– Radian 6, Biz360, … (Client Trend Data)
– Etc.
12© Copyright 2012 EMC Corporation. All rights reserved.
Need For Data TrustC li ith l Compliance with laws
– Revenue Canada, Sarbanes Oxley [SOX], BASIL II, HIPAA, etc.
L k f fid i th d t Lack of confidence in the data– Reports utilizing same data do not report same totals or
computations
D t t d fi d d dil il bl Data not defined and readily available– Multiple sources of data have to be rationalized at each
project start-up thereby wasting valuable time & $ on every projecty p j
Data timeliness– Manual process to collect, analyze and provide results
Data integ it Data integrity– Unknown filters, varying calculation/computations, fields
used for data not indicative of field names, data passed along from one person to another to another to another…..
13© Copyright 2012 EMC Corporation. All rights reserved.
g p
Summation Of Challenges We Are ObservingObserving
• Business mandate to obtain more value out of the data (get answers)of the data (get answers)
• Variety of sources, amounts, types and granularity of data that customers want to integrate is growing exponentially
• Need to shrink the latency between the b i d h d il bili f business event and the data availability for analysis and decision-making
• Advancing agility of information is key• Advancing agility of information is key
• Need for Data trust and Compliance with regulations
14© Copyright 2012 EMC Corporation. All rights reserved.
regulations
The The Challenge Challenge Of Big DataOf Big Data
15© Copyright 2012 EMC Corporation. All rights reserved.
“Old” Journey To Information Maturity [EIM]Data Chaos• Same type of data means different things in different systemsE AT&T i th
Master Data• Publish and Subscribe to master dataEx: Single view of
Data Analytics• Analyzing the data.• Looking for trends and correlations
• Ex: AT&T is the same as AT&T Inc
• Ex: Single view of customer across all information systemsData Discovery Data Governance Data Integration Data MiningPROCESSES
Data Chaos Defined Data Master DataIntegrated
InformationData
AnalyticsBusiness
Optimization
Defined Data Integrated Predictive
Data Discover Metadata ETL Suite BI / DW / OLAPTOOLS
Defined Data• Define common meanings.
• Ex: Determine the sources, types, and
f d
Integrated Information• Bring metadata together with information for
Predictive Information• Using the analyzed data to optimize operations
• Wiki Type Sharing Of Self-
16© Copyright 2012 EMC Corporation. All rights reserved.
properties of grouped (i.e.: customer) records
reporting (BI) and warehousing (drilling and hierarchies).
Provisioned Environments• Atomic Data Analytics
The Information Issue IsThe Information Issue Is…
Too many organizations are not using information to its full advantage: information to its full advantage: – 1 in 3 business leaders frequently make
critical decisions without the information they need
– 1 in 2 business leaders do not have access to the information across their organization to the information across their organization needed to do their jobs.
– 3 in 4 business leaders say more predictive y pinformation would drive better decisions
17© Copyright 2012 EMC Corporation. All rights reserved.
Source: Source: IBM Institute for Business Value, March 2009
Information Trust & Business Alignment Harris Interactive recently polled 23,000 U.S.
employees and foundOnly 37% said they have a clear understanding of – Only 37% said they have a clear understanding of what their organization is trying to achieve and whyO l i fi th i ti b t th i t – Only one in five was enthusiastic about their team and the organization’s / corporation’s goals
– Only one in five said they have a clear “line of sight” between their tasks and their team and organization’s goals
– Only 15% felt that their organization fully enables y g ythem to execute key goals
– Only 20% fully trusted the organization they work for
18© Copyright 2012 EMC Corporation. All rights reserved.
Only 20% fully trusted the organization they work for
Viewed Using An Seasonal Analogy…
If a football team had these players on the fi ldfield:
– Only 4 of the 11 players on the field would know which goal is theirs
– Only 6 of the 11 would care – Only 3 of the 11 would know Only 3 of the 11 would know
what position they play and what they are supposed to do
– 9 players out of 11 would, in 9 players out of 11 would, in some way, be competing against their own team rather than the opponent
19© Copyright 2012 EMC Corporation. All rights reserved.
pp
Perceived Complicated Landscape
• BI/DW is perceived as not “enabling” the business– Inhibitor to corporate progress IT systems cannot be
changed fast enough to meet market demands, seize g gopportunity or comply with a new requirement.
– Weak alignment between IT and business strategy Marked by an intractable language barrier.
i l h f i– Business not always sure what Information or Dimensions they want or need How can IT provide without requirements?BI/DW is not known as the source of innovations– BI/DW is not known as the source of innovations
• The complexity of systems has caused BI/DW to be reactive rather than proactive
– Silo’d solutions, db’s and applications with trapped business rules
– Multiple sources of information and no single “truth”No “Architectural Blueprints” to the enterprise
20© Copyright 2012 EMC Corporation. All rights reserved.
– No “Architectural Blueprints” to the enterprise…
The Business Intelligence Maturity Model
21© Copyright 2012 EMC Corporation. All rights reserved.
Advancing The Maturity Of Information…
22© Copyright 2012 EMC Corporation. All rights reserved.
The big data impacts to both business and IT are significant;early adopters will fundamentally change their industries
• More agile, more real-time, more accurate decision-making
Business Expectations IT Ramifications
• Enhanced user experience that delivers insights to any deviceg
• Predict and spot changes in dynamic and volatile markets
• Deeper understanding of customer preferences and behavior
• Greater fidelity in risk assessment and li f t
g y• Operationalization of data scientists and
analytic insights• Tools and processes for data quality,
governance, and security• Cloud for self-service, collaboration, agility,
d t d ticompliance enforcement and cost reduction
“Big data poses a major opportunity for CIOs to drive added value for the business by deriving insights and added value for the business, by deriving insights and identifying patterns from the huge amounts of data available”
“Through 2015, organizations integrating high value, diverse new information sources and types into a coherent information management infrastructure will outperform industry peers financially by more than 20%”
23© Copyright 2012 EMC Corporation. All rights reserved.
Source: Gartner"The New Value Integrator," Insights from the Global Chief Financial Officer Study”July 2011
Architectural Architectural Solutions For Solutions For Big Datag
24© Copyright 2012 EMC Corporation. All rights reserved.
Big Data Requires Change…g q g
Consider 100 GB would store the entire US Census DB “basic” information set for every Census DB “basic” information set for every living human being on the planet:
Age Sex Income Ethnicity Language Religion – Age, Sex, Income, Ethnicity, Language, Religion, Housing Status, Location into a 128 bit set
– That equates to about 6.75 millions rows of at equates to about 6 5 o s o s oabout 10 collumns
Consider the Large Hadron Collinder at CERN– Expected to produce 150,000 times as much raw
data each year
25© Copyright 2012 EMC Corporation. All rights reserved.
The Big Change In Technologies
Consider that Relational technologies were invented to get data in invented to get data in and organized, not designed nor organized t t it tto get it out
– RDBMS’s were designed for efficient transactions processing on large data sets
▪ Adding, Updating
▪ Searching for & retrieving small amounts of data
26© Copyright 2012 EMC Corporation. All rights reserved.
[2] Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09
Data Warehouses Were An AnswerDW l i ll d i d “ f DW was classically designed as “copy of transaction data specifically structured for query and analysis”query and analysis
– General approach is bulk ETL into a DB designed for queries
Big data changes the answer– “Traditional RDBMS-based dimensional modeling
and cube-based OLAP turns out to be to slow or and cube based OLAP turns out to be to slow or to limited to support asking the really interesting questions of warehoused data”[2]
“To achieve acceptable performance for highly order-dependent queries on truly large data, one must be willing to consider
abandoning the purely relational database model[2]”
27© Copyright 2012 EMC Corporation. All rights reserved.
[2] Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09
Voluminous Data Sets…
What makes large data sets are repeated observations over time/spacerepeated observations over time/space
– Web log has M’s visits over handful pages
Retailer has 10K products M custs but B trans– Retailer has 10K products, M custs, but B trans
– Hi-Res Scientific like fMRI 1K GB per view
L d t t S ti l T l di ’– Large datasets Spatial or Temporal dim’s
Cardinalities (distinct observations) is usually small with regard to total # of observations
28© Copyright 2012 EMC Corporation. All rights reserved.
Technology Solutions Appeared…
29© Copyright 2012 EMC Corporation. All rights reserved.
Lets Talk Technical Solutions…Sequential and/or Distributed File-Based Solutions
– Oracle Exadata, Hadoop, etc.
Columnar (compression) / Multi-Level Tables( p ) /– Solves challenge of retrieving entire row– Par-Excel, Vertica, Sybase, etc.
Distributed MPP– Teradata, Greenplum, etc.
Polymorphic– Combination of Columnar & MPP
30© Copyright 2012 EMC Corporation. All rights reserved.
Finding Answers Sequentially With OLTP
Random access is slower than sequential
The advantage gained by doing all data g g y gaccess in sequential order is often 4x – 10x
– Many orders of magnitude !
31© Copyright 2012 EMC Corporation. All rights reserved.
[2] Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09
Distributed File: Partitioning With OLTP
Partitioning can solve challenges of data Partitioning can solve challenges of data growth, but true distributed processing utilizing MPP is best (author’s opinion)
32© Copyright 2012 EMC Corporation. All rights reserved.
utilizing MPP is best
Distributed File: Partitioning Viewed
Q: What was the total transactions (sales)
amount for May 20 and May 21 2009?
Sales Table
5/17May 21 2009? 5/17
5/18Only the 2
Select sum(sales_amount)
From SALES
5/19
5/20
relevant partitions are read
Where sales_date between
to_date(‘05/20/2009’,’MM/DD/YYYY’)
And
5/20
5/21
to_date(‘05/22/2009’,’MM/DD/YYYY’);5/22
33© Copyright 2012 EMC Corporation. All rights reserved.
Source: Extreme Performance With Oracle Data Warehousing
Distributed File: Open Source (Hadoop)
Apache Hadoop is a software framework that supports data-intensive p p ppdistributed applications under a free license.
– It enables applications to work with thousands of nodes and petabytes of data– Hadoop was inspired by Google's MapReduce and Google File System (GFS)
papers.papers.
Hadoop is a top-level Apache project being built and used by a global community of contributors using the Java programming language.
– Yahoo! has been the largest contributor to the project, and uses Hadoop extensively across its businesses
34© Copyright 2012 EMC Corporation. All rights reserved.
extensively across its businesses.
Source: Wikipedia “Hadoop”
Distributed File: Hash-Based Distribution
In a hash-based data distribution, the data is distributed across multiple platforms for
ll li f iparallelism of queries…
35© Copyright 2012 EMC Corporation. All rights reserved.
Columnar: Storage
In a table with say 256 columns, a lookup will retrieve all the data in the row (disk bound) Columnar storage reduces this I/O bandwidth by storing g / y g
column data using compression– State (50 combinations stored)– Master (compressed) table has pointers to State
36© Copyright 2012 EMC Corporation. All rights reserved.
( p ) p
Source: Vertica Website
Columnar: Multi-Level Table Partitioning
In multi-level table partitioning, data distribution occurs across multiple platforms in segmented p p gtables for distribution of columnar queries
This reduces the amount of work performed by each platfo m
37© Copyright 2012 EMC Corporation. All rights reserved.
each platform
MPP Shared Nothing Architectures
Extreme scalability
Elastic Expansion & Self-Healing Fault-Tolerance
Unified Analytics
38© Copyright 2012 EMC Corporation. All rights reserved.
y
Source: “Greenplum Database 4.0: Critical Mass Innovation”, White Paper, August 2010
MPP Shared Nothing Architectures
39© Copyright 2012 EMC Corporation. All rights reserved.
Source: “Greenplum Database 4.0: Critical Mass Innovation”, White Paper, August 2010
The “Ideal” – MPP Shared Nothing
Poly-Morphic StorageTabular, Columnar,
NoSQL, etc.
40© Copyright 2012 EMC Corporation. All rights reserved.
It’s A Brave It s A Brave New WorldNew World
41© Copyright 2012 EMC Corporation. All rights reserved.
From the Old Stack to a New Ecosystem: Drivers for Changeg Many new data sources (organic growth, data services, M&A)
– Impractical to add new data sources because of tightly coupled pipeline
M t t d d t i l di i l di More unstructured data, including social media– Lack of access to unstructured data; need analytics and classifiers that operate on it
Less up front data integration– Can’t assume data is pre-integrated – have to be able to locate and to query federated – Can t assume data is pre-integrated – have to be able to locate and to query federated
sources of data and content
More need to track and leverage metadata– Metadata is fragmented, jailed and inconsistent – need agile, community approach
Need for flexible, agile data structures– Current structures are too rigid, and too close to the sources or the business reports
More emphasis on dynamic views for purposeo e e p as s o dy a c e s o pu pose– Need dynamic planning, creation and structuring of views that support analytics
Information governance and management in a federated, regulated world– Need flexible policy expression and enforcement, not just at point of access
42© Copyright 2012 EMC Corporation. All rights reserved.
An Information Platform with New DNATo Promote Agility Business Value and CommunityTo Promote Agility, Business Value and Community
1. Coordinated ingestion of diverse information, changes, events
2. Metadata driven processing and management
3. Nuanced optimization – on demand, multi-source, matching information needs
4. Broader reach of query – contextual search, federation, materialization
5. Freedom from imposed information structure – roll your own structure!
6 Navigation through information – contextual faceted multi-dimensional6. Navigation through information contextual, faceted, multi dimensional
7. Visualization of information – heat, clouds, clusters, flows
8. New data paths engendered by patterned consumption of entities
9 R i b t d t t l ti d i ti f h d bli ti9. Reasoning about data set location, derivation, freshness, and obligations
10. User empowerment – collaboration and talent development
43© Copyright 2012 EMC Corporation. All rights reserved.
Businesses Want Integrated, Timely Information for Purposefor Purpose
Area Revolution
Latency “Microbatch is the new Batch”
Enrichment “Tagging is the new Transformation”
Query “Query is the new ETL”
Federation “Query Director is the new Query Optimizer”
Source “Purposeful View is the new Master”
44© Copyright 2012 EMC Corporation. All rights reserved.
Some Of The Newer Trends In Big Data
Powerful Analytics– What if, What will happen next, …, pp ,
– Self-service analytics?
▪ Build your own sandbox of data…u d you o sa dbo o da a
Data Cloud Surrounded Warehouse– Data Virtualization– Data Virtualization
▪ Abstracting the data from the systems, it complements existing data warehouses
– Many times the size of structured warehouse
– Provides for rapid analytic iterations
45© Copyright 2012 EMC Corporation. All rights reserved.
p y
When You Link Structured & Unstructured Information You Get…
46© Copyright 2012 EMC Corporation. All rights reserved.
Powerful Analytical EnginesWhat is the best price to sell my product?
47© Copyright 2012 EMC Corporation. All rights reserved.
How Do I Do This?How Do I Do This?
48© Copyright 2012 EMC Corporation. All rights reserved.
How Do I Do This #2?How Do I Do This #2?
49© Copyright 2012 EMC Corporation. All rights reserved.
How Do I Do This #3?How Do I Do This #3?
50© Copyright 2012 EMC Corporation. All rights reserved.
Visualize The Information…
51© Copyright 2012 EMC Corporation. All rights reserved.
Analytics: A Picture Is Worth A 1,000 WordsWords
52© Copyright 2012 EMC Corporation. All rights reserved.
Data Virtualization Example
53© Copyright 2012 EMC Corporation. All rights reserved.
Data Virtualization In Practice
54© Copyright 2012 EMC Corporation. All rights reserved.
Enterprise Big Data Cloud
55© Copyright 2012 EMC Corporation. All rights reserved.
The Future Of Data Warehousing?The “Ideal” AAbatebate Enterprise Data Cloud Truly Virtualized Data Environment Extreme Scale, Elastic Expansion Automated Metadata Discovery, Classification & Tagging Linearly Scalable Linearly Scalable
– Add 1x and get 2x performance
Self – Service Provisioning Single Point Of Management
– Resource utilization optimization
Secure, Unified Data Access – Single Point of Entry– Portal based sharing of data sandboxes (wiki-type)
Reduce TCO By Eliminating Excessive Licensing Fees– Use of open source community to improve solution
56© Copyright 2012 EMC Corporation. All rights reserved.
Example Example Case StudiesCase Studies
57© Copyright 2012 EMC Corporation. All rights reserved.
Telecomm Provider Learns A Lesson…
BIG DATA ANALYTICS USE CASE
e eco o de ea s essoBefore investing $M of dollars on infrastructure, a provider learned where to invest their monies that would payoff…
Ch llChallenge– 100TB Traditional EDW, Single Source Of Truth– Operational Reporting & Financial Consolidation– Heavy Governance And Control– Unable To Support Critical Business Initiatives– Customer Loyalty And Churn The #1 Business
Initiative From The CEO
Enterprise Data Cloud Enterprise Data Cloud Architecture-Based Solution
– Extracted Data From EDW & Other Sources– Generated Social Graph From Call Detail
And Subscriber Data– Within 2 Weeks Found “Connected”
Subscribers7X More Likely To Churn Than Average UsersN D l i 1PB P d ti
58© Copyright 2012 EMC Corporation. All rights reserved.
– Now Deploying 1PB Production
Drive Multi-channel Campaign Optimization
BIG DATA ANALYTICS USE CASE
Drive Multi channel Campaign OptimizationRetailer increases in-flight multi-channel effectiveness with customer and product insights
HIGH
ion
LegacySystem Advanced
Analytics
ood
Of C
onve
rsi
Big Data Analytics
I t t t b h i l d t ith
LOW
Like
liho
Monitor cross-channel product
sales effectiveness
Integrate customer behavioral data with social media sentiment data to yield new market, product and campaign insights
59© Copyright 2012 EMC Corporation. All rights reserved.
Innovate With Big Data Analytics
BIG DATA ANALYTICS USE CASE
Innovate With Big Data AnalyticsBig Data Analytics Accelerate Health Care 2.0 for Evidence-based Care Provider
HIGHHIGH
Car
e
LegacySystem BI Reporting
Big Data
AdvancedAnalytics
Qua
lity
of C
Delivering 10 Years
g ataAnalytics
Associative Rule Mining and User External Data Sources Enable
LOWTreatment
Pathways onTreatment
Pathways on
Delivering 10 Years Of Data In Seconds
Associative Rule Mining and User Clustering Improves Pathways
External Data Sources Enable Personalized Medicine
TRADITIONAL DATA LEVERAGED
a ays oSummary Data
a ays oAll the Data
BIG DATA LEVERAGED
60© Copyright 2012 EMC Corporation. All rights reserved.
O h OfOpen Exchange Of Ideas…Ideas…
Speaker Contact Information:Speaker Contact Information:
Robert J. Abate, CBIP, CDMProbert.abate@emc.com(201) 745-7680
61© Copyright 2012 EMC Corporation. All rights reserved.
Credits To Quoted AuthorsAdam Jacobs i i f i 1010d I h h l h l d h Adam Jacobs is senior software engineer at 1010data Inc., where, among other roles, he leads the continuing development of Tenbase, the company’s ultra-high-performance analytical database engine. He has more than 10 years of experience with distributed processing of big datasets, starting in his earlier career as a computational neuroscientist at Weill Medical College of Cornell University (where he holds the position of Visiting Fellow) and at UCLA. He holds a Ph.D. in neuroscience from UC Berkeley and a B.A. in linguistics from Columbia University. (QUOTED FROM: “The Pathologies of Big Data”, 7/6/09)a B.A. in linguistics from Columbia University. (QUOTED FROM: The Pathologies of Big Data , 7/6/09)
Bill Schmarzo has over two decades of experience in data warehousing, BI and analytic applications (Metaphor Computers, 1984). Bill authored the Business Benefits Analysis methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data W h I i f l h h d f h l i li i i l Bill VP f A l i Warehouse Institute faculty as the head of the analytic applications curriculum. Bill was VP of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Web Site analytics products, including the delivery of “actionable insights” through a holistic user experience. For Business Objects, Bill oversaw the Analytic Applications business unit including the development, marketing and sales of Business Objects’ industry-leading analytic applications.
Donald Sutton has over 20 years experience in Data Architecture, Analysis, Modeling, ETL, Implementation and Integration in the areas of Data Entry (OLTP) or ERP and 3rd Party COTS Applications, Operational Data Store (ODS), Master Data Store (MDS), Data Warehouse (DW) and Data Marts (DM) while providing Business Intelligence (BI) from multiple sources above. Passionate and motivated about sound design of data structures in all different data layers and the representation and t f ti f d t ith th ti d f d t th h t ll d t l hil transformation of data with the accounting and governance of data throughout all data layers while Providing Business Intelligence (BI) and analytics with Key Performance Indicators (KPI) along with business modeling in translating business requirements to data requirements. (QUOTED FROM: Current Warehousing Environment & Analytics Visualizations)
62© Copyright 2012 EMC Corporation. All rights reserved.
Recommended