Upload
damon-fisher
View
214
Download
1
Embed Size (px)
Citation preview
Bringing Value of Big Data to Business: SAP's Integrated Strategy [1]
Group 6 - Ziqi Fan, Sheng Chen
SAP’s Integrated Big Data Strategy
• SAP is attempting to create an integrated approach that allows companies to perform all the following operations in one environment– Analytics;– Make big data operational;– Support applications for high resolution
management .
Architecture Vision of SAP’s Integrated Big Data
SPA HANA [2]
• SAP HANA, an in memory database is the key to SAP’s integrated strategy.
• HANA DB takes advantage of the low cost of main memory (RAM), data processing abilities of multi-core processors and the fast data access of solid-state drives relative to traditional hard drives to deliver better performance of analytical and transactional applications.
SPA HANA [2]
• It offers a multi-engine query processing environment which allows it to support both relational data as well as graph and text processing for semi- and unstructured data management within the same system.
• HANA DB is 100% ACID compliant.
Main-Memory DB Query Optimization [3]
• Logical Optimization– Almost same like that in conventional database
• Physical Optimization– goal : minimize execution costs with respect to a
given cost model– Quite different from that in conventional database
due to lack of I/O as dominant cost factor• A “simple” cost model
T = TMem + TCPU
Main-Memory DB Query Optimization
• CPU Cost
TCPU = c0 + c1 · n + c2 · mc0 - fix startup costs
c1 - per tuple costs for processing input tuples
c2 - per tuple costs for producing output tuples
n - # input tuples
m - # output tuples
Main-Memory DB Query Optimization
• Memory Access Cost
Mis - # cache miss of level i for sequential access
Mir - # cache miss of level i for random access
lis - cache latency of level i for sequential access
lir - cache latency of level i for random access
Estimating Mis and Mi
r is very difficult !
Main-Memory DB Query Optimization
• Basic Access Pattern
– single sequential traversal
– repetitive sequential traversal
– single random traversal
– random access
– etc.
• Compound Access Pattern
– Nested loop Join
– Hash-join
– etc.
Reference
• [1] Dan Woods, “Bringing Value of Big Data to Business: SAP's Integrated Strategy”, Forbes, 01/05/2012 http://www.forbes.com/sites/danwoods/2012/01/05/bringing-value-of-big-data-to-business-saps-integrated-strategy/
• [2] http://en.wikipedia.org/wiki/SAP_HANA• [3] Manegold S.: Understanding, Modeling, and Improving Main-
Memory Database Performance, SIKS Dissertation Series No. 2002-17, ISBN 90 6196 5179, pp. 71-104