48
1 Copyright 2011 EMC Corporation. All rights reserved. EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby. 1 Hamid Djam Principal Architect Business Intelligence & Analytics Business Intelligence & Big Data Analytics

Hamid Djam Principal Architect Business Intelligence & Analytics

  • Upload
    dyami

  • View
    44

  • Download
    1

Embed Size (px)

DESCRIPTION

Business Intelligence & Big Data Analytics. Hamid Djam Principal Architect Business Intelligence & Analytics. - PowerPoint PPT Presentation

Citation preview

Page 1: Hamid  Djam Principal Architect  Business Intelligence & Analytics

1© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

1

Hamid Djam

Principal Architect

Business Intelligence & Analytics

Business Intelligence & Big Data Analytics

Page 2: Hamid  Djam Principal Architect  Business Intelligence & Analytics

2© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

2

• EMC makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”).

• Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

• Roadmap information is EMC Restricted Confidential and is provided under the terms, conditions and restrictions defined in the EMC Non-Disclosure Agreement in place with your organization.

Disclaimer

Page 3: Hamid  Djam Principal Architect  Business Intelligence & Analytics

3© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

3

Why A Complete Big Data Analytics Stack Matters

• Big Data is the new source for economic value

• The clearest path to competitive advantage

• The ultimate manifestation of fact-based decision making

• The net new catalyst for business innovation and workplace evolution

• The driving force of a new computing paradigm: data computing

Page 4: Hamid  Djam Principal Architect  Business Intelligence & Analytics

4© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

New Realities: Your Data Rules the World

Page 5: Hamid  Djam Principal Architect  Business Intelligence & Analytics

5© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

5

Challenges in Today’s DW Environments…

Traditional solutions cannot meet new challenges

• Critical business insight is outside enterprise data warehouse because the traditional DW solutions cannot absorb data fast enough

– 100s of data marts– ‘Shadow’ databases

• Data is everywhere and growing

– 44x data growth by 2020

Enterprise Data Warehouse

But it only holds 10 % of data

Data-marts and‘personal databases’ e.g. Access, Excel ……

Makeup up 90% of corporate data

Source: IDC Digital Universe Study, sponsored by EMC, May 2010

Page 6: Hamid  Djam Principal Architect  Business Intelligence & Analytics

6© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

6

DW Challenges Resolved With, BI as a Service

SpeedAgilityFlexibilityChange Short term

StabilitySecurityControlStandardsLong term

BUSINESS IT Long Project Duration.

Gap in understanding business requirements.

Business creating their own data marts.

Inconsistent data between IT systems and business systems.

Reference: Nine Secrets to Building an Agile, Adaptable BI Environment ,TDWI

Page 7: Hamid  Djam Principal Architect  Business Intelligence & Analytics

7© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

7

EMC IT: Offering IT-as-a-Service

Infrastructure-as-a-service

NetworkStorage &

backupCompute

Platform-as-a-service

Greenplum SQL Server

Application Platforms

Enterprise Applications/ Software-as-a-service

MDM

ERP

Governance, risk, compliance Business

intelligence

CRM

Application Server

Info. Lifecycle Mgmt Ent. Content Mgmt

Integration Web server

App. frameworks

Security

Runtime environments Development tools

vBlock

Oracle …DatabasePlatform

Apps

Infrastructure

Desktop-as-a-serviceVirtual DesktopsClient Devices

Page 8: Hamid  Djam Principal Architect  Business Intelligence & Analytics

8© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

8

• Guarantees data availability where and when it is required

• Movement and transformation of enterprise information

• Interconnectivity of IT portfolio

• Standardized formats and service interfaces – SOA

Data Integration

• Identification and deduplication of shared master data

• Cross-referencing and disambiguation

• Hierarchy management• Data governance

framework and stewardship processes

Master Data Management

• Unstructured data storage and management

• Workflow-based publishing & versioning services

• Tie-in to enterprise portal and user identity / security strategies

Content Management

• Framework and organization to ensure management of data as a strategic corporate asset

• Data stewardship• Policies and procedures;

monitoring and measuring

Data Governance

• Data warehouse methodology – envisioning to deployment

• Business use-case- or function-specific datamarts / reporting solutions

• Moving with agility fromreactive to predictive capability

Business Intelligence

• Assurance that trustworthy data is accessible at time of demand

• Standardization& cleansing

• Business data rule enforcement

• Stale data refresh• Augmentation from

external sourcesInformation Quality

Information Management Core Disciplines

Page 9: Hamid  Djam Principal Architect  Business Intelligence & Analytics

9© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

9

Page 10: Hamid  Djam Principal Architect  Business Intelligence & Analytics

10© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

10

Building The Industry’s Only Complete Big Data Analytics “Stack”

Greenplum ChorusEnterprise Collaboration Platform for Data

Greenplum Database

Enterprise & Community Editions

World’s Most Scalable MPP Database Platform

Analytic Toolsets(Business Analytics, BI, Statistics, etc.)

Greenplum HD

Hadoop Enterprise & Community Editions

Enterprise Analytics Platform for Unstructured Data

Greenplum Data Computing AppliancesPurpose-built for Big Data Analytics

Page 11: Hamid  Djam Principal Architect  Business Intelligence & Analytics

11© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

11

GREENPLUM DATABASE

Industry-Leading Massively Parallel Processing (MPP)

Performance

Click icon to add picture

Page 12: Hamid  Djam Principal Architect  Business Intelligence & Analytics

12© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

12

Building The Industry’s Only Complete Big Data Analytics “Stack”

Greenplum ChorusEnterprise Collaboration Platform for Data

Greenplum Database

Enterprise & Community Editions

World’s Most Scalable MPP Database Platform

Analytic Toolsets(Business Analytics, BI, Statistics, etc.)

Greenplum HD

Hadoop Enterprise & Community Editions

Enterprise Analytics Platform for Unstructured Data

Greenplum Data Computing AppliancesPurpose-built for Big Data Analytics

Page 13: Hamid  Djam Principal Architect  Business Intelligence & Analytics

13© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

13

EMC Greenplum Database IsPurpose-built for Big Data

• EMC Greenplum is a shared nothing, massively parallel processing (MPP) data warehouse system

• Core principle of data computing is to move the processing dramatically closer to the data and to the people

Fast DataLoading

Extreme Performance

& Elastic Scalability

Unified Data Access

Page 14: Hamid  Djam Principal Architect  Business Intelligence & Analytics

14© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

14

Greenplum 4.0: Database Architecture

NetworkInterconnect

... ...

......MasterServers

Query planning & dispatch

SegmentServers

Query processing & data storage

SQL

MapReduce

SQL

MapReduce

ExternalSources

Loading, streaming, etc.

Massively Parallel ProcessingAnd Linear Performance Scalability

Page 15: Hamid  Djam Principal Architect  Business Intelligence & Analytics

16© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

Platform IndependenceDelivers Choice and Flexibility

Software-Only• On your x86 hardware• Flexibility for any workload• Ideal for Q/A or DR

Virtualized Infrastructure• Pool resources• Elastic scalability• Ideal for Test &

Development

Data Computing Appliance• Optimized Price/Performance• Minimum time-to-value• Ideal for Production

Environments

Page 16: Hamid  Djam Principal Architect  Business Intelligence & Analytics

17© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

Mature Enterprise Platform

PRODUCTFEATURES

CLIENT ACCESS & TOOLS

Multi-Level Fault Tolerance

Shared-Nothing MPP

Parallel Query Optimizer

Polymorphic Data Storage™

CLIENT ACCESS

ODBC, JDBC, OLEDB, etc.

CORE MPPARCHITECTURE

Parallel Dataflow Engine

gNet™ Software Interconnect

MPP Scatter/Gather Streaming™

Online System Expansion Workload ManagementGPDB ADAPTIVE

SERVICES

LOADING & EXT. ACCESS

Petabyte-Scale Loading

Trickle Micro-Batching

Anywhere Data Access

STORAGE & DATA ACCESS

Hybrid Storage & Execution(Row- & Column-Oriented)

In-Database Compression

Multi-Level Partitioning

Indexes – Btree, Bitmap, etc.

LANGUAGE SUPPORT

Comprehensive SQL

Native MapReduce

SQL 2003 OLAP Extensions

Programmable Analytics

3rd PARTY TOOLS

BI Tools, ETL Tools

Data Mining, etc

ADMIN TOOLS

GP Performance Monitor

pgAdmin3 for GPDB

Page 17: Hamid  Djam Principal Architect  Business Intelligence & Analytics

18© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

18

EMC GREENPLUM HD

Delivering Enterprise-

Ready Apache Hadoop

Page 18: Hamid  Djam Principal Architect  Business Intelligence & Analytics

19© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

19

Building The Industry’s Only Complete Big Data Analytics “Stack”

Greenplum ChorusEnterprise Collaboration Platform for Data

Greenplum Database

Enterprise & Community Editions

World’s Most Scalable MPP Database Platform

Analytic Toolsets(Business Analytics, BI, Statistics, etc.)

Greenplum HD

Hadoop Enterprise & Community Editions

Enterprise Analytics Platform for Unstructured Data

Greenplum Data Computing AppliancesPurpose-built for Big Data Analytics

Page 19: Hamid  Djam Principal Architect  Business Intelligence & Analytics

20© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

20

Greenplum HD – Enterprise Ready Hadoop Platform for Unstructured Data

• Greenplum Hadoop is faster, more dependable, and easier to use

– Faster to address the growth of unstructured data

– EMC reliable for the Enterprise

– Easier to use with existing systems and tools

Page 20: Hamid  Djam Principal Architect  Business Intelligence & Analytics

21© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

21

Why Hadoop?

• With massive growth of unstructured data, open-source software, Apache Hadoop has quickly become an important new data platform and technology

– We've seen this first-hand with customers deploying Hadoop alongside Greenplum databases

Page 21: Hamid  Djam Principal Architect  Business Intelligence & Analytics

22© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

22

Why EMC Greenplum HD?• EMC has the technical depth, expertise and

critical mass in building the scalable and reliable distributed data processing systems necessary to drive technical innovation into Hadoop

• Hadoop needs to become “mission critical” and “easier to use and manage”

– HDFS optimizations, workload management, job scheduling, systems management, etc.

– Fault-tolerance: Eliminate SPOF for Name-Node, Job Tracker and other key components underlying Hadoop

Page 22: Hamid  Djam Principal Architect  Business Intelligence & Analytics

23© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

23

Greenplum HD: Hadoop Software Distributions

• Introducing Greenplum HD, enterprise-ready Apache Hadoop software distributions

–Community Edition software• 100% open source

–Enterprise Edition software• Advanced features• 100% API compatible

Page 23: Hamid  Djam Principal Architect  Business Intelligence & Analytics

24© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

24

Greenplum HD Data Computing Appliance

• Introducing the world’s first:– high-performance– purpose-built– data co-processing Hadoop

appliance

• Combining Hadoop and Greenplum Database in one appliance

Page 24: Hamid  Djam Principal Architect  Business Intelligence & Analytics

25© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

THE ANSWERMACHINEDATA IN. DECISIONS OUT.

Introducing the Greenplum Data Computing Appliance

Page 25: Hamid  Djam Principal Architect  Business Intelligence & Analytics

26© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

26

Building The Industry’s Only Complete Big Data Analytics “Stack”

Greenplum ChorusEnterprise Collaboration Platform for Data

Greenplum Database

Enterprise & Community Editions

World’s Most Scalable MPP Database Platform

Analytic Toolsets(Business Analytics, BI, Statistics, etc.)

Greenplum HD

Hadoop Enterprise & Community Editions

Enterprise Analytics Platform for Unstructured Data

Greenplum Data Computing AppliancesPurpose-built for Big Data Analytics

Page 26: Hamid  Djam Principal Architect  Business Intelligence & Analytics

27© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

27

Key Architectural Principles• Keep it simple

• Build on standard hardware components– Performance comes from our software architecture– Best of breed x86 and Ethernet networking technologies– Benefit from broad ecosystem innovation

• Make it modular for easy scaling

• SAN connectivity designed in

• Focus on Data Computing, not Data Warehousing– Greenplum Database– SAS Analytics– Hadoop

Page 27: Hamid  Djam Principal Architect  Business Intelligence & Analytics

28© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

28

DCA Functional Components

2 GPDB Master Servers

4 GPDB Segment Servers

2 10GE Switches

Administrative Switch

8 Segment Servers

FreeFunction

alBlock

FreeFunction

alBlock

FreeFunction

alBlock

FreeFunction

alBlock

Page 28: Hamid  Djam Principal Architect  Business Intelligence & Analytics

29© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

29

Scale to Multiple Racks In Granular Quarter Rack Increments

1st Rack

Add ¼ rackIncrements

+

Expansion Rack

Add ¼ rackIncrements

+ . . .

Page 29: Hamid  Djam Principal Architect  Business Intelligence & Analytics

30© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

30

High Availability Built-In

Master

Segment Segment Segment Segment…

Master

Master server data protection• HW RAID protection for drive failures• Replicated transaction logs for server failure

On server failure• Standby server activated• Administrator alerted

Segment Server Data Protection• HW RAID protection for drive failures• Mirrored segments for server failures

On server failure• Mirrored segments take over with no loss of

service• Fast online differential recovery

Page 30: Hamid  Djam Principal Architect  Business Intelligence & Analytics

31© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

31

GPDB HA Groups And Segment Mirrors

GPDB HA Group

GPDB HA Group

GPDB HAGroup

GPDB HA Group

P1 P2 P3 M6 M8M10

P4 P5 P6 M1 M9M11

P7 P8 P9 M2 M4M12

P10 P11 P12 M3 M5 M7

SegmentServer 1

SegmentServer 2

SegmentServer 3

SegmentServer 4

Set of Active Segment Instances

Number of primary and mirror instances shown above are for illustration purposes only. Each Segment Server in a DCA actually supports a total of 12

instances (6 primaries and 6 mirrors)

Page 31: Hamid  Djam Principal Architect  Business Intelligence & Analytics

32© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

32

DCA Can Sustain Up to Four Server Failures Per Rack, One Per HA Group

GPDB HA Group

GPDB HA Group

GPDB HAGroup

GPDB HA Group

P1 P2 P3 M6 M8M10

P4 P5 P6 M1 M9M11

P7 P8 P9 M2 M4M12

P10 P11 P12 M3 M5 M7

SegmentServer 1

SegmentServer 2

SegmentServer 3

SegmentServer 4

Set of Active Segment Instances

Number of primary and mirror instances shown above are for illustration purposes only. Each Segment Server in a DCA actually supports a total of 12 instances (6

primaries and 6 mirrors)

Page 32: Hamid  Djam Principal Architect  Business Intelligence & Analytics

33© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

33

EMC Dial-Home andRemote Support Built-In

• EMC Premium Support

• ESRS secure IP connection enabled for DCA racks

– Automatic dial home for DCA HW and SW failures

– 24x7 Remote technical support and trouble shooting

– Online support triggers FRU parts shipment

• Four hour on site support objective

EMC Support

FTPSOr

ESRS

Page 33: Hamid  Djam Principal Architect  Business Intelligence & Analytics

34© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

34

Customer Support Services EMC Greenplum Warranty and Premium Maintenance

Premium Maintenance• Remote Technical Support

– 24x7 technical support and remote troubleshooting

– Customer-managed case severity level– Installation of platform operating system

updates• Onsite Support

– Installation of replacement parts– Four-hour response objective

• Proactive Service– Secure remote monitoring for hardware– Notification of engineering technical

advisories– Built-in tools maximize stability and

performance• Secure Self-Help

– 24x7 access to eService support tools including knowledgebase, forums, and appropriately licensed software updates

One year Limited HW Warranty • Secure Self-Help

– 24x7 access to eService support tools including knowledgebase, forums

• Remote Technical Support– Technical support and remote

troubleshooting during normal business hours

• Replacement parts shipped for next business day arrival

Page 34: Hamid  Djam Principal Architect  Business Intelligence & Analytics

35© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

35

EMC Effect: Rapidly Expanding Portfolio

One Rack DCA High Capacity DCA

Total CPU Cores 192 192

Total Memory 768 GB 768 GB

Segment HDD 192 192

HDD Type 600GB SAS 2TB SATA

Usable Capacity (uncompressed) 36 TB 124 TB

Usable Capacity (compressed) 144 TB 496TB

Scan Rate 24GB/Sec 14GB/Sec

Data Load Rate 10TB/Hour 10TB/Hour

Page 35: Hamid  Djam Principal Architect  Business Intelligence & Analytics

36© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

36

Data Computing Appliance (DCA)GP10

(1/4 Rack)

GP100 (1/2

Rack)

GP1000 (Full

Rack)

Master Servers 2 2 2

Segment Servers 4 8 16

Total CPU core 48 96 192

Total Memory 192 GB 384 GB 768 GB

Segment HDD’s (SAS) 48 96 192

Usable Capacity

(uncompressed)9 TB 18 TB 36 TB

Usable Capacity (compressed) 36 TB 72 TB 144 TB

Scan Rate 6 GB/Sec

12 GB/Sec

24 GB/Sec

Data Load Rate 2.5TB/Hr 5TB/Hr 10TB/Hr

• Purpose-built, highly scalable next generation data warehousing appliance

• Architecturally integrates database, compute, storage, and network into an enterprise-class, easy-to-implement system.

• Balanced for best price/performance ratio

• Available in quarter-, half-, three-quarter-, full-, and multi-rack configurations

Page 36: Hamid  Djam Principal Architect  Business Intelligence & Analytics

37© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

37

High Capacity DCA• Suitable for large

data base customers with PB scalability in mind

• Increase the data capacity in a rack by three-times

• Reduced rack space, power, and cooling needs per unit data

• Lowest price-per-unit data warehouse appliance

• Available in quarter-, half-, three-quarter-, full-, and multi-rack configurations

GP10C (1/4

Rack)

GP100C (1/2

Rack)

GP1000C (Full

Rack)

Master Servers 2 2 2

Segment Servers 4 8 16

Total CPU core 48 96 192

Total Memory 192 GB 384 GB 768 GB

Segment HDD’s (SATA) 48 96 192

Usable Capacity (uncompressed) 31TB 62 TB 124 TB

Usable Capacity (compressed) 124 TB 248 TB 496TB

Scan Rate 3.5 GB/Sec

7 GB/Sec 14 GB/Sec

Data Load Rate 2.5 TB/Hr 5TB/Hr 10TB/Hr

Page 37: Hamid  Djam Principal Architect  Business Intelligence & Analytics

38© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

38

Application Specific Configurations

Database Hadoop

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

Page 38: Hamid  Djam Principal Architect  Business Intelligence & Analytics

39© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

39

Seamless Infrastructure Integration

Big Data Loading & Staging

Disaster Recovery

Storage Expansion

Data Protection

Page 39: Hamid  Djam Principal Architect  Business Intelligence & Analytics

40© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

40

Seamless Infrastructure Integration

EMC Data DomainEfficient Backup & Restore

EMC VMAX SAN MirrorFor Advanced Storage

Management

Isilon Scale Out StorageFor Big Data Staging

EMC VMAX SRDFEMC Data Domain Replication

For Disaster Recovery

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

Page 40: Hamid  Djam Principal Architect  Business Intelligence & Analytics

41© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

41

Efficient Backup/Restore withEMC Data Domain

• Data Domain deduplication is a great fit for Greenplum datasets

• Drastic reduction in backup storage requirement

• Backup all segment servers in parallel directly to Data Domain

• With Greenplum deduplication friendly compressed data streams, achieve effective backup rates up to 6TB/hr

Page 41: Hamid  Djam Principal Architect  Business Intelligence & Analytics

42© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

42

P1 M1

DCA SAN Mirror• Default DCA configuration has

Segment Primaries and Segment Mirrors on internal storage

• SAN Mirror offloads Segment Mirrors to SAN storage

– Doubles effective capacity of a DCA– Foundation of SAN leverage– Seamless off-host backups– Data replication

• No performance impact– Primaries on internal storage– SAN sized for load and failed

segment server

P96 M96

… …EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

H1 2011

Page 42: Hamid  Djam Principal Architect  Business Intelligence & Analytics

43© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

43

GREENPLUM CHORUS

The World’s First

Enterprise Data Cloud

Platform

Page 43: Hamid  Djam Principal Architect  Business Intelligence & Analytics

44© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

44

Building The Industry’s Only Complete Big Data Analytics “Stack”

Greenplum ChorusEnterprise Collaboration Platform for Data

Greenplum Database

Enterprise & Community Editions

World’s Most Scalable MPP Database Platform

Analytic Toolsets(Business Analytics, BI, Statistics, etc.)

Greenplum HD

Hadoop Enterprise & Community Editions

Enterprise Analytics Platform for Unstructured Data

Greenplum Data Computing AppliancesPurpose-built for Big Data Analytics

Page 44: Hamid  Djam Principal Architect  Business Intelligence & Analytics

45© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

45

Greenplum Chorus

• Greenplum’s Enterprise Data Cloud Platform (EDC), enabling:

– Self-service provisioning– Data services– Collaborative analytics

• Customers deploy Chorus along with VMware and the Greenplum Database to create an agile and self-service analytic infrastructure

• Chorus can significantly accelerate the time and ease with which companies extract value and insight from their data

Page 45: Hamid  Djam Principal Architect  Business Intelligence & Analytics

46© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

46

Spin up new projects rapidly with self-service provisioning.o Provision instances, both single-

node and multi-node.o Provision sandboxes as new

databases or schemas.o Import data easily from anywhere in

the cloud.

Page 46: Hamid  Djam Principal Architect  Business Intelligence & Analytics

47© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

47

Data is now discoverable, self-documenting, and shared.o Browse schemas and explore data

with powerful search and visualization tools.

o Attach documents, ask questions, add comments, and build a living data dictionary.

o Define data sets, share them with the team, and schedule imports.

Page 47: Hamid  Djam Principal Architect  Business Intelligence & Analytics

48© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

48

Create a collaborative environment for deep analytics on big data.o Create project workspaces with shared files,

data, documentation and workflows.o Execute workflows directly in the sandbox,

and then track changes to work and results over time.

o Control permissions to protect private data.o Publish functions and documentation, to

promote common standards and techniques.o Import functions from libraries of in-database

analytics functions.o Collaborate within projects, share information

across teams.

Page 48: Hamid  Djam Principal Architect  Business Intelligence & Analytics

49© Copyright 2011 EMC Corporation. All rights reserved.

EMC* makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.

49

THANK YOU