17
IBM North America © 2016 IBM Corporation Linux on Power Open Source Databases Kevin Lawrence IBM - NA Power Systems - Server Solutions Ecosystem Open Source Databases

2016-10 Open Source Databases - Lawrence - v2.1.pdf

  • Upload
    dangdat

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

IBM North America

© 2016 IBM Corporation

Linux on PowerOpen Source Databases

Kevin LawrenceIBM - NA Power Systems - Server Solutions Ecosystem

Open Source Databases

Page 2: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 2

IBM North America

Linux on Power - Open Source Databases

• “By 2018, more than 70% of new in-house

applications will be developed on an OSDBMS,

and 50% of existing commercial RDBMS

instances will have been converted or will be in

process”*

*Gartner - The State of Open Source RDBMSs, 2015, by Donald Feinberg and Merv Adrian, published April 21, 2015.

Page 3: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 3

IBM North America

Database Ecosystem Many Database choices spanning commercial to open source products, Relational and non-Relational models –no single ‘winner takes all’,

Relational DBs strengths –transactional integrity and large ecosystem around SQL

NoSQL DBs are much lower cost and provide clients a simple data model with dynamic control over store and retrieve of primarily unstructured data types.

The primary 4 flavors of NoSQL DBs are all available on Power 8 :

Key/Value Store (example is Redis) Document Store (example is MongoDB) Columnar Store (example is Cassandra Graph Stores (example is Neo4J)

Page 4: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 4

IBM North America

Types of Databases• Relational database management systems (RDBMS) support the relational

(table-oriented) data model. The schema of a table (relation schema) is defined

by the table name and a fixed number of attributes with fixed data types. A record

(entity) corresponds to a row in the table and consists of the values of each

attribute. (Open Source example would be Postgres/EnterpriseDB)

• Document Databases (eg – MongoDB) store data in Documents, Documents

contain one or more Fields. Data can be queried based on any combination of

fields in a document. The appeal of these systems is that that are very general

purpose, have large application ecosystems and map very nicely to support and

enable many of today’s object oriented programing styles.

• Key Value Store Databases (eg – Redis) are the most basic type of non-

relational DBs. They store a Key and associated Values.

• Wide Column Stores (example – Cassandra) vary in the number of Columns

that are stored. The appeal of these systems is around their very high

performance and scalability.

• Graph Databases – (eg – Neo4j) focus on storing simple and complex

relationships and can be queried to discover simple and more complex

relationships between the data.

Page 5: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 5

IBM North America

Types of Databases with Open Source Examples

Wide column store - Example: Cassandra Graphical - Example: Neo4J

- Example: MongoDB - Example: Redis

Relational - Example: EnterpriseDB

Page 6: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 6

IBM North America

Common Linux on Power OSDBs

Name Classification Optimized for Common Use Cases

MongoDB NoSQL - Document Store Document Model, Document stores, semi-structured or unstructured data.

Single view of Customer records, Enterprise content management, catalogs, personalization

Redis NoSQL - in memory Key Value Store Data queues, Strings, Lists, Counts, caching, Statistics, Text, session IDs, pictures, videos

Live in memory cache, data queues, User session data, shopping cart data,

Cassandra NoSQL - Wide Column Store NoSQL environments that need Very High Performance and Scalability, Very High data volumes

Messaging, Fraud detection, Internet of Things data – sensor data, log data, telco call detail records

Neo4J NoSQL - Graph Store Data stored as edges, nodes, or attributes (Graphs).

Fraud detection, Social Network Analysis, Location aware apps, Master data mgmt., Machine Learning

PostGres(Enterprise DB)

Open source Object Relational database

Wide variety of transactional work at lower TCO – relational/structured queries to object store and retrieval

Oracle RDBMs migrations and take-outs

MariaDB Open source Relational database Lower cost transactional SQL based queries and updates

Migrations from Oracle MySQL, Turbo LAMP stack

Page 7: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 7

IBM North America

Redis

• Main points: Simple values or data structures by keys. Blazing fast

• Exploits Power 8: Redis Labs on Power utilizes IBM POWER8

servers, the IBM Flash System, the IBM CAPI-Flash card and the

Redis Labs Enterprise Cluster (RLEC) for Flash software.

• Other features : Master-slave replication, automatic failover

• Best used: For rapidly changing data with a foreseeable database

size (should fit mostly in memory).

• For example: To store real-time stock prices. Real-time analytics.

Leaderboards. Real-time communication. And wherever you used

memcached before.

Page 8: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 8

IBM North America

MongoDB

• Main point: Retains some friendly properties of SQL. (Query, index)

• Exploits Power 8 features: Performance, MongoDB with CAPI Flash

on P8 testing just starting

• Other features : Master/slave replication (auto failover with replica

sets), Sharding , Text search integrated, Has geospatial indexing

• Data center aware

• Best used: If you need dynamic queries. If you prefer to define

indexes, not map/reduce functions. If you need good performance on

a big DB. If you wanted CouchDB, but your data changes too much,

filling up disks.

• For example: Most popular NoSQL Document DB.

Page 9: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 9

IBM North America

Cassandra

• Main point: Store huge datasets , retrieves in "almost" SQL (CQL3)

• Exploits Power 8 features : Apache

• Other features: CQL3 is the official interface and very similar SQL, but with some limitations that come from the scalability

(most notably: no JOINs, no aggregate functions.)

• Querying by key, or key range (secondary indices are also available).

• Highly scalable and highly available with no single point of failure

• NoSQL column family implementation

• Very high write throughput and good read throughput. Writes can be much faster than reads (when reads are disk-bound)

• SQL-like query language (since 0.8) and support search through secondary indexes

• Tunable consistency and support for replication

• Flexible schema

• Map/reduce possible with Apache Hadoop

• Very good and reliable cross-datacenter replication

• Best used: When you need to store data so huge that it doesn't fit on server, but still want a friendly familiar interface to it.

• For example: Web analytics, to count hits by hour, by browser, by IP, etc. Transaction logging. Data collection from huge

sensor arrays.

Page 10: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 10

IBM North America

Neo4j

• Main point: NoSQL Graph database optimized for connected data

• Exploit Power 8 features: Neo4j on POWER8 offers 56 TB of extended memory, drastically increasing the size at which realtime graph queries are possible. Real-time graph processing with Neo4j on POWER8 supports both standard operational requirements and analytic insights that normally require offline processing. IBM POWER8 hardware allows Neo4j to scale both up and out for graphs of greater size than ever before.

• Other features: HTTP/REST (or embedding in Java)

• Full ACID (Atomicity, Consistency, Isolation, Durability) conformity (including durable data)

• Integrated pattern-matching-based query language ("Cypher")

• Indexing of keys, nodes and relationships

• Advanced path-finding with multiple algorithms

• Optimized for reads

• Has transactions (in the Java API)

• Clustering, replication, caching, online backup, advanced monitoring and High Availability are commercially licensed

• Best used: For graph-style, rich or complex, interconnected data.

• For example: For searching routes in social relations, public transport links, road maps, or network topologies.

Page 11: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 11

IBM North America

EnterpriseDB (Postgres)

• Main Point: Enterprise class, Open Source, Relational Database

• Easily integrates/supplants OracleDB - This means that many applications written for Oracle run on Postgres

Advanced Server without modification and Oracle-skilled developers can use it with minimal re-training.

• Performance – EDB running on Power8 brings a cost-effective, enterprise-class solution to CIOs and IT managers

running Red Hat Enteprise Linux 7.x and Power8 based on little endian. EDB Postgres Advanced Server on Power8

offers 2x higher performance over Intel-based systems for OLTP applications, high performance multi-threading,

more cache and greater data bandwidth

• Scalability – Reliably handles multi-terabyte data sets supporting millions of users with guaranteed transactional

integrity and continuous availability

• TCO – Reduces operating costs by requiring less systems at a lower acquisition cost

• DBMS Convergence – Support traditional structured, semi-structured, and unstructured data types to reduce the

need to deploy costly, one-off NoSQL data silos, adoption of Postgres and migration of workloads from proprietary

databases.

• Services – Brings together two industry leaders committed to Open Source offerings. EDB Postgres Management,

Integration, and Migration Suites supports replication, HA, database monitoring/management and data integration for

mission-critical enterprise applications.

Page 12: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

Modernize your Databasewith POWER8 and EnterpriseDB

79%3-year TCO Reduction

30%Less servers

84%reduction in SW licensing

cost with fewer cores and

EnterpriseDB

29%reduction in HW costs

and maintenance

68%reduction in core count

0

1000000

2000000

3000000

4000000

5000000

6000000

S822LC/20c/2.926 withEnterpriseDB

HP DL380p/Brwell (2s) withOracleEE

Solution TCO for 3 years

Environmentals HW SW • Assumptions: 7 Power S922LC servers (65% utilization) have equivalent performance as 10 x86 servers (40% utilization)

Page 13: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

Modernize your Databasewith POWER8/PowerKVM and MongoDB vs x86/VMWare and Oracle EE

85%3-year TCO Reduction

90%reduction in SW licensing

cost with fewer cores and

MongoDB

23%reduction in HW costs

and maintenance

45%reduction in core count

• Assumptions: • 7xPower S822LC/20c servers with PowerKVM (40% utilization) have equivalent performance as

10xHPDL380/E5-2699 v4/44c servers with VMWare (40% utilization)• Performance is based on SPECint_rate

0

1000000

2000000

3000000

4000000

5000000

6000000

S822LC/20c/2.926 withMongoDB

HP DL380/BWL/44c/2.2 withOracleEE

Solution TCO for 3 years

Environmentals HW SW

Page 14: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 14

IBM North America

Hortonworks Announcement

• Announced at IBM Edge:

Hortonworks HDP is coming to Power!

• What is Hortonworks’ HDP?

It is an Enterprise-ready open source

Apache™ Hadoop® distribution based

on a centralized architecture (YARN).

• HDP addresses the complete needs of

data-at-rest, powers real-time customer

applications and delivers robust

analytics that accelerate decision

making and innovation

Page 15: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 15

IBM North America

“By 2018, more than 70% of new in-

house applications will be developed on

an OSDBMS, and 50% of existing

commercial RDBMS instances will have

been converted or will be in process”*

*Gartner - The State of Open Source RDBMSs, 2015, by Donald Feinberg and Merv Adrian, published April 21, 2015.

Page 16: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 16

IBM North America

Trademarks and notes

IBM Corporation 2016

• IBM, the IBM logo and ibm.com are registered trademarks, and other company, product, or service names may be trademarks or service marks of International Business Machines Corporation in the United States, other countries, or both. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.

• Other company, product, and service names may be trademarks or service marks of others.

• References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates.

• IBM and IBM Credit LLC do not, nor intend to, offer or provide accounting, tax or legal advice to clients. Clients should consult with their own financial, tax and legal advisors. Any tax or accounting treatment decisions made by or on behalf of the client are the sole responsibility of the customer.

• IBM Global Financing offerings are provided through IBM Credit LLC in the United States, IBM Canada Ltd. in Canada, and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates and availability are based on a client’s credit rating, financing terms, offering type, equipment type and options, and may vary by country. Some offerings are not available in certain countries. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice.

Page 17: 2016-10 Open Source Databases - Lawrence - v2.1.pdf

© 2016 IBM Corporation 17

IBM North America

Welcome to the Waitless World.