Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
Datenbanksystemimplementierung
Prof. Dr. Viktor Leis
Professur für Datenbanken und Informationssysteme
DBMS Architecture
• so far, we mostly looked at the traditional database architecture• developed between 1980 and 2010• in the past decade many new systems have emerged that are very differentfrom traditional ones
• many concepts are similar though• change is mostly driven by hardware trends (e.g., multi-core CPUs, CPUcaches, SIMD, NVMe SSDs)
• in this lecture, we will look at some historical and current developments
1
Information Retrieval P. BAXENDALE, Editor
A Relational Model of Data for Large Shared Data Banks
E. F. CODD IBM Research Laboratory, San Jose, California
Future users of large data banks must be protected from
having to know how the data is organized in the machine (the
internal representation). A prompting service which supplies
such information is not a satisfactory solution. Activities of users
at terminals and most application programs should remain
unaffected when the internal representation of data is changed
and even when some aspects of the external representation
are changed. Changes in data representation will often be
needed as a result of changes in query, update, and report
traffic and natural growth in the types of stored information.
Existing noninferential, formatted data systems provide users
with tree-structured files or slightly more general network
models of the data. In Section 1, inadequacies of these models
are discussed. A model based on n-ary relations, a normal
form for data base relations, and the concept of a universal
data sublanguage are introduced. In Section 2, certain opera-
tions on relations (other than logical inference) are discussed
and applied to the problems of redundancy and consistency
in the user’s model.
KEY WORDS AND PHRASES: data bank, data base, data structure, data
organization, hierarchies of data, networks of data, relations, derivability,
redundancy, consistency, composition, join, retrieval language, predicate
calculus, security, data integrity
CR CATEGORIES: 3.70, 3.73, 3.75, 4.20, 4.22, 4.29
1. Relational Model and Normal Form
1 .I. INTR~xJ~TI~N This paper is concerned with the application of ele-
mentary relation theory to systems which provide shared access to large banks of formatted data. Except for a paper by Childs [l], the principal application of relations to data systems has been to deductive question-answering systems. Levein and Maron [2] provide numerous references to work in this area.
In contrast, the problems treated here are those of data independence-the independence of application programs and terminal activities from growth in data types and changes in data representation-and certain kinds of data inconsistency which are expected to become troublesome even in nondeductive systems.
Volume 13 / Number 6 / June, 1970
The relational view (or model) of data described in Section 1 appears to be superior in several respects to the graph or network model [3,4] presently in vogue for non- inferential systems. It provides a means of describing data with its natural structure only-that is, without superim- posing any additional structure for machine representation purposes. Accordingly, it provides a basis for a high level data language which will yield maximal independence be- tween programs on the one hand and machine representa- tion and organization of data on the other.
A further advantage of the relational view is that it forms a sound basis for treating derivability, redundancy, and consistency of relations-these are discussed in Section 2. The network model, on the other hand, has spawned a number of confusions, not the least of which is mistaking the derivation of connections for the derivation of rela- tions (see remarks in Section 2 on the “connection trap”).
Finally, the relational view permits a clearer evaluation of the scope and logical limitations of present formatted data systems, and also the relative merits (from a logical standpoint) of competing representations of data within a single system. Examples of this clearer perspective are cited in various parts of this paper. Implementations of systems to support the relational model are not discussed.
1.2. DATA DEPENDENCIES IN PRESENT SYSTEMS The provision of data description tables in recently de-
veloped information systems represents a major advance toward the goal of data independence [5,6,7]. Such tables facilitate changing certain characteristics of the data repre- sentation stored in a data bank. However, the variety of data representation characteristics which can be changed without logically impairing some application programs is still quite limited. Further, the model of data with which users interact is still cluttered with representational prop- erties, particularly in regard to the representation of col- lections of data (as opposed to individual items). Three of the principal kinds of data dependencies which still need to be removed are: ordering dependence, indexing depend- ence, and access path dependence. In some systems these dependencies are not clearly separable from one another.
1.2.1. Ordering Dependence. Elements of data in a data bank may be stored in a variety of ways, some involv- ing no concern for ordering, some permitting each element to participate in one ordering only, others permitting each element to participate in several orderings. Let us consider those existing systems which either require or permit data elements to be stored in at least one total ordering which is closely associated with the hardware-determined ordering of addresses. For example, the records of a file concerning parts might be stored in ascending order by part serial number. Such systems normally permit application pro- grams to assume that the order of presentation of records from such a file is identical to (or is a subordering of) the
Communications of the ACM 377
2
The “Dinosaurs”
• the earliest commercially available relational systems were Oracle, IBMSystem R, Ingres (end of 1970s)
• in 1990s and 2000s Oracle, MS SQL Server, IBM DB2 were the dominantplayers (in terms of mind- and market share)
• general-purpose systems: OLTP and OLAP• behavior similar (all speak SQL), but different enough to make switching hard• similar internal architecture• stable, great functionality (ACID transactions, SQL)• sometimes hard to use (database administrator needed), many tuning knobsto get better performance
• database systems research was kind of boring
3
BAY AREA PARK
CODD RIVER
RELATIONAL CREEK
CODD RIVER
BAY AREA PARK
1970s
1980s1990s
2000s
2010s
DABA (Robotron, TU Dresden)
v1, 1992
v1.0, 1987
v4.0, 1990 v10, 1993
v1, 1987 v2, 1989v3, 2011
v11.5, 1996 v11.9, 1998
v12.0, 1999 v12.5, 2001 v12.5.1, 2003 v15.0, 2005 v16.0, 2012
v1, 1989
v2, 1993
v1.0, 1980s
v5.x, 1970s
v6.0, 1986 OpenIngres 2.0, 1997 vR3, 2004
v1, 1995 v6, 1997 v7, 2000
v8, 2005 v9, 2010
v9.0, 2006 v10, 2010
v4.0, 1990 v5.0, 1992 v6.0, 1994
v9.0, 2000 v10, 2005 v11, 2007
v4.21, 1993 v6, 1994 v7, 1998
v8, 1997
v3.1, 1997 v3.21, 1998 v3.23, 2001 v4, 2003 v4.1, 2004 v5, 2005 v5.1, 2008
v5.5, 2010
v8i, 1999 v9i, 2001 v10g, 2003 v10gR2, 2005 v11g, 2007 v11gR2, 2009
v8, 2000 v9, 2005 v10, 2008 v11, 2012
v3, 1995 v4, 1997 v5, 1999 v10, 2001 v11, 2003 v12, 2007 v14, 2010
v3, 1983 v4, 1984 v5, 1985
v1, 1983
v5.1, 1986
v3, 1993
v1, 1983 v2, 1988 v3, 1993 v4, 1994
v5, 1996
v6, 1999 v7, 2001 v8, 2003 v9, 2006
alpha, 1979
v1.0, 1981
v6.1, 1997 v8.1, 1998 v10.2, 2008
v5.1, 2004 v6.0, 2005 v6.2, 2006 v12, 2007 v13.0, 2009
v13.10, 2010 v14.0, 2012
v4, 1995
v5, 1997
v6, 1999
v1.6, 2001 v1.7, 2002
v1.8, 2005
v3.0, 1988
v2.0, 2010
v7, 2001 v8, 2004 v9, 2007
v10, 2010
v7, 1992
v7.0, 1995
v2, 1979
v1, 2003 v1.5, 2004
v6.5, 1995
v11, 1995v12, 1999
v15, 2009
v12c, 2013
v1, 1988 v2, 1992 v4, 1992
v6, 2008
v7, 2010
Ingres
VectorWise
MonetDB
Netezza
Greenplum
PostgreSQL
Red Brick
Microsoft SQL Server
H-Store
Informix
VoltDB
Vertica
Sybase ASE
Sybase IQ
SQL Anywhere
Access
Oracle
Infobright
MySQL
TimesTen
Paradox
Teradata Database
Empress Embedded
RDB
DB2 for iSeries
Derby
Transbase
DB2 for z/OS
DB2 for VSE & VM
Solid DB
EXASolution
dBase
Firebird
DB2 for LUW
HSQLDB
SQLite
HANA
HyPer
MaxDB
Nonstop SQL
AdabasD
MariaDB
v10, 2013
v11.70, 2010 v12.10, 2013
v2, 2006
FileMakerv1, 1985 II, 1988 v2, 1992 v3, 1995 v4, 1997 v5, 1999 v6, 2002
v7, 2004 v8, 2005
v9, 2007 v10, 2009v11, 2011 v14, 2015
dBase II, 1981 dBase III, 1984 dBase IV, 1988
MemSQL
Impala
Trafodion
Redshift
v2, 2012
v5.6, 2013 v5.7, 2015
Borland
Siemens
dBase Inc.
SAP
Ashton Tate
HP Compaq
Claris (Apple)
FileMaker Inc.
Tableau
dBase LLC
Cohera
PeopleSoft
Cullinet
NCR
Teradata
IBM
Oracle
Oracle
Borland
Corel
Informix IBM
EMC
SAP
Oracle
IBM
HP
Powersoft Sybase
Sun
Pivotal
RTI
ASK Group (1990) CA (1994)
Ingres Corp. (2005)
Actian (renamed 2011)
Actian
brand
Informix
Illustra
Microsoft SQL Server
InnoDB (Innobase)
PADB (ParAccel)
TimesTenJBMS
Cloudscape
Transbase(Transaction Software)
GDBM
AdabasD (Software AG)
P*TIME
Groton Database Systems
Trafodion
Berkeley Ingres
MonetDB (CWI)
IBM Red Brick Warehouse
MariaDB
Sybase ASE
IDM 500 (Britton Lee)
ShareBase
Aster Database
Multics Relational Data Store (Honeywell)
DB2 for VSE & VM
DB2 UDB for iSeries
DBM
System/38
InfiniDB
TurboImage (Esvel)
TurboImage/XL (HP)
IBM Peterlee Relational Test Vehicle
Neoview
Ingres
Postgres PostgreSQL
IBM Informix
Greenplum
Volt DB
Netezza
Informix
Sybase SQL Server
Microsoft Access
MySQL
Sybase IQ
Nonstop SQL(Tandem)
mSQL Infobright
H-StoreC-Store Vertica Analytic DB
VectorWise (Actian)
Monet Database System (Data Distilleries)
DATAllegro
Expressway 103
Watcom SQL SQL Anywhere
FileMaker(Nashoba)
FileMaker Pro
Redshift (Amazon)Bizgres
Empress EmbeddedRed Brick
VisualFoxPro (Microsoft)Oracle
RDB (DEC)
DBC 1012 (Teradata)
Derby Apache Derby
FoxPro
SQL/DS
DB2 UDB
DB2 MVS
Solid DB
System-R (IBM)
DB2 z/OS
SQL/400 DB2/400
MemSQL
Impala
TinyDB
Gamma (Univ. Wisconsin)Mariposa (Berkeley)
HyPer (TUM)
dBase (Ashton Tate)
NDBM
SQLite
HSQLDB
VDN/RDS DDB4 (Nixdorf)SAP DB MaxDB
SAP HANA
REDABAS (Robotron)
EXASolution
InterBase Firebird
Allbase (HP)
IBM IS1
Paradox (Ansa)
DB2
Key to lines and symbols
DBMS name (Company)
Genealogy of Relational Database Management Systems
Discontinued Branch (intellectual and/or code)Acquisition Versionsv9, 2006
Crossing lines have no special semantics
Felix Naumann, Jana Bauckmann, Claudia Exeler, Jan-Peer Rudolph, Fabian TschirschnitzContact - Hasso Plattner Institut, University of Potsdam, [email protected] - Alexander Sandt Grafik-Design, HamburgVersion 6.0 - October 2018https://hpi.de/naumann/projects/rdbms-genealogy.html
4
OLAP vs. OLTP
• Online Analytical Processing (OLAP)• mostly reads• long-running queries (one transaction per query)• many full table scans• batch inserts• example: compute revenue per month for last year• benchmark: TPC-H (also TPC-DS)
• Online Transactional Processing (OLTP)• many point writes and reads• short-running transactions (multiple statements per transaction)• heavy reliance on indexes• example: order processing (online shop)• benchmark: TPC-C (also TPC-E)
5
The Anatomy of a Dinosaur
feature techniquetransaction isolation 2 Phase Lockingsynchronization lock couplinglarge data sets buffer managementdurability ARIES-style loggingindexing B+treestorage slotted pages (row-wise)SQL iterator model (interpreter)parallelization Exchange operatorsquery optimization cost-based DP
• assumption: the only thing that matters for performance is minimizing diskI/O operations
6
7
OLTP Through the Looking Glass, and What We Found ThereStavros Harizopoulos
HP LabsPalo Alto, CA
Michael StonebrakerSamuel MaddenDaniel J. AbadiYale UniversityNew Haven, CT
Massachusetts Institute of TechnologyCambridge, MA
{madden, stonebraker}@csail.mit.edu
ABSTRACTOnline Transaction Processing (OLTP) databases include a suiteof features — disk-resident B-trees and heap files, locking-basedconcurrency control, support for multi-threading — that wereoptimized for computer technology of the late 1970’s. Advancesin modern processors, memories, and networks mean that today’scomputers are vastly different from those of 30 years ago, suchthat many OLTP databases will now fit in main memory, andmost OLTP transactions can be processed in milliseconds or less.Yet database architecture has changed little.
Based on this observation, we look at some interesting variants ofconventional database systems that one might build that exploitrecent hardware trends, and speculate on their performancethrough a detailed instruction-level breakdown of the major com-ponents involved in a transaction processing database system(Shore) running a subset of TPC-C. Rather than simply profilingShore, we progressively modified it so that after every featureremoval or optimization, we had a (faster) working system thatfully ran our workload. Overall, we identify overheads and opti-mizations that explain a total difference of about a factor of 20xin raw performance. We also show that there is no single “highpole in the tent” in modern (memory resident) database systems,but that substantial time is spent in logging, latching, locking, B-tree, and buffer management operations.
Categories and Subject DescriptorsH.2.4 [Database Management]: Systems — transaction process-ing; concurrency.
General TermsMeasurement, Performance, Experimentation.
KeywordsOnline Transaction Processing, OLTP, main memory transactionprocessing, DBMS architecture.
1. INTRODUCTIONModern general purpose online transaction processing (OLTP)database systems include a standard suite of features: a collectionof on-disk data structures for table storage, including heap filesand B-trees, support for multiple concurrent queries via locking-based concurrency control, log-based recovery, and an efficientbuffer manager. These features were developed to support trans-action processing in the 1970’s and 1980’s, when an OLTP data-base was many times larger than the main memory, and when thecomputers that ran these databases cost hundreds of thousands tomillions of dollars.
Today, the situation is quite different. First, modern processorsare very fast, such that the computation time for many OLTP-style transactions is measured in microseconds. For a few thou-sand dollars, a system with gigabytes of main memory can bepurchased. Furthermore, it is not uncommon for institutions toown networked clusters of many such workstations, with aggre-gate memory measured in hundreds of gigabytes — sufficient tokeep many OLTP databases in RAM.
Second, the rise of the Internet, as well as the variety of dataintensive applications in use in a number of domains, has led to arising interest in database-like applications without the full suiteof standard database features. Operating systems and networkingconferences are now full of proposals for “database-like” storagesystems with varying forms of consistency, reliability, concur-rency, replication, and queryability [DG04, CDG+06, GBH+00,SMK+01].
This rising demand for database-like services, coupled with dra-matic performance improvements and cost reduction in hard-ware, suggests a number of interesting alternative systems thatone might build with a different set of features than those pro-vided by standard OLTP engines.
1.1 Alternative DBMS ArchitecturesObviously, optimizing OLTP systems for main memory is a goodidea when a database fits in RAM. But a number of other data-base variants are possible; for example:
• Logless databases. A log-free database system might eithernot need recovery, or might perform recovery from other sitesin a cluster (as was proposed in systems like Harp [LGG+91],Harbor [LM06], and C-Store [SAB+05]).
• Single threaded databases. Since multi-threading in OLTPdatabases was traditionally important for latency hiding in the
Permission to make digital or hard copies of all or part of this work for per-sonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior spe-cific permission and/or a fee.SIGMOD’08, June 9–12, 2008, Vancouver, BC, Canada.Copyright 2008 ACM 978-1-60558-102-6/08/06...$5.00.
8
OLTP Through the Looking Glass [SIGMOD 2008]
• even a decade ago, the working set of many applications fit into mainmemory
• research question: Where does time go in OLTP?• approach: disable/rip out components step by step (+ additionalmicro-optimizations)
• use Shore system• open source storage engine• developed at University of Wisconsin in early 1990s• architecturally similar to Dinosaurs• the assumption is that the results should be similar too
9
General Setup
• single-core Pentium 4, 3.2 GHz• Linux• measure instructions, cycles• use TPC-C (standard OLTP benchmark)
10
TPC-C Schema
Figure 3. TPC-C Schema.
10 districts / warehouseWarehouse(size W)
100k stocks /warehouse
Stock(size W x 100k)
W stocks /item
Item(size 100k)
History(size > W x 30k)
New-Order(size > W x 9k)
Order-Line(size > W x 300k)
(size W x 30k)
(size > W x 30k)
(size W x 10)District
3k customers /district
>= 1 order /customer
Customer
Order
>= 1 historyrecord /customer
0 or 1 neworders /order
5-15 order-line entries /order
11
TPC-C Transactions
New Orderbeginfor loop(10).....Btree lookup(I), pinBtree lookup(D), pinBtree lookup (W), pinBtree lookup (C), pinupdate rec (D)for loop (10).....Btree lookup(S), pin.....update rec (S).....create rec (O-L).....insert Btree (O-L)create rec (O)insert Btree (O)create rec (N-O)insert Btree (N-O)insert Btree 2ndary(N-O)commit
PaymentbeginBtree lookup(D), pinBtree lookup (W), pinBtree lookup (C), pinupdate rec (C)update rec (D)update rec (W)create rec (H)commit
Figure 4. Calls to Shore’s methods for New Order and Payment transactions.
12
Instruction Breakdown
13
Transactions Per Second
• out-of-the-box Shore: 640• disable log flushing: 1,700• disable components: 12,700• standalone C implementation: 46,500
14
OLTP Through the Looking Glass: Conclusions
• traditional database implementation and architecture can be extremelyinefficient
• 10× to 100× performance gains are achievable• all components are slow (“no high pole in the tent”)
15
The End of an Architectural Era (It’s Time for a Complete Rewrite)
Michael Stonebraker
Samuel Madden Daniel J. Abadi
Stavros Harizopoulos MIT CSAIL
{stonebraker, madden, dna, stavros}@csail.mit.edu
Nabil Hachem AvantGarde Consulting, LLC
Pat Helland Microsoft Corporation
ABSTRACT In previous papers [SC05, SBC+07], some of us predicted the end of “one size fits all” as a commercial relational DBMS paradigm. These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1-2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific database markets. Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required. In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well. The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T., to a popular RDBMS on the standard transactional benchmark, TPC-C.
We conclude that the current RDBMS code lines, while attempting to be a “one size fits all” solution, in fact, excel at nothing. Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of “from scratch” specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow’s requirements, not continue to push code lines and architectures designed for yesterday’s needs.
1. INTRODUCTION The popular relational DBMSs all trace their roots to System R from the 1970s. For example, DB2 is a direct descendent of System R, having used the RDS portion of System R intact in their first release. Similarly, SQL Server is a direct descendent of Sybase System 5, which borrowed heavily from System R. Lastly, the first release of Oracle implemented the user interface from System R.
All three systems were architected more than 25 years ago, when hardware characteristics were much different than today. Processors are thousands of times faster and memories are thousands of times larger. Disk volumes have increased enormously, making it possible to keep essentially everything, if one chooses to. However, the bandwidth between disk and main memory has increased much more slowly. One would expect this relentless pace of technology to have changed the architecture of database systems dramatically over the last quarter of a century, but surprisingly the architecture of most DBMSs is essentially identical to that of System R. Moreover, at the time relational DBMSs were conceived, there was only a single DBMS market, business data processing. In the last 25 years, a number of other markets have evolved, including data warehouses, text management, and stream processing. These markets have very different requirements than business data processing.
Lastly, the main user interface device at the time RDBMSs were architected was the dumb terminal, and vendors imagined operators inputting queries through an interactive terminal prompt. Now it is a powerful personal computer connected to the World Wide Web. Web sites that use OLTP DBMSs rarely run interactive transactions or present users with direct SQL interfaces. In summary, the current RDBMSs were architected for the business data processing market in a time of different user interfaces and different hardware characteristics. Hence, they all include the following System R architectural features: Disk oriented storage and indexing structures Multithreading to hide latency Locking-based concurrency control mechanisms Log-based recovery Of course, there have been some extensions over the years, including support for compression, shared-disk architectures, bitmap indexes, support for user-defined data types and operators, etc. However, no system has had a complete redesign since its inception. This paper argues that the time has come for a complete rewrite. A previous paper [SBC+07] presented benchmarking evidence that the major RDBMSs could be beaten by specialized architectures by an order of magnitude or more in several application areas, including:
Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Database Endowment. To copy otherwise, or to republish, to post on servers or to redistribute to lists, requires a fee and/or special permissions from the publisher, ACM. VLDB ’07, September 23-28, 2007, Vienna, Austria. Copyright 2007 VLDB Endowment, ACM 978-1-59593-649-3/07/09.
16
The End of an Architectural Era (It’s Time for a Complete Rewrite) [ICDE 2007]
• Stonebraker’s lesson:• existing code bases are hopeless, rewrite needed• specialized, simplified systems for OLTP, OLAP, text, etc.• let’s start lots of startups building specialized systems (OLTP: H-Store, OLAP:C-Store/Vertica, Data Integration/Cleaning: Tamr)
• German lesson:• general-purpose relational systems are kind of nice• DRAM is cheap (1 TB RAM for less than 50K EUR), let’s go in-memory only• SAP HANA, TUM HyPer
17
Modern Systems
• Column Stores for OLAP• Actian Vector (“Vectorwise”)• Microsoft Apollo (part of SQL Server)• IBM BLU
• OLTP (in-memory)• Microsoft Hekaton (part of SQL Server)• VoltDB
• OLTP and OLAP (in-memory)• SAP HANA• TUM HyPer
18
DBMS Evolution
feature old newtransaction isolation 2 Phase Locking MVCCsynchronization lock coupling optimistic lock couplinglarge data sets buffer management pointer swizzlingdurability ARIES-style logging scalable loggingindexing B+tree B+tree/triestorage slotted pages (row-wise) column storesSQL iterator model (interpreter) compilation or vectorizationparallelization Exchange operators morsel-driven parallelismquery optimization cost-based DP cost-based DP
19
Conclusions
• fast changes in hardware drive evolution of database systems• many new techniques• concepts stay similar, but need to be rethought• big trend: cloud• database systems can never be fast enough
20