38
Data Management Systems Storage Management Memory hierarchy Segments and file storage Database buffer cache Storage techniques in context Basic principles Blocks instead of pages Tablespaces, segments, extents Updates, free space Gustavo Alonso Institute of Computing Platforms Department of Computer Science ETH Zürich Storage - Segments and File storage 1

Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Data Management Systems

• Storage Management

• Memory hierarchy

• Segments and file storage

• Database buffer cache

• Storage techniques in context

• Basic principles

• Blocks instead of pages

• Tablespaces, segments, extents

• Updates, free space

Gustavo Alonso

Institute of Computing Platforms

Department of Computer Science

ETH Zürich Storage - Segments and File storage 1

Page 2: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Storage in databases

• Remember the two key guarantees provided by a database in regard to storage• Data is persistent

• Data is recoverable

• Even when failures occur!!

• Add as well the following property:• Physical data independence

• Taken together these three aspects play a big role on how databases store data to maintain these properties while still achieving the necessary performance.

Storage - Segments and File storage 2

Page 3: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Storage Management

Physical storage

Blocks, files, segments

Pages in memory

Physical records

Logical records (tuples)

Logical data (tables, schemas)

Relations, views

Queries, Transactions (SQL)

Record Interface

Record Access

Page access

File Access

Application

Logical view (logical data)

Access Paths

Physical data in memory

Page structure

Storage allocation

3Storage - Segments and File storage

Page 4: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Disclaimers

• As usual: • Standard database architectures based on slow, hard drive disks (HDD) with a

high latency for seek operations (random access)

• Assuming most of the database is not in memory

• Today, some things are different:• More main memory available

• Different storage media (SSD, NVM, network attached storage)

• Principles remain the same and illustrate the underlying problem rather than a particular implementation

Storage - Segments and File storage 4

Page 5: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Explanations using a real system

• Many of the explanations that follow are based on Oracle’s database• Cover all the basics• Provide a good example of how a real system works• Demonstrate the many tuning knobs available in a database• Prove why some people consider databases too complicated (more about this later in the course)

• Ideas and concepts are nevertheless generic and they are somewhat similar across systems

• Logical storage:• https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/logical-storage-

structures.html#GUID-4AF2D61A-8675-4D48-97A4-B20F401ADA16• https://docs.oracle.com/cd/B19306_01/server.102/b14220/logical.htm

• Disk storage:• https://docs.oracle.com/cd/B19306_01/server.102/b14220/physical.htm• https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/physical-storage-

structures.html#GUID-008A1F08-9C75-4E9F-A70B-41FB942C60B4

Storage - Segments and File storage 5

Page 6: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Problem statement

• A database is doing many things at the same time (in the same way an Operating System is managing many different user processes at the same time)

• Each “thing” (a query, a system process, a database component) active at any point in time needs its own logical view of the data (and correspondingly, of memory, and disk). This is the same as the OS giving a process the impression it is alone in the machine.

• A database engine creates such virtual, logical views of the system using its own mechanisms (and different from the OS)

Storage - Segments and File storage 6

Page 7: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Logical and Physical storage in Oracle 19

• Entity-relationship diagram (crow’s foot implies one-to-many)

• Logical:• Tablespaces

• Segments

• Extent

• Block

• Physical:• Data File

• OS block

Storage - Segments and File storage 7

https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/logical-storage-structures.html#GUID-13CE5EDA-8C66-4CA0-87B5-4069215A368D

Page 8: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Tablespaces

• A tablespace is a logical data unit in the database:• Schema related:

• A table• An index• Several tables (clustered tables)

• Engine related:• Data structures for the database engine (result buffers, undo buffers, etc.)

• A tablespace provides a logical representation of the principle of spatial locality (keep together what belongs together)• Does not necessarily mean all data is continuous• It means all the information and all the data of a tablespace is under the same

umbrella

• Space (memory/disk) is allocated to tablespaces

Storage - Segments and File storage 8

Page 9: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Segments, Extents, and Blocks (example)

Storage - Segments and File storage 9

https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/logical-storage-structures.html#GUID-13CE5EDA-8C66-4CA0-87B5-4069215A368D

Page 10: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Generalizing

• A given logical object in a database (a table, an index) is “stored” in a tablespace

• A tablespace is organized into segments (each schema object has a segment)

• Segments have space allocated to them in the form of extents. A segment can have several extents

• Extents are sets of contiguously allocated data blocks. Extents are mapped to one data file (and typically to a file)

• Data blocks are the smallest allocation unit of space (not necessarily a page, databases typically use blocks larger than an OS page). The size of the blocks is a tunable parameter.

Storage - Segments and File storage 10

Page 11: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Translating

• Tablespace = keep together what needs to be kept together

• Segment = an object in the schema (or part of an object if it is partitioned)

• Extent = groups of continuously allocated pages

• Blocks = glorified pages

• Why so complicated?:• Tablespaces = logical locality• Segments = allocate (virtual) space to objects• Extents = allocate contiguous physical space• Blocks = the unit of space allocation

Storage - Segments and File storage 11

Page 12: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

More explanations and analogies

• Tablespaces provide a logical unit to refer to the storage of well defined entities (a table, an index, a buffer, etc.)

• Virtual space is allocated to a Tablespace in a segment. A segment acts as a form of virtual memory where everything belonging to the same entity appears continuous and can be treated as a single unit

• Actual space is allocated to a segment through extents. Extents are sets of blocks that are physically contiguous on storage and can be allocated as a whole

• The space in a segment is divided into blocks which typically correspond to several OS pages.

Storage - Segments and File storage 12

Page 13: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Segments I

• A segment allocates the equivalent of virtual memory to a tablespace

• Common setup:• Tablespace -> table

• Tablespace has one segment

• Segment has one or more extents

• These structures, like all examples that follow, are created to provide physical data independence. Higher layers only need to know in which tablespace a table is. This allows the engine to change everything below by simply changing the pointers to the segment, to the extents, etc.

Storage - Segments and File storage 13

Page 14: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Segments II

• The mapping of a segment to a tablespace simplifies the manipulation of database entities:• CREATE TABLE T• DROP TABLE T• CREATE INDEX I ON table_name (attributes)• DROP INDEX I

• If we need more space for a table or an index, the table is still referred to by its segment, but the segment indexes as many extents as needed.

• If we partition a table, each partition has its own segment but the table is still referred to by its tablespace

Storage - Segments and File storage 14

Page 15: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Extents

Storage - Segments and File storage 15

• Extents provide optimal access by ensuring logically continuous allocation of blocks (blocks can end up non continuous on disk because of RAID, for instance)

• Acts as a form of virtual address space:• Allocate and release blocks to the

extent• Easy sequential search• Can be dropped as an unit https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/logical-storage-

structures.html#GUID-13CE5EDA-8C66-4CA0-87B5-4069215A368D

The big square is a data file!

Space available in the data file

Page 16: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Dynamic extents

• When a segment is created, it is allocated an extent (how big? => tunable parameter)

• If more space is needed, another extent is created

• The new extent is not contiguous with the previous one and can be on a different data file

Storage - Segments and File storage 16

The big square is a data file!

Page 17: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Why extents?

• Better than the alternatives:

Storage - Segments and File storage 17

Static file mapping

Segment (starting address, size)

Easy to maintainHighly efficient (performance)Poor utilization (space)No flexibility

Dynamic extents Dynamic block mapping

Difficult to maintainNon contiguousPoor performanceMaximum flexibility

Page 18: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Why extents?

• Databases must optimize along many dimensions.

• A static file mapping is very easy to manage but induces fragmentation and provides no flexibility

• Dynamic block mapping is extremely flexible but data is not contiguous and is expensive to maintain

• Extents are a compromise:• An extent provides a static mapping to a set of blocks (like a static file

mapping)

• When more space is needed, extents are dynamically allocated (like with dynamic block mapping)

Storage - Segments and File storage 18

Page 19: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Managing space in extents

• Extents are collections of blocks. • How large should a extent be?• What do we do when we need more extents?

• Modern databases offer many tunable parameters to control extents• Uniform allocation: all extents are of the same size (tunable) and when more

space is needed, another extent is added• Automatic: when more space is needed, a new extent is added and its size

optimized according to come criterion.

• Typical database design trade-off: we want a very large extent to have the data contiguous but that induces fragmentation and it is not efficient. Instead, allocate groups of contiguous blocks as needed.

Storage - Segments and File storage 19

Page 20: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Allocating and sizing new extents

• A good way to allocate new extents is to increase their size exponentially (~1.25)

• Better to allocate a bit more than to be constantly allocating new extents (if a table grows, it is likely to keep growing)

• Bounds the size of the extent directory

Storage - Segments and File storage 20

Dynamic extents

Page 21: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Blocks vs OS pages

• Operating systems deal with pages to implement virtual memory

• Databases typically store a lot of data:• A table, an index might span many OS pages• Allocating space page by page at OS granularity is too much overhead• If one OS page belongs to a table, it is likely we will also be interested in the

other pages allocated to that table (locality)

• Hence, databases use blocks rather than OS pages

• This is why, although most databases today use the OS I/O system for writing to disk, they manage their own files and memory representation

• Later on we will see the structure of database pages in memory

Storage - Segments and File storage 21

Page 22: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Block structure: slotted pages

• A block is structured as follows:

• Header: address and type of segment (index, table, etc.)

• Table directory: schema of the table stored in the block

• Row directory: pointers to the actual tuples stored in the block

• Free space

• Row data (tuples) stored in the block

• The directory grows downwards, the space for tuples is used upwards

Storage - Segments and File storage 22

Page 23: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Slotted pages

• The row directory allows to insert tuples anywhere in the block (and change their position) while maintain a simple addressing schema:

(block, slot)

• How the space within a block is managed depends on many optimizations

• We will look into the block structure in more detail later on

Storage - Segments and File storage 23

Page 24: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Optimizing the use of blocks I

• Percentage Free• Determines how much space in

each block is reserved for updating tuples instead of using it for storing new tuples

• This is needed because an update can result in a bigger tuple than the original one

UPDATE TSET Adress = “AStreetWithAVeryLongName”WHERE LegiNr = 12345678

Storage - Segments and File storage 24

Page 25: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Optimizing the use of Blocks II

• Percentage used• Determines how much space needs to

be free in a block before the free space can be used to insert new tuples

• Blocks are unavailable to inserting new tuples until they have the given amount of free space

• It is needed because if updates can make tuples smaller (freeing up space), they can also make them bigger (needing space). The combination of both parameters avoids thrashing on the page

Storage - Segments and File storage 25

Page 26: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Storage - Segments and File storage 26

Page 27: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Fragmentation within blocks

• Like any unit of storage, blocks can suffer form fragmentation. However, compaction is a very expensive procedure

• Compaction often done only when the block has enough space for an INSERT or UPDATE but the space is not contiguous. Otherwise, the available space is used without reorganizing the tuples in the block.

• UPDATE might require to move a tuple from one block to another if the new size does not fit in the block where the tuple is. In such a case, the tuple is inserted into a new block and the original space is used to store a pointer to the new row. The ID of the tuple does not change (which means an indirection is needed to find the tuple).

Storage - Segments and File storage 27

Page 28: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Where to find space?

• A segment contains one or more free lists• Not done at the extent level because the search for space would be more

complex

• A free list contains pointers to blocks that have usable free space

• Using several free lists helps avoiding contention when performing parallel inserts or updates

• The free list is updated as transactions execute INSERT, DELETE, or UPDATE statements using the rules established with PCTFREE and PCTUSED

Storage - Segments and File storage 28

Page 29: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Part 2Part 2

Tables have no order

• Recall that tables have no order on the tuples?• A property of the relational model

• A consequence of how data is organized

• Example with extents

Storage - Segments and File storage 29

Table T Extent

Part 1 Part 1

Part 3

Part 3

Part 4

Part 4

SELECT * FROM TORDER BY …

Page 30: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

The art of writing to disk

• Databases provide concurrency control and recovery.• Concurrency control: database is correct even if several transactions modified

the data at the same tie

• Recovery: the state of the database can be recovered in the event of failures

• This makes relational database engines unique and imposes a lot of design and performance constraints

• It is, nevertheless, one of the major advantages of database engines

• We will study concurrency control and recovery in more detail later on, here we just point out some of the implications in terms of dealing with storage.

Storage - Segments and File storage 30

Page 31: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Shadow paging

Storage - Segments and File storage 31

SEGMENT

Data

Free List

EXTENT

1 2 3 4 5 6 7

1 2 3 4 5 6 7

8 9

10 11 12 13 14 15 16 17 18

8 9 10 11 12 13 14 15 16 17 18

UPDATE TSET attribute_1 = 1000WHERE ID = 17

Modifies a tuple in page 5

Page 32: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Shadow paging

Storage - Segments and File storage 32

SEGMENT

Data

Free List

EXTENT

1 2 3 4 5 6 7

1 2 3 4 5 6 7

8 9

10 11 12 13 14 15 16 17 18

8 9 10 11 12 13 14 15 16 17 18

UPDATE TSET attribute_1 = 1000WHERE ID = 17

Modifies a tuple in page 5

Creates a copy of page 5 and puts it into page 8

5’

Does the update on the new page

Page 33: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Shadow paging

Storage - Segments and File storage 33

SEGMENT

Data

Free List

EXTENT

1 2 3 4 5 6 7

1 2 3 4 8 6 7

8 9

10 11 12 13 14 15 16 17 18

8 9 10 11 12 13 14 15 16 17 18

UPDATE TSET attribute_1 = 1000WHERE ID = 17

Modifies a tuple in page 5

Creates a copy of page 5 and puts it into page 8

Does the update on the new page

When the transaction commits, makes page 8 the correct page and marks the old one as free

5

Page 34: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Shadow Paging, pros and cons

• Shadow paging is good for:• Recovery: it is enough to make sure that only clean pages (committed) are

written to disk and then the disk contains only clean data

• Recovery: undo of an uncommitted transaction is easy, throw the page away

• Concurrency Control: A transaction can use the copies it makes without interfering with queries which are reading the original copies (snapshot isolation)

• Shadow paging is bad because:• Page access and management is complicated (clean vs dirty pages, copy a

page, refresh lists on the segment when transaction commits, etc.)

• Every update ends up moving a page somewhere else, destroying locality

Storage - Segments and File storage 34

Page 35: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

When to use shadow paging

• Shadow paging makes sense when the overhead of creating a copy of the page and managing the copies is small enough• Main memory (e.g., use the Copy on Write feature of the OS)

• Possibly with Flash and NVM

Storage - Segments and File storage 35

Page 36: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Delta Files

• Using delta files, before modifying a page, one makes a copy of the page

• The copy is stored in a delta file and can be used to undo the changes

• The update is performed directly on the original page

• The copy in the delta file can be discarded when no longer needed

Storage - Segments and File storage 36

EXTENT

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18

EXTENT

1 2 3 4 5’ 6 7 8 9

10 11 12 13 14 15 16 17 18

5

DELTA

Page 37: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Implementing delta files

• Delta files are used for many different purposes and in many different ways• Keep the old data in the delta file

• Favors commits, simplifies undo, allows looking at older data

• Keep the new data in the delta file• Favors aborts, allows to delay the propagation of updates

• Oracle• used rollback segments (segments specifically used to store the old copy of

the data)

• Now it uses undo-tablespaces

Storage - Segments and File storage 37

Page 38: Data Management Systems - ETH Z...Allocating and sizing new extents •A good way to allocate new extents is to increase their size exponentially (~1.25) •Better to allocate a bit

Summary Segments and file storage

• I/O is a big component of the performance of a database engine

• Many different architectures and possibilities

• Many opportunities for optimization and trade-offs

• Plays a big role in concurrency control and recovery (see later)

• Plays a big role in how data is accessed and manipulated (see later)

• Many details changing as storage evolves from disk to Flash to NVM to network attached storage …

Storage - Segments and File storage 38