24
Columnstore Indexes Deep introduction into columnar storage and indexes in SQL Server 2012 Denis Reznik

SqlSaturday199 - Columnstore Indexes

  • Upload
    -

  • View
    615

  • Download
    3

Embed Size (px)

DESCRIPTION

In-Memory features is the most perspective trend in the area of high performance. Columnstore Indexes is one of such features, and even with their restrictions, they can accelerate your queries at times! How to get more from this feature? In which situations should we use them? Which internal mechanisms help to achive that? You can get answers on these questions on this session.

Citation preview

Page 1: SqlSaturday199 - Columnstore Indexes

Columnstore Indexes

Deep introduction into columnar storage and indexes in SQL Server 2012

Denis Reznik

Page 2: SqlSaturday199 - Columnstore Indexes

Sponsors

Page 3: SqlSaturday199 - Columnstore Indexes

About me

Denis Reznik Kiev, Ukraine Database Architect at The Frayman Group Microsoft MVP Community enthusiast

3 |

Page 4: SqlSaturday199 - Columnstore Indexes

Agenda

Columnar storage Creation of Columnstore index Usage scenarios and limitations Performance accelerators

Columnstore Storage internals Columnstore Execution mode internals

Columnstore index maintenance Columnstore Future (actually Present :)

4 |

Page 5: SqlSaturday199 - Columnstore Indexes

Row Store and Column Store

In row store, data is stored tuple by tuple. In column store, data is stored column by

column

Page 6: SqlSaturday199 - Columnstore Indexes

Row Store and Column Store

Most of the queries does not process all the attributes of a particular relation.

SELECT c.Name, c.Address FROM Customers cWHERE c.City = 'Sofia'

id

name

city state age

address

Page 7: SqlSaturday199 - Columnstore Indexes

Creating a columnstore index

T-SQL

SSMS

Page 8: SqlSaturday199 - Columnstore Indexes

Usage scenarios and limitations

Primary focus of Columnstore Indexes is DW databases

In SQL Server 2012 Columnstore Indexes are read-only

Supported operators and data types are limited

Page 9: SqlSaturday199 - Columnstore Indexes

DEMO

Incredible Performance of Columnstore Indexes

Page 10: SqlSaturday199 - Columnstore Indexes

How Are These Performance Gains Achieved?

Two complimentary technologies: Storage

Data is stored in a compressed columnar data format (stored by column) instead of row store format (stored by row).

New “batch mode” execution Vector-based query execution capability Data can then be processed in batches versus row-by-row Depending on filtering and other factors, a query may also benefit

by “segment elimination” - bypassing million row chunks (segments) of data, further reducing I/O

Page 11: SqlSaturday199 - Columnstore Indexes

Compression

Patented VERTIPAQ algorithms So, there is no public information about how the

data actually compressed

But some info we have Dictionary encoding Run Length encoding Bit-Vector encoding …

Page 12: SqlSaturday199 - Columnstore Indexes

DEMO

Columnstore Indexes Internals

Page 13: SqlSaturday199 - Columnstore Indexes

C1 C2 C3 C5 C6C4

Pages

Row store:

Column store:

Columnar storage structure

Page 14: SqlSaturday199 - Columnstore Indexes

C1 C2 C3 C5 C6C4

Set of about 1M rows

Column Segment

segment 1

segment Ndictionaries

Column Segments and Dictionaries

Page 15: SqlSaturday199 - Columnstore Indexes

DEMO

Columnstore Indexes – Segments and Dictionaries

Page 16: SqlSaturday199 - Columnstore Indexes

Memory management

SELECT C2, SUM(C4)FROM TGROUP BY C2;

T.C2

T.C4

T.C2

T.C4

T.C2

T.C2

T.C2

T.C1 T.C

1

T.C1

T.C1

T.C1

T.C3

T.C3

T.C3

T.C3

T.C3

T.C4

T.C4

T.C4

• Memory management is automatic

• Columnstore is persisted on disk

• Needed columns fetched into memory

• Columnstore segments is a unit of data between disk and

memory

Page 17: SqlSaturday199 - Columnstore Indexes

Batch mode processing

Process ~1000 rows at a time

Vector operators implemented

Greatly reduced CPU time (7 to 40X)

bitm

ap o

f qua

lifyi

ng r

ows

Column vectors

Batch object

Page 18: SqlSaturday199 - Columnstore Indexes

Segment Elimination

column_id

segment_id

min_data_id max_data_id

1 1 20120101 20120131

1 2 20120115 20120215

1 3 20120201 20120228

• Segment (rowgroup) = 1 million row chunk• Min, Max kept for each column in a segment• Scans can skip segments based on this info

select Date, count(*) from dbo.Purchase where Date >= '20120201'group by Date

skipped

Page 19: SqlSaturday199 - Columnstore Indexes

DEMO

Segment Elimination

Page 20: SqlSaturday199 - Columnstore Indexes

Maintaining Data in a Columnstore Index

Once built, the table becomes “read-only” and INSERT/UPDATE/DELETE/MERGE is no longer allowed

ALTER INDEX REBUILD / REORGANIZE not allowed

How can I modify index data? Drop columnstore index / make modifications / add

columnstore index UNION ALL (but be sure to validate performance) Partition switches (IN and OUT)

Page 21: SqlSaturday199 - Columnstore Indexes

Columnstore Index Future

Actually it is already become Columnstore indexes can be clustered (in

SQL server 2014) Clustered Columnstore indexes can be

updatable (in SQL Server 2014) Update data (deltas) store in rowstore until

segment can be created

Page 22: SqlSaturday199 - Columnstore Indexes

Summary

Columnar storage Columnstore Performance Demo Creation of Columnstore index Usage scenarios and limitations Performance accelerators

Columnstore Storage internals Columnstore Execution mode internals

Columnstore index maintanance Columnstore Future (actually Present :)

22 |

Page 23: SqlSaturday199 - Columnstore Indexes

Sponsors

Page 24: SqlSaturday199 - Columnstore Indexes

Thank you!

Denis Reznik Twitter: @denisreznik Email: [email protected] Blog (in russian): http://reznik.uneta.com.ua Facebook: https://www.facebook.com/denis.reznik.5

LinkedIn: http://ua.linkedin.com/pub/denis-reznik/3/502/234