50
Multi-thread Multi-thread sweep, sweep, backup and backup and restore restore

Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Multi-thread Multi-thread sweep, sweep,

backup and backup and restorerestore

Page 2: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Firebird Conference 2019Berlin, 17-19 October

Page 3: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 43

Introduction

● Big demand from users to speed up most time consuming regular maintenance operations:● Backup● Restore● Sweep

● Initial implementation based on Firebird 2.5 Classic● Firebird 2.5 Super Server is not suitable

● Front ported to the v3 codebase● Including Super Server, of course

● Available in HQbird 2020 (for Firebird 2.5 and 3.0)● Will be included into Firebird 4+

Page 4: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 44

Introduction

● The good parallel implementation should, at least● Evenly distribute workload between workers● Avoid or minimize possible contentions for shared

resources (disk, memory, internal locking)● Minimize necessary coordination between workers and

task manager

Page 5: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 45

Sweep

● How sweep works● Read each table in database● Cleanup unneeded record versions ● Move OIT marker on success

Page 6: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 46

Sweep

● What can be run in parallel ?● Each parallel worker could handle (read and cleanup)

separate table

Table 1

Table 2Worker 2

Worker 1

Table 3

Table 4

Worker 3

Worker 4

Table 6

Table 7

Table 5

Table 8

Table 11

Table 10

Table 9 Table 12

time

Page 7: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 47

Sweep

● What if there is few big tables and many small tables ?

Table 1

Table 2Worker 2

Worker 1

Table 3

Table 4

Worker 3

Worker 4

Table 6

Table 7

Table 5

Table 8

Table 11

Table 10

Table 9 Table 12

time

Page 8: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 48

Sweep

● What if there is few big tables and many small tables ?● Big table could be handled by few parallel workers

Table 1

Table 2Worker 2

Worker 1

Table 3

Table 4

Worker 3

Worker 4

Table 6

Table 7

Table 5

Part of Table 8

Table 11

Table 10

Table 9 Table 12

Part T8

Part T8

Part T8

time

Page 9: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 49

Sweep

● How to divide big table between few workers to minimize contention and coordination ?● Every worker could handle one data page and then ask

for a next (not handled) one– Almost fair distribution of workload– No contention for the same data pages– Some contention for the same pointer page– Coordinate with manager very often

Page 10: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 410

Sweep

● How to divide big table between few workers to minimize contention and coordination ?● Every worker could handle few data pages and then ask

for a next (not handled) few pages– How much ?

Page 11: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 411

Sweep

● How to divide big table between few workers to minimize contention and coordination ?● Every worker handle data pages from the same pointer

page and then ask for a next (not handled) pointer page– Workload distribution still fair enough– No contention for the same data pages– No contention for the same pointer page– Coordinate with manager not too often

Page 12: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 412

Sweep

● Implementation details● Single attachment can’t be handled by concurrent threads

simultaneously● Every worker have its own private attachment and

transaction● Internal pool of worker attachments

– Per database and per server process– Limited by value of new configuration setting

MaxParallelWorkers– Created automatically when required– Works in the same server process– Closed automatically when last connection to the database

is gone

Page 13: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 413

Sweep

● Usage● gfix -sweep -parallel 4 <database>

– Run sweep using 4 parallel attachments● 1 user attachment and 3 additional worker attachments

● New DPB tag isc_dpb_parallel_workers

Page 14: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 414

Sweep

● Usage● Auto-sweep also could run in parallel mode

– New configuration setting ParallelWorkers

Page 15: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 415

Sweep

● Test results● Big database

Test environment 1

Firebird version 2.5.9 HQBird

OS CentOS 6.7

Server ProLiant DL380 Gen9

CPU 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz

Cores per socket 8

Logical CPU’s 32

RAM 96 GB

HDD 4xHDD SAS 15k RAID 10

Database 510 GB

Page 16: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 416

Sweep

● Test results● Big database

1 2 4 8 16 24 32 48 640

20

40

60

80

100

120

140

160 152

101

65

51

3328 26 25 25

Sweep time, Firebird 2.5 SC

Workers

Tim

e, m

in

Page 17: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 417

Backup

● How backup works● Read system tables and store user metadata in backup

file● Read user tables and store records in backup file

Page 18: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 418

Backup

● What can be run in parallel ?● Parallel workers could read database independently, but

backup file should be written in correct order– Serialize workers when backup file is written

Read

ReadWorker 2

Worker 1

Read

Read

Worker 3

Worker 4

Read

Read

Read

Read

Write

Sync

time

Page 19: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 419

Backup

● What can be run in parallel ?● Parallel workers could read database independently, but

backup file should be written in correct order– Move all write activity into another dedicated thread

Read

ReadReader 2

Reader 1

Read

Read

Reader 3

Reader 4

Read

Read

Read

Read

Writer Sync Sync

Write

Sync

time

Page 20: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 420

Backup

● What can be run in parallel ?● Read and store metadata

– Could be done but● It will significantly complicate code● Amount of metadata usually much less than size of user data

Page 21: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 421

Backup

● What can be run in parallel ?● Read and store user data

– Handle different tables by parallel workers● Backup file will contain mix of records from different tables● Requires change in backup file structure to allow restore to

handle such file● “Big table” problem as in sweep case

Page 22: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 422

Backup

● What can be run in parallel ?● Read and store user data

– Parallel workers should handle different parts of the same table

– Requires a way to split table by parts● Ideally parts of the equal size

Page 23: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 423

Backup

● How to split table for few parallel workers ?● Use ranges of primary\unique key values

– Not every table could have primary\unique key– Unknown in advance whole range of key values– Uneven distribution of key values– How to make ranges for character keys ?– How to make ranges for composite (multi-segment) keys ?

Page 24: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 424

Backup

● How to split table by few parallel workers ?● Use ranges of data pages

– gbak works “outside” of the engine, it can’t address data pages directly

● Use ranges of RDB$DB_KEY values– Engine supports equality comparison only for

RDB$DB_KEY– Application (gbak) have no idea what data page is

addressed by given RDB$DB_KEY value– Need some support from the engine side

Page 25: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 425

Backup

● Use ranges of RDB$DB_KEY values● New built-in function MAKE_DBKEY

– MAKE_DBKEY(relation_id, recnum)● Returns dbkey for record recnum

– MAKE_DBKEY(relation_id, recnum, dpnum)● Returns dbkey for recnum at data page dpnum

– MAKE_DBKEY(relation_id, recnum, dpnum, ppnum)● Returns dbkey for recnum at data page dpnum at pointer

page ppnum● Engine now supports all kind of comparisons with

RDB$DB_KEY (<, <=, >, >=, =, !=)

Page 26: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 426

Backup

● How to split table for few parallel workers ?● Every worker handle records from the data pages from

the same pointer page and then ask for a next (not handled) pointer page

SELECT * FROM TABLE WHERE RDB$DB_KEY >= MAKE_DBKEY(:relId, 0, 0, :ppNum) AND RDB$DB_KEY < MAKE_DBKEY(:relId, 0, 0, :ppNum + 1)

Page 27: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 427

Backup

● Backup consistency● gbak uses snapshot transaction to read user data in

consistent way● Every worker uses own attachment and transaction● All worker attachments should read the same data

despite of other activity in database● Need shared database snapshot

Page 28: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 428

Backup

● Shared database snapshot● First introduced in Firebird 4 beta

– Based on new database snapshots architecture using commits order

● Re-implemented for Firebird 2.5 and Firebird 3 specially to support parallel backup

● Follows the same interface as of Firebird 4

Page 29: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 429

Backup

● Usage● gbak -b -parallel 4 <database> <backup>

Page 30: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 430

Backup

● Test results● Big database

1 2 4 8 16 24 32 48 640

50

100

150

200

250

300

350

400364

215

142

9983 76 70 59 56

Backup time, Firebird 2.5 SC

Workers

Tim

e, m

in

Page 31: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 431

Backup

● Test results● Medium database

Test environment 2

Firebird version 2.5.9 HQBird, 3.0.5 HQBird

OS CentOS 6.7

Server ProLiant DL380 Gen9

CPU 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Cores per socket 6

Logical CPU’s 24

RAM 32 GB

HDD 4xHDD SAS 10k RAID 10

Database 42 GB

Page 32: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 432

Backup

● Test results● Medium database

1 2 4 8 120

200

400

600

800

1000

1200

961

601

394330 308

Backup time, Firebird 2.5 SC

Workers

Tim

e, s

ec

Page 33: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 433

Restore

● How restore works● Create new database● Read metadata and populate system tables● Read data and populate user tables● Activate (build) indices

Page 34: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 434

Restore

● What can be run in parallel ?● Create new database

– no ● Read metadata and populate system tables

– not practical● Read data and populate user tables

– yes– probably, requires changes in backup format– not now, sorry

● Activate (build) indices– yes, exactly

Page 35: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 435

Restore

● How indices are build at restore● Index metadata is created with table metadata

– Indices are created with DEFERRED_ACTIVE flag● Indices are activated (build) after all user data is

committed● Index is actually build at transaction commit● Every index is activated in separate transaction

Page 36: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 436

Index build

● Index build steps● Read table data

– Remove unneeded record versions (garbage collect)– Put index keys into the sorter

● Build index b-tree using already sorted data

Page 37: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 437

Index build

● What can be run in parallel ?● Read table and sort index keys

– Yes● Build index B-tree

– Non-trivial task: prefix compression of index keys– Not now, maybe later

Page 38: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 438

Index build

● What can be run in parallel ?● Read table and sort data● Every worker handle records of data pages from the

same pointer page and then ask for a next (not handled) pointer page

● Every worker have its own attachment, transaction and sorter

● On the “B-tree build” step data from all sorters are merged into common sorted stream● By single thread

Page 39: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 439

Index build

● What can be run in parallel ?

Read and sort

Read and sortWorker 2

Worker 1

Read and sort

Read and sort

Worker 3

Worker 4

Read and sort

Read and sort

Read and sort

Read and sort

time

Build B-tree

Page 40: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 440

Restore

● Restore with parallel index build

Worker 2

Worker 1

Worker 3

Worker 4

time

gbak: opened file … .fbkgbak: created database ...gbak: restoring ...gbak: committing metadata

gbak: restoring index …gbak: restoring data for table …gbak: committing metadata

gbak: activating and creating deferred index …gbak: activating and creating deferred index …

gbak: finishing, closing, and going home

Page 41: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 441

Restore

● What can be improved next ?● Parallel load of user data into database

– Backup file format could be changed● Create few indices simultaneously at one table scan

– Temporary space usage could be significantly increased

Page 42: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 442

Restore

● Usage● gbak -c -parallel 4 <backup> <database>● Any application

– DPB tag isc_dpb_parallel_workers ● instruct engine how many parallel workers could be

used for some tasks● currently index creation and auto-sweep supports such

parallel handling

Page 43: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 443

Index build

● Usage● Regular CREATE INDEX and ALTER INDEX ACTIVE

statements also could build index with parallel workers– Configuration setting ParallelWorkers– DPB tag isc_dpb_parallel_workers

Page 44: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 444

Restore

● Test results● Big database

1 2 4 8 16 24 32 48 640

100

200

300

400

500

600

700

579

492

393356 343

381 366 372 372

Restore time, Firebird 2.5 SC

Workers

Tim

e, m

in

Page 45: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 445

Restore

● Test results● Medium database

1 2 4 8 120

500

1000

1500

2000

2500

3000

3500

1559 1556 1569 1570 1561

1373930

689 631 640

2932

24862258 2201 2201

Restore time, Firebird 2.5 SC

index data all Workers

Tim

e, s

ec

Page 46: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 446

Restore

● Test results● Medium database

1 2 4 8 120

500

1000

1500

2000

2500

3000

3500

1698 1710 1701 1720 1694

14231011 827 721 726

3121

27212528 2441 2420

Restore time, Firebird 3 SS

index data all Workers

Tim

e, s

ec

Page 47: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 447

Restore

● Test results● Medium database

1 2 4 8 120

500

1000

1500

2000

2500

3000

3500

1619 1611 1625 1610 1610

1401

860611 527 500

3020

24712236 2137 2110

Restore time, Firebird 3 SC

index data all Workers

Tim

e, s

ec

Page 48: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 448

All together

● Firebird now could run tasks using multiply workers/threads

● Some tasks used parallelism built into engine● Sweep● Index build, gbak -restore

● Some tasks used parallelism “outside” of the engine● gbak -backup

● This list will be enhanced● Validation, Statistics● Query execution

Page 49: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Berlin 2019 Berlin 2019 Firebird 4 Firebird 449

All together

● firebird.conf, per database settings● MaxParallelWorkers

– Set maximum number of parallel workers per Firebird process

● ParallelWorkers

– Set default number of parallel workers used to run some task

● DPB tag● isc_dpb_parallel_workers

– Set number of parallel workers used to run some task by current attachment (overrides ParallelWorkers setting)

Page 50: Multi-thread sweep, backup and restore · 3 Berlin 2019 Firebird 4 Introduction Big demand from users to speed up most time consuming regular maintenance operations: Backup Restore

Questions ?Questions ?

Firebird official web site

Firebird tracker

THANK YOU FOR ATTENTIONTHANK YOU FOR ATTENTION

[email protected]