Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
InnoDB: What’s new in 8.0
Sunny Bains & Jimmy Yang InnoDB team
Copyright © 2018, Oracle and/or its affiliates. All rights reserved.
#MySQL
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
New DD & implications DDL/Instant Add Column Changes related to performance Configuration variables deprecated Q&A
3
Agenda
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Legacy Multiple Data Dictionaries
• Up to 5.7 two separate data dictionaries (.frm & InnoDB DD) • Changes were not atomic • Mismatch could happen between .frm files and InnoDB’s meta-data
• .frm file updates were not atomic • Not crash proof • Update to Data Dictionary is non-transactional
4
Problems
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
5
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
New Data Dictionary
Data Dictionary SQL
I_S
Core DD
system tables
Other
system tables
InnoDB
Views
DD
tablespace
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Single New Data Dictionary
• One source of truth – System Tables in DD tablespaces • Atomic DDL • No more .frm and InnoDB data dictionary mismatch issues
• Control meta-data access using a single locking mechanism (MDL) • Server supports the concept of Tablespaces • .frm files were per table, made tablespace support messy
• No entries in Persistent System tables for temporary tables - meta-data in memory only
7
Benefits
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Upgrade
Upgrade from 5.7 only Make sure no crash and previous innodb_fast_shutdown is not set to 2 Upgrade automatically
Create new DD tables in DD tablespace
Update all tables to new DD tables Upgrade to the new Undo tablespaces seamlessly
Create SDI Finally, legacy InnoDB system tables are dropped
Downgrade is not allowed for now
Incompatibility and crash can be handled
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Versioning support
Purpose is to make upgrades seamless, deprecation easy Server version, such as 8.0.11
Stored in DD table and page 0 of a tablespace, at byte offset 8
To identify release with new features Visible in information_schema.innodb_tablespaces
Tablespace version, starting from 1... Stored in DD table and page 0 of tablespace, at byte offset 12
To identify any table space format changes (layoyut/page/row) Also visible in I_S.innodb_tablespaces
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Serialized Dictionay Information (SDI)
Purpose Redundancy. Metadata stored in the tablespace in addition to the DD
To make the tablespace self descriptive
Stored in the form of B-tree
Compressed JSON format
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Serialized Dictionary Information (SDI) { "sdi_version": 1, "dd_version": 1, "dd_object_type": "Table", "dd_object": { "name": "tbl1", "mysql_version_id": 80000, "created": 20160922042352, "last_altered": 20160922042352, … "columns": [ { "name": "id", "type": 4, "is_nullable": false, … ], … "indexes": [ … ], "foreign_keys": [], "partitions": [], "collation_id": 8 …
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
SDI tools
• A tool for extracting Serialized Dictionary Information (SDI) • ibd2sdi • Works offline and online • Extracts the SDI record id, type, data in JSON format • Useful during disaster recovery • e.g., Table corrupted in a tablespace with multiple tables • Extract the meta-data from the .ibd file into a separate .SDI file • Remove corrupt table meta-data by editing .SDI file • Use edited .SDI file to import the tablespace and ignore the corrupted table
12
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
DDL_log table – Internal table
Purpose is to support atomic DDL – record physical file actions (delete/create) during DDL etc. Resides in DD tablespace
No row locking
One DDL will generate several entries Changes are persisted immediately, exempted from innodb_flush_log_at_trx_commit setting
Table size won’t grow infinitely
innodb_ddl_log CREATE TABLE `innodb_ddl_log` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`thread_id` bigint(20) unsigned NOT NULL,
`type` int(10) unsigned NOT NULL,
`space_id` int(10) unsigned DEFAULT NULL,
`page_no` int(10) unsigned DEFAULT NULL,
`index_id` bigint(20) unsigned DEFAULT NULL,
`table_id` bigint(20) unsigned DEFAULT NULL,
`old_file_path` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`new_file_path` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `thread_id` (`thread_id`)
) /*!50100 TABLESPACE `mysql` */ ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 STATS_PERSISTENT=0
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
DDL_LOG continued
DDL_LOG record physical operations for InnoDB DDLs o DDL_LOG is a table in the mysql tablespace, no DDL and no USER DML allowed
o It is created to track tablespace file creation/drop, index trees creation/drop, file rename etc. o This covers physical operations in a DDL that cannot be rolled back by a transaction. o DDL transaction and this table makes Atomic (Crash safe DDL) possible
14
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Example - DROP SCHEMA at a high level
MySQL 5.7 Delete tables InnoDB will starts its own transactions to delete table/index metadata from InnoDB system tables and commit. Server will delete TRN/TRG/FRM files without transaction support. Delete stored programs
Metadata rows in MyISAM (non-transactional) Delete schema
Metadata in DB.OPT file Problems Mix of filesystem Non-transactional/transactional storage Multiple commits. Non-atomic, could result in in-consistent state at various stages
15
MySQL 8.0 Delete tables Server starts transaction. Metadata in DD system tables marked as deleted. InnoDB will not drop physical artefacts at this stage, it only logs a record in the DDL_LOG, to ensure that the physical deletion happens when trx commits (recovery implications too).
Delete stored programs
Metadata rows in InnoDB (within the same transaction)
Delete schema
Metadata rows in InnoDB (within the same transaction) Benefits Updates to transactional storage, one commit InnoDB physically deletes all indexes/tablespaces etc.
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ALTER TABLE - INSTANT
ALTER TABLE tbl_name … ALGORITHM = INSTANT; New (default) algorithm
Does not work with LOCK clause
Exception: ALGORITHM=INSTANT, LOCK=DEFAULT
Internal This will result in a metadata change only
No table lock required
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ALTER TABLE - INSTANT
Operations which can be INSTANT RENAME TABLE(ALTER) SET DEFAULT
DROP DEFAULT
MODIFY COLUMN
CHANGE COLUMN(Virtual column generation expression) Change index option
ADD virtual column, DROP virtual column
ADD COLUMN(non-generated)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ALTER TABLE – ADD COLUMN
Biggest pain point for users New columns have to be added (to a big table) from time to time
Copy table - time, disk, resource schedule
Table lock
Replication
…
But why? InnoDB doesn’t keep enough metadata in physical record/page
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ALTER TABLE – INSTANT ADD COLUMN
Contribution from Tencent Gaming
Only metadata change
No more copying of data and table rebuild
No double (or even more) disk space use (for table rebuild) Smaller final data size (default value not stored) Forward compatibility with old data file
ALTER TABLE … ADD COLUMN c, ALGORITHM = INSTANT
Can be INSTANT along with other instant operations
Support for all row formats : DYNAMIC/COMPACT/REDUNDANT
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Physical record
Record created before first INSTANT ADD COLUMN In old format Number of fields = Instant columns
Record created after last INSTANT ADD COLUMN In new format Number of fields == Latest number of fields
Record created between above two In new format Instant columns < Number of fields < Latest number of fields
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Metadata
First INSTANT ADD COLUMN Remember current number of fields Remember the default value of new columns, either value or NULL
Follow-up INSTANT ADD COLUMN Only remember the default value of new columns
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Metadata
Why store the DEFAULT?
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Metadata
Table rebuild, create, truncate, etc. Will discard the relevant metadata, and keep the table/partitions as before
Column default values would be abandoned if useless
Partitioned table Some partition operations will only re-create / truncate some of partitions Some partitions will need the default values, some not
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Observability
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN – Observability
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
INSTANT ADD COLUMN - Limitations
Only support adding columns as the last column in the row
No support for COMPRESSED tables, which is seldom used
No support for tables with a full-text index
No support for table residing in DD tablespace
No support for temporary tables (it goes with COPY)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Performance
• Cost Based Optimiser statistics • Number of pages in RAM per index
• Remove the buffer pool mutex (Percona contribution) • CATS (Contention Aware Transaction Scheduling) (was called VATS earlier)
• Contributed by University of Michigan DB researchers • No configuration required • Switches between FIFO and CATS automatically
• Threshold is >= 32 waiting threads
36
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Information Schema
• A new INFORMATION_SCHEMA table, INNODB_CACHED_INDEXES • Report pages cached in the InnoDB buffer pool for each index.
37
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Redesigned BLOB infrastructure (for JSON)
• Partial fetch and update • Better handling for large and small updates • Performance improvement for large LOBs • Up to 14x in our internal tests
• Partial updates to small BLOBs similar to other column types • Up to 4x in our internal tests
• InnoDB changes are for all BLOBs • Currently only used for JSON documents by the Optimiser
38
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Performance (cont.)
• Group records by table id when purging • Reduces contention of the dict_index_t::lock when multiple purge threads enabled
• --innodb_stats_include_delete_marked := bool • Include/Exclude rows that are delete marked (in 8.0.1)
• --innodb_deadlock_detect := bool (dynamic) • On high concurrent loads, rely on --innodb_lock_wait_timeout and rollback
39
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
IO Performance
•Major IO bottleneck fixed • Sharded the fil_sys_t::mutex • 64 shards • Dedicated shard for redo log files and undo tablespaces
40
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Performance Results
41
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Redo log
New design (lock free) Old design
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Redo log contd. Dedicated threads - log_writer - writes from log buffer to file - log_flusher - executes fsync() - log_write_notifier - notifies user threads about finished writes - log_flush_notifier - notifies user threads about finished fsync - log_checkpointer - writes checkpoints - log_closer - maintains limit for disorder in flush lists
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Redo log contd.
• Removed log_flush_order mutex
• Removed log_sys mutex
• Decreased latency between: fsync()→trx committed
--innodb-log-wait-for-flush-spin-hwm - advanced
--innodb-log-spin-cpu-abs-lwm - advanced
--innodb-log-spin-cpu-pct-hwm - advanced
--innodb-log-buffer-size – advanced
Defaults should work well out of the box
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Performance Results
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
New features
• Persistent auto increment • Doesn’t reset to SELECT MAX(AUTOINC_COL) FROM T; on restart • Probably the most requested feature since v3.x • Bug 199 - Created on 27 March 2003
• Memcache improvements • Support multiple get and range search
46
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Undo tablespace improvements
• Change the undo roll ptr format • Default will be two undo tablespaces • Undo truncate is on by default • SQL syntax to create/drop/offline/online the undo tablespaces (WIP)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Miscellaneous
• Avoid intermediate commits that would occur every 10000 rows • e.g. ALTER TABLE … ALGORITHM=COPY
• Remove .isl files (InnoDB Symbolic Link files) • --innodb-read-only semantics change • If ON then affects entire MySQL instance • Because DD tables are stored in InnoDB
• Parallel initialization of the buffer pool
48
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Deprecations / Removals
• Deprecated parameters that have been removed • innodb_file_format • innodb_file_format_check • innodb_file_format_max • innodb_large_prefix • innodb_stats_sample_pages • innodb_locks_unsafe_for_binlog • innodb_checksums • innodb_support_xa (always ON)
49
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
New In-Memory Storage Engine (temptable)
• Currently for internal use only (Optimizer joins etc.) • Not shared across connection • Lifetime limited to query life time • Limited size, bounded by memory allocated
• --temptable-max-ram • Supports UTF8 • BLOBs (WIP)
50
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Encryption and Generalised Tablespace Improvements
• Encryption of redo and undo log • --innodb-redo-log-encrypt := bool • --innodb-undo-log-encrypt := bool
• Generalised/Shared tablespaces • Support Encryption • Support Compression • Support Import/Export (WIP)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
NO_WAIT/SKIP LOCKED
• If NO_WAIT set for a query • Return immediately without waiting for the row lock to be released • SELECT * FROM T WHERE C 1= n and C2 = m FOR UPDATE NO_WAIT;
• If SKIP LOCKED set for a query • Skip locked row, without waiting for the row lock to be released • SELECT * FROM T WHERE C1 = N AND C2 = m LIMIT 1 FOR UPDATE SKIP LOCKED;
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Descending Indexes
• Change buffering is not supported
• If secondary index contains a descending index key column
• If the primary key includes a descending index column
• Supported for all data types for which ascending indexes are available.
• Supported for ordinary and generated columns (both VIRTUAL and STORED)
• Not supported for full text indexes and RTree
• A little slower than ascending indexes, due to page layout issues
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Dedicated Server
• --innodb-dedicated-server := boolean (default OFF)
• Sets default values based on physical memory available
• If below variables not explicitly set to non-defaults
• --innodb-log-file-size based on physical memory size
• --innodb-buffer-pool-size based on physical memory size
• --innodb-flush-method=O_DIRECT_NO_FSYNC
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Dedicated Server (contd.)
• --innodb-buffer-pool-size
If phy_mem_size < 1G
Use InnoDB default value
Else If phy_mem_size <= 4GB
Use 50% of phy_mem_size
Else
Use 75% of phy_mem_size
End
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Dedicated Server (contd.)
• --innodb-log-file-size
If phy_mem_size < 1G
Use InnoDB default value
Else If phy_mem_size <= 4GB
Set to 128 MB
Else if phy_mem_size <= 8 GB
Set to 512 MB
Else if phy_mem_size <= 16 GB
Set to 1GB
Else
Set to 2GB
End
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Labs release Source and Binaries
http://mysqlserverteam.com
http://labs.mysql.com
8.0.12 Preview (Instant add column)