Upload
oleksii-kliukin
View
423
Download
1
Embed Size (px)
Citation preview
Zalando SE
• One of the biggest fashion retailers in Europe
• 15 european countries
• Millions of transactions every day
• Relies of hundreds of PostgreSQL databases
Database ChangesModel (Metadata, DDL) ALTER TABLE … ADD COLUMN…
Data (DML)UPDATE … SET…
Code (Database functions)CREATE OR REPLACE FUNCTION …
Horror stories
Developer:ALTER TABLE public.bar DROP COLUMN beer;
Application:SELECT beer FROM public.bar INTO glass;
Result...
Horror stories• Heavy-weight
AccessExclusiveLock (ALTER TABLE…)
• Conflicts with lightweight AccessShareLock (SELECT)
• All subsequent lock requests are queued
Postgres is insanely great!
• Fully ACID-compliant
• Transactional DDL
• Sprocs in different languages
• Schemas and search_path support
Sqitch
• Utility for writing incremental database changes
• Works on top of your VCS
• Deploy/Revert/Verify
• Explicit order of changes with sqitch.plan
• Test on staging, bundle and deploy to production
First steps
• $ git init .
• Initialized empty Git repository in /Users/alexk/devel/conference/2015/pgconf.us/.git/
• $ sqitch --engine pg init pgconf.us --uri http://pgconf.us
• Created sqitch.conf
• Created sqitch.plan
• Created deploy/
• Created revert/
• Created verify/
Deploy/Revert/Verify
$ sqitch add approle -n 'add pgconf role'
Created deploy/approle.sql
Created revert/approle.sql
Created verify/approle.sql
Added "pgconf" to sqitch.plan
Verify
$ cat verify/approle.sql
-- Verify that app role is there
BEGIN;
SELECT pg_catalog.pg_has_role('pgus', 'usage');
ROLLBACK;
Apply and test
• $ createdb conference_test
• $ sqitch deploy --verify db:pg:conference_test
• Deploying changes to db:pg:conference_test
• + approle .. ok
Dependencies
$ sqitch add appschema -r approle -n 'define application schema'
Created deploy/appschema.sql
Created revert/appschema.sql
Created verify/appschema.sql
Added "appschema [approle]" to sqitch.plan
Dependencies
$ cat deploy/appschema.sql
-- Deploy appschema
-- requires: approle
BEGIN;
CREATE SCHEMA pgconf AUTHORIZATION pgus;
COMMIT;
Dependencies
$ cat deploy/appschema.sql
-- Verify appschema
BEGIN;
SELECT 1/count(1)
FROM information_schema.schemata
WHERE schema_name = 'pgconf';
ROLLBACK;
Dependencies
$ sqitch revert db:pg:conference_test
Revert all changes from db:pg:conference_test? [Yes] y
- approle .. ok
$ sqitch deploy
Deploying changes to db:pg:conference_test
+ approle .... ok
+ appschema .. ok
Plan your trip
$ cat sqitch.plan
%syntax-version=1.0.0
%project=pgconf.us
%uri=http://pgconf.us/2015/
approle 2015-03-25T04:05:04Z Oleksii Kliukin <[email protected]> # add a role definition for the app
appschema [approle] 2015-03-25T12:06:12Z Oleksii Kliukin <[email protected]> # define application schema
Branches
$ git checkout master
Already on 'master'
$ git merge speaker
Auto-merging sqitch.plan
CONFLICT (content): Merge conflict in sqitch.plan
Automatic merge failed; fix conflicts and then commit the result.
Plan file conflicts
$ cat sqitch.plan
%syntax-version=1.0.0
%project=pgconf.us
%uri=http://pgconf.us/
approle 2015-03-26T11:32:43Z Oleksii Kliukin <[email protected]> # add pgconf role
appschema [approle] 2015-03-26T11:32:56Z Oleksii Kliukin <[email protected]> # add pgconf schema
<<<<<<< HEAD
talk 2015-03-26T11:34:24Z Oleksii Kliukin <[email protected]> # add talk table
=======
speaker 2015-03-26T11:33:43Z Oleksii Kliukin <[email protected]> # add speaker table
>>>>>>> speaker
Plan file conflicts
• git rebase
• use union merge for the plan file echo sqitch.plan merge=union > .gitattributes
• sqitch rebase to revert/deploy all changes
Everything is logged
$ sqitch log
Deploy de2ef0221c80d8a789e8d2ce0e71db2557a12c3c
Name: appschema
Committer: Oleksii Kliukin <[email protected]>
Date: 2015-03-25 08:29:29 -0400
define application schema
…
Everything is logged
Revert de2ef0221c80d8a789e8d2ce0e71db2557a12c3c
Name: appschema
Committer: Oleksii Kliukin <[email protected]>
Date: 2015-03-25 08:30:22 -0400
define application schema
Metadata
conference_test=# \dt sqitch.
sqitch.changes sqitch.dependencies sqitch.events sqitch.projects sqitch.releases sqitch.tags
Staging -> production
• sqitch tag @release1.0
• sqitch bundle —to @release1.0
• cd bundle && sqitch deploy db:pg:production
Reworking
• Edit your changes in-place
• Do not go through revert/deploy cycle (i.e. on production)
• Old version of changes duplicated and renamed to a change@tag
• ‚Deploy script of the old version becomes revert of the new one [by default]
• Requires a tag between the old and reworked changes
Reworking
sqitch rework set_talk -n ‚process description field'
Added „set_talk [[email protected]]“ to sqitch.plan.
Modify these files as appropriate:
* deploy/set_talk.sql
* revert/set_talk.sql
* verify/set_talk.sql
new version of changes
Reworking
$ git status
On branch master
Changes not staged for commit:
modified: revert/set_talk.sql
modified: sqitch.plan
Untracked files:
deploy/[email protected]
revert/[email protected]
verify/[email protected] changes
original deployment script
Versioning
A package written by Hubert ‘depesz’ Lubaczewski:
https://github.com/depesz/Versioning
Instead of making changes on development server, then finding differences between production and development, deciding which ones should be installed on production, and finding a way to install them -you start with writing diffs themselves!
Versioning
• Small, atomic changes
• Track already applied patches
• Allow rollbacks
• Maintain dependencies
Patches
BEGIN;SELECT _v.register_patch('beer_not_null.sql');
ALTER TABLE public.bar ALTER COLUMN beer SET NOT NULL;COMMIT;
A patch is enclosed in a transaction to guarantee that the changes are atomic.
Track changespostgres@test:~$ psql -f beer_not_null.sql -d testdbBEGIN register_patch----------------(0 rows)
ALTER TABLECOMMITpostgres@test:~$ psql -f beer_not_null.sql -d testdbBEGINpsql:beer_not_null.sql:2: ERROR: Patch beer_not_null.sql is already applied!CONTEXT: SQL function "register_patch" statement 1psql:beer_not_null.sql:3: ERROR: current transaction is aborted, commands ignored until end of transaction blockROLLBACK
Rollbacks
Rollback patches are recommended for ‘dangerous’ changes.
BEGIN;SELECT _v.unregister_patch('beer_not_null.sql'); ALTER TABLE public.bar ALTER COLUMN beer DROP NOT NULL;COMMIT;
MetadataColumn Type Modifiers
patch_name text primary key
applied_tsz timestamp with time zone
not null default now()
applied_by text not null
requires text[]
conflicts text[]
Honorable mentions
• PgTap - a database unit testing frameworkhttp://pgtap.org
• SEM - schema evolution managerhttps://github.com/gilt/schema-evolution-manager
• FDIFF - consistent sproc modificationshttps://github.com/trustly/fdiff
Data access
sprocs API sprocs APIsprocs API
Applications access data in PostgreSQL databases by calling stored procedures.
Deployment procedures• Weekly deployment cycle
• Use release branches in git
• Changes to schema, sprocs API and Java code
• No application or database downtime
Release branches in git
database
10_data
20_api
03_types
04_tables
db_diffs
03_types
05_stored_procedures
R14_00_42
R14_00_42
Use numbers to indicate the order compatible with sort -V
Deployment stages• DB Diffs
• Sprocs API
• Java code
• Post-deployment actions (i.e data migrations)
Database diffs rollout• ‚Versioning’ package by depesz
• Changes developed by feature teams
• Reviewed by database engineers
• Schema changes should not break a running app
• Expensive locks should be avoided
Avoiding exclusive locksBEGIN; SELECT _v.register_patch('TDO-3575.camp');
ALTER TABLE zcm_data.media_placement ADD COLUMN mp_last_modified timestamptz; ALTER COLUMN mp_last_modified SET DEFAULT now();COMMIT;
-- make the column not nullUPDATE zcm_data.media_placement SET mp_last_modified = mp_createdWHERE mp_last_modified IS NULL;
ALTER TABLE zcm_data.media_placementALTER COLUMN mp_last_modified SET NOT NULL;
Steps to add a not-null column with a default value:
●add a column●set default value●commit a transaction●update the new column with the default value●set the new column not null
Single definition for a database objectBEGIN; SELECT _v.register_patch('FOO-1235.add_bar.sql_diff'); # add table bar \i database/baz/10_data/01_bar.sqlCOMMIT;
database/baz/10_data/bar/01_bar.sql-- 01_bar.sqlCREATE TABLE bar(b_id PRIMARY KEY, b_name TEXT);
Lifecycle of a database diff• Produced by the feature team developer, tested locally
• Applied on integration DB by the author
• Reviewed by 2 Database Engineers
• Applied to Release Staging
• Tested by the QA team as a part of the feature
• Applied to patch staging and LIVE databases
Naming conventionSHOP-12345.shop.postdeploy.rollback.sql_diff• Link diff to the deployment set for a given team
• Locate the database to apply the diff to
• Check the status of the Jira ticket related to the diff
• Locate a rollback diff if necessary
search_path based deployments
• API is deployed after the schema changes
• Multiple versions of API sprocs co-exist in a database
• Active version is chosen with search_path
• Fast rollbacks
• Almost no in-place sprocs modification
postgres=# SET search_path = pgconf_eu_2014;SETpostgres=# select hello(); hello--------------- Hello Madrid!(1 row)
postgres=# SET search_path = pgconf_us_2015;SETpostgres=# select hello(); hello--------------- Hello New York!(1 row)
New API deployment
APP INSTANCE 1
APP INSTANCE 2
Initially, all instances are running the same version of Java code and PostgreSQL API
New API deployment
New sprocs API is deployed but inactive, old one is still in use by the application
APP INSTANCE 1
APP INSTANCE 2
New API deployment
One instance of the application is switched to the new API via the search_path
APP INSTANCE 1
APP INSTANCE 2
New API deployment
Finally, new API is in use by all application instances
APP INSTANCE 1
APP INSTANCE 2
Bootstrapping databases• Database objects are defined in SQL files in git
• Every change produces both a db diff and edit of a SQL file
• Order of bootstrapping is controlled by file names
00_create_schema.sql 04_tables/05_newsletter.sql
• One line to bootstrap the fresh testing database find . -name ‘*.sql’|sort -V|xargs cat|psql -d testdb -1f -
Tooling for data migrations• Split updates or deletes in chunks
• Run vacuum after a given number of changes
• Controls load on a database server
• Multi-threaded to run on multiple shards at once
• Limits for max load and number of rows per minute
environment: integrationdatabase: testdbgetid: select id from public.fooupdate: update public.bar SET foo_count = 1 WHERE foo_id = %(id)s
commitrows: 20maxload: 3.0vacuum_cycles: 10vacuum_delay: 100vacuum_table: bar
A sneak peek into the future
• Automated testing of schema changes
• Schema comparison tool to quickly spot schema anomalies
• Handle dependencies between migrations and schema changes
• Stay tuned!
Versioning: https://github.com/depesz/Versioning Sqitch: http://sqitch.org pgTap: http://pgtap.org fdiff: https://github.com/trustly/fdiff Java SprocWrapper: https://github.com/zalando/java-sproc-wrapper PgObserver: https://zalando.github.io/PGObserver/ pg_view: https://github.com/zalando/pg_view PG Wiki: https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques
Links