74

PgConf US 2015 - ALTER DATABASE ADD more SANITY

Embed Size (px)

Citation preview

ALTER DATABASEADD more SANITY

PGConf US, New YorkMarch 27, 2015

Zalando SE

• One of the biggest fashion retailers in Europe

• 15 european countries

• Millions of transactions every day

• Relies of hundreds of PostgreSQL databases

AgendaA.Changes

1.Schema

2.Sprocs

3.Data Migrations

B.Tools

C.Zalando

Database changes

Database ChangesModel (Metadata, DDL) ALTER TABLE … ADD COLUMN…

Data (DML)UPDATE … SET…

Code (Database functions)CREATE OR REPLACE FUNCTION …

…does not always work in production

Horror stories

Result...

Horror stories

Result...

Application:SELECT beer FROM public.bar INTO glass;

Horror stories

Developer:ALTER TABLE public.bar DROP COLUMN beer;

Application:SELECT beer FROM public.bar INTO glass;

Result...

Horror stories• Heavy-weight

AccessExclusiveLock (ALTER TABLE…)

• Conflicts with lightweight AccessShareLock (SELECT)

• All subsequent lock requests are queued

Horror stories

deployment

self time

total time

From source code to databases

From source code…

• Version-control

• Unit-testing

• Staging environments

• Deployment procedures

Postgres is insanely great!

• Fully ACID-compliant

• Transactional DDL

• Sprocs in different languages

• Schemas and search_path support

Sqitch

• Utility for writing incremental database changes

• Works on top of your VCS

• Deploy/Revert/Verify

• Explicit order of changes with sqitch.plan

• Test on staging, bundle and deploy to production

First steps

• $ git init .

• Initialized empty Git repository in /Users/alexk/devel/conference/2015/pgconf.us/.git/

• $ sqitch --engine pg init pgconf.us --uri http://pgconf.us

• Created sqitch.conf

• Created sqitch.plan

• Created deploy/

• Created revert/

• Created verify/

Deploy/Revert/Verify

$ sqitch add approle -n 'add pgconf role'

Created deploy/approle.sql

Created revert/approle.sql

Created verify/approle.sql

Added "pgconf" to sqitch.plan

Deploy

$ cat deploy/approle.sql

-- Deploy approle

BEGIN;

CREATE ROLE pgus LOGIN;

COMMIT;

Revert

$ cat revert/approle.sql

-- Revert approle

BEGIN;

DROP ROLE pgus;

COMMIT;

Verify

$ cat verify/approle.sql

-- Verify that app role is there

BEGIN;

SELECT pg_catalog.pg_has_role('pgus', 'usage');

ROLLBACK;

Apply and test

• $ createdb conference_test

• $ sqitch deploy --verify db:pg:conference_test

• Deploying changes to db:pg:conference_test

• + approle .. ok

Dependencies

$ sqitch add appschema -r approle -n 'define application schema'

Created deploy/appschema.sql

Created revert/appschema.sql

Created verify/appschema.sql

Added "appschema [approle]" to sqitch.plan

Dependencies

$ cat deploy/appschema.sql

-- Deploy appschema

-- requires: approle

BEGIN;

CREATE SCHEMA pgconf AUTHORIZATION pgus;

COMMIT;

Dependencies

$ cat revert/appschema.sql

-- Revert appschema

BEGIN;

DROP SCHEMA pgconf;

COMMIT;

Dependencies

$ cat deploy/appschema.sql

-- Verify appschema

BEGIN;

SELECT 1/count(1)

FROM information_schema.schemata

WHERE schema_name = 'pgconf';

ROLLBACK;

Dependencies

$ sqitch revert db:pg:conference_test

Revert all changes from db:pg:conference_test? [Yes] y

- approle .. ok

$ sqitch deploy

Deploying changes to db:pg:conference_test

+ approle .... ok

+ appschema .. ok

Plan your trip

$ cat sqitch.plan

%syntax-version=1.0.0

%project=pgconf.us

%uri=http://pgconf.us/2015/

approle 2015-03-25T04:05:04Z Oleksii Kliukin <[email protected]> # add a role definition for the app

appschema [approle] 2015-03-25T12:06:12Z Oleksii Kliukin <[email protected]> # define application schema

Branches

$ git checkout master

Already on 'master'

$ git merge speaker

Auto-merging sqitch.plan

CONFLICT (content): Merge conflict in sqitch.plan

Automatic merge failed; fix conflicts and then commit the result.

Plan file conflicts

$ cat sqitch.plan

%syntax-version=1.0.0

%project=pgconf.us

%uri=http://pgconf.us/

approle 2015-03-26T11:32:43Z Oleksii Kliukin <[email protected]> # add pgconf role

appschema [approle] 2015-03-26T11:32:56Z Oleksii Kliukin <[email protected]> # add pgconf schema

<<<<<<< HEAD

talk 2015-03-26T11:34:24Z Oleksii Kliukin <[email protected]> # add talk table

=======

speaker 2015-03-26T11:33:43Z Oleksii Kliukin <[email protected]> # add speaker table

>>>>>>> speaker

Plan file conflicts

• git rebase

• use union merge for the plan file echo sqitch.plan merge=union > .gitattributes

• sqitch rebase to revert/deploy all changes

Everything is logged

$ sqitch log

Deploy de2ef0221c80d8a789e8d2ce0e71db2557a12c3c

Name: appschema

Committer: Oleksii Kliukin <[email protected]>

Date: 2015-03-25 08:29:29 -0400

define application schema

Everything is logged

Revert de2ef0221c80d8a789e8d2ce0e71db2557a12c3c

Name: appschema

Committer: Oleksii Kliukin <[email protected]>

Date: 2015-03-25 08:30:22 -0400

define application schema

Metadata

conference_test=# \dt sqitch.

sqitch.changes sqitch.dependencies sqitch.events sqitch.projects sqitch.releases sqitch.tags

Staging -> production

• sqitch tag @release1.0

• sqitch bundle —to @release1.0

• cd bundle && sqitch deploy db:pg:production

Reworking

• Edit your changes in-place

• Do not go through revert/deploy cycle (i.e. on production)

• Old version of changes duplicated and renamed to a change@tag

• ‚Deploy script of the old version becomes revert of the new one [by default]

• Requires a tag between the old and reworked changes

Reworking

sqitch rework set_talk -n ‚process description field'

Added „set_talk [[email protected]]“ to sqitch.plan.

Modify these files as appropriate:

* deploy/set_talk.sql

* revert/set_talk.sql

* verify/set_talk.sql

new version of changes

Reworking

$ git status

On branch master

Changes not staged for commit:

modified: revert/set_talk.sql

modified: sqitch.plan

Untracked files:

deploy/[email protected]

revert/[email protected]

verify/[email protected] changes

original deployment script

Versioning

A package written by Hubert ‘depesz’ Lubaczewski:

https://github.com/depesz/Versioning

Instead of making changes on development server, then finding differences between production and development, deciding which ones should be installed on production, and finding a way to install them -you start with writing diffs themselves!

Versioning

• Small, atomic changes

• Track already applied patches

• Allow rollbacks

• Maintain dependencies

Patches

BEGIN;SELECT _v.register_patch('beer_not_null.sql');

ALTER TABLE public.bar ALTER COLUMN beer SET NOT NULL;COMMIT;

A patch is enclosed in a transaction to guarantee that the changes are atomic.

Track changespostgres@test:~$ psql -f beer_not_null.sql -d testdbBEGIN register_patch----------------(0 rows)

ALTER TABLECOMMITpostgres@test:~$ psql -f beer_not_null.sql -d testdbBEGINpsql:beer_not_null.sql:2: ERROR: Patch beer_not_null.sql is already applied!CONTEXT: SQL function "register_patch" statement 1psql:beer_not_null.sql:3: ERROR: current transaction is aborted, commands ignored until end of transaction blockROLLBACK

Rollbacks

Rollback patches are recommended for ‘dangerous’ changes.

BEGIN;SELECT _v.unregister_patch('beer_not_null.sql'); ALTER TABLE public.bar ALTER COLUMN beer DROP NOT NULL;COMMIT;

Dependencies

A

B

C

SELECT _v.register_patch('C.sql', ARRAY['A.sql', 'B.sql']);

MetadataColumn Type Modifiers

patch_name text primary key

applied_tsz timestamp with time zone

not null default now()

applied_by text not null

requires text[]

conflicts text[]

Honorable mentions

• PgTap - a database unit testing frameworkhttp://pgtap.org

• SEM - schema evolution managerhttps://github.com/gilt/schema-evolution-manager

• FDIFF - consistent sproc modificationshttps://github.com/trustly/fdiff

Zalando approach

• Database diffs

• Schema-based API versioning

• Tooling for data migrations

Data access

sprocs API sprocs APIsprocs API

Applications access data in PostgreSQL databases by calling stored procedures.

Deployment procedures• Weekly deployment cycle

• Use release branches in git

• Changes to schema, sprocs API and Java code

• No application or database downtime

Release branches in git

database

10_data

20_api

03_types

04_tables

db_diffs

03_types

05_stored_procedures

R14_00_42

R14_00_42

Use numbers to indicate the order compatible with sort -V

Deployment stages• DB Diffs

• Sprocs API

• Java code

• Post-deployment actions (i.e data migrations)

Database diffs rollout• ‚Versioning’ package by depesz

• Changes developed by feature teams

• Reviewed by database engineers

• Schema changes should not break a running app

• Expensive locks should be avoided

Avoiding exclusive locksBEGIN; SELECT _v.register_patch('TDO-3575.camp');

ALTER TABLE zcm_data.media_placement ADD COLUMN mp_last_modified timestamptz; ALTER COLUMN mp_last_modified SET DEFAULT now();COMMIT;

-- make the column not nullUPDATE zcm_data.media_placement SET mp_last_modified = mp_createdWHERE mp_last_modified IS NULL;

ALTER TABLE zcm_data.media_placementALTER COLUMN mp_last_modified SET NOT NULL;

Steps to add a not-null column with a default value:

●add a column●set default value●commit a transaction●update the new column with the default value●set the new column not null

Single definition for a database objectBEGIN; SELECT _v.register_patch('FOO-1235.add_bar.sql_diff'); # add table bar \i database/baz/10_data/01_bar.sqlCOMMIT;

database/baz/10_data/bar/01_bar.sql-- 01_bar.sqlCREATE TABLE bar(b_id PRIMARY KEY, b_name TEXT);

Lifecycle of a database diff• Produced by the feature team developer, tested locally

• Applied on integration DB by the author

• Reviewed by 2 Database Engineers

• Applied to Release Staging

• Tested by the QA team as a part of the feature

• Applied to patch staging and LIVE databases

Naming conventionSHOP-12345.shop.postdeploy.rollback.sql_diff• Link diff to the deployment set for a given team

• Locate the database to apply the diff to

• Check the status of the Jira ticket related to the diff

• Locate a rollback diff if necessary

search_path based deployments

• API is deployed after the schema changes

• Multiple versions of API sprocs co-exist in a database

• Active version is chosen with search_path

• Fast rollbacks

• Almost no in-place sprocs modification

postgres=# SET search_path = pgconf_eu_2014;SETpostgres=# select hello(); hello--------------- Hello Madrid!(1 row)

postgres=# SET search_path = pgconf_us_2015;SETpostgres=# select hello(); hello--------------- Hello New York!(1 row)

New API deployment

APP INSTANCE 1

APP INSTANCE 2

Initially, all instances are running the same version of Java code and PostgreSQL API

New API deployment

New sprocs API is deployed but inactive, old one is still in use by the application

APP INSTANCE 1

APP INSTANCE 2

New API deployment

One instance of the application is switched to the new API via the search_path

APP INSTANCE 1

APP INSTANCE 2

New API deployment

Finally, new API is in use by all application instances

APP INSTANCE 1

APP INSTANCE 2

Bootstrapping databases• Database objects are defined in SQL files in git

• Every change produces both a db diff and edit of a SQL file

• Order of bootstrapping is controlled by file names

00_create_schema.sql 04_tables/05_newsletter.sql

• One line to bootstrap the fresh testing database find . -name ‘*.sql’|sort -V|xargs cat|psql -d testdb -1f -

Tooling for data migrations• Split updates or deletes in chunks

• Run vacuum after a given number of changes

• Controls load on a database server

• Multi-threaded to run on multiple shards at once

• Limits for max load and number of rows per minute

environment: integrationdatabase: testdbgetid: select id from public.fooupdate: update public.bar SET foo_count = 1 WHERE foo_id = %(id)s

commitrows: 20maxload: 3.0vacuum_cycles: 10vacuum_delay: 100vacuum_table: bar

A sneak peek into the future

• Automated testing of schema changes

• Schema comparison tool to quickly spot schema anomalies

• Handle dependencies between migrations and schema changes

• Stay tuned!

Oleksii KliukinDatabase Engineer

Zalando SE

email: [email protected]

twitter: @alexeyklyukin

Thank you!

Versioning: https://github.com/depesz/Versioning Sqitch: http://sqitch.org pgTap: http://pgtap.org fdiff: https://github.com/trustly/fdiff Java SprocWrapper: https://github.com/zalando/java-sproc-wrapper PgObserver: https://zalando.github.io/PGObserver/ pg_view: https://github.com/zalando/pg_view PG Wiki: https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques

Links