59
Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander [email protected] PRODUCTS CONSULTING APPLICATION MANAGEMENT IT OPERATIONS SUPPORT TRAINING

Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander [email protected] PRODUCTS

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schemaless meets Relational: JSONB

ConFoo 2015Montreal, Canada

Magnus [email protected]

PRODUCTS • CONSULTING • APPLICATION MANAGEMENT • IT OPERATIONS • SUPPORT • TRAINING

Page 2: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Magnus Hagander•PostgreSQL

•Core Team member•Committer•PostgreSQL Europe

•Redpill Linpro•Infrastructure services•Principal database consultant

Page 3: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

PostgreSQL•Relational database (RDBMS)

Page 4: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Relational DatabaseA relational database management system (RDBMS) isa database management system (DBMS) that is basedon the relational model as invented by E. F. Codd

Page 5: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Relational Database•Tables•Columns•Rows•(etc etc)

Page 6: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Relational Database•Strict schema•Well defined data model

Page 7: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Relational Database•Migration between defined schemas•ALTER TABLE etc etc

Page 8: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schema migrations•Often considered a problem

•(one main reason for NoSQL?)•Not as bad as you think

•All RDMBSes don't suck at this•... but some definitely do

Page 9: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

DDL in PostgreSQL•Fully transactional

•ROLLBACK any operations!•Often no exclusive lock•Often no rewrite

Page 10: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Rewrite-less DDL•ALTER TABLE ADD COLUMN

•with no default•ALTER TABLE ALTER COLUMN SET TYPE

•depending on type of course!•Work done on further reductions!

Page 11: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Still not good enough?•There's always EAV

•Entity-Attribute-ValueCREATE TABLE eav ( object int PRIMARY KEY, attr text, value text);

Page 12: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

EAV•Yeah, that's evil•Just don't go there!

•One of the few "anti-patterns" in SQL

Page 13: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

So what about NoSQL•There's a lot to NoSQL

•ISAM?•dBASE?•FoxPro?

•Let's talk document store

Page 14: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

NoSQL document store•Schemaless

•Usually JSON•"CAP" vs "ACID"•Sharding•etc

Page 15: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

NoSQL document store•Schemaless•"CAP" vs "ACID"•Sharding•etc

Page 16: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schemaless•Data still has schema•It's not strict•There are no built-in schema migrations

Page 17: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schemaless•Schema migration happens in application•Or all versions of schema are live

Page 18: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Reality-check•Parts of the schema is known

•And fixed•Parts of the schema is dynamic

Page 19: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSONB•Enter JSONB

Page 20: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schemaless recap•Schemaless is not new in PostgreSQL•Not even JSON...

Page 21: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Schemaless evolution•8.2: hstore•8.3: hstore+gin•9.0: hstore fully functional kvs•9.2: json•9.4: jsonb

Page 22: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

hstore•Basic key/value store•No nesting support•No datatype support•Generic index support

Page 23: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSON•Simple text storage

•Just validates JSON format•Accessor functions•Requires indexes on specific keys

Page 24: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSONB•Deconstructed JSON format

•Not BSON!•API format is still pure JSON

•Data type aware•string, int, float, bool, null•arrays, hashes

•Nested structures

Page 25: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSONB•Regular PostgreSQL datatype•Full ACID•Up to 1GB•Transparent compression etc

Page 26: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Simple usageCREATE TABLE jsontable ( id SERIAL PRIMARY KEY, j JSONB NOT NULL);

Page 27: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Loading dataINSERT INTO jsontable VALUES (1,'{ "name": "Magnus", "country": "Sweden", "skills": { "database": ["postgresql", "sql"], "web": ["varnish", "django"] }}');

Page 28: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Retrieving datapostgres=# SELECT j FROM jsontable WHERE id=1; j ---------------------------------------------------------------------------------------------------------------------- {"name": "Magnus", "skills": {"web": ["varnish", "django"], "database": ["postgresql", "sql"]}, "country": "Sweden"} (1 row)

•Key-order not preserved•Formatting not preserved

Page 29: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Retrieving data•Element retrieval operatorpostgres=# SELECT j->>'country' FROM jsontable WHERE id=1; ?column?---------- Sweden(1 row)

•Or use -> to get json object backpostgres=# select j->'skills'->'database' FROM jsontable WHERE id=1; ?column?----------------------- ["postgresql", "sql"](1 row)

Page 30: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Retrieving data•Once retrieved, just a SQL value•All regular SQL queries/analytics workpostgres=# SELECT j->>'country' AS country,count(*)postgres-# FROM jsontable GROUP BY country; country | count---------+------- Sweden | 1....

Page 31: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Retrieving data•Path operatorspostgres=# SELECT j#>'{skills,database}' FROM jsontable WHERE id=1; ?column?----------------------- ["postgresql", "sql"](1 row)

•or array indexingpostgres=# SELECT j#>'{skills,database,0}' FROM jsontable WHERE id=1; ?column?-------------- "postgresql"(1 row)

Page 32: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Searching data•Searching by extracted fieldspostgres=# SELECT id FROM jsontable WHERE j->>'country' = 'Sweden'; id---- 1(1 row)

Page 33: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Searching data•Searching by extracted fieldspostgres=# SELECT id FROM jsontable WHERE j->>'country' = 'Sweden'; id---- 1(1 row)

•Requires index on each field•Not very schemaless

Page 34: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Searching data•Path operatorspostgres=# SELECT id FROM jsontable WHERE j @> '{"country": "Sweden"}'; id---- 1(1 row)

•Nested structure supportpostgres=# SELECT id FROM jsontable WHERE j @> '{"skills": {"database": ["postgresql"]}}'; id---- 1(1 row)

Page 35: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Indexing•Where JSONB really shines•Single generic index!CREATE INDEX json_idx ON jsontable USING gin(j);

Page 36: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Indexed containment searchpostgres=# EXPLAIN SELECT id FROM jsontablepostgres-# WHERE j @> '{"skills": {"database": ["postgresql"]}}'; QUERY PLAN------------------------------------------------------------------------------ Bitmap Heap Scan on jsontable (cost=12.00..16.01 rows=1 width=4) Recheck Cond: (j @> '{"skills": {"database": ["postgresql"]}}'::jsonb) -> Bitmap Index Scan on json_idx (cost=0.00..12.00 rows=1 width=0) Index Cond: (j @> '{"skills": {"database": ["postgresql"]}}'::jsonb)(4 rows)

Page 37: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Indexed containment search•Full path search in one operator•Deep filtering in the index

Page 38: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Other indexed operators•Key existencepostgres=# EXPLAIN SELECT id FROM jsontablepostgres-# WHERE j ? 'country'; QUERY PLAN----------------------------------------------------------------------- Bitmap Heap Scan on jsontable (cost=8.00..12.01 rows=1 width=4) Recheck Cond: (j ? 'country'::text) -> Bitmap Index Scan on json_idx (cost=0.00..8.00 rows=1 width=0) Index Cond: (j ? 'country'::text)(4 rows)

Page 39: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Other indexed operators•Key existence

•On sub-objects•No longer indexed•(can use expression index)

postgres=# EXPLAIN SELECT id FROM jsontablepostgres-# WHERE j->'skills' ? 'database'; QUERY PLAN--------------------------------------------------------- Seq Scan on jsontable (cost=0.00..1.03 rows=1 width=4) Filter: ((j -> 'skills'::text) ? 'database'::text)(2 rows)

Page 40: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Other indexed operators•All keys existpostgres=# EXPLAIN SELECT id FROM jsontablepostgres-# WHERE j ?& ARRAY['name', 'firstname']; QUERY PLAN------------------------------------------------------------------------ Bitmap Heap Scan on jsontable (cost=12.00..16.01 rows=1 width=4) Recheck Cond: (j ?& '{name,firstname}'::text[]) -> Bitmap Index Scan on json_idx (cost=0.00..12.00 rows=1 width=0) Index Cond: (j ?& '{name,firstname}'::text[])(4 rows)

Page 41: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Other indexed operators•Any key existspostgres=# EXPLAIN SELECT id FROM jsontablepostgres-# WHERE j ?| ARRAY['name', 'firstname']; QUERY PLAN------------------------------------------------------------------------ Bitmap Heap Scan on jsontable (cost=12.00..16.01 rows=1 width=4) Recheck Cond: (j ?| '{name,firstname}'::text[]) -> Bitmap Index Scan on json_idx (cost=0.00..12.00 rows=1 width=0) Index Cond: (j ?| '{name,firstname}'::text[])(4 rows)

Page 42: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Index operator classes•Default operator class jsonb_ops

•Handles all operators shown•Path only operator class jsonb_path_ops

•Handles only @>•Smaller index footprint!•Hashed paths

Page 43: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Index operator classes•Consider using jsonb_path_ops instead

•Don't use bothCREATE INDEX json_idx ON jsontable USING gin(j jsonb_path_ops);

Page 44: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSON conversion•Many functions to deal with conversion•To/from tables•To/from resultsets•To/from arrays

Page 45: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSON conversionpostgres=# SELECT row_to_json(b.*, 't') FROM pg_stat_bgwriter b; row_to_json---------------------------------------------------- {"checkpoints_timed":1302, + "checkpoints_req":1, + "checkpoint_write_time":17259, + "checkpoint_sync_time":44, + "buffers_checkpoint":185, +....

Page 46: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

How to use this?

Page 47: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

How to use this?•tl; dr: don't do what I just showed

Page 48: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

How to use this•Identify fixed schema parts

•E.g. name and country•Actually, maybe also skills..

•Use relational for this data•It will be faster•Especially when updating!

Page 49: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

How to use this•Add additional jsonb column(s)

•When needed•Some tables

•For purely unstructured data•Or data that changes structure over time

Page 50: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Drawbacks of jsonb•Lack of key compression•Document level locking•Document level updating

Page 51: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Advantages of jsonb•Dynamic schema•Dynamic index

•Only need one!

Page 52: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Client support•Text input/output

•Supported by all drivers•Client level compose/decompose

•Driver integration•Driver maps directly to native object•e.g. psycopg2 >= 2.5.4

Page 53: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Other technologies

Page 54: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

JSQuery•JSONB specific query language•Like tsquery for fts•Available as plugin for 9.4

Page 55: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

pg_shard•Automatic sharding for PostgreSQL•Shards and replicates on simple expressions

•E.g. update/delete requires sharding key•Non-transactional across shards

Page 56: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

ToroDB•MongoDB frontend

•(Speaks MongoDB protocol)•PostgreSQL backend

•Fully relational, not jsonb

Page 57: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

VODKA•Even more indexing•Combined access methods

•(in one index)

Page 58: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Summary•PostgreSQL is a platform for data management

•Both structured and unstructured•Reality is usually complex

•Requires both!

Page 59: Schemaless meets Relational: JSONB Magnus Hagander … · 2020. 9. 10. · Schemaless meets Relational: JSONB ConFoo 2015 Montreal, Canada Magnus Hagander magnus@hagander.net PRODUCTS

Thank you!Magnus Hagander

[email protected]@magnushagander

http://www.hagander.net/talks/

This material is licensed CC BY-NC 4.0.