PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore

Preview:

Citation preview

PostgreSQL 9.4 and BeyondJSON, Analytics, and More

Uptime Technologies

Satoshi Nagayasu@snaga

FOSSASIA 2015

Satoshi Nagayasu• Satoshi Nagayasu

– Database enthusiast. DBA and Data Steward.– Traveling Asia: Hong Kong, Shenzhen, Beijing, Singapore

• Uptime Technologies– Co-founder– Providing consulting services around Database and Platform

Technologies.

• PostgreSQL– pgstatindex, pageinspect, xlogdump– PostgresForest, Postgres-XC (clusters)– Organizing Japanese Users Group.

What Iʼm doing on PostgreSQL

• Postgres Toolkit– Brand new PostgreSQL DBA tool– Stay informed at uptime.jp/go/pt

• Postgres Add-on for Hinemos– One of the most popular system management tools

in Japan.– Monitoring, Alerting, Job Management, etc.

PostgreSQL and Hinemos

Number of sessions Database Size

Cache Hit Ratio Number of Written Blocks

Hacking Hardware

RaspberryPi 2&

DE0 (FPGA)

ZigBee(wireless)

Thanks to...• Magnus Hagander• Michael Paquier• Toshi Harada• Noriyoshi Shinoda

• ... and many pg guys!

Agenda• 9.4 Overview• NoSQL (JSON and GIN Index)• Analytics (Aggregation & Mat.View)• Replication and Beyond (Logical

Decoding)• Administration (ALTER SYSTEM)• Infrastructure (For Parallelization)• Beyond 9.4

9.4 Overview

9.4 Overview - Status• The first official release.

– 9.4 released on 18th December.

• The latest stable release– 9.4.1 released on 5th February.

9.4 Overview - Statistics• 9.4.0 - compared to 9.3.5

– 3,750 files changed.– 62,960 insertions (+)– 15,935 deletions (-)

9.4 Overview - Changes

Server

Indexes

General Performance

Monitoring

SSL

Server Settings

Replication and Recovery

Logical Decoding

Queries

Utility Commands

EXPLAIN

Views

Object Manipulation

Data Types

JSON

Functions

System Information Functions

Aggregates

Server‐Side Languages

PL/pgSQL Server‐Side Language

libpq

Client Applications

psql

Backslash Commands

pg_dump

pg_basebackup

Source Code

Additional Modules

pgbench

pg_stat_statements

9.4 Overview - Changes

Categories of Enhancements• NoSQL (JSON and GIN Index)• Analytics (Aggregation & Mat.View)• Replication+ (Logical Decoding)• Administration (ALTER SYSTEM)• Basic Infrastructure (Parallelization)

NoSQL(JSON and GIN Index)

NoSQL - JSONB• JSON vs. JSONB

NoSQL - JSONB• “Binary JSON”

– Different from JSON, a text representation– Faster for searching

• With JSONB...– No duplicated keys allowed. Last wins.– Key order not preserved.– Can take advantages of GIN Index.

NoSQL - GIN Index• JSON+btree vs. JSONB+GIN

– Btree indexes vs. GIN index

http://www.slideshare.net/toshiharada/jpug-studyjsonbdatatype20141011-40103981

Table Index Size Comparison

Analytics(Aggregation & Materialized View)

Analytics - Aggregation• FILTER replaces CASE WHEN.

Analytics - Aggregation• New Aggregate Functions

– percentile_cont()– percentile_disc()– mode()– rank()– dense_rank()– percent_rank()– cume_dist()

Analytics - Aggregation• Ordered-set aggregates

– mode(), most common value in a subset

Analytics - Aggregation• Ordered-set aggregates

– rank(), rank of a value in a subset

Analytics – Materialized Views

• REFRESH MATERIALIZED VIEW CONCURRENTLY myview

• Refreshing a MV concurrently (in background) without exclusive lock.

• Usability and availability improved.

Replication and Beyond(Logical Decoding)

Replication and Beyond –Logical Decoding

• “Logical” representation from replication stream– INSERT/UPDATE/DELETE operations– Can be replayed on different version/platform

• pg_recvlogical command– Shows how it works

• Replication can be more flexible– BDR (Bi-Directional Rep.), Slony, and more ...– Continuous Backup as well

pg_recvlogical (contrib)

Administration(ALTER SYSTEM)

Administration - ALTER SYSTEM

• ALTER SYSTEM SET– puts new value in postgresql.auto.conf– pg_reload_conf() reloads them.– postgresql.auto.conf takes priority over

postgresql.conf.

• ALTER SYSTEM RESET– Remove values from postgresql.auto.conf.

Infrastructure(For Parallelization)

Dynamic Background Workers

• In 9.3, background workers must start at the postmaster startup.

• After 9.4, they can be launched “on-demand” basis.

• From parallelization point of view...– It allows to launch multiple background

processes to execute child queries in parallel.

Dynamic Shared Memory• Shared memory can be allocated “on-demand”

basis– Cf.) by background workers

• Main segment (ex. shared_buffers) still fixed at startup

• Also supports lightweight message queue

• From parallelization point of view...– It allows to share data and communicate with

several bgworker processes.

My Tiny Favorite(pl/pgsql stacktrace)

pl/pgsql stacktrace

http://h50146.www5.hp.com/services/ci/opensource/pdfs/PostgreSQL_9_4%20_Ver_1_0.pdf

There are many other enhancements,

so please try it asap.

Beyond 9.4

BRIN Index• Block Range INdex

– Holds "summary“ data, instead of raw data.– Reduces index size tremendously.– Also reduces creation/maintenance cost.– Needs extra tuple fetch to get the exact record.

0

50,000

100,000

150,000

200,000

250,000

300,000

Btree BRIN

Elap

sed tim

e (m

s)

Index Creation

0

50,000

100,000

150,000

200,000

250,000

300,000

Btree BRIN

Numbe

r of B

locks

Index Size

0

2

4

6

8

10

12

14

16

18

Btree BRIN

Elap

sed tim

e (m

s)

Select 1 record

https://gist.github.com/snaga/82173bd49749ccf0fa6c

Commitfest 2015-2CommitFest is a process to review, fix and commit the submitted patches.

• Parallel Seq Scan• INSERT ... ON CONFLICT {UPDATE | IGNORE}• File level incremental backup• and others..

Still work in progress...

commitfest.postgresql.org

Wrap-up• One of the most developer-friendly

RDBMSes in the world.

• Analytics features and the performance are improving.

• Things are going to parallel.

Resources• www.postgresql.org

• www.planetpostgresql.org

• www.pgcon.org

Any Question?

Thank you!• E-mail: snaga@uptime.jp• Twitter, Github: @snaga• WeChat: satoshinagayasu

Recommended