Introduction to PostgreSQL

Preview:

DESCRIPTION

This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014. http://www.meetup.com/nylug-meetings/events/180533472/

Citation preview

Introduction to PostgreSQL

November, 2014

Creative Commons Attribution License

Who Are We?

● Jim Mlodgenski– jimm@openscg.com– @jim_mlodgenski

● Co-organizer of– NYCPUG - www.nycpug.org

● Director, PgUS– www.postgresql.us

● CTO, OpenSCG– www.openscg.com

● Jonathan S. Katz– jonathan@venuebook.com – @jkatz05

● Co-organizer of– NYCPUG - www.nycpug.org

● Director, PgUS– www.postgresql.us

● CTO, VenueBook– www.venuebook.com

History

● The world’s most advanced open source database● Designed for extensibility and customization● ANSI/ISO compliant SQL support● Actively developed for almost 30 years

– University POSTGRES (1986-1993)– Postgres95 (1994-1995)– PostgreSQL (1996-2014)

Timeline

“Over the past few years, PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications.” – Jeff Barr, Chief Evangelist, Amazon Web Services

Why PostgreSQL

Affordability

Technology

Security

Flexibility

Stability

Extensibility

Reliability

Predictability

Community

Auditability

Technology

● Full Featured Database– Mature Server Side Programming Functionality

– Hot Standby High Availability

– Online Backups

– Point In Time Recovery

– Table Partitioning– Spatial Functionality– Full Text Search

Security

● Object Level Privileges assigned to “Roles & User”● Row Level Security● Many Authentication mechanisms

– Kerberos– LDAP– PAM– GSSAPI

● Native SSL Support.● Data Level Encryption (AES, 3DES, etc)● Ability to utilize 3rd party Key Stores in a full PKI

infrastructure

Flexibility

● No Vendor Lock-in– Compliant with the ANSI SQL standard

– Runs on all major platforms using all major languages and middleware

● “BSD-like” license – PostgreSQL License– Allows businesses to retain the option of commercializing the final product

with minimal legal issues– No fear of “Open Source Viral Infection”

Predictability

● Predictable release cycles– The average span between major

releases over the last 10 years is 13 months

● Quick turn around on patches– The average span between minor

releases over the last 5 years is 3 months

Version Release Date

7.3 Nov-02

7.4 Nov-03

8.0 Jan-05

8.1 Nov-05

8.2 Dec-06

8.3 Feb-08

8.4 Jul-09

9.0 Aug-10

9.1 Sep-11

9.2 Sep-12

9.3 Sep-13

Community

● Strong Open Source Community● Independent & Thriving Development Community

– 10+ committers and ~200 reviewers– 1,500 contributors and 10,000+ members

● Millions of downloads per year

● PostgreSQL is a meritocracy– Influence through their merits (usually technical) of the contributor

Who's Using PostgreSQL

PostgreSQL Success Stories

“…With PostgreSQL we have been successful in growing the databases as the company

has grown, both in number of users and in the complexity of services we offer…”

Hannu Krosing – Database Architect Skye Technologies.

“We manage multiple terabytes of data in more than 50 unique production PostgreSQL databases.”

Cisco uses PostgreSQL as the embedded database in all its “Case Sensitive Routing”

(CSR) products to store carrier details, rules, contacts, routes – to perform call routing.

“…Fujitsu is proud of its sponsorship of contributions to PostgreSQL and of its work with

The PostgreSQL community. We are committed to helping make PostgreSQL the leading

Database Management System…”

Takayuki Nakazawa – Director Database in Software Group.

Database 101

● A database stores data● Clients ( people or applications ) input data into tables

( relations ) in the database and retrieve data from it● Relational Database Management Systems are responsible

for managing the safe-storage of data● RDBMSs are designed to store data in an A.C.I.D compliant

way ( all or nothing )– This is done via transactions

Database 101 - (ACID)

● Atomic – Store data in an 'all-or-nothing' approach

● Consistent – Give me a consistent picture of the data

● Isolated– Prevent concurrent data access from causing me woe

● Durable– When I say 'COMMIT;' the data, make sure it is safe until I explicitly destroy it

Database 101 - (Transactions)

● All or nothing● A transaction has

– A Beginning ( BEGIN; )

– Work ( multiple lines of SQL, i.e. INSERT / UPDATE / DELETE)– An Ending ( END; ) You would expect one of two cases

● COMMIT; ( save everything )● ROLLBACK; ( undo all changes, save nothing)

– Once the transaction has ended, it will either make ALL of the changes between BEGIN; and COMMIT; or NONE of them ( if there is an error for example )

PostgreSQL 101

● PostgreSQL meets all of the requirements to be a fully ACID-compliant, transactional database.

● PostgreSQL RDBMS serves a cluster aka an instance.– An instance serves one ( and only one ) TCP/IP port– Contains at least one database– Has an associated data-directory

Major Features● Full network client-server architecture● ACID compliant● Transactional ( uses WAL / REDO )● Partitioning● Tiered storage via tablespaces● Multiversion Concurrency Control

( readers don't block writers )

● On-line maintenance operations● Hot ( readonly ) and Warm ( quick-

promote ) standby ● Log-based and trigger based replication● SSL● Full-text search● Procedural languages

– Pl/pgSQL plus other, custom languages

General Limitations

Limit Value

Maximum Database Size Unlimited

Maximum Table Size 32 TB

Maximum Row Size 1.6 TB

Maximum Field Size 1 GB

Maximum Rows / Table Unlimited

Maximum Columns / Table 250-1600

Maximum Indexes / Table Unlimited

Client Architecture

Server Overview

● PostgreSQL utilizes a multi-process architecture● Similar to Oracle's 'Dedicated Server' mode● Types of processes

– Primary ( postmaster )– Per-connection backend process– Utility ( maintenance processes )

Server Architecture

Process Components

Memory Components

On-Disk Components

Data Types

● Building blocks of a schema● Optimized on-disk format for a specific type of data● PostgreSQL provides:

– Wide array (no pun intended) of basic to complex data types– Functional interfaces for ease of manipulation– Ability to extend and create custom data types

Number Types

Name Storage Size Range

smallint 2 bytes -32768 to +32767

integer 4 bytes -2147483648 to +2147483647

bigint 8 bytes -9223372036854775808 to 9223372036854775807

decimal variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

numeric variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

real 4 bytes 6 decimal digits precision

double 8 bytes 15 decimal digits precision

Character Types

Name Description

varchar(n) variable-length with limit

char(n) fixed-length, blank padded

text variable unlimited length

Date/Time Types

Name Size Range Resolution

timestamp without timezone

8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits

timestamp with timezone

8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits

date 4 bytes 4713 BC to 5874897 AD 1 day

time without timezone

8 bytes 00:00:00 to 24:00:00 1 microsecond / 14 digits

time with timezone

12 bytes 00:00:00+1459 to 24:00:00-1459

1 microsecond / 14 digits

interval 12 bytes -178000000 years to 178000000 years

1 microsecond / 14 digits

Specialized Types

Name Storage Size Range

boolean 1 byte false to true

smallserial 2 bytes 1 to 32767

serial 4 bytes 1 to 2147483647

bigserial 8 bytes 1 to 9223372036854775807

bytea 1 to 4 bytes plus size of binary string

variable-length binary string

cidr 7 or 19 bytes IPv4 or IPv6 networks

inet 7 or 19 bytes IPv4 or IPv6 hosts or networks

macaddr 6 bytes MAC addresses

uuid 16 bytes Universally Unique Identifiers

“Schema-less” Types

Name Description

xml stores XML data and checks the input values for well-formedness

hstore stores sets of key/value pairs

json stores an exact copy of the input JSON document

jsonb stores a decomposed binary format of the input JSON document

Range Types

● Represents a range of an element type– Integers– Numerics– Times– Dates– And more...

Range TypesCREATE TABLE travel_log (

id serial PRIMARY KEY,

name varchar(255),

travel_range daterange,

EXCLUDE USING gist (travel_range WITH &&)

);

INSERT INTO travel_log (name, trip_range) VALUES ('Chicago', daterange('2012-03-12', '2012-03-17'));

INSERT INTO travel_log (name, trip_range) VALUES ('Austin', daterange('2012-03-16', '2012-03-18'));

ERROR: conflicting key value violates exclusion constraint "travel_log_trip_range_excl"

DETAIL: Key (trip_range)=([2012-03-16,2012-03-18)) conflicts with existing key (trip_range)=([2012-03-12,2012-03-17)).

Indexes

● Enhances database performance

● Enforces some types of constraints– Uniqueness– Exclusion

Index Types

● B-Tree● Generalized Inverted Index (GIN)● Generalized Search Tree (GIST)● Space-Partitoned Generalized Search Tree (SP-GIST)

Coming Soon...● Block Range Index (BRIN) ● “VODKA”

Procedural Languages

● Allows for use defined functionality to be run within the database– Used as functions or triggers

● Frequent use cases– Enhance performance– Increase security– Centralize business logic

Procedural Language Types

● PL/pgSQL● PL/Perl● PL/TCL● PL/Python● More available through extensions...

Extensions

● Additional modules that can be plugged into PostgreSQL● Can be used to add a ton of useful features

– Procedural Languages– Data Types– Administration Tools– Foreign Data Wrappers

● Many found in contrib● Also www.pgxn.org

Procedural Language Extensions

● pl/Java● pl/v8● pl/R● pl/Ruby● pl/schema● pl/lolcode

● pl/sh● pl/Proxy● pl/psm● pl/lua● pl/php

Data Type Extensions

● Hstore● Case Insensitive Text (citext)● International Product Numbering Standards (ISN)● PostGIS (geometry)● BioPostgres● SSN● Email

PostGIS

● PostGIS adds OpenGIS Consortium (OGC) compliant geometry data types and functions to PostgreSQL

● With PostgreSQL, becomes a best of breed spatial and raster database

Administration Tool Extensions

● auto_explain● pageinspect● pg_buffercache● pg_stat_statements● Slony● OmniPITR● pg_monitoring● pgaudit● pg_partman

What are Foreign Data Wrappers?

● Used with SQL/MED– New ANIS SQL 2003 Extension

– Management of External Data

– Standard way of handling remote objects in SQL databases

● Wrappers used by SQL/MED to access remotes data sources

● Makes external data sources look like a PostgreSQL table

FDW Extensions

● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC

● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC

● TDS (Sybase/SQL Server)● S3● WWW● PG-Strom● Column Store● Delimited files● Fixed length files● JSON files

● Hadoop● MongoDB● CouchDB● MonetDB● Redis● Neo4j● Tycoon● LDAP

MongoDB FDWCREATE SERVER mongo_server FOREIGN DATA WRAPPER

mongo_fdw OPTIONS (address '192.168.122.47', port '27017');

CREATE FOREIGN TABLE databases (

_id NAME,

name TEXT

)

SERVER mongo_server

OPTIONS (database 'mydb', collection 'pgData');

test=# select * from databases ;

_id | name

--------------------------+------------

52fd49bfba3ae4ea54afc459 | mongo

52fd49bfba3ae4ea54afc45a | postgresql

52fd49bfba3ae4ea54afc45b | oracle

52fd49bfba3ae4ea54afc45c | mysql

52fd49bfba3ae4ea54afc45d | redis

52fd49bfba3ae4ea54afc45e | db2

(6 rows)

WWW FDW

test=# SELECT * FROM www_fdw_geocoder_googletest-# WHERE address = '731 Lexington Ave, New York, NY';

-[ RECORD 1 ]-----+----------------------------------------------address | type | street_addressformatted_address | 731 Lexington Avenue, New York, NY 10022, USAlat | 40.7619363lng | -73.9681017location_type | ROOFTOP

PL/Proxy

● Developed by Skype

● Allows for scalability and parallelization

● Uses procedural languages and FDWs

PostgreSQL Replication

● Replicate to read-only databases using native streaming replication

● All writes go to a master server

● Load balance across the pool of servers

PostgreSQL Scalability

● PostgreSQL scales up linearly up to 64 cores

● May scale further but hardware is not available to the community

http://rhaas.blogspot.com/2012/04/did-i-say-32-cores-how-about-64.html

Getting Help

● Community Mail List– http://www.postgresql.org/list/

● IRC– irc://irc.freenode.net/postgresql

● NYC PostgreSQL User Group– http://www.nycpug.org

Questions?

Recommended