What are the benefits?
improved query performance for big tables *
* works best when results are coming from single partition
When to consider partitioning?
Big table that cannot be queried efficiently, as the index is enormous as well.
Rule of thumb: table does not fit into memory
Table inheritance: schemaCREATE TABLE A (value INT);CREATE TABLE B () INHERITS (A);
Table "public.a" Column | Type | Modifiers--------+---------+----------- value | integer |
Table "public.b" Column | Type | Modifiers--------+---------+----------- value | integer |Inherits: a
Table inheritance: schema
CREATE TABLE C (extra_field TEXT) INHERITS (A);
Table "public.c" Column | Type | Modifiers-------------+---------+----------- value | integer | extra_field | text |Inherits: a
NB: may inherit from multiple tables
Table inheritance: querying
INSERT INTO A VALUES (0);INSERT INTO B VALUES (10);INSERT INTO C VALUES (20, 'zonk');
SELECT * FROM A WHERE value < 20; value ------- 0 10
SELECT * FROM ONLY A WHERE value < 20; value ------- 0
What happened?
EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20; QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..69.38 rows=1290 width=4) (actual time=0.020..0.033 rows=2 loops=1) -> Seq Scan on a (cost=0.00..4.00 rows=80 width=4) (actual time=0.019..0.021 rows=1 loops=1) Filter: (value < 20) -> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.005..0.006 rows=1 loops=1) Filter: (value < 20) -> Seq Scan on c (cost=0.00..25.38 rows=410 width=4) (actual time=0.005..0.005 rows=0 loops=1) Filter: (value < 20) Rows Removed by Filter: 1
Table CHECK constraint
CREATE TABLE A (value INT);CREATE TABLE B (CHECK (value < 20)) INHERITS (A);CREATE TABLE C (CHECK (value >= 20)) INHERITS (A);
INSERT INTO B VALUES (10);INSERT INTO C VALUES (20);
INSERT INTO C VALUES(5);
ERROR: new row for relation "c" violates check constraint "c_value_check"DETAIL: Failing row contains (5).
Ta da!
EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20; QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..40.00 rows=801 width=4) (actual time=0.015..0.017 rows=1 loops=1) -> Seq Scan on a (cost=0.00..0.00 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1) Filter: (value < 20) -> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.012..0.013 rows=1 loops=1) Filter: (value < 20)
Gotcha #2
All check constraints and not-null constraints on a parent table are automatically inherited by its children.
Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.
Solution #2: create constraints or indexes in all partitionsCREATE TABLE A (id serial NOT NULL, value INT NOT NULL);
CREATE TABLE B ( CONSTRAINT B_pkey PRIMARY KEY (id), CHECK (value < 20)) INHERITS (A);
CREATE TABLE C ( CONSTRAINT C_pkey PRIMARY KEY (id), CHECK (value >= 20)) INHERITS (A);
Gotcha #3There is no practical way to enforce uniqueness of a SERIAL id across partitions.
INSERT INTO B (value) VALUES (10);INSERT INTO C (value) VALUES (20);INSERT INTO C VALUES (1, 30);
SELECT * FROM A; id | value ----+------- 1 | 10 2 | 20 1 | 30
Gotcha #4
Specifying that another table's column REFERENCES a(value) would allow the other table to contain “a values”, but not “b or c values”. There is no good workaround for this case.
How to insert / update rows?
➔ PostgreSQL docs recommend using triggers➔ also possible to do it using rules (overhead, bulk)
Trigger exampleCREATE OR REPLACE FUNCTION a_insert_trigger()RETURNS TRIGGER AS $$BEGIN IF NEW.value < 20 THEN INSERT INTO b VALUES (NEW.*); ELSE INSERT INTO c VALUES (NEW.*); END IF; RETURN NULL;END;$$LANGUAGE plpgsql;
CREATE TRIGGER insert_a_trigger BEFORE INSERT ON a FOR EACH ROW EXECUTE PROCEDURE a_insert_trigger();
Partitioned gemhttps://github.com/fiksu/partitioned
class Company < ActiveRecord::Base; end
class ByCompanyId < Partitioned::ByForeignKey self.abstract_class = true belongs_to :company def self.partition_foreign_key return :company_id end partitioned do |partition| partition.index :id, :unique => true endend
class Employee < ByCompanyId; end
employee = Employee.from_partition(1).find(1)
UUID in Rails
Rails 4:enable_extension 'uuid-ossp'create_table :documents, id: :uuid do |t| t.string :title t.string :author t.timestampsend
Rails 3:gem postgres_ext https://github.com/dockyard/postgres_ext
Actual performance improvements?
➔ table with ~13 mln rows➔ partitioned by date into 8 partitions➔ complex, long-running query➔ baseline: 0.07 tps➔ when results in single partition: 0.15 tps➔ when results in 2 partitions: like baseline
References➔ http://www.postgresql.org/docs/9.3/static/ddl-
partitioning.html➔ PostgreSQL 9.0 High Performance Gregory Smith