Download pdf - pgpool feat and devel

Transcript
  • 8/14/2019 pgpool feat and devel

    1/38

  • 8/14/2019 pgpool feat and devel

    2/38

    Agenda

    Developers

    History

    Existing pgpool project Ongoing pgpool-II project

    Demonstration

  • 8/14/2019 pgpool feat and devel

    3/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 3

    Who are we?

    pgpool Global Development Group

    Tatsuo Ishii(SRA OSS, Inc. Japan)

    Devrim Gunduz(Command Prompt, Inc.)

    Yoshiyuki Asaba(SRA OSS, Inc. Japan)

    Taiki Yamaguchi(SRA OSS, Inc. Japan)

    pgpoo-II development team In addition to Tatsuo, Yoshiyuki and Taiki:

    Tomoaki Sato, Yoshiharu Mori, Kaori Inaba (all from SRAOSS, Inc. Japan)

  • 8/14/2019 pgpool feat and devel

    4/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 4

    Developers!

    Yoshiyuki

    Project leaderpgpooldeveloper

    YoshiharuQuery rewritingParallel executionengin

    KaoriProject managerSystem DB

    TatsuoEnhancepgpool-Ipgpooldeveloper

    TaikiPCPQuery Cachepgpooldeveloper

    TomoakiCommunicationmanager

    Our BossGreen Turtle

  • 8/14/2019 pgpool feat and devel

    5/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 5

    pgpool-I: the history

    V0.1: Started as a personal project(2003/6/27)

    V1.0: Synchronous replication (2004/3)

    V2.0: Load balance (2004/6)

    V3.0: pgpool Global DevelopmentGroup(2006/2)

  • 8/14/2019 pgpool feat and devel

    6/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 6

    Why pgpool?

    No general purpose connection poolingsoftware was available

    Java has its own, but PHP does not...

    No small to mid scale/light weightsynchronous replication tool wasavailable

  • 8/14/2019 pgpool feat and devel

    7/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 7

    pgpoolparent process

    pgpoolchild process

    pre-fork

    PostgreSQLbackend process

    pgpool process architecture

    pgpool.conf

  • 8/14/2019 pgpool feat and devel

    8/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 8

    pgpool functionality:connection pooling

    Reduce connection overhead

    Limit maximum number of connections tothe PostgreSQL backend

    New incoming connections are queued inthe kernel if all pgpool processes are busy

  • 8/14/2019 pgpool feat and devel

    9/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 9

    How does connection poolingwork?(1)

    PostgreSQL

    pgpool

    user1/db1

    user1/db1

    PostgreSQLuser1/db1

    user1/db1PostgreSQL

    user1/db1PostgreSQLuser1/db1

  • 8/14/2019 pgpool feat and devel

    10/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 10

    How does connection poolingwork?(2)

    pgpool

    user1/db1

    user2/db2PostgreSQL

    user2/db2

    user2/db2PostgreSQLuser3/db3

    user3/db3

    user2/db2PostgreSQL

    user3/db3

  • 8/14/2019 pgpool feat and devel

    11/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 11

    pgpool understandsfrontend/backend protocol

    Pgpool is transparent to both applicationsand PostgreSQL

    Virtually no modifications to applications

    are needed No special APIs are needed

    Can be used with existing language APIs

    Can be more efficient than libpq becauseof smaller controlling granuality

  • 8/14/2019 pgpool feat and devel

    12/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 12

    Limitations of currentimplementation

    resetting pooled connection may not beperfect(need modification to applications)

    temp tables

    need help from PostgreSQL SSL is not supported

    fall back to non SSL mode

    no pg_hba.conf like IP basedauthentication

  • 8/14/2019 pgpool feat and devel

    13/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 13

    pgpool functionality: replication

    Queries are duplicated and sent toPostgreSQL servers

    pgpool

    PostgreSQL

    PostgreSQL

    query

    query

  • 8/14/2019 pgpool feat and devel

    14/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 14

    Dead lock problem

    session 1 session 2 session 1 session 2

    master secondary

    lock

    lock

    lock

    lock

    t

    wait

    wait

  • 8/14/2019 pgpool feat and devel

    15/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 15

    Strict mode in replicationto avoid deadlock problem

    pgpoolPostgreSQL

    PostgreSQL

    Querypacket

    pgpoolPostgreSQL

    PostgreSQL

    Wait untilsomethingreturns

    pgpoolPostgreSQL

    PostgreSQL

    Query

    pgpoolPostgreSQL

    PostgreSQL

    Wait untilsomethingreturns

    reply backthe result

    Query

  • 8/14/2019 pgpool feat and devel

    16/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 16

    pgpool functionality: loadbalanace

    SELECT queries are sent to randomlychosen backend

    The ratio for load balancing can be

    changed

    pgpool

    PostgreSQL

    PostgreSQL

  • 8/14/2019 pgpool feat and devel

    17/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 17

    Limitations of currentimplementation in replication

    CURRENT_TIMESTAMP and server- dependent-value-returning-functions cannot be replicated

    MD5 and crypt authentication are not supported

    Sequences and SERIAL needs table locking ifthere are more than 1 connections

    Functions having side effects cannot be loadbalanced

  • 8/14/2019 pgpool feat and devel

    18/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 18

    pgpool-II project

    Information-Technology PromotionAgency, Japan (IPA: http://www.ipa.go.jp)granted project

    Started in February 2006, expected to

    release the first version in September2006 under BSD license

    Features including parallel query and

    enhancement to pgpool Successor to pgpool

  • 8/14/2019 pgpool feat and devel

    19/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 19

    Goal of pgpool-II project

    Implement parallel query processing

    Enhance pgpool

    Allow to have more than 2 DB nodes

    More precise control using shared memory

    Easy to manage Control port/protocol

    Detailed statistics on node status GUI management tool

  • 8/14/2019 pgpool feat and devel

    20/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 20

    pgpool-II architecture overview

    Communication

    Manager

    PCP lib

    PCP command

    SQLParser

    pgpoolCatalog

    PostgreSQL

    QueryRewriting

    Parallelexecution

    engine

    PostgreSQL

    sharedmemorymanager

    pgpoolAdmin

    Replication/loadbalanceengine

    pgpool-II System DB

    DB node

    ClientQueryCache

  • 8/14/2019 pgpool feat and devel

    21/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 21

    pgpool-I and pgpool-II mode

    pgpoo-Imode

    1000

    200030004000

    1000

    200030004000

    pgpoo-IImode

    10002000

    30004000

    replicationload balancefail overvirtually compatible with pgpool

    parallel query

  • 8/14/2019 pgpool feat and devel

    22/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 22

    Table partitioning controlCREATE TABLE dist_def (

    dbname TEXT, -- database name

    schema_name TEXT, --schema nametable_name TEXT, -- table namecol_name TEXT, -- key col namecol_list TEXT[], -- col namestype_list TEXT[], -- col typesdist_def_func TEXT, -- function name

    PRIMARY KEY (dbname, schema_name, table_name));

    INSERT INTO dist_def VALUES ('y-mori','public','accounts','aid',ARRAY['aid','bid','abalance','filler'],ARRAY['integer','integer','integer','character(84)'],'dist_def_accounts');

    CREATE OR REPLACE FUNCTION dist_def_accounts (val ANYELEMENT)RETURNS INTEGER AS '

    SELECT CASE WHEN $1 >= 0 and $1 < 100001 THEN 0WHEN $1 >= 100001 and $1 < 200001 THEN 1WHEN $1 >= 200001 and $1 < 300000 THEN 2

    END' LANGUAGE SQL;

  • 8/14/2019 pgpool feat and devel

    23/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 23

    Simple parallel query

    SELECT * FROM accountsWHERE aid = 1000;

    pgpool

    PostgreSQL

    PostgreSQL

    SELECT * FROM accountsWHERE aid = 1000;

    SELECT * FROM accountsWHERE aid = 1000;

  • 8/14/2019 pgpool feat and devel

    24/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 24

    Complex query exampleSELECT * FROM accounts

    WHERE aid = 1000ORDER BY aid;

    pgpool

    PostgreSQLSELECT * FROM accountsWHERE aid = 1000;

    System DBPostgreSQL

    SELECT * dblink('con',

    'SELECT pool_parallel('SELECT * FROM accountsWHERE aid = 1000')') AS foo(...)ORDER BY aid;

    pgpool SELECT pool_parallel('SELECT * FROM accountsWHERE aid = 1000')')

  • 8/14/2019 pgpool feat and devel

    25/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 25

    DML

    INSERT recognize partition key value in aquery and INSERT into appropreate DBnode

    UPDATE/DELETE simply issues the samequery to all DB nodes

  • 8/14/2019 pgpool feat and devel

    26/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 26

    INSERT

    INSERT INTO accounts(aid, bid)VALUES(500,100);

    SELECTdist_def_accounts(500);

    call

    returnDB node = 1

    Syetm DB

    DB node 0 DB node 1 DB node 2

    aid = 0-499 aid = 500-999 aid = 1000-1499

    INSERT

  • 8/14/2019 pgpool feat and devel

    27/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 27

    Query Cache

    CREATE TABLE pgpool_catalog.query_cache (hash TEXT, -- query string(MD5 hashed)query TEXT, -- query stringvalue bytea, -- query result(RowDescription and

    DataRow packets)dbname TEXT, -- database namecreate_time TIMESTAMP WITH TIME ZONE, -- cache creation time

    PRIMARY KEY(hash));

    Caches query result

    Caches can be validated by pgpoolAdmin

    Using parallel query and

  • 8/14/2019 pgpool feat and devel

    28/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 28

    Using parallel query andreplication together

    pgpool-IIwith parallel query

    pgpool-IIwith replication

    pgpool-IIwith replication

    PostgreSQL PostgreSQL PostgreSQL PostgreSQL

    Parallel querylayer

    Replicationand loadbalance

    layer

  • 8/14/2019 pgpool feat and devel

    29/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 29

    pgpoolAdmin

    Web based pgpoolmanagement tool

    Apache/PHP/Smarty

    functions Stopping pgpool

    Switch over

    Monitoring connection pool status

    Monitoring process status

    Editing configuration file

    Query Cache management

  • 8/14/2019 pgpool feat and devel

    30/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 30

    Benchmarks

    Hitachi Blade Symphony BS1000

    10 blades

    Xeon 3.0GHz x 1

    1GB Mem UL320 SCSI Disks

    Cent OS 4.3

    PostgreSQL 8.1.4/pgbench

    Please note that these results are measured onpre-alpha version of pgpool-II and maybe slightlydifferent from the future official release version

  • 8/14/2019 pgpool feat and devel

    31/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 31

    Sequencial Scan case(mid size)

    1 2 4 8 16 320

    0.10.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    1.1

    1.2

    1.3

    PostgreSQL pgpool-II

    concurrent users

    TPS

    scale factor = 90(90M rows)

    SELECT * FROMaccounts WHEREaid = :aid

    pgpool-II is 7 timesfaster at the best

    (32 concurrentusers)

  • 8/14/2019 pgpool feat and devel

    32/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 32

    Index Scan case(mid size)

    1 2 4 8 16 320

    250500

    750

    1000

    1250

    1500

    1750

    2000

    22502500

    2750

    3000

    3250

    3500

    3750

    PostgreSQL pgpool-II

    concurrent users

    TPS

    scale factor = 90(90M rows)

    SELECT * FROMaccounts WHEREaid = :aid

    pgpool-II is 9 timesfaster at the best

    (16 concurrentusers)

  • 8/14/2019 pgpool feat and devel

    33/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 33

    Index Scan case(large size)

    1 2 4 8 16 320

    100200

    300

    400

    500

    600

    700

    800

    900

    1000

    1100

    1200

    1300

    PostgreSQL pgpool-II

    concurrent users

    TPS

    scale factor = 900(900M rows)

    SELECT * FROMaccounts WHEREaid = :aid

    pgpool-II is 18times faster at the

    best (32concurrent users)

  • 8/14/2019 pgpool feat and devel

    34/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 34

    complex query case

    1 2 4 80

    5

    10

    15

    20

    25

    30

    35

    40

    45

    PostgreSQL pgpool-II

    concurrent users

    TPS

    scale factor = 900(900M rows)

    SELECT a1.abalanceFROM accounts as a1

    ,accounts as a2WHERE a1.aid = :aid1and a2.aid = :aid2

    pgpool-II is fasteronly when at 8concurrent users

  • 8/14/2019 pgpool feat and devel

    35/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 35

    pgpool-II restrictions

    SQL restrictions COPY is not supported

    SELECT INTO, INSERT INTO ... SELECT is not supported

    Extended protocol is not supported

    INSERT requires explicit values if target column is the key for datapatitioning

    UPDATE with WHERE clause including function call is not supported

    CREATE TABLE/ALTER TABLE requires pgpool restarting

    Transactions If a DB node fails during INSERT/UPDATE/DELETE, data consistency will

    be broken

    SQL commands via dblink will be executed as separate transactions

  • 8/14/2019 pgpool feat and devel

    36/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 36

    Future plans(TODO)

    More intelligent query rewriting Remove SQL restrictions

    Employ two phase commit to keep consistency

    among DB nodes. However this technique canbe used only for queries outside transactionblock since PREPARE TRANSACTION closescurrent transaction block

    More intelligent cache validation

  • 8/14/2019 pgpool feat and devel

    37/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 37

    References

    pgpool page http://pgpool.projects.postgresql.org/

    pgpool-II page

    English page http://pgpool.projects.postgresql.org/pgpool-II/en/

    Japanese page

    http://pgpool.projects.postgresql.org/pgpool-II/ja/

    http://pgpool.projects.postgresql.org/pgpool-II/en/http://pgpool.projects.postgresql.org/pgpool-II/en/
  • 8/14/2019 pgpool feat and devel

    38/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 38

    Thank you!


Recommended