pgpool feat and devel

  • View
    214

  • Download
    0

Embed Size (px)

Text of pgpool feat and devel

  • 8/14/2019 pgpool feat and devel

    1/38

  • 8/14/2019 pgpool feat and devel

    2/38

    Agenda

    Developers

    History

    Existing pgpool project Ongoing pgpool-II project

    Demonstration

  • 8/14/2019 pgpool feat and devel

    3/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 3

    Who are we?

    pgpool Global Development Group

    Tatsuo Ishii(SRA OSS, Inc. Japan)

    Devrim Gunduz(Command Prompt, Inc.)

    Yoshiyuki Asaba(SRA OSS, Inc. Japan)

    Taiki Yamaguchi(SRA OSS, Inc. Japan)

    pgpoo-II development team In addition to Tatsuo, Yoshiyuki and Taiki:

    Tomoaki Sato, Yoshiharu Mori, Kaori Inaba (all from SRAOSS, Inc. Japan)

  • 8/14/2019 pgpool feat and devel

    4/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 4

    Developers!

    Yoshiyuki

    Project leaderpgpooldeveloper

    YoshiharuQuery rewritingParallel executionengin

    KaoriProject managerSystem DB

    TatsuoEnhancepgpool-Ipgpooldeveloper

    TaikiPCPQuery Cachepgpooldeveloper

    TomoakiCommunicationmanager

    Our BossGreen Turtle

  • 8/14/2019 pgpool feat and devel

    5/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 5

    pgpool-I: the history

    V0.1: Started as a personal project(2003/6/27)

    V1.0: Synchronous replication (2004/3)

    V2.0: Load balance (2004/6)

    V3.0: pgpool Global DevelopmentGroup(2006/2)

  • 8/14/2019 pgpool feat and devel

    6/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 6

    Why pgpool?

    No general purpose connection poolingsoftware was available

    Java has its own, but PHP does not...

    No small to mid scale/light weightsynchronous replication tool wasavailable

  • 8/14/2019 pgpool feat and devel

    7/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 7

    pgpoolparent process

    pgpoolchild process

    pre-fork

    PostgreSQLbackend process

    pgpool process architecture

    pgpool.conf

  • 8/14/2019 pgpool feat and devel

    8/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 8

    pgpool functionality:connection pooling

    Reduce connection overhead

    Limit maximum number of connections tothe PostgreSQL backend

    New incoming connections are queued inthe kernel if all pgpool processes are busy

  • 8/14/2019 pgpool feat and devel

    9/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 9

    How does connection poolingwork?(1)

    PostgreSQL

    pgpool

    user1/db1

    user1/db1

    PostgreSQLuser1/db1

    user1/db1PostgreSQL

    user1/db1PostgreSQLuser1/db1

  • 8/14/2019 pgpool feat and devel

    10/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 10

    How does connection poolingwork?(2)

    pgpool

    user1/db1

    user2/db2PostgreSQL

    user2/db2

    user2/db2PostgreSQLuser3/db3

    user3/db3

    user2/db2PostgreSQL

    user3/db3

  • 8/14/2019 pgpool feat and devel

    11/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 11

    pgpool understandsfrontend/backend protocol

    Pgpool is transparent to both applicationsand PostgreSQL

    Virtually no modifications to applications

    are needed No special APIs are needed

    Can be used with existing language APIs

    Can be more efficient than libpq becauseof smaller controlling granuality

  • 8/14/2019 pgpool feat and devel

    12/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 12

    Limitations of currentimplementation

    resetting pooled connection may not beperfect(need modification to applications)

    temp tables

    need help from PostgreSQL SSL is not supported

    fall back to non SSL mode

    no pg_hba.conf like IP basedauthentication

  • 8/14/2019 pgpool feat and devel

    13/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 13

    pgpool functionality: replication

    Queries are duplicated and sent toPostgreSQL servers

    pgpool

    PostgreSQL

    PostgreSQL

    query

    query

  • 8/14/2019 pgpool feat and devel

    14/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 14

    Dead lock problem

    session 1 session 2 session 1 session 2

    master secondary

    lock

    lock

    lock

    lock

    t

    wait

    wait

  • 8/14/2019 pgpool feat and devel

    15/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 15

    Strict mode in replicationto avoid deadlock problem

    pgpoolPostgreSQL

    PostgreSQL

    Querypacket

    pgpoolPostgreSQL

    PostgreSQL

    Wait untilsomethingreturns

    pgpoolPostgreSQL

    PostgreSQL

    Query

    pgpoolPostgreSQL

    PostgreSQL

    Wait untilsomethingreturns

    reply backthe result

    Query

  • 8/14/2019 pgpool feat and devel

    16/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 16

    pgpool functionality: loadbalanace

    SELECT queries are sent to randomlychosen backend

    The ratio for load balancing can be

    changed

    pgpool

    PostgreSQL

    PostgreSQL

  • 8/14/2019 pgpool feat and devel

    17/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 17

    Limitations of currentimplementation in replication

    CURRENT_TIMESTAMP and server- dependent-value-returning-functions cannot be replicated

    MD5 and crypt authentication are not supported

    Sequences and SERIAL needs table locking ifthere are more than 1 connections

    Functions having side effects cannot be loadbalanced

  • 8/14/2019 pgpool feat and devel

    18/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 18

    pgpool-II project

    Information-Technology PromotionAgency, Japan (IPA: http://www.ipa.go.jp)granted project

    Started in February 2006, expected to

    release the first version in September2006 under BSD license

    Features including parallel query and

    enhancement to pgpool Successor to pgpool

  • 8/14/2019 pgpool feat and devel

    19/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 19

    Goal of pgpool-II project

    Implement parallel query processing

    Enhance pgpool

    Allow to have more than 2 DB nodes

    More precise control using shared memory

    Easy to manage Control port/protocol

    Detailed statistics on node status GUI management tool

  • 8/14/2019 pgpool feat and devel

    20/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 20

    pgpool-II architecture overview

    Communication

    Manager

    PCP lib

    PCP command

    SQLParser

    pgpoolCatalog

    PostgreSQL

    QueryRewriting

    Parallelexecution

    engine

    PostgreSQL

    sharedmemorymanager

    pgpoolAdmin

    Replication/loadbalanceengine

    pgpool-II System DB

    DB node

    ClientQueryCache

  • 8/14/2019 pgpool feat and devel

    21/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 21

    pgpool-I and pgpool-II mode

    pgpoo-Imode

    1000

    200030004000

    1000

    200030004000

    pgpoo-IImode

    10002000

    30004000

    replicationload balancefail overvirtually compatible with pgpool

    parallel query

  • 8/14/2019 pgpool feat and devel

    22/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 22

    Table partitioning controlCREATE TABLE dist_def (

    dbname TEXT, -- database name

    schema_name TEXT, --schema nametable_name TEXT, -- table namecol_name TEXT, -- key col namecol_list TEXT[], -- col namestype_list TEXT[], -- col typesdist_def_func TEXT, -- function name

    PRIMARY KEY (dbname, schema_name, table_name));

    INSERT INTO dist_def VALUES ('y-mori','public','accounts','aid',ARRAY['aid','bid','abalance','filler'],ARRAY['integer','integer','integer','character(84)'],'dist_def_accounts');

    CREATE OR REPLACE FUNCTION dist_def_accounts (val ANYELEMENT)RETURNS INTEGER AS '

    SELECT CASE WHEN $1 >= 0 and $1 < 100001 THEN 0WHEN $1 >= 100001 and $1 < 200001 THEN 1WHEN $1 >= 200001 and $1 < 300000 THEN 2

    END' LANGUAGE SQL;

  • 8/14/2019 pgpool feat and devel

    23/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 23

    Simple parallel query

    SELECT * FROM accountsWHERE aid = 1000;

    pgpool

    PostgreSQL

    PostgreSQL

    SELECT * FROM accountsWHERE aid = 1000;

    SELECT * FROM accountsWHERE aid = 1000;

  • 8/14/2019 pgpool feat and devel

    24/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 24

    Complex query exampleSELECT * FROM accounts

    WHERE aid = 1000ORDER BY aid;

    pgpool

    PostgreSQLSELECT * FROM accountsWHERE aid = 1000;

    System DBPostgreSQL

    SELECT * dblink('con',

    'SELECT pool_parallel('SELECT * FROM accountsWHERE aid = 1000')') AS foo(...)ORDER BY aid;

    pgpool SELECT pool_parallel('SELECT * FROM accountsWHERE aid = 1000')')

  • 8/14/2019 pgpool feat and devel

    25/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 25

    DML

    INSERT recognize partition key value in aquery and INSERT into appropreate DBnode

    UPDATE/DELETE simply issues the samequery to all DB nodes

  • 8/14/2019 pgpool feat and devel

    26/38

    2006/07/09 Tronto Copyright(c) 2006 pgpool DG 26

    INSERT

    INSERT INTO accounts(aid, bid)VALUES(500,100);

    SELECTdist_def_accounts(500);

    call

    returnDB node = 1

    Syetm DB

    DB node 0 DB node 1 DB node 2

    aid = 0-499 aid = 500-999 aid = 1000-1499

    INSERT

  • 8/14/2019 pgpool feat and devel

    27/38

    2006/07/09 Tronto Copyright(