36
Sharing Code and Experiences @fabriziomello

GSoC2014 - PGCon2015 Presentation June, 2015

Embed Size (px)

Citation preview

Sharing Code and Experiences

@fabriziomello

About me

I was born in Dom Pedrito in the South of Brazil

But I lived in Bagé since I was 1 year old

Bagé is 236 miles away from Porto Alegre

Porto Alegre is the FISL city

In my hometown Bagé is common people...

● … be born there● … grow up there● … build a family there● … spend the entire life there● … and die there.

Until 2005 this lifestyle fitted with my “old” needs

Then I had an opportunity to work 100% of my time

with FOOS

Background● Bachelor in Information Systems in 2002

● Entrepeneur at http://timbira.com

● Agile Methodologies Specialization student 2014/2015

● PostgreSQL colaborator since 2008 (Brazilian community and now the international too)

FOSS and me

● My first contact was using Linux in 1997

● I fell in love with this culture since then

● In 1999 I met PostgreSQL so since then I knew this would be part of my life

But FOSS bite me awile :-)

● Because of this decision I had a lot of troubles, including financial…

● But here I am :-)

Is a global program that offers students stipends to write code for open source

projects.

We have worked with the open source community to

identify and fund exciting projects for the upcoming

summer.

Connect students to open source communities

GSoC and PostgreSQL

● Since 2006● Cool projects

○ Fast GiST index build○ New phpPgAdmin Plugin Architecture (brazilian)○ pgAdmin database designer○ Better indexing for ranges○ Document collection Foreign-data Wrapper

And now my project ...

PostgreSQL 9.1 introduced a new kind of tableUnlogged Tables

What means “Unlogged”?

First we need to know what means “WAL”

PostgreSQL is Full-ACID and to guarantee data integrity uses a standard method called

WAL (Write-Ahead Logging)

WAL (Write-Ahead Logging)“In computer science, write-ahead logging (WAL) is a family of techniques for providing atomicity and durability (two of the ACID properties) in database systems.

In a system using WAL, all modifications are written to a log before they are applied. Usually both redo and undo information is stored in the log.”

http://en.wikipedia.org/wiki/Write-ahead_logging

Ok, and what means “Unlogged” ?

● Unlogged means that the data written in these tables is not written to WAL.

● So it makes written really, really fast compared to written into regular tables.

So I’ll use it to all of my tables...

● However you won’t want to do that, because

○ They are neither crash-safe (an unlogged table is automatically truncated after a crash or unclean shutdown)

○ And they are nor replicated using SR

But there are some cool use cases

● Speed ETL jobs● Cache● Session State● Queues?!● ...

And now we have the power to ...

● change from UNLOGGED to LOGGED○ ALTER TABLE name SET LOGGED;

● change from LOGGED to UNLOGGED○ ALTER TABLE name SET UNLOGGED;

Already committed commit: f41872d0c1239d36ab03393c39ec0b70e9ee2a3cauthor: Alvaro Herrera <[email protected]>date: Fri, 22 Aug 2014 14:27:00 -0400Implement ALTER TABLE .. SET LOGGED / UNLOGGED

This enables changing permanent (logged) tables to unlogged andvice-versa.

(Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order thathopefully makes more sense than the original.)

Author: Fabrízio de Royes MelloReviewed by: Christoph Berg, Andres Freund, Thom BrownSome tweaking by Álvaro Herrera

Current implementation1. Acquire AcessExclusiveLock2. Check dependencies

a. Cannot change temp tablesb. Check Foreign Keys

3. Create new heap/toast with new relpersistence 4. Rewrite heap/toast5. Rewrite indexes

Currently Caveats

● AccessExclusiveLock● Rewrite all datafiles

PostgreSQL GSoC2015 Improve the performance of

ALTER TABLE SET LOGGED UNLOGGED statement

Improvements

● Don’t rewrite datafiles when wal_level = minimal

● Create wal records for datafiles when wal_level != minimal

Main Problems

● To change an Unlogged to Logged (or vice-versa) we need

○ Drop/Create the InitFork

○ and it is not a transactional operation

GSoC2015 : Main Problems

● ALTER TABLE ... SET LOGGED○ if we drop the InitFork

○ and a crash occurs we're in a inconsistent state

○ in catalog the relation is marked as unlogged and we don't have the InitFork anymore

GSoC2015 : Main Problems

● ALTER TABLE ... SET UNLOGGED○ if we create the InitFork and the transaction

rollback or a crash occurs we're in a inconsistent state because in the catalog the relation is marked as logged and during the crash recovery we truncate the relation if the InitFork exists leading us to a inconsistent state

Idea if “wal_level = minimal”● Acquire AcessExclusiveLock● Check dependencies

○ Cannot change temp tables○ Check Foreign Keys

● Create a “TransientInitFork” to crash recovery detection● FlushRelationBuffers and fsync relation● Drop/Create the “InitFork”● Change Catalog● Drop the “TransientInitFork”

Idea if “wal_level != minimal”● Same of the “wal_level = minimal”● Plus

○ xlog all pages of datafiles (heap, toast, index)

Questions?

Special thanks to

● GSoC2014○ Stephen Frost (mentor)○ Josh Berkus and Thom Brown (organizers)○ Christoph Berg (patch review)○ Álvaro Herrera (patch review and commit)

● GSoC2015○ Josh Berkus and Thom Brown (organizers)○ Ashutosh Baspat (mentor)

PGBR2015 call for papers is open!!

● When?○ November 18, 19 and 20

● Where○ Porto Alegre, Brasil

● http://pgbr.postgresql.org.br