DBIx-DataModel v2.0 in detail

Preview:

DESCRIPTION

DBIx-DataModel is an object-relational mapping framework for Perl5. Schema declarations are inspired from UML modelling. The API provides efficient interaction with the DBI layer, detailed control on statement execution steps, flexible and powerful treatment of database joins. More on http://search.cpan.org/dist/DBIx-DataModel. Talk presented at YAPC::EU::2011 Riga (updated from a previous version presented at FPW2010).

Citation preview

11.04.23 - Page 1

DépartementOffice

DBIx::DataModel2.0

in detail

YAPC::EU::2011, Riga

laurent.dami@justice.ge.chDépartement

Office

Agenda

• Introduction : Object-Relational Mappings & UML• DBIx::DataModel 2.0 Architecture• Modelling the schema• Selecting data• Row objects and statement objects• Joins• Inserts and updates• Customization• Strengths and limitations of DBIx::DataModel

today on CPAN :v 1.99_05

11.04.23 - Page 1

DépartementOffice

Object-Relational Mappers

Perl and databases

Database

DBD driver

DBI

Object-Relational Mapper

Perl program

ORM principle

r1r2...

c1 c2 c3

...

c3 c4

+c1: String+c2: String+c3: class2

r1 : class1

RDBMS

r2 : class1

RAM

table1

table2

Impedance mismatch

• SELECT c1, c2 FROM table1 missing c3, so cannot navigate to class2 is it a valid instance of class1 ?

• SELECT * FROM table1 LEFT JOIN table2 ON … is it a valid instance of class1 ? what to do with the c4 column ?

• SELECT c1, c2, length(c2) AS l_c2 FROM table1 no predeclared method in class1 for accessing l_c2

c1 c2 c3 c3 c4+c1: String+c2: String+c3: class2

r1 : class1 RDBMSRAMtable1 table2

11.04.23 - Page 1

DépartementOffice

The Unified Modeling Language (UML)

Example : CPAN model

Author

Distribution Module

1

*

1 *

*

multiplicity

role

class

association

dependent_distribs

*prereq_modules

contains ►

assoc. name

composition

depends_on ►

11.04.23 - Page 1

DépartementOffice

Modelling a schema

Architecture

Schema

Source

Table Join Statement

My::DB My::DB::Table_n

My::DB::AutoJoin::

row statementrow

DBIx::DataModelclasses

applicationclasses

objects

schema

quite similar to DBI architecture (dbh, sth)

All definitions in one single file

use DBIx::DataModel;

DBIx::DataModel->Schema("My::DB")

->Table(qw/Author author author_id /)->Table(qw/Distribution distribution distrib_id/)->Table(qw/Module module module_id /)

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

creates package My::DB

creates package My::DB::Author

adds methods intoboth packages

Multiplicities

"$min..$max"

"*" means "0..POSIX::INT_MAX""1" means "1..1"

$min == 0 ? joins are "LEFT OUTER JOIN" : joins are "INNER JOIN"

$max > 1 ? default result is a list : default result is a single object

Meta-Architecture

Schema Table Join

My::DB

My::DB::Table_n

My::DB::Auto_join

Meta::Source

Meta::Table

Meta::Join

meta::table

meta::join

meta::schema

Meta::Schema Meta::PathMeta::Association Meta::Type

meta::assoc meta::path

meta::type

11.04.23 - Page 1

DépartementOffice

Data retrieval

Fetching one single record

# fetch from primary key# by default, retrieves all columns ('*') my $author = My::DB->table('Author')->fetch('DAMI');

# reach columns through the hashref APIwhile (my ($k, $v) = each %$author) {

print "$k : $v\n";}

Multi-schema mode

# create a schemamy $dbh = DBI->connect(… ):my $schema = My::DB->new(dbh => $dbh);

# fetch datamy $author = $schema->table('Author')->fetch('DAMI');

Fetching a list of records

# select multiple records my $recent_distribs = My::DB->table('Distribution') ->select( -columns => [qw/distrib_name d_release/], -where => {d_release => {'>' => $some_date}}, -order_by => [qw/-d_release +distrib_name/],

);

foreach my $distrib (@$recent_distribs) { ...}

Select API : overview

my $result = $source->select( -columns => \@columns,, -where => \%where, -group_by => \@groupings, -having => \%criteria, -order_by => \@order, -for => 'read only', -post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … }, -page_size => …, -page_index => …, -limit => …, -offset => …, -column_types => \%types, -result_as => 'rows' || 'sth' || 'sql' || 'statement' || 'hashref');

11.04.23 - Page 1

DépartementOffice

Arguments to select()

SQL::Abstract::More : named parameters

my $result = $source->select( -columns => \@columns,, -where => \%where, -order_by => \@order,);

SQL::Abstract->new ->select($table, \@columns, \%where, \@order)

SQL::Abstract::More : extensions

• -columns => [qw/col1|alias1 max(col2)|alias2/]– SELECT col1 AS alias1, max(col2) AS alias2

• -columns => [-DISTINCT => qw/col1 col2 col3/]– SELECT DISTINCT col1, col2, col3

• -order_by => [qw/col1 +col2 –col3/]– SELECT … ORDER BY col1, col2 ASC, col3 DESC

• -for => "update" || "read only"– SELECT … FOR UPDATE

Grouping

• -group_by => [qw/col1 col2 …/]• -having => { col1 => {"<" => val1} , col2 => ... }

– SELECT … GROUP BY col1, col2 HAVING col1 < ? AND col2 …

separate call to SQL::Abstract and re-injection into the SQL

Paging

• -page_size => $num_rows, -page_index => $page_index

# or

• -limit => $num_rows, -offset => $row_index

either new call to $sth->execute(), or use scrollable cursors (DBIx.:DataModel::Statement::JDBC)

starts at 1

starts at 0

Callbacks

-post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … },

• hooks to various states within the statement lifecycle (see later)

• sometimes useful for DB-specific features

Polymorphic result

-result_as =>– 'rows' (default) : arrayref of row objects– 'firstrow' : a single row object (or undef)– 'hashref' : hashref keyed by primary keys– [hashref => @cols] : cascaded hashref– 'flat_arrayref' : flattened values from each row– 'statement' : a statement object (iterator)– 'fast_statement' : statement reusing same memory– 'sth' : DBI statement handle– 'sql' : ($sql, @bind_values)– 'subquery' : \["($sql)", @bind]

don't need method variants : select_hashref(), select_arrayref(), etc.

11.04.23 - Page 1

DépartementOffice

Row objects

A row object …

• is just a hashref– keys are column names– values are column values– nothing else

• actually, when in multi-schema mode, there is an additional __schema field

• is blessed into the table class– has a metadm method (accessor to the metaclass)– has a schema method (accessor to the schema)– has methods for navigating to related tables

• can be dumped as is– to Dumper / YAML / JSON / XML– to Perl debugger

Columns …

• basically, are plain scalars, not objects• but can be "inflated/deflated" through a Type()• programmer chooses the column list, at each

select()-columns => \@columns # arrayref -columns => "col1, col2" # string-columns => "*" # default

• objects have variable size !– if missing keys : runtime error

• when following joins• when updating and deleting

Navigation to associated tables

• Method names come from association declarations• Exactly like a select()

– automatically chooses –result_as => 'rows' || 'firstrow'from multiplicity information

# ->Association([qw/Author author 1 /],# [qw/Distribution distribs 0..* /])

my $author = $distrib->author();my $other_distribs = $author->distribs( -columns => [qw/. . ./], -where => { . . . }, -order_by => [qw/. . ./],);

11.04.23 - Page 1

DépartementOffice

Statement objects

Statement: an encapsulated query

statement

meta::source My::Source

1

1

**schemadbh

0..1 *

rownext() / all()

My::Schema meta::schema

in single-schema mode

in multi-schema mode

singleton()

Statement lifecycle

new

sqlized

prepared

executed

schema + source

data row(s)

new()

sqlize()

prepare()

execute()

bind()refine()

bind()

bind()

bind()execute()

next() / all()

blessedcolumn types applied

-post_bless

-pre_exec

-post_exec

-post_SQL

When to explicitly use a statement

• as iteratormy $statement = $source->select(..., -result_as => 'statement');while (my $row = $statement->next) { . . .}

• for paging$statement->goto_page(123);

• for loop efficiencymy $statement = My::Table->join(qw/role1 role2/); $statement->prepare(-columns => ..., -where => ...); my $list = My::Table->select(...); foreach my $obj (@$list) { my $related_rows = $statement->execute($obj)->all; ... }

Fast statement

• like a regular statement– but reuses the same memory location for each row– see DBI::bind_col()

my $statement = $source->select( . . . , -result_as => 'fast_statement');

while (my $row = $statement->next) { . . . # DO THIS : print $row->{col1}, $row->{col2} # BUT DON'T DO THIS : push @results, $row;}

11.04.23 - Page 1

DépartementOffice

Database joins

Basic join

$rows = My::DB->join(qw/Author distribs modules/) ->select(-where => ...);

Author Distrib Module

My::DB::AutoJoin::…

DBIDM::Source::Join

new class created on the fly

Left / inner joins

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])

# default : LEFT OUTER JOIN

->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

# default : INNER JOIN

# but defaults can be overriddenMy::DB->join([qw/Author <=> distribs/)-> . . . My::DB->join([qw/Distribution => modules /)-> . . .

Join from an instance

$rows = $author->join(qw/distribs modules/)->select( -columns => [qw/distrib_name module_name/], -where => {d_release => {'<' => $date} },);

SELECT distrib_name, module_nameFROM distribution INNER JOIN module ON distribution.distrib_id = module.distrib_idWHERE distrib.author_id = $author->{author_id} AND d_release < $date

11.04.23 - Page 1

DépartementOffice

Insert / Update

Insert

@ids = MyDB::Author->insert({ firstname => 'Larry', lastname => 'Wall' },

{ firstname => 'Damian', lastname => 'Conway' },);

INSERT INTO author(firstname, lastname)VALUES (?, ?)

Bulk insert

@ids = MyDB::Author->insert([qw/firstname lastname/],[qw/Larry Wall /],

[qw/Damian Conway /],);

Insert into / cascaded insert

@id_trees = $author->insert_into_distribs( {distrib_name => 'DBIx-DataModel', modules => [ {module_name => 'DBIx::DataModel', ..}, {module_name => 'DBIx::DataModel::Statement', ..}, ]}, {distrib_name => 'Pod-POM-Web', … }, -returning => {},);

Update

$obj->{col1} = $new_val_1;$obj->{col2} = $new_val_2;. . .$opj->update;# orMyDB::Author->update({author_id => $id, col =>

$new_val})# orMyDB::Author->update($id, {col => $new_val})

# or (bulk update)MyDB::Author->update(-set => {col => $new_val},

-where => \%condition)

Transaction

MyDB->do_transaction(sub { my $author = MyDB::Author->fetch($author_id, {-for => "read only"} );

my $distribs = $author->distribs(-for => 'update');foreach my $distrib (@$distribs) {my $id = $distrib->{distrib_id};MyDB::Distrib->update($id, {col => $val});

}});

• can be nested• can involve several dbh• no savepoints (yet)

11.04.23 - Page 1

DépartementOffice

Other features

Named placeholders / bind()

# introduce named placeholders$statement->prepare(-where => { col1 => '?:foo', col2 => {"<" => '?:bar'}, col3 => {">" => '?:bar'}, col3 => 1234, });

# fill placeholders with values$statement->bind(foo => 99, bar => 88, other => 77);$statement->bind($hashref);

$sql @bind

SELECT * FROM .. WHEREcol1 = ? AND col2 < ? AND col3 = ?

-- -- 1234

?:foo ?:bar

Types (inflate/deflate)

# declare a TypeMy::DB->Type(Multivalue => from_DB => sub {$_[0] = [split /;/, $_[0]] }, to_DB => sub {$_[0] = join ";", @$_[0] },);

# apply it to some columns in a tableMy::DB::Author->metadm->define_column_type( Multivalue => qw/hobbies languages/,);

Auto_expand

# declare auto-expansionsMyDB::Author->define_auto_expand(qw/distributions/);MyDB::Distribution->define_auto_expand(qw/modules/);

# apply to an object (automatically fetches all modules of all distributions of that author)

$author->auto_expand();

# use the data treeuse YAML; print Dump($author);

Schema localization

{# a kind of "local MyDB";

my $guard = MyDB->localize_state();

# temporary change class dataMyDB->dbh($new_dbh, %new_options);do_some_work_with_new_dbh();

} # automatically restore previous state

Schema generator

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBI('dbi:connection:string')" -- \ -schema My::New::Schema > My/New/Schema.pm

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBIxClass('Some::DBIC::Schema')" -- \ -schema My::New::Schema > My/New/Schema.pm

Auto_insert / Auto_update / No_update

$table->auto_insert_columns(created_by => sub {$ENV{REMOTE_USER} . ", " . localtime });

$table->auto_update_columns( modified_by => sub {…} );

$table->no_update_columns(qw/row_id/);

can also be declared for the whole schema

Extending / customizing DBIx::DataModel

• Schema hooks for– SQL dialects (join syntax, alias syntax, limit / offset, etc.)– last_insert_id

• Ad hoc subclasses for– SQL::Abstract– Table– Join– Statements

• Statement callbacks• Extending table classes

– additional methods– redefining _singleInsert method

11.04.23 - Page 1

DépartementOffice

Conclusion

Strengths

• centralized definitions of tables & associations• efficiency• improved API for SQL::Abstract • clear conceptual distinction between

– data sources (tables and joinss),– database statements (stateful objects representing SQL

queries)– data rows (lightweight blessed hashrefs)

• concise and flexible syntax for joins• used in production for mission-critical app

– (running Geneva courts)

Limitations

• tiny community• no schema versioning• no object caching nor 'dirty columns' • no 'cascaded update' nor 'insert or create'

Lots of documentation

• SYNOPSIS AND DESCRIPTION• DESIGN• QUICKSTART• REFERENCE• COOKBOOK• MISC• INTERNALS• GLOSSARY

11.04.23 - Page 1

DépartementOffice

THANK YOU FOR YOUR ATTENTION

11.04.23 - Page 1

DépartementOffice

Bonus slides

ORM: What for ?

[catalyst list] On Thu, 2006-06-08, Steve wrote:

Not intending to start any sort of rancorous discussion, but I was wondering whether someone could illuminate me a little?

I'm comfortable with SQL, and with DBI. I write basic SQL that runs just fine on all databases, or more complex SQL when I want to target a single database (ususally postgresql).

What value does an ORM add for a user like me?

ORM useful for …

• dynamic SQL– navigation between tables– generate complex SQL queries from Perl datastructures– better than phrasebook or string concatenation

• automatic data conversions (inflation / deflation)• expansion of tree data structures coded in the relational

model• transaction encapsulation • data validation• computed fields• caching• …

See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June

Many-to-many implementation

author_idauthor_namee_mail

1

*

1 *

* *

Author

distrib_idmodule_id

Dependency

distrib_iddistrib_named_releaseauthor_id

Distribution

module_idmodule_namedistrib_id

Module

1 1

link table forn-to-n association

Writing SQL

SQL is too low-level, I don't ever want to see it

SQL is the most important part of my application, I won't let

anybody write it for me

Why hashref instead of OO accessors ?

• Perl builtin rich API for hashes (keys, values, slices, string interpolation)

• good for import / export in YAML/XML/JSON• easier to follow steps in Perl debugger• faster than OO accessor methods• visually clear distinction between lvalue / rvalue

– my $val = $hashref->{column};– $hashref->{column} = $val;

• visually clear distinction between – $row->{column} / $row->remote_table()

Callback example

WITH RECURSIVE nodetree(level, id, pid, sort) AS ( SELECT 1, id, parent, '{1}'::int[] FROM nodes WHERE parent IS NULL     UNION SELECT level+1,p.id, parent, sort||p.id FROM nodetree pr JOIN nodes p ON p.parent = pr.id

) SELECT * FROM nodetree ORDER BY sort;

my $with_clause = "WITH RECURSIVE …";

DBIx::DataModel->Schema('Tst') ->Table(qw/Nodetree nodetree id/);

my $result = Tst::Nodetree->select ( -post_SQL => sub {my $sql = shift; $sql =~ s/^/$with_clause/; return $sql, @_ }, -orderBy => 'sort',);

Verbose form for definitions

DBIx::DataModel->define_schema( class => "My::DB");

My::DB->metadm->define_table( class => "Author", db_name => "author", primary_key => "author_id"):

New features in 2.0

• metaclasses– client can query about tables, associations, types, etc.– method namespace for regular objects is not polluted by meta-methods

• single-schema / multi-schema mode• misc additions

– bulk update & bulk delete– support for table inheritance– arbitrary clauses in joins

• API changes– perlish_method_names()– SQL generation moved to SQL::Abstract::More– Params::Validate everywhere– deprecated Autoload()– compatibility layer : use DBIx::DataModel –compatibility => 1.0

Migration to v2.0

• deploy DBIx::DataModel 2.0– -compatibility => 1.0

• test• change client code according to new API• test• suppress compatibility layer

DBIx::DataModel history

• 2005 : 1st CPAN publication (v0.10, 16.09.05)• 2006 : YAPC::EU::06 Birmingham presentation• 2008 : heavy refactoring (v1.03, 23.09.08)

– statement object– implicit schema name

• 2011: heavy refactoring– metaobject layer– multi-schema mode– API renaming

Recommended