DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

DBIx::DataModel2.0

in detail

YAPC::EU::2011, Riga

[email protected]épartement

Office

Agenda

• Introduction : Object-Relational Mappings & UML• DBIx::DataModel 2.0 Architecture• Modelling the schema• Selecting data• Row objects and statement objects• Joins• Inserts and updates• Customization• Strengths and limitations of DBIx::DataModel

today on CPAN :v 1.99_05

11.04.23 - Page 1

DépartementOffice

Object-Relational Mappers

Perl and databases

Database

DBD driver

DBI

Object-Relational Mapper

Perl program

ORM principle

r1r2...

c1 c2 c3

...

c3 c4

+c1: String+c2: String+c3: class2

r1 : class1

RDBMS

r2 : class1

RAM

table1

table2

Impedance mismatch

• SELECT c1, c2 FROM table1 missing c3, so cannot navigate to class2 is it a valid instance of class1 ?

• SELECT * FROM table1 LEFT JOIN table2 ON … is it a valid instance of class1 ? what to do with the c4 column ?

• SELECT c1, c2, length(c2) AS l_c2 FROM table1 no predeclared method in class1 for accessing l_c2

c1 c2 c3 c3 c4+c1: String+c2: String+c3: class2

r1 : class1 RDBMSRAMtable1 table2

11.04.23 - Page 1

DépartementOffice

The Unified Modeling Language (UML)

Example : CPAN model

Author

Distribution Module

1

*

1 *

*

multiplicity

role

class

association

dependent_distribs

*prereq_modules

contains ►

assoc. name

composition

depends_on ►

11.04.23 - Page 1

DépartementOffice

Modelling a schema

Architecture

Schema

Source

Table Join Statement

My::DB My::DB::Table_n

My::DB::AutoJoin::

row statementrow

DBIx::DataModelclasses

applicationclasses

objects

schema

quite similar to DBI architecture (dbh, sth)

All definitions in one single file

use DBIx::DataModel;

DBIx::DataModel->Schema("My::DB")

->Table(qw/Author author author_id /)->Table(qw/Distribution distribution distrib_id/)->Table(qw/Module module module_id /)

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

creates package My::DB

creates package My::DB::Author

adds methods intoboth packages

Multiplicities

"$min..$max"

"*" means "0..POSIX::INT_MAX""1" means "1..1"

$min == 0 ? joins are "LEFT OUTER JOIN" : joins are "INNER JOIN"

$max > 1 ? default result is a list : default result is a single object

Meta-Architecture

Schema Table Join

My::DB

My::DB::Table_n

My::DB::Auto_join

Meta::Source

Meta::Table

Meta::Join

meta::table

meta::join

meta::schema

Meta::Schema Meta::PathMeta::Association Meta::Type

meta::assoc meta::path

meta::type

11.04.23 - Page 1

DépartementOffice

Data retrieval

Fetching one single record

# fetch from primary key# by default, retrieves all columns ('*') my $author = My::DB->table('Author')->fetch('DAMI');

# reach columns through the hashref APIwhile (my ($k, $v) = each %$author) {

print "$k : $v\n";}

Multi-schema mode

# create a schemamy $dbh = DBI->connect(… ):my $schema = My::DB->new(dbh => $dbh);

# fetch datamy $author = $schema->table('Author')->fetch('DAMI');

Fetching a list of records

# select multiple records my $recent_distribs = My::DB->table('Distribution') ->select( -columns => [qw/distrib_name d_release/], -where => {d_release => {'>' => $some_date}}, -order_by => [qw/-d_release +distrib_name/],

);

foreach my $distrib (@$recent_distribs) { ...}

Select API : overview

my $result = $source->select( -columns => \@columns,, -where => \%where, -group_by => \@groupings, -having => \%criteria, -order_by => \@order, -for => 'read only', -post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … }, -page_size => …, -page_index => …, -limit => …, -offset => …, -column_types => \%types, -result_as => 'rows' || 'sth' || 'sql' || 'statement' || 'hashref');

11.04.23 - Page 1

DépartementOffice

Arguments to select()

SQL::Abstract::More : named parameters

my $result = $source->select( -columns => \@columns,, -where => \%where, -order_by => \@order,);

SQL::Abstract->new ->select($table, \@columns, \%where, \@order)

SQL::Abstract::More : extensions

• -columns => [qw/col1|alias1 max(col2)|alias2/]– SELECT col1 AS alias1, max(col2) AS alias2

• -columns => [-DISTINCT => qw/col1 col2 col3/]– SELECT DISTINCT col1, col2, col3

• -order_by => [qw/col1 +col2 –col3/]– SELECT … ORDER BY col1, col2 ASC, col3 DESC

• -for => "update" || "read only"– SELECT … FOR UPDATE

Grouping

• -group_by => [qw/col1 col2 …/]• -having => { col1 => {"<" => val1} , col2 => ... }

– SELECT … GROUP BY col1, col2 HAVING col1 < ? AND col2 …

separate call to SQL::Abstract and re-injection into the SQL

Paging

• -page_size => $num_rows, -page_index => $page_index

# or

• -limit => $num_rows, -offset => $row_index

either new call to $sth->execute(), or use scrollable cursors (DBIx.:DataModel::Statement::JDBC)

starts at 1

starts at 0

Callbacks

-post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … },

• hooks to various states within the statement lifecycle (see later)

• sometimes useful for DB-specific features

Polymorphic result

-result_as =>– 'rows' (default) : arrayref of row objects– 'firstrow' : a single row object (or undef)– 'hashref' : hashref keyed by primary keys– [hashref => @cols] : cascaded hashref– 'flat_arrayref' : flattened values from each row– 'statement' : a statement object (iterator)– 'fast_statement' : statement reusing same memory– 'sth' : DBI statement handle– 'sql' : ($sql, @bind_values)– 'subquery' : \["($sql)", @bind]

don't need method variants : select_hashref(), select_arrayref(), etc.

11.04.23 - Page 1

DépartementOffice

Row objects

A row object …

• is just a hashref– keys are column names– values are column values– nothing else

• actually, when in multi-schema mode, there is an additional __schema field

• is blessed into the table class– has a metadm method (accessor to the metaclass)– has a schema method (accessor to the schema)– has methods for navigating to related tables

• can be dumped as is– to Dumper / YAML / JSON / XML– to Perl debugger

Columns …

• basically, are plain scalars, not objects• but can be "inflated/deflated" through a Type()• programmer chooses the column list, at each

select()-columns => \@columns # arrayref -columns => "col1, col2" # string-columns => "*" # default

• objects have variable size !– if missing keys : runtime error

• when following joins• when updating and deleting

Navigation to associated tables

• Method names come from association declarations• Exactly like a select()

– automatically chooses –result_as => 'rows' || 'firstrow'from multiplicity information

# ->Association([qw/Author author 1 /],# [qw/Distribution distribs 0..* /])

my $author = $distrib->author();my $other_distribs = $author->distribs( -columns => [qw/. . ./], -where => { . . . }, -order_by => [qw/. . ./],);

11.04.23 - Page 1

DépartementOffice

Statement objects

Statement: an encapsulated query

statement

meta::source My::Source

1

1

**schemadbh

0..1 *

rownext() / all()

My::Schema meta::schema

in single-schema mode

in multi-schema mode

singleton()

Statement lifecycle

new

sqlized

prepared

executed

schema + source

data row(s)

new()

sqlize()

prepare()

execute()

bind()refine()

bind()

bind()

bind()execute()

next() / all()

blessedcolumn types applied

-post_bless

-pre_exec

-post_exec

-post_SQL

When to explicitly use a statement

• as iteratormy $statement = $source->select(..., -result_as => 'statement');while (my $row = $statement->next) { . . .}

• for paging$statement->goto_page(123);

• for loop efficiencymy $statement = My::Table->join(qw/role1 role2/); $statement->prepare(-columns => ..., -where => ...); my $list = My::Table->select(...); foreach my $obj (@$list) { my $related_rows = $statement->execute($obj)->all; ... }

Fast statement

• like a regular statement– but reuses the same memory location for each row– see DBI::bind_col()

my $statement = $source->select( . . . , -result_as => 'fast_statement');

while (my $row = $statement->next) { . . . # DO THIS : print $row->{col1}, $row->{col2} # BUT DON'T DO THIS : push @results, $row;}

11.04.23 - Page 1

DépartementOffice

Database joins

Basic join

$rows = My::DB->join(qw/Author distribs modules/) ->select(-where => ...);

Author Distrib Module

My::DB::AutoJoin::…

DBIDM::Source::Join

new class created on the fly

Left / inner joins

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])

# default : LEFT OUTER JOIN

->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

# default : INNER JOIN

# but defaults can be overriddenMy::DB->join([qw/Author <=> distribs/)-> . . . My::DB->join([qw/Distribution => modules /)-> . . .

Join from an instance

$rows = $author->join(qw/distribs modules/)->select( -columns => [qw/distrib_name module_name/], -where => {d_release => {'<' => $date} },);

SELECT distrib_name, module_nameFROM distribution INNER JOIN module ON distribution.distrib_id = module.distrib_idWHERE distrib.author_id = $author->{author_id} AND d_release < $date

11.04.23 - Page 1

DépartementOffice

Insert / Update

Insert

@ids = MyDB::Author->insert({ firstname => 'Larry', lastname => 'Wall' },

{ firstname => 'Damian', lastname => 'Conway' },);

INSERT INTO author(firstname, lastname)VALUES (?, ?)

Bulk insert

@ids = MyDB::Author->insert([qw/firstname lastname/],[qw/Larry Wall /],

[qw/Damian Conway /],);

Insert into / cascaded insert

@id_trees = $author->insert_into_distribs( {distrib_name => 'DBIx-DataModel', modules => [ {module_name => 'DBIx::DataModel', ..}, {module_name => 'DBIx::DataModel::Statement', ..}, ]}, {distrib_name => 'Pod-POM-Web', … }, -returning => {},);

Update

$obj->{col1} = $new_val_1;$obj->{col2} = $new_val_2;. . .$opj->update;# orMyDB::Author->update({author_id => $id, col =>

$new_val})# orMyDB::Author->update($id, {col => $new_val})

# or (bulk update)MyDB::Author->update(-set => {col => $new_val},

-where => \%condition)

Transaction

MyDB->do_transaction(sub { my $author = MyDB::Author->fetch($author_id, {-for => "read only"} );

my $distribs = $author->distribs(-for => 'update');foreach my $distrib (@$distribs) {my $id = $distrib->{distrib_id};MyDB::Distrib->update($id, {col => $val});

}});

• can be nested• can involve several dbh• no savepoints (yet)

11.04.23 - Page 1

DépartementOffice

Other features

Named placeholders / bind()

# introduce named placeholders$statement->prepare(-where => { col1 => '?:foo', col2 => {"<" => '?:bar'}, col3 => {">" => '?:bar'}, col3 => 1234, });

# fill placeholders with values$statement->bind(foo => 99, bar => 88, other => 77);$statement->bind($hashref);

$sql @bind

SELECT * FROM .. WHEREcol1 = ? AND col2 < ? AND col3 = ?

-- -- 1234

?:foo ?:bar

Types (inflate/deflate)

# declare a TypeMy::DB->Type(Multivalue => from_DB => sub {$_[0] = [split /;/, $_[0]] }, to_DB => sub {$_[0] = join ";", @$_[0] },);

# apply it to some columns in a tableMy::DB::Author->metadm->define_column_type( Multivalue => qw/hobbies languages/,);

Auto_expand

# declare auto-expansionsMyDB::Author->define_auto_expand(qw/distributions/);MyDB::Distribution->define_auto_expand(qw/modules/);

# apply to an object (automatically fetches all modules of all distributions of that author)

$author->auto_expand();

# use the data treeuse YAML; print Dump($author);

Schema localization

{# a kind of "local MyDB";

my $guard = MyDB->localize_state();

# temporary change class dataMyDB->dbh($new_dbh, %new_options);do_some_work_with_new_dbh();

} # automatically restore previous state

Schema generator

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBI('dbi:connection:string')" -- \ -schema My::New::Schema > My/New/Schema.pm

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBIxClass('Some::DBIC::Schema')" -- \ -schema My::New::Schema > My/New/Schema.pm

Auto_insert / Auto_update / No_update

$table->auto_insert_columns(created_by => sub {$ENV{REMOTE_USER} . ", " . localtime });

$table->auto_update_columns( modified_by => sub {…} );

$table->no_update_columns(qw/row_id/);

can also be declared for the whole schema

Extending / customizing DBIx::DataModel

• Schema hooks for– SQL dialects (join syntax, alias syntax, limit / offset, etc.)– last_insert_id

• Ad hoc subclasses for– SQL::Abstract– Table– Join– Statements

• Statement callbacks• Extending table classes

– additional methods– redefining _singleInsert method

11.04.23 - Page 1

DépartementOffice

Conclusion

Strengths

• centralized definitions of tables & associations• efficiency• improved API for SQL::Abstract • clear conceptual distinction between

– data sources (tables and joinss),– database statements (stateful objects representing SQL

queries)– data rows (lightweight blessed hashrefs)

• concise and flexible syntax for joins• used in production for mission-critical app

– (running Geneva courts)

Limitations

• tiny community• no schema versioning• no object caching nor 'dirty columns' • no 'cascaded update' nor 'insert or create'

Lots of documentation

• SYNOPSIS AND DESCRIPTION• DESIGN• QUICKSTART• REFERENCE• COOKBOOK• MISC• INTERNALS• GLOSSARY

11.04.23 - Page 1

DépartementOffice

THANK YOU FOR YOUR ATTENTION

11.04.23 - Page 1

DépartementOffice

Bonus slides

ORM: What for ?

[catalyst list] On Thu, 2006-06-08, Steve wrote:

Not intending to start any sort of rancorous discussion, but I was wondering whether someone could illuminate me a little?

I'm comfortable with SQL, and with DBI. I write basic SQL that runs just fine on all databases, or more complex SQL when I want to target a single database (ususally postgresql).

What value does an ORM add for a user like me?

ORM useful for …

• dynamic SQL– navigation between tables– generate complex SQL queries from Perl datastructures– better than phrasebook or string concatenation

• automatic data conversions (inflation / deflation)• expansion of tree data structures coded in the relational

model• transaction encapsulation • data validation• computed fields• caching• …

See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June

Many-to-many implementation

author_idauthor_namee_mail

1

*

1 *

* *

Author

distrib_idmodule_id

Dependency

distrib_iddistrib_named_releaseauthor_id

Distribution

module_idmodule_namedistrib_id

Module

1 1

link table forn-to-n association

Writing SQL

SQL is too low-level, I don't ever want to see it

SQL is the most important part of my application, I won't let

anybody write it for me

Why hashref instead of OO accessors ?

• Perl builtin rich API for hashes (keys, values, slices, string interpolation)

• good for import / export in YAML/XML/JSON• easier to follow steps in Perl debugger• faster than OO accessor methods• visually clear distinction between lvalue / rvalue

– my $val = $hashref->{column};– $hashref->{column} = $val;

• visually clear distinction between – $row->{column} / $row->remote_table()

Callback example

WITH RECURSIVE nodetree(level, id, pid, sort) AS ( SELECT 1, id, parent, '{1}'::int[] FROM nodes WHERE parent IS NULL UNION SELECT level+1,p.id, parent, sort||p.id FROM nodetree pr JOIN nodes p ON p.parent = pr.id

) SELECT * FROM nodetree ORDER BY sort;

my $with_clause = "WITH RECURSIVE …";

DBIx::DataModel->Schema('Tst') ->Table(qw/Nodetree nodetree id/);

my $result = Tst::Nodetree->select ( -post_SQL => sub {my $sql = shift; $sql =~ s/^/$with_clause/; return $sql, @_ }, -orderBy => 'sort',);

Verbose form for definitions

DBIx::DataModel->define_schema( class => "My::DB");

My::DB->metadm->define_table( class => "Author", db_name => "author", primary_key => "author_id"):

New features in 2.0

• metaclasses– client can query about tables, associations, types, etc.– method namespace for regular objects is not polluted by meta-methods

• single-schema / multi-schema mode• misc additions

– bulk update & bulk delete– support for table inheritance– arbitrary clauses in joins

• API changes– perlish_method_names()– SQL generation moved to SQL::Abstract::More– Params::Validate everywhere– deprecated Autoload()– compatibility layer : use DBIx::DataModel –compatibility => 1.0

Migration to v2.0

• deploy DBIx::DataModel 2.0– -compatibility => 1.0

• test• change client code according to new API• test• suppress compatibility layer

DBIx::DataModel history

• 2005 : 1st CPAN publication (v0.10, 16.09.05)• 2006 : YAPC::EU::06 Birmingham presentation• 2008 : heavy refactoring (v1.03, 23.09.08)

– statement object– implicit schema name

• 2011: heavy refactoring– metaobject layer– multi-schema mode– API renaming

Technology

DBIx-DataModel v2.0 in detail