68
16.06.22 - Page 1 Département Office DBIx::DataModel 2.0 in detail YAPC::EU::2011, Riga [email protected] Département Office

DBIx-DataModel v2.0 in detail

  • Upload
    ldami

  • View
    5.806

  • Download
    0

Embed Size (px)

DESCRIPTION

DBIx-DataModel is an object-relational mapping framework for Perl5. Schema declarations are inspired from UML modelling. The API provides efficient interaction with the DBI layer, detailed control on statement execution steps, flexible and powerful treatment of database joins. More on http://search.cpan.org/dist/DBIx-DataModel. Talk presented at YAPC::EU::2011 Riga (updated from a previous version presented at FPW2010).

Citation preview

Page 1: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

DBIx::DataModel2.0

in detail

YAPC::EU::2011, Riga

[email protected]épartement

Office

Page 2: DBIx-DataModel v2.0 in detail

Agenda

• Introduction : Object-Relational Mappings & UML• DBIx::DataModel 2.0 Architecture• Modelling the schema• Selecting data• Row objects and statement objects• Joins• Inserts and updates• Customization• Strengths and limitations of DBIx::DataModel

today on CPAN :v 1.99_05

Page 3: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Object-Relational Mappers

Page 4: DBIx-DataModel v2.0 in detail

Perl and databases

Database

DBD driver

DBI

Object-Relational Mapper

Perl program

Page 5: DBIx-DataModel v2.0 in detail

ORM principle

r1r2...

c1 c2 c3

...

c3 c4

+c1: String+c2: String+c3: class2

r1 : class1

RDBMS

r2 : class1

RAM

table1

table2

Page 6: DBIx-DataModel v2.0 in detail

Impedance mismatch

• SELECT c1, c2 FROM table1 missing c3, so cannot navigate to class2 is it a valid instance of class1 ?

• SELECT * FROM table1 LEFT JOIN table2 ON … is it a valid instance of class1 ? what to do with the c4 column ?

• SELECT c1, c2, length(c2) AS l_c2 FROM table1 no predeclared method in class1 for accessing l_c2

c1 c2 c3 c3 c4+c1: String+c2: String+c3: class2

r1 : class1 RDBMSRAMtable1 table2

Page 7: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

The Unified Modeling Language (UML)

Page 8: DBIx-DataModel v2.0 in detail

Example : CPAN model

Author

Distribution Module

1

*

1 *

*

multiplicity

role

class

association

dependent_distribs

*prereq_modules

contains ►

assoc. name

composition

depends_on ►

Page 9: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Modelling a schema

Page 10: DBIx-DataModel v2.0 in detail

Architecture

Schema

Source

Table Join Statement

My::DB My::DB::Table_n

My::DB::AutoJoin::

row statementrow

DBIx::DataModelclasses

applicationclasses

objects

schema

quite similar to DBI architecture (dbh, sth)

Page 11: DBIx-DataModel v2.0 in detail

All definitions in one single file

use DBIx::DataModel;

DBIx::DataModel->Schema("My::DB")

->Table(qw/Author author author_id /)->Table(qw/Distribution distribution distrib_id/)->Table(qw/Module module module_id /)

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

creates package My::DB

creates package My::DB::Author

adds methods intoboth packages

Page 12: DBIx-DataModel v2.0 in detail

Multiplicities

"$min..$max"

"*" means "0..POSIX::INT_MAX""1" means "1..1"

$min == 0 ? joins are "LEFT OUTER JOIN" : joins are "INNER JOIN"

$max > 1 ? default result is a list : default result is a single object

Page 13: DBIx-DataModel v2.0 in detail

Meta-Architecture

Schema Table Join

My::DB

My::DB::Table_n

My::DB::Auto_join

Meta::Source

Meta::Table

Meta::Join

meta::table

meta::join

meta::schema

Meta::Schema Meta::PathMeta::Association Meta::Type

meta::assoc meta::path

meta::type

Page 14: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Data retrieval

Page 15: DBIx-DataModel v2.0 in detail

Fetching one single record

# fetch from primary key# by default, retrieves all columns ('*') my $author = My::DB->table('Author')->fetch('DAMI');

# reach columns through the hashref APIwhile (my ($k, $v) = each %$author) {

print "$k : $v\n";}

Page 16: DBIx-DataModel v2.0 in detail

Multi-schema mode

# create a schemamy $dbh = DBI->connect(… ):my $schema = My::DB->new(dbh => $dbh);

# fetch datamy $author = $schema->table('Author')->fetch('DAMI');

Page 17: DBIx-DataModel v2.0 in detail

Fetching a list of records

# select multiple records my $recent_distribs = My::DB->table('Distribution') ->select( -columns => [qw/distrib_name d_release/], -where => {d_release => {'>' => $some_date}}, -order_by => [qw/-d_release +distrib_name/],

);

foreach my $distrib (@$recent_distribs) { ...}

Page 18: DBIx-DataModel v2.0 in detail

Select API : overview

my $result = $source->select( -columns => \@columns,, -where => \%where, -group_by => \@groupings, -having => \%criteria, -order_by => \@order, -for => 'read only', -post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … }, -page_size => …, -page_index => …, -limit => …, -offset => …, -column_types => \%types, -result_as => 'rows' || 'sth' || 'sql' || 'statement' || 'hashref');

Page 19: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Arguments to select()

Page 20: DBIx-DataModel v2.0 in detail

SQL::Abstract::More : named parameters

my $result = $source->select( -columns => \@columns,, -where => \%where, -order_by => \@order,);

SQL::Abstract->new ->select($table, \@columns, \%where, \@order)

Page 21: DBIx-DataModel v2.0 in detail

SQL::Abstract::More : extensions

• -columns => [qw/col1|alias1 max(col2)|alias2/]– SELECT col1 AS alias1, max(col2) AS alias2

• -columns => [-DISTINCT => qw/col1 col2 col3/]– SELECT DISTINCT col1, col2, col3

• -order_by => [qw/col1 +col2 –col3/]– SELECT … ORDER BY col1, col2 ASC, col3 DESC

• -for => "update" || "read only"– SELECT … FOR UPDATE

Page 22: DBIx-DataModel v2.0 in detail

Grouping

• -group_by => [qw/col1 col2 …/]• -having => { col1 => {"<" => val1} , col2 => ... }

– SELECT … GROUP BY col1, col2 HAVING col1 < ? AND col2 …

separate call to SQL::Abstract and re-injection into the SQL

Page 23: DBIx-DataModel v2.0 in detail

Paging

• -page_size => $num_rows, -page_index => $page_index

# or

• -limit => $num_rows, -offset => $row_index

either new call to $sth->execute(), or use scrollable cursors (DBIx.:DataModel::Statement::JDBC)

starts at 1

starts at 0

Page 24: DBIx-DataModel v2.0 in detail

Callbacks

-post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … },

• hooks to various states within the statement lifecycle (see later)

• sometimes useful for DB-specific features

Page 25: DBIx-DataModel v2.0 in detail

Polymorphic result

-result_as =>– 'rows' (default) : arrayref of row objects– 'firstrow' : a single row object (or undef)– 'hashref' : hashref keyed by primary keys– [hashref => @cols] : cascaded hashref– 'flat_arrayref' : flattened values from each row– 'statement' : a statement object (iterator)– 'fast_statement' : statement reusing same memory– 'sth' : DBI statement handle– 'sql' : ($sql, @bind_values)– 'subquery' : \["($sql)", @bind]

don't need method variants : select_hashref(), select_arrayref(), etc.

Page 26: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Row objects

Page 27: DBIx-DataModel v2.0 in detail

A row object …

• is just a hashref– keys are column names– values are column values– nothing else

• actually, when in multi-schema mode, there is an additional __schema field

• is blessed into the table class– has a metadm method (accessor to the metaclass)– has a schema method (accessor to the schema)– has methods for navigating to related tables

• can be dumped as is– to Dumper / YAML / JSON / XML– to Perl debugger

Page 28: DBIx-DataModel v2.0 in detail

Columns …

• basically, are plain scalars, not objects• but can be "inflated/deflated" through a Type()• programmer chooses the column list, at each

select()-columns => \@columns # arrayref -columns => "col1, col2" # string-columns => "*" # default

• objects have variable size !– if missing keys : runtime error

• when following joins• when updating and deleting

Page 29: DBIx-DataModel v2.0 in detail

Navigation to associated tables

• Method names come from association declarations• Exactly like a select()

– automatically chooses –result_as => 'rows' || 'firstrow'from multiplicity information

# ->Association([qw/Author author 1 /],# [qw/Distribution distribs 0..* /])

my $author = $distrib->author();my $other_distribs = $author->distribs( -columns => [qw/. . ./], -where => { . . . }, -order_by => [qw/. . ./],);

Page 30: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Statement objects

Page 31: DBIx-DataModel v2.0 in detail

Statement: an encapsulated query

statement

meta::source My::Source

1

1

**schemadbh

0..1 *

rownext() / all()

My::Schema meta::schema

in single-schema mode

in multi-schema mode

singleton()

Page 32: DBIx-DataModel v2.0 in detail

Statement lifecycle

new

sqlized

prepared

executed

schema + source

data row(s)

new()

sqlize()

prepare()

execute()

bind()refine()

bind()

bind()

bind()execute()

next() / all()

blessedcolumn types applied

-post_bless

-pre_exec

-post_exec

-post_SQL

Page 33: DBIx-DataModel v2.0 in detail

When to explicitly use a statement

• as iteratormy $statement = $source->select(..., -result_as => 'statement');while (my $row = $statement->next) { . . .}

• for paging$statement->goto_page(123);

• for loop efficiencymy $statement = My::Table->join(qw/role1 role2/); $statement->prepare(-columns => ..., -where => ...); my $list = My::Table->select(...); foreach my $obj (@$list) { my $related_rows = $statement->execute($obj)->all; ... }

Page 34: DBIx-DataModel v2.0 in detail

Fast statement

• like a regular statement– but reuses the same memory location for each row– see DBI::bind_col()

my $statement = $source->select( . . . , -result_as => 'fast_statement');

while (my $row = $statement->next) { . . . # DO THIS : print $row->{col1}, $row->{col2} # BUT DON'T DO THIS : push @results, $row;}

Page 35: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Database joins

Page 36: DBIx-DataModel v2.0 in detail

Basic join

$rows = My::DB->join(qw/Author distribs modules/) ->select(-where => ...);

Author Distrib Module

My::DB::AutoJoin::…

DBIDM::Source::Join

new class created on the fly

Page 37: DBIx-DataModel v2.0 in detail

Left / inner joins

->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])

# default : LEFT OUTER JOIN

->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);

# default : INNER JOIN

# but defaults can be overriddenMy::DB->join([qw/Author <=> distribs/)-> . . . My::DB->join([qw/Distribution => modules /)-> . . .

Page 38: DBIx-DataModel v2.0 in detail

Join from an instance

$rows = $author->join(qw/distribs modules/)->select( -columns => [qw/distrib_name module_name/], -where => {d_release => {'<' => $date} },);

SELECT distrib_name, module_nameFROM distribution INNER JOIN module ON distribution.distrib_id = module.distrib_idWHERE distrib.author_id = $author->{author_id} AND d_release < $date

Page 39: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Insert / Update

Page 40: DBIx-DataModel v2.0 in detail

Insert

@ids = MyDB::Author->insert({ firstname => 'Larry', lastname => 'Wall' },

{ firstname => 'Damian', lastname => 'Conway' },);

INSERT INTO author(firstname, lastname)VALUES (?, ?)

Page 41: DBIx-DataModel v2.0 in detail

Bulk insert

@ids = MyDB::Author->insert([qw/firstname lastname/],[qw/Larry Wall /],

[qw/Damian Conway /],);

Page 42: DBIx-DataModel v2.0 in detail

Insert into / cascaded insert

@id_trees = $author->insert_into_distribs( {distrib_name => 'DBIx-DataModel', modules => [ {module_name => 'DBIx::DataModel', ..}, {module_name => 'DBIx::DataModel::Statement', ..}, ]}, {distrib_name => 'Pod-POM-Web', … }, -returning => {},);

Page 43: DBIx-DataModel v2.0 in detail

Update

$obj->{col1} = $new_val_1;$obj->{col2} = $new_val_2;. . .$opj->update;# orMyDB::Author->update({author_id => $id, col =>

$new_val})# orMyDB::Author->update($id, {col => $new_val})

# or (bulk update)MyDB::Author->update(-set => {col => $new_val},

-where => \%condition)

Page 44: DBIx-DataModel v2.0 in detail

Transaction

MyDB->do_transaction(sub { my $author = MyDB::Author->fetch($author_id, {-for => "read only"} );

my $distribs = $author->distribs(-for => 'update');foreach my $distrib (@$distribs) {my $id = $distrib->{distrib_id};MyDB::Distrib->update($id, {col => $val});

}});

• can be nested• can involve several dbh• no savepoints (yet)

Page 45: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Other features

Page 46: DBIx-DataModel v2.0 in detail

Named placeholders / bind()

# introduce named placeholders$statement->prepare(-where => { col1 => '?:foo', col2 => {"<" => '?:bar'}, col3 => {">" => '?:bar'}, col3 => 1234, });

# fill placeholders with values$statement->bind(foo => 99, bar => 88, other => 77);$statement->bind($hashref);

$sql @bind

SELECT * FROM .. WHEREcol1 = ? AND col2 < ? AND col3 = ?

-- -- 1234

?:foo ?:bar

Page 47: DBIx-DataModel v2.0 in detail

Types (inflate/deflate)

# declare a TypeMy::DB->Type(Multivalue => from_DB => sub {$_[0] = [split /;/, $_[0]] }, to_DB => sub {$_[0] = join ";", @$_[0] },);

# apply it to some columns in a tableMy::DB::Author->metadm->define_column_type( Multivalue => qw/hobbies languages/,);

Page 48: DBIx-DataModel v2.0 in detail

Auto_expand

# declare auto-expansionsMyDB::Author->define_auto_expand(qw/distributions/);MyDB::Distribution->define_auto_expand(qw/modules/);

# apply to an object (automatically fetches all modules of all distributions of that author)

$author->auto_expand();

# use the data treeuse YAML; print Dump($author);

Page 49: DBIx-DataModel v2.0 in detail

Schema localization

{# a kind of "local MyDB";

my $guard = MyDB->localize_state();

# temporary change class dataMyDB->dbh($new_dbh, %new_options);do_some_work_with_new_dbh();

} # automatically restore previous state

Page 50: DBIx-DataModel v2.0 in detail

Schema generator

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBI('dbi:connection:string')" -- \ -schema My::New::Schema > My/New/Schema.pm

perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBIxClass('Some::DBIC::Schema')" -- \ -schema My::New::Schema > My/New/Schema.pm

Page 51: DBIx-DataModel v2.0 in detail

Auto_insert / Auto_update / No_update

$table->auto_insert_columns(created_by => sub {$ENV{REMOTE_USER} . ", " . localtime });

$table->auto_update_columns( modified_by => sub {…} );

$table->no_update_columns(qw/row_id/);

can also be declared for the whole schema

Page 52: DBIx-DataModel v2.0 in detail

Extending / customizing DBIx::DataModel

• Schema hooks for– SQL dialects (join syntax, alias syntax, limit / offset, etc.)– last_insert_id

• Ad hoc subclasses for– SQL::Abstract– Table– Join– Statements

• Statement callbacks• Extending table classes

– additional methods– redefining _singleInsert method

Page 53: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Conclusion

Page 54: DBIx-DataModel v2.0 in detail

Strengths

• centralized definitions of tables & associations• efficiency• improved API for SQL::Abstract • clear conceptual distinction between

– data sources (tables and joinss),– database statements (stateful objects representing SQL

queries)– data rows (lightweight blessed hashrefs)

• concise and flexible syntax for joins• used in production for mission-critical app

– (running Geneva courts)

Page 55: DBIx-DataModel v2.0 in detail

Limitations

• tiny community• no schema versioning• no object caching nor 'dirty columns' • no 'cascaded update' nor 'insert or create'

Page 56: DBIx-DataModel v2.0 in detail

Lots of documentation

• SYNOPSIS AND DESCRIPTION• DESIGN• QUICKSTART• REFERENCE• COOKBOOK• MISC• INTERNALS• GLOSSARY

Page 57: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

THANK YOU FOR YOUR ATTENTION

Page 58: DBIx-DataModel v2.0 in detail

11.04.23 - Page 1

DépartementOffice

Bonus slides

Page 59: DBIx-DataModel v2.0 in detail

ORM: What for ?

[catalyst list] On Thu, 2006-06-08, Steve wrote:

Not intending to start any sort of rancorous discussion, but I was wondering whether someone could illuminate me a little?

I'm comfortable with SQL, and with DBI. I write basic SQL that runs just fine on all databases, or more complex SQL when I want to target a single database (ususally postgresql).

What value does an ORM add for a user like me?

Page 60: DBIx-DataModel v2.0 in detail

ORM useful for …

• dynamic SQL– navigation between tables– generate complex SQL queries from Perl datastructures– better than phrasebook or string concatenation

• automatic data conversions (inflation / deflation)• expansion of tree data structures coded in the relational

model• transaction encapsulation • data validation• computed fields• caching• …

See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June

Page 61: DBIx-DataModel v2.0 in detail

Many-to-many implementation

author_idauthor_namee_mail

1

*

1 *

* *

Author

distrib_idmodule_id

Dependency

distrib_iddistrib_named_releaseauthor_id

Distribution

module_idmodule_namedistrib_id

Module

1 1

link table forn-to-n association

Page 62: DBIx-DataModel v2.0 in detail

Writing SQL

SQL is too low-level, I don't ever want to see it

SQL is the most important part of my application, I won't let

anybody write it for me

Page 63: DBIx-DataModel v2.0 in detail

Why hashref instead of OO accessors ?

• Perl builtin rich API for hashes (keys, values, slices, string interpolation)

• good for import / export in YAML/XML/JSON• easier to follow steps in Perl debugger• faster than OO accessor methods• visually clear distinction between lvalue / rvalue

– my $val = $hashref->{column};– $hashref->{column} = $val;

• visually clear distinction between – $row->{column} / $row->remote_table()

Page 64: DBIx-DataModel v2.0 in detail

Callback example

WITH RECURSIVE nodetree(level, id, pid, sort) AS ( SELECT 1, id, parent, '{1}'::int[] FROM nodes WHERE parent IS NULL     UNION SELECT level+1,p.id, parent, sort||p.id FROM nodetree pr JOIN nodes p ON p.parent = pr.id

) SELECT * FROM nodetree ORDER BY sort;

my $with_clause = "WITH RECURSIVE …";

DBIx::DataModel->Schema('Tst') ->Table(qw/Nodetree nodetree id/);

my $result = Tst::Nodetree->select ( -post_SQL => sub {my $sql = shift; $sql =~ s/^/$with_clause/; return $sql, @_ }, -orderBy => 'sort',);

Page 65: DBIx-DataModel v2.0 in detail

Verbose form for definitions

DBIx::DataModel->define_schema( class => "My::DB");

My::DB->metadm->define_table( class => "Author", db_name => "author", primary_key => "author_id"):

Page 66: DBIx-DataModel v2.0 in detail

New features in 2.0

• metaclasses– client can query about tables, associations, types, etc.– method namespace for regular objects is not polluted by meta-methods

• single-schema / multi-schema mode• misc additions

– bulk update & bulk delete– support for table inheritance– arbitrary clauses in joins

• API changes– perlish_method_names()– SQL generation moved to SQL::Abstract::More– Params::Validate everywhere– deprecated Autoload()– compatibility layer : use DBIx::DataModel –compatibility => 1.0

Page 67: DBIx-DataModel v2.0 in detail

Migration to v2.0

• deploy DBIx::DataModel 2.0– -compatibility => 1.0

• test• change client code according to new API• test• suppress compatibility layer

Page 68: DBIx-DataModel v2.0 in detail

DBIx::DataModel history

• 2005 : 1st CPAN publication (v0.10, 16.09.05)• 2006 : YAPC::EU::06 Birmingham presentation• 2008 : heavy refactoring (v1.03, 23.09.08)

– statement object– implicit schema name

• 2011: heavy refactoring– metaobject layer– multi-schema mode– API renaming