Upload
ldami
View
5.806
Download
0
Tags:
Embed Size (px)
DESCRIPTION
DBIx-DataModel is an object-relational mapping framework for Perl5. Schema declarations are inspired from UML modelling. The API provides efficient interaction with the DBI layer, detailed control on statement execution steps, flexible and powerful treatment of database joins. More on http://search.cpan.org/dist/DBIx-DataModel. Talk presented at YAPC::EU::2011 Riga (updated from a previous version presented at FPW2010).
Citation preview
11.04.23 - Page 1
DépartementOffice
DBIx::DataModel2.0
in detail
YAPC::EU::2011, Riga
[email protected]épartement
Office
Agenda
• Introduction : Object-Relational Mappings & UML• DBIx::DataModel 2.0 Architecture• Modelling the schema• Selecting data• Row objects and statement objects• Joins• Inserts and updates• Customization• Strengths and limitations of DBIx::DataModel
today on CPAN :v 1.99_05
11.04.23 - Page 1
DépartementOffice
Object-Relational Mappers
Perl and databases
Database
DBD driver
DBI
Object-Relational Mapper
Perl program
ORM principle
r1r2...
c1 c2 c3
...
c3 c4
+c1: String+c2: String+c3: class2
r1 : class1
RDBMS
r2 : class1
RAM
table1
table2
Impedance mismatch
• SELECT c1, c2 FROM table1 missing c3, so cannot navigate to class2 is it a valid instance of class1 ?
• SELECT * FROM table1 LEFT JOIN table2 ON … is it a valid instance of class1 ? what to do with the c4 column ?
• SELECT c1, c2, length(c2) AS l_c2 FROM table1 no predeclared method in class1 for accessing l_c2
c1 c2 c3 c3 c4+c1: String+c2: String+c3: class2
r1 : class1 RDBMSRAMtable1 table2
11.04.23 - Page 1
DépartementOffice
The Unified Modeling Language (UML)
Example : CPAN model
Author
Distribution Module
1
*
1 *
*
multiplicity
role
class
association
dependent_distribs
*prereq_modules
contains ►
assoc. name
composition
depends_on ►
11.04.23 - Page 1
DépartementOffice
Modelling a schema
Architecture
Schema
Source
Table Join Statement
My::DB My::DB::Table_n
My::DB::AutoJoin::
row statementrow
DBIx::DataModelclasses
applicationclasses
objects
schema
quite similar to DBI architecture (dbh, sth)
All definitions in one single file
use DBIx::DataModel;
DBIx::DataModel->Schema("My::DB")
->Table(qw/Author author author_id /)->Table(qw/Distribution distribution distrib_id/)->Table(qw/Module module module_id /)
->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);
creates package My::DB
creates package My::DB::Author
adds methods intoboth packages
Multiplicities
"$min..$max"
"*" means "0..POSIX::INT_MAX""1" means "1..1"
$min == 0 ? joins are "LEFT OUTER JOIN" : joins are "INNER JOIN"
$max > 1 ? default result is a list : default result is a single object
Meta-Architecture
Schema Table Join
My::DB
My::DB::Table_n
My::DB::Auto_join
Meta::Source
Meta::Table
Meta::Join
meta::table
meta::join
meta::schema
Meta::Schema Meta::PathMeta::Association Meta::Type
meta::assoc meta::path
meta::type
11.04.23 - Page 1
DépartementOffice
Data retrieval
Fetching one single record
# fetch from primary key# by default, retrieves all columns ('*') my $author = My::DB->table('Author')->fetch('DAMI');
# reach columns through the hashref APIwhile (my ($k, $v) = each %$author) {
print "$k : $v\n";}
Multi-schema mode
# create a schemamy $dbh = DBI->connect(… ):my $schema = My::DB->new(dbh => $dbh);
# fetch datamy $author = $schema->table('Author')->fetch('DAMI');
Fetching a list of records
# select multiple records my $recent_distribs = My::DB->table('Distribution') ->select( -columns => [qw/distrib_name d_release/], -where => {d_release => {'>' => $some_date}}, -order_by => [qw/-d_release +distrib_name/],
);
foreach my $distrib (@$recent_distribs) { ...}
Select API : overview
my $result = $source->select( -columns => \@columns,, -where => \%where, -group_by => \@groupings, -having => \%criteria, -order_by => \@order, -for => 'read only', -post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … }, -page_size => …, -page_index => …, -limit => …, -offset => …, -column_types => \%types, -result_as => 'rows' || 'sth' || 'sql' || 'statement' || 'hashref');
11.04.23 - Page 1
DépartementOffice
Arguments to select()
SQL::Abstract::More : named parameters
my $result = $source->select( -columns => \@columns,, -where => \%where, -order_by => \@order,);
SQL::Abstract->new ->select($table, \@columns, \%where, \@order)
SQL::Abstract::More : extensions
• -columns => [qw/col1|alias1 max(col2)|alias2/]– SELECT col1 AS alias1, max(col2) AS alias2
• -columns => [-DISTINCT => qw/col1 col2 col3/]– SELECT DISTINCT col1, col2, col3
• -order_by => [qw/col1 +col2 –col3/]– SELECT … ORDER BY col1, col2 ASC, col3 DESC
• -for => "update" || "read only"– SELECT … FOR UPDATE
Grouping
• -group_by => [qw/col1 col2 …/]• -having => { col1 => {"<" => val1} , col2 => ... }
– SELECT … GROUP BY col1, col2 HAVING col1 < ? AND col2 …
separate call to SQL::Abstract and re-injection into the SQL
Paging
• -page_size => $num_rows, -page_index => $page_index
# or
• -limit => $num_rows, -offset => $row_index
either new call to $sth->execute(), or use scrollable cursors (DBIx.:DataModel::Statement::JDBC)
starts at 1
starts at 0
Callbacks
-post_SQL => sub { … }, -pre_exec => sub { … }, -post_exec => sub { … }, -post_bless => sub { … },
• hooks to various states within the statement lifecycle (see later)
• sometimes useful for DB-specific features
Polymorphic result
-result_as =>– 'rows' (default) : arrayref of row objects– 'firstrow' : a single row object (or undef)– 'hashref' : hashref keyed by primary keys– [hashref => @cols] : cascaded hashref– 'flat_arrayref' : flattened values from each row– 'statement' : a statement object (iterator)– 'fast_statement' : statement reusing same memory– 'sth' : DBI statement handle– 'sql' : ($sql, @bind_values)– 'subquery' : \["($sql)", @bind]
don't need method variants : select_hashref(), select_arrayref(), etc.
11.04.23 - Page 1
DépartementOffice
Row objects
A row object …
• is just a hashref– keys are column names– values are column values– nothing else
• actually, when in multi-schema mode, there is an additional __schema field
• is blessed into the table class– has a metadm method (accessor to the metaclass)– has a schema method (accessor to the schema)– has methods for navigating to related tables
• can be dumped as is– to Dumper / YAML / JSON / XML– to Perl debugger
Columns …
• basically, are plain scalars, not objects• but can be "inflated/deflated" through a Type()• programmer chooses the column list, at each
select()-columns => \@columns # arrayref -columns => "col1, col2" # string-columns => "*" # default
• objects have variable size !– if missing keys : runtime error
• when following joins• when updating and deleting
Navigation to associated tables
• Method names come from association declarations• Exactly like a select()
– automatically chooses –result_as => 'rows' || 'firstrow'from multiplicity information
# ->Association([qw/Author author 1 /],# [qw/Distribution distribs 0..* /])
my $author = $distrib->author();my $other_distribs = $author->distribs( -columns => [qw/. . ./], -where => { . . . }, -order_by => [qw/. . ./],);
11.04.23 - Page 1
DépartementOffice
Statement objects
Statement: an encapsulated query
statement
meta::source My::Source
1
1
**schemadbh
0..1 *
rownext() / all()
My::Schema meta::schema
in single-schema mode
in multi-schema mode
singleton()
Statement lifecycle
new
sqlized
prepared
executed
schema + source
data row(s)
new()
sqlize()
prepare()
execute()
bind()refine()
bind()
bind()
bind()execute()
next() / all()
blessedcolumn types applied
-post_bless
-pre_exec
-post_exec
-post_SQL
When to explicitly use a statement
• as iteratormy $statement = $source->select(..., -result_as => 'statement');while (my $row = $statement->next) { . . .}
• for paging$statement->goto_page(123);
• for loop efficiencymy $statement = My::Table->join(qw/role1 role2/); $statement->prepare(-columns => ..., -where => ...); my $list = My::Table->select(...); foreach my $obj (@$list) { my $related_rows = $statement->execute($obj)->all; ... }
Fast statement
• like a regular statement– but reuses the same memory location for each row– see DBI::bind_col()
my $statement = $source->select( . . . , -result_as => 'fast_statement');
while (my $row = $statement->next) { . . . # DO THIS : print $row->{col1}, $row->{col2} # BUT DON'T DO THIS : push @results, $row;}
11.04.23 - Page 1
DépartementOffice
Database joins
Basic join
$rows = My::DB->join(qw/Author distribs modules/) ->select(-where => ...);
Author Distrib Module
My::DB::AutoJoin::…
DBIDM::Source::Join
new class created on the fly
Left / inner joins
->Association([qw/Author author 1 /], [qw/Distribution distribs 0..* /])
# default : LEFT OUTER JOIN
->Composition([qw/Distribution distrib 1 /], [qw/Module modules 1..* /]);
# default : INNER JOIN
# but defaults can be overriddenMy::DB->join([qw/Author <=> distribs/)-> . . . My::DB->join([qw/Distribution => modules /)-> . . .
Join from an instance
$rows = $author->join(qw/distribs modules/)->select( -columns => [qw/distrib_name module_name/], -where => {d_release => {'<' => $date} },);
SELECT distrib_name, module_nameFROM distribution INNER JOIN module ON distribution.distrib_id = module.distrib_idWHERE distrib.author_id = $author->{author_id} AND d_release < $date
11.04.23 - Page 1
DépartementOffice
Insert / Update
Insert
@ids = MyDB::Author->insert({ firstname => 'Larry', lastname => 'Wall' },
{ firstname => 'Damian', lastname => 'Conway' },);
INSERT INTO author(firstname, lastname)VALUES (?, ?)
Bulk insert
@ids = MyDB::Author->insert([qw/firstname lastname/],[qw/Larry Wall /],
[qw/Damian Conway /],);
Insert into / cascaded insert
@id_trees = $author->insert_into_distribs( {distrib_name => 'DBIx-DataModel', modules => [ {module_name => 'DBIx::DataModel', ..}, {module_name => 'DBIx::DataModel::Statement', ..}, ]}, {distrib_name => 'Pod-POM-Web', … }, -returning => {},);
Update
$obj->{col1} = $new_val_1;$obj->{col2} = $new_val_2;. . .$opj->update;# orMyDB::Author->update({author_id => $id, col =>
$new_val})# orMyDB::Author->update($id, {col => $new_val})
# or (bulk update)MyDB::Author->update(-set => {col => $new_val},
-where => \%condition)
Transaction
MyDB->do_transaction(sub { my $author = MyDB::Author->fetch($author_id, {-for => "read only"} );
my $distribs = $author->distribs(-for => 'update');foreach my $distrib (@$distribs) {my $id = $distrib->{distrib_id};MyDB::Distrib->update($id, {col => $val});
}});
• can be nested• can involve several dbh• no savepoints (yet)
11.04.23 - Page 1
DépartementOffice
Other features
Named placeholders / bind()
# introduce named placeholders$statement->prepare(-where => { col1 => '?:foo', col2 => {"<" => '?:bar'}, col3 => {">" => '?:bar'}, col3 => 1234, });
# fill placeholders with values$statement->bind(foo => 99, bar => 88, other => 77);$statement->bind($hashref);
$sql @bind
SELECT * FROM .. WHEREcol1 = ? AND col2 < ? AND col3 = ?
-- -- 1234
?:foo ?:bar
Types (inflate/deflate)
# declare a TypeMy::DB->Type(Multivalue => from_DB => sub {$_[0] = [split /;/, $_[0]] }, to_DB => sub {$_[0] = join ";", @$_[0] },);
# apply it to some columns in a tableMy::DB::Author->metadm->define_column_type( Multivalue => qw/hobbies languages/,);
Auto_expand
# declare auto-expansionsMyDB::Author->define_auto_expand(qw/distributions/);MyDB::Distribution->define_auto_expand(qw/modules/);
# apply to an object (automatically fetches all modules of all distributions of that author)
$author->auto_expand();
# use the data treeuse YAML; print Dump($author);
Schema localization
{# a kind of "local MyDB";
my $guard = MyDB->localize_state();
# temporary change class dataMyDB->dbh($new_dbh, %new_options);do_some_work_with_new_dbh();
} # automatically restore previous state
Schema generator
perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBI('dbi:connection:string')" -- \ -schema My::New::Schema > My/New/Schema.pm
perl -MDBIx::DataModel::Schema::Generator \ -e "fromDBIxClass('Some::DBIC::Schema')" -- \ -schema My::New::Schema > My/New/Schema.pm
Auto_insert / Auto_update / No_update
$table->auto_insert_columns(created_by => sub {$ENV{REMOTE_USER} . ", " . localtime });
$table->auto_update_columns( modified_by => sub {…} );
$table->no_update_columns(qw/row_id/);
can also be declared for the whole schema
Extending / customizing DBIx::DataModel
• Schema hooks for– SQL dialects (join syntax, alias syntax, limit / offset, etc.)– last_insert_id
• Ad hoc subclasses for– SQL::Abstract– Table– Join– Statements
• Statement callbacks• Extending table classes
– additional methods– redefining _singleInsert method
11.04.23 - Page 1
DépartementOffice
Conclusion
Strengths
• centralized definitions of tables & associations• efficiency• improved API for SQL::Abstract • clear conceptual distinction between
– data sources (tables and joinss),– database statements (stateful objects representing SQL
queries)– data rows (lightweight blessed hashrefs)
• concise and flexible syntax for joins• used in production for mission-critical app
– (running Geneva courts)
Limitations
• tiny community• no schema versioning• no object caching nor 'dirty columns' • no 'cascaded update' nor 'insert or create'
Lots of documentation
• SYNOPSIS AND DESCRIPTION• DESIGN• QUICKSTART• REFERENCE• COOKBOOK• MISC• INTERNALS• GLOSSARY
11.04.23 - Page 1
DépartementOffice
THANK YOU FOR YOUR ATTENTION
11.04.23 - Page 1
DépartementOffice
Bonus slides
ORM: What for ?
[catalyst list] On Thu, 2006-06-08, Steve wrote:
Not intending to start any sort of rancorous discussion, but I was wondering whether someone could illuminate me a little?
I'm comfortable with SQL, and with DBI. I write basic SQL that runs just fine on all databases, or more complex SQL when I want to target a single database (ususally postgresql).
What value does an ORM add for a user like me?
ORM useful for …
• dynamic SQL– navigation between tables– generate complex SQL queries from Perl datastructures– better than phrasebook or string concatenation
• automatic data conversions (inflation / deflation)• expansion of tree data structures coded in the relational
model• transaction encapsulation • data validation• computed fields• caching• …
See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June
Many-to-many implementation
author_idauthor_namee_mail
1
*
1 *
* *
Author
distrib_idmodule_id
Dependency
distrib_iddistrib_named_releaseauthor_id
Distribution
module_idmodule_namedistrib_id
Module
1 1
link table forn-to-n association
Writing SQL
SQL is too low-level, I don't ever want to see it
SQL is the most important part of my application, I won't let
anybody write it for me
Why hashref instead of OO accessors ?
• Perl builtin rich API for hashes (keys, values, slices, string interpolation)
• good for import / export in YAML/XML/JSON• easier to follow steps in Perl debugger• faster than OO accessor methods• visually clear distinction between lvalue / rvalue
– my $val = $hashref->{column};– $hashref->{column} = $val;
• visually clear distinction between – $row->{column} / $row->remote_table()
Callback example
WITH RECURSIVE nodetree(level, id, pid, sort) AS ( SELECT 1, id, parent, '{1}'::int[] FROM nodes WHERE parent IS NULL UNION SELECT level+1,p.id, parent, sort||p.id FROM nodetree pr JOIN nodes p ON p.parent = pr.id
) SELECT * FROM nodetree ORDER BY sort;
my $with_clause = "WITH RECURSIVE …";
DBIx::DataModel->Schema('Tst') ->Table(qw/Nodetree nodetree id/);
my $result = Tst::Nodetree->select ( -post_SQL => sub {my $sql = shift; $sql =~ s/^/$with_clause/; return $sql, @_ }, -orderBy => 'sort',);
Verbose form for definitions
DBIx::DataModel->define_schema( class => "My::DB");
My::DB->metadm->define_table( class => "Author", db_name => "author", primary_key => "author_id"):
New features in 2.0
• metaclasses– client can query about tables, associations, types, etc.– method namespace for regular objects is not polluted by meta-methods
• single-schema / multi-schema mode• misc additions
– bulk update & bulk delete– support for table inheritance– arbitrary clauses in joins
• API changes– perlish_method_names()– SQL generation moved to SQL::Abstract::More– Params::Validate everywhere– deprecated Autoload()– compatibility layer : use DBIx::DataModel –compatibility => 1.0
Migration to v2.0
• deploy DBIx::DataModel 2.0– -compatibility => 1.0
• test• change client code according to new API• test• suppress compatibility layer
DBIx::DataModel history
• 2005 : 1st CPAN publication (v0.10, 16.09.05)• 2006 : YAPC::EU::06 Birmingham presentation• 2008 : heavy refactoring (v1.03, 23.09.08)
– statement object– implicit schema name
• 2011: heavy refactoring– metaobject layer– multi-schema mode– API renaming