Upload
mark-rees
View
805
Download
10
Embed Size (px)
Citation preview
Relational Database Access with PythonWhy you should be using SQLAlchemy
Mark ReesCTO
Century Software (M) Sdn. Bhd.
Is This Your Current Relational Database Access Style?
# Django ORM>>> from ip2country.models import Ip2Country
>>> Ip2Country.objects.all()[<Ip2Country: Ip2Country object>, <Ip2Country: Ip2Country object>, '...(remaining elements truncated)...']
>>> myp = Ip2Country.objects.filter(assigned__year=2015)\... .filter(countrycode2=’MY')
>>> myp[0].ipfrom736425984.0
Is This Your Current Relational Database Access Style?
# SQLAlchemy ORM>>> from sqlalchemy import create_engine, extract>>> from sqlalchemy.orm import sessionmaker>>> from models import Ip2Country
>>> engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2country')>>> Session = sessionmaker(bind=engine)>>> session = Session()
>>> all_data = session.query(Ip2Country).all()
>>> myp = session.query(Ip2Country).\... filter(extract('year',Ip2Country.assigned) == 2015).\... filter(Ip2Country.countrycode2 == ’MY')
print(myp[0].ipfrom)736425984.0
SQL Relational Database AccessSELECT * FROM ip2country;
“id”,"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1,1729522688;1729523711;"apnic";"2011-08-05";"CN";"CHN";"China"2,1729523712;1729524735;"apnic";"2011-08-05";"CN";"CHN";"China”. . .
SELECT * FROM ip2countryWHERE date_part('year', assigned) = 2015AND countrycode2 = ’MY';
“id”,"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"5217;736425984;736427007;"apnic";"2015-01-13 00:00:00";"MY";"MYS";"Malaysia”5218;736427008;736428031;"apnic";"2015-01-13 00:00:00";"MY";"MYS";"Malaysia”. . .
SELECT ipfrom FROM ip2countryWHERE date_part('year', assigned) = 2015AND countrycode2 = ’MY';
"ipfrom"736425984736427008. . .
Python + SQL == Python DB-API 2.0
• The Python standard for a consistent interface to relational databases is the Python DB-API (PEP 249)
• The majority of Python database interfaces adhere to this standard
Python DB-API UML Diagram
Python DB-API Connection ObjectAccess the database via the connection object• Use connect constructor to create a
connection with databaseconn = psycopg2.connect(parameters…)
• Create cursor via the connectioncur = conn.cursor()
• Transaction management (implicit begin)conn.commit()conn.rollback()
• Close connection (will rollback current transaction)
conn.close()• Check module capabilities by globals
psycopg2.apilevel psycopg2.threadsafety psycopg2.paramstyle
Python DB-API Cursor ObjectA cursor object is used to represent a database cursor, which is used to manage the context of fetch operations.• Cursors created from the same connection
are not isolatedcur = conn.cursor()cur2 = conn.cursor()
• Cursor methodscur.execute(operation, parameters) cur.executemany(op,seq_of_parameters)cur.fetchone()cur.fetchmany([size=cursor.arraysize])cur.fetchall()cur.close()
Python DB-API Cursor Object• Optional cursor methods
cur.scroll(value[,mode='relative']) cur.next()cur.callproc(procname[,parameters])cur.__iter__()
• Results of an operationcur.descriptioncur.rowcountcur.lastrowid
• DB adaptor specific “proprietary” cursor methods
Python DB-API Parameter StylesAllows you to keep SQL separate from parameters
Improves performance & security
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.From http://initd.org/psycopg/docs/usage.html#query-parameters
Python DB-API Parameter StylesGlobal paramstyle gives supported style for the adaptor
qmark Question mark styleWHERE countrycode2 = ?
numeric Numeric positional styleWHERE countrycode2 = :1
named Named styleWHERE countrycode2 = :code
format ANSI C printf format styleWHERE countrycode2 = %s
pyformat Python format style WHERE countrycode2 = %(name)s
Python + SQL: INSERTimport csv, datetime, psycopg2conn = psycopg2.connect("dbname=ip2countrydb user=ip2country_rw password=secret")cur = conn.cursor()with open("IpToCountry.csv", "rt") as f: reader = csv.reader(f) try: for row in reader: if row[0][0] != "#": row[3] = datetime.datetime.utcfromtimestamp(float(row[3])) cur.execute("""INSERT INTO ip2country( ipfrom, ipto, registry, assigned, countrycode2, countrycode3, countryname) VALUES (%s, %s, %s, %s, %s, %s, %s)""", row) except (Exception) as error: print(error) conn.rollback() else: conn.commit() finally: cur.close() conn.close()
Python + SQL: SELECT# Find ipv4 address ranges assigned to Malaysiaimport psycopg2, socket, struct
def num_to_dotted_quad(n): """convert long int to dotted quad string http://code.activestate.com/recipes/66517/""" return socket.inet_ntoa(struct.pack('!L', n))
conn = psycopg2.connect("dbname=ip2countrydb user=ip2country_rw password=secret")
cur = conn.cursor()
cur.execute("""SELECT * FROM ip2country WHERE countrycode2 = 'MY' ORDER BY ipfrom""")
for row in cur: print("%s - %s" % (num_to_dotted_quad(int(row[0])), num_to_dotted_quad(int(row[1]))))
SQLite
• sqlite3• CPython 2.5 & 3• DB-API 2.0• Part of CPython distribution since 2.5
PostgreSQL
• psycopg• CPython 2 & 3• DB-API 2.0, level 2 thread safe• Appears to be most popular• http://initd.org/psycopg/
• py-postgresql• CPython 3• DB-API 2.0• Written in Python with optional C
optimizations• pg_python - console• http://python.projects.postgresql.org/
PostgreSQL
• PyGreSQL• CPython 2.5+• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/
• pyPgSQL• CPython 2• Classic & DB-API 2.0 interfaces• http://pypgsql.sourceforge.net/• Last release 2006
PostgreSQL• pypq• CPython 2.7 & pypy 1.7+• Uses ctypes• DB-API 2.0 interface• psycopg2-like extension API• https://bitbucket.org/descent/pypq
• psycopg2cffi• CPython 2.6+ & pypy 2.0+• Uses cffi• DB-API 2.0 interface• psycopg2 compat layer • https://github.com/chtd/psycopg2cffi
MySQL• MySQL-python• CPython 2.3+• DB-API 2.0 interface• http://sourceforge.net/projects/mysql-
python/• PyMySQL• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• http://www.pymysql.org/
• MySQL-Connector• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• https://launchpad.net/myconnpy
Other “Enterprise” Databases
• cx_Oracle• CPython 2 & 3• DB-API 2.0 interface• http://cx-oracle.sourceforge.net/
• informixda• CPython 2• DB-API 2.0 interface• http://informixdb.sourceforge.net/• Last release 2007
• Ibm-db• CPython 2• DB-API 2.0 for DB2 & Informix• http://code.google.com/p/ibm-db/
ODBC• mxODBC• CPython 2.3+• DB-API 2.0 interfaces• http://www.egenix.com/products/pytho
n/mxODBC/doc
• Commercial product
• PyODBC• CPython 2 & 3• DB-API 2.0 interfaces with extensions• https://github.com/mkleehammer/
pyodbc• ODBC interfaces not limited to Windows
thanks to iODBC and unixODBC
Jython + SQL
• zxJDBC• DB-API 2.0 Written in Java using JDBC
API so can utilize JDBC drivers• Support for connection pools and JNDI
lookup• Included with standard Jython
installation http://www.jython.org/• jyjdbc• DB-API 2.0 compliant• Written in Python/Jython so can utilize
JDBC drivers• Decimal data type support• https://bitbucket.org/clach04/jyjdbc/
IronPython + SQL
• adodbapi• IronPython 2+• Also works with CPython 2.3+ with
pywin32• http://adodbapi.sourceforge.net/
Gerald, the half a schema
import geralds1 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2country')s2 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2countryv4')
print s1.schema['ip2country'].compare(s2.schema['ip2country'])DIFF: Definition of assigned is differentDIFF: Column countryname not in ip2countryDIFF: Definition of registry is differentDIFF: Column countrycode3 not in ip2countryDIFF: Definition of countrycode2 is different
• Database schema toolkit• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://halfcooked.com/code/gerald/
SQLPython
$ sqlpython --postgresql ip2country ip2country_rwPassword: 0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG';...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG'\j[...{"ipfrom": 1728830464.0, "ipto": 1728830719.0, "registry": "apnic”,"assigned": "2011-11-02", "countrycode2": "SG", "countrycode3": "SGP", "countryname": "Singapore"}]
• A command-line interface to relational databases• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://packages.python.org/sqlpython/
SQLPython, batteries included0:ip2country_rw@ip2country> select * from ip2country where countrycode2 =’MY’;...1728830464.0 1728830719.0 apnic 2011-11-02 MY MYS Malaysia 551 rows selected.0:ip2country_rw@ip2country> pyPython 2.6.6 (r266:84292, May 20 2011, 16:42:25) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
py <command>: Executes a Python command. py: Enters interactive Python mode. End with `Ctrl-D` (Unix) / `Ctrl-Z` (Windows), `quit()`, 'exit()`. Past SELECT results are exposed as list `r`; most recent resultset is `r[-1]`. SQL bind, substitution variables are exposed as `binds`, `substs`. Run python code from external files with ``run("filename.py")`` >>> r[-1][-1](1728830464.0, 1728830719.0, 'apnic', datetime.date(2011, 11, 2), ’MY', ’MYS', ’Malaysia')>>> import socket, struct>>> def num_to_dotted_quad(n):... return socket.inet_ntoa(struct.pack('!L',n))...>>> num_to_dotted_quad(int(r[-1][-1].ipfrom))'103.11.220.0'
SpringPython – Database Templates# Find ipv4 address ranges assigned to Malaysia# using SpringPython DatabaseTemplate & DictionaryRowMapper
from springpython.database.core import *from springpython.database.factory import * conn_factory = PgdbConnectionFactory( user="ip2country_rw", password="secret", host="localhost", database="ip2countrydb")dt = DatabaseTemplate(conn_factory)
results = dt.query( "SELECT * FROM ip2country WHERE countrycode2=%s", (”MY",), DictionaryRowMapper())
for row in results: print("%s - %s" % (num_to_dotted_quad(int(row['ipfrom'])), num_to_dotted_quad(int(row['ipto']))))
SQLAlchemyhttp://www.sqlalchemy.org/
First release in 2005Now at version 1.0.8What is it• Provides helpers, tools & components to
assist with database access• Provides a consisdent and full featured
façade over the Python DBAPI• Provides an optional object relational
mapper(ORM)• Foundation for many Python third party
libraries & tools• It doesn’t hide the database, you need
understand SQL
SQLAlchemy Overview
SQLAlchemy Core – The Enginefrom sqlalchemy import create_engine
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb')
engine.execute(""" create table registry ( id serial primary key, name text ) """)
engine.execute(""" insert into registry(name) values('apnic') """)engine.execute(""" insert into registry(name) values('aprn') ""”)engine.execute(""" insert into registry(name) values('lacnic') """)
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaData
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer,
autoincrement=True,
primary_key=True), Column('name', String(10)))
metadata.create_all(engine) # create table if it doesn't exist
# auto construct insert statement with binding parametersins = registry.insert().values(name='dummy’)
conn = engine.connect() # get database connection# insert multiple rows with explicit commitconn.execute(ins, [{'name': 'apnic'}, {'name': 'aprn'}, {'name': 'lacnic'}])
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaDatafrom sqlalchemy.sql import select
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer, autoincrement=True, primary_key=True, Column('name', String(10)))
# auto create select statements = select([registry])
conn = engine.connect()
result = conn.execute(s)
for row in result: print(row)
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaDatafrom sqlalchemy.sql import select
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer, autoincrement=True, primary_key=True, Column('name', String(10)))
# auto create select statements = select([registry])
conn = engine.connect()
result = conn.execute(s)
for row in result: print(row)
SQLAlchemy ORMfrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy import create_engine, Table, Column, Integer, String
Base = declarative_base()
class Registry(Base): __tablename__ = 'registry' id = Column(Integer, autoincrement=True, primary_key=True) name = Column(String(10))
def __repr__(self): return "<Registry(%r, %r)>" % ( self.id, self.name )
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)Base.metadata.create_all(engine)
from sqlalchemy.orm import Sessionsession = Session(bind=engine)
apnic = session.query(Registry).filter_by(name='apnic').first()print(apnic)
SQLAlchemy ORM. . .Base = declarative_base()
class Registry(Base): __tablename__ = 'registry' id = Column(Integer, autoincrement=True, primary_key=True) name = Column(String(10))
def __repr__(self): return "<Registry(%r, %r)>" % ( self.id, self.name )
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)Base.metadata.create_all(engine)
from sqlalchemy.orm import Sessionsession = Session(bind=engine)
mynic = Registry(name='mynic')
session.add(mynic)
session.commit()
DB-API 2.0 PEP http://www.python.org/dev/peps/pep-0249/
Travis Spencer’s DB-API UML Diagram http://travisspencer.com/
Andrew Kuchling's introduction to the DB-API http://www.amk.ca/python/writing/DB-API.html
Attributions
Andy Todd’s OSDC paper http://halfcooked.com/presentations/osdc2006/python_databases.html
Source of csv data used in examples from WebNet77 licensed under GPLv3 http://software77.net/geo-ip/
Attributions
Mark Reesmark at censof dot com
+Mark Rees@hexdump42
hex-dump.blogspot.com
Contact Details