23
First Indico Workshop Database Technology Pedro Ferreira 27-29 May 2013 CERN

First Indico Workshop Database Technology Pedro Ferreira 27-29 May 2013 CERN

Embed Size (px)

Citation preview

First Indico Workshop

Database TechnologyPedro

Ferreira

27-29 May 2013 CERN

The FutureLet’s start from the end

A more «social» IndicoUser-oriented approachPersonal home page, user-tailored informationMore data in less time

Need for an adequate DB

Avoiding disaster

ZODBWhy not?

It’s object-orientedIt’s been around for a long timeWe know how to use itWe are already using it for some of these things

ZODBAn object-oriented database

Index Transactions

O1

O2

O1 O4

O2

O3

O4

O3

O1time

ZODBThe good parts

It’s simpleNo need for ORM or mapping layersTightly integrated with PythonIt is ACID – things work as expected

ZODBThe bad parts

No server-side queriesNo built-in indexing – application levelData recovery is slow (latency, unpickling, setting object state)No way to fetch more than an object at once (pre-load)

ZODBYes, there’s more…

Has to be packed regularlyNo caching on the server side (besides OS cache)FileStorage Replication is not Open Source - moneyRelStorage could work, but requires migrationIt’s a niche product

Don’t Panic!We’re working on it!

NG DB ProjectThe Next Generation DB for Indicohttp://indico-software.org/wiki/Dev/FutureDB

Aims to find DB infrastructure that can support growth6 month initial phase (tech preview/boilerplate – end 2013)Tech Survey, Evaluation, Prototyping…

NG DB ProjectThe criteria

Availability (OSS)Scalability/ReplicationEase of use/developmentTransactions/ConsistencyCommunity/MomentumCosts

The ContestantsThe NoSQL crowd

Key-value storesRiak, Redis, Voldemort

Document-orientedMongoDB, CouchDB

Column-orientedCassandra, Hbase

Neo4J

Key-ValueJust a mapping structure

user1 Pedro Ferreira

user2 Alberto Resco

user3 {name: {first:“Jose Benito”, last: “Gonzalez”}

Column-oriented

Document-orientedCloser to the OO philosophy

user1 Pedro Ferreira

user2

name

Genevacity

Alberto Resconame

Runninghobbies

Cycling

The ContestantsThe usual (relational) suspects

MySQLMariaDBDrizzlePercona

PostgreSQL

Relational vs. NoSQLA very coarse comparison

Relational NoSQL

ACID Yes Not usually

Philosophy General-purposeTable-oriented

Problem-specificNormally closer to OO

Maturity Decades Pretty recent

Consistency Normally strong

Normally eventual

Relational vs. NoSQLThe problems

Relational NoSQL

Requires ORMDifferent philosophy

Lack of transactionsEventual consistencyToo simplistic

Typical query«All events in a user’s favourite categories»

RelationalSimple join between two tables

MongoDBEither replicate data or use DB refs (slow!)

Hybrid ApproachThe best of two worlds?

ZODB - excellent storage for business objectsSELECT style queries…ZODB as primary storage?Already kind of doing that (Redis)Need for transition period

Hybrid ApproachNo such thing as a free lunch…

Keeping data consistentMultiple DB calls per requestYet another thing to install

Conclusion

Research projectStill a lot of ground to coverHard to evaluate the hypeHybrid could be a good option

Pedro Ferreirahttp://github.com/pferreir

@pferreir