114
A Brief History

A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

  • Upload
    lecong

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

A Brief History

Page 2: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

In The Beginning... (circa Y2K)

● Client-side● Application Scale● Low Administration

● Server-side● Enterprise Scale● High Administration

Key/Value Databases SQL Databases

Page 3: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

In The Beginning... (circa Y2K)

GDBM

Page 4: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

● Easy to understand● Simple API● Low-level operations● Compact

implementation● Less overhead● Faster primitives

● Steep learning curve● Rich query language● Promotes code with

– Fewer errors– More features

● Better algorithms● Situational awareness

Page 5: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

PUT(key, value)

GET(key) → value

Page 6: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

SELECT * FROM table1 WHERE pk=123

Page 7: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

SELECT eqptid, enclosureid FROM eqpt WHERE typeid IN ( SELECT typeid FROM typespec WHERE attrid=( SELECT attrid FROM attribute WHERE name='detect_autoactuate' ) AND value=1 INTERSECT SELECT typeid FROM typespec WHERE attrid=( SELECT attrid FROM attribute WHERE name='algorithm' ) AND value IN ('sensor','wetbulb') )

Page 8: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

● How to compute it● A few 1000 lines ● Low-level● Bare metal● Many bugs● Heads down

● What to compute● A few lines of code● High-level● Abstract● Fewer bugs● Heads up

Page 9: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQL promotes situational awareness

Page 10: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

Page 11: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

Page 12: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

KV versus SQL

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

Page 13: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 14: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

DatabaseFiles on

Disk

DatabaseEngine

Client

Client Client

Client

Page 15: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

DatabaseFiles on

Disk

DatabaseEngine

Client

Client Client

Client

Page 16: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 17: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

DatabaseFiles on

Disk

Client

Client Client

Client

Page 18: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite 1.0

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

GDBM

Page 19: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 20: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 21: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Issues with GDBM

● Separate file for each table and index● Hash based – cannot do range queries● No transactions

– Updates are not atomic– Cannot rollback

● File corruption on crash or power loss● GPL

Page 22: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Change The Underlying KV Database?

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

?

Page 23: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite 2.0

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

Page 24: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite 2.0

SQL Parser

Query Planner

Virtual Machine

KV Database Engine

Page 25: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 26: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite 2.X

● Single-file database● Useful subset of SQL● Everything is a string● Transactions with atomic commit & rollback● Public domain● About 250KiB size

Page 27: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

2.1.02001-11-12

2.2.02001-12-22

2.3.02002-01-30

2.4.02002-03-10

2.5.02002-06-17

2.6.02002-07-17

2.7.02002-08-25

2.8.02003-02-16

Page 28: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

(early 2004:) AOL needed....

● UTF-16 support● BLOB support● Collating sequences● Two-phase commit of ATTACH-ed databases

SQLite 3.0.0

Page 29: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 30: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

3.0.02004-06-18

3.1.02005-01-21

3.2.02005-03-21

3.3.02006-01-10

3.4.02007-06-18

3.5.02007-09-04

3.6.02008-07-16

Page 31: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Currently at 3.6.22

● Almost All SQL– Missing RIGHT and

FULL OUTER JOIN– Incomplete ALTER

TABLE● Faster than ever

– Improved query planner

– Less I/O

● 100% branch testing● FOREIGN KEY● SAVEPOINT● Full-text search● R-Tree indexes● 282 KiB binary● Terabyte Databases

Page 32: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Other Features Of SQLite

● App-defined functions● App-defined collating

sequences● UTF8 or UTF16● Robust against power

loss● Robust against

malloc() failures

● Live backup● Virtual Tables● ATTACH DATABASE● Gigibyte size BLOBs

and strings● Robust against I/O

errors● Zero-malloc option

Page 33: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite bridges the gap...

● Like BerkeleyDB and GDBM:– Direct to disk; serverless; client-side– Linked into application– Zero administration

● Like PostgreSQL, Oracle, etc:– High-level query language– Transactional & Relational– Heads-up programming; situational awareness

Page 34: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

In The Beginning... (circa Y2K)

GDBM

Page 35: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

... Squeezing out BDB and GDBM

Page 36: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Other client-side SQL engines

Page 37: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

One new notable KV database...

Page 38: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Do not confuse SQLitewith server databaseengines....

Page 39: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 40: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 41: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

is to is toas

Page 42: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

● SQLite does not compete with Oracle

Page 43: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

● SQLite does not compete with Oracle● SQLite competes with fopen()

Page 44: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Application Code

High-levelSQL statements

Low-level diskreads & writes

Page 45: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Portable File Format

● A database is a single ordinary disk file● No special naming conventions or required file

suffixes● Cross-platform: big/little-endian and 32/64-bit● Backwards compatible through 3.0.0● Promise to keep it compatible moving forward● Not tied to any particular programming

language.

Page 46: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 47: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

c,s,v

10011010010111

homegrown

<xml/>

Page 48: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 49: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite as file format freebies

● No parsing and generating code to write● Atomic updates● Fast, built-in searching● Access via third-party tools● Simplified upgrade migration● Cross-platform file format● High-level query language● sqlite3_trace()

Page 50: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Small Footprint

gcc -Os -DSQLITE_THREADSAFE=0

gcc -O3 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_RTREE=1

282 KiB

868 KiB

As of 2010-01-25

Page 51: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Single Source Code File

● The “amalgamation” source code file: sqlite3.c● About 66,000 lines of ANSI C code● 3.9 MB● No other library dependencies on than standard

library routines:– memcpy(), memset(), malloc(), free(), etc

● Very simple to add to a larger C program

Page 52: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 53: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 54: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Many companies andorganizations use SQLite...

Page 55: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Adobe Photoshop Lightroom

Page 56: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Adobe Reader

Page 57: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Mozilla Firefox

Page 58: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 59: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Android

Page 60: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

iPhone

Page 61: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

iPod & iTunes

Page 62: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

iStuff

Page 63: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Blackberry

Page 64: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Palm webOS

Page 65: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Skype

Page 66: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Chrome

Page 67: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Dropbox

Page 68: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Sony Playstation

Page 69: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

... and so forth

Page 70: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Various Programming Languages

Page 71: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Books About SQLite

Page 72: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Estimating Adoption Of SQLite

● 450 million Skype users● 300 million Firefox users● 100 million Nokia/Symbian smartphones● 50 million iPhones● ??? Other active users.

500M to 1B active deployments

Page 73: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite.orgThe Company

Page 74: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

The SQLite Development Team

Page 75: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Business Plan

● Annual maintenance subscriptions● Technical support agreements● Proprietary extensions

– SQLite Encryption Extension– Compressed and Encrypted Read-Only Database– Test suite for embedded devices

● Training & custom development work● Certification artifacts● SQLite Consortium

Page 76: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

ConsortiumThe

● Guarantees of project continuity● Enterprise-level technical support● Highest priority bug fixes

Page 77: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

● http://www.sqlite.org/● Per day: 16K unique IPs, 250K hits, 10GB xfer● Hosted by Linode.com

– Linode 720– Since 2004

● Custom web server on xinetd● CPU utilization: 2.87%

Page 78: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

? Is SQLite still relevant?

? Aren't people moving away from SQL?

? What about all these newcloud-computing databases?

Page 79: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 80: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

The CAP Theorem

Consistent

Available

Partition

Choose Any Two

Page 81: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

The CAP Theorem

Consistent

Available

Partition

Choose Any Two

SQL

Page 82: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

The CAP Theorem

Consistent

Available

Partition

Choose Any Two

SQL

NoSQL

Page 83: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 84: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 85: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Scaling Up versus Scaling Out

Scaling Up:

Scaling Out:

Increase the size and speed of the database server

Increase the number of databaseservers in your cloud

Page 86: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Scaling Up versus Scaling Out

Scaling Up:

Scaling Out:

Increase the size and speed of the database server

Increase the number of databaseservers in your cloud

SQL

NoSQL

Page 87: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

ACID versus BASE

Atomic,

Consistent,

Isolated,

Durable

Basically

Available,

Soft-state,

Eventually consistent

SQL NoSQL

Page 88: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Can We Not Keep Consistency?

● A single PostgreSQL server can handle:– Every seat reservation for the largest airline– Every book sale in the USA (online or otherwise)

● Sharding is a fallback– Separate database server for each airplane– Separate server for ranges of ISBN numbers

Page 89: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQL without Consistency?

● Definitely have to give up– UNIQUE– FOREIGN KEY

● All current implementations disallow:– BEGIN, COMMIT, SAVEPOINT, ROLLBACK

● Most current implementations:– Key/Value only– No query language– No indexing, sorting, aggregating

Page 90: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

What To Do?

● Avoid abandoning Consistency if you do not really have to.

● Avoid regressing from high-level SQL to low-level KV databases.

● Watch the cloud-database world carefully but avoid getting sucked into the hype.

Page 91: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

One Process

Many Processes on One Server

Many Servers in One Datacenter

Multiple Datacenters

Page 92: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

An Example Cloud Database

● Distributed Version Control System (DVCS)

Page 93: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Fundamental Concepts

● A repository is a bag of “artifacts”

● Every users has their own repository

001004

005

002003

Page 94: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Fundamental Concepts● Sync by sharing

artifacts001004

005

002003

012

004

005

007

003

sync

001002

007012

Page 95: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Fundamental Concepts

● After sync, repositories have the same set of artifacts

001004

005

002003

012

004

005

007

003

012

007

001

002

Page 96: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

Page 97: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

Page 98: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

sync

Page 99: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

sync

sync

Page 100: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

http//www.fossil-scm.org/

● Distributed VCS– and wiki,– and bug reports

● Self-contained● Built-in Web Interface● HTTP & CGI● “github in a box”

Page 101: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

(Live Demo)

Page 102: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Fossil Server Setup

#!/usr/bin/fossilrepository: /fossil/fossil.fossil

The actual 2-line CGI script that runs the canonical self-hostingfossil repository:

Page 103: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

18-Dec-2009 Catastrophic Events at Our Internet Host

Thanks to the progressive death of the NAS server from which all our Internet facilities are served, we lost everything - Tracker, website, Wiki and news mirrors - piece-by-piece over the past days.

By gargantuan effort, Sean Leyne and his colleagues at Broadview have been reconstructing our infrastructure from scratch. The main website is up (obviously!). Others are more challenging, requiring radical reconstruction using new versions of underlying software, and will take time.

For now, if you find broken links (other than those to our other servers) then please feel free to inform us via a message to the firebird-general list (joining instructions HERE).

http://www.firebirdsql.org/

Page 104: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

Page 105: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

Page 106: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

001 004

005002003

012007

001 004

005002003

012007 001 004

005002003

012007

001 004

005002003

012007

sync

sync

Page 107: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Fossil Is A Post-Modern Database

● Repository is not relational● Supports Available and Partition; not Consistent

● Each node stores content in an ACID SQL database (SQLite)

● Local storages includes indices for fast report generation.

But....

Page 108: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Every DVCS Is A BASE Database

● For Git and Hg, local storage is a pile-of-files– Hand-coded read and write– Ad hoc format– What happens on a power loss?

● For Monotone and Fossil, local storage is SQLite– Leverage existing database engine– Proof against crashes– Simple reports

Page 109: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

SQLite versus Pile-of-Files

● Single-file repository● Transactions

– Rollback after crash– Rollback if self-check fails

● Viewable with 3rd party tools● High-level query language● Structured storage● sqlite3_trace() for debugging

Page 110: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *
Page 111: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

INSERT OR IGNORE INTO timeline SELECT blob.rid, uuid, datetime(event.mtime,'localtime') AS timestamp, coalesce(ecomment, comment), coalesce(euser, user), (SELECT count(*) FROM plink WHERE pid=blob.rid AND isprim=1), (SELECT count(*) FROM plink WHERE cid=blob.rid), NOT EXISTS(SELECT 1 FROM plink WHERE pid=blob.rid AND coalesce((SELECT value FROM tagxref WHERE tagid=8 AND rid=plink.pid), 'trunk') = coalesce((SELECT value FROM tagxref WHERE tagid=8 AND rid=plink.cid), 'trunk')), bgcolor, event.type, (SELECT group_concat(substr(tagname,5), ', ') FROM tag, tagxref WHERE tagname GLOB 'sym-*' AND tag.tagid=tagxref.tagid AND tagxref.rid=blob.rid AND tagxref.tagtype>0), tagid, briefFROM event JOIN blobWHERE blob.rid=event.objid AND event.mtime>=(SELECT julianday('2000-08-17 10:25:00', 'utc'))ORDER BY event.mtime ASC LIMIT 10;SELECT * FROM timeline ORDER BY timestamp DESC;

Page 112: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Summary

● SQLite (and similar) for local data storage– Client-side– Consistent– Transactional– Robust

● Cling to Consistency● Cherish high-level query languages

Page 113: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

EmbeddedSQL

Client/ServerSQL

Post-Modern

Page 114: A Brief History - [ Triangle Linux Users Group ]porter/meetings/2010-02-11_RichardHi...Situational awareness KV versus SQL PUT(key, value) GET(key) → value KV versus SQL SELECT *

Questions & Comments