commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf ·...

Preview:

Citation preview

CouchDB (Couch is an acronym for cluster of unreliablecommodity hardware).

Created in April 2005 by Damien Katz, former Lotus Notesdeveloper at IBM.

First released as an open source project under the GNUGeneral Public License.

In February 2008, it became an Apache Incubator projectand the license was changed to the Apache License.

Written in Erlang

Introduction

Independet FacebookDevelopersMobile Developers(iPhone,Android)Bloggers

NoSQL

Describes a broad range of databasetechnologies which are non-relational.

Focused towards solving specific problemswhich are not suited for relational storage.

A database designed to run on theinternet of today for today’s desktop-like applications and the connecteddevices through which we access theinternet.

If your data is truly relational, stickwith RDBMS

NoSQL

Relational Database

SQL

Tabular storage of data

Replacement for relational databases

JSON document-oriented DB (NoSQL)

database is made up collections

You can think of collections as tables from relationaldatabases

Collections are made up of zero or more documents.

You can think of documents as a row from relationaldatabases

Schema free: No need to design your tables, you can simplystart storing new values

No wasting storage on empty, or null fields.

Web Server /Application Server: Write a client side application that

talks directly to the Couch without the need for a server side middlelayer.

Having the database stored locally, your client side application can

run with almost no latency.

Data replication model: devices (like phones) that can go offline andhandle data sync for you when the device is back online.

Add attachments to documents

Scalable and fault tolerant

Use RESTful Interface to store JSON documents:

Data creation/replication/insertion, every management and

data task can be done via HTTP.

REST=Representational State Transfer

Use map/reduce query written in JavaScript

Faster than SQL because of using pointers instead of joints

CouchDB provides ACID semantics.It does this by implementing a form of Multi Version ConcurrencyControl (MVCC), meaning that CouchDB can handle a high volume ofconcurrent readers and writers without conflict.

Distributed Architecture with Replication CouchDB was designedwith bi-direction replication (or synchronization) and off-lineoperation in mind. That means multiple replicas can have theirown copies of the same data, modify it, and then sync thosechanges at a later time.

ACID Semantics

DocumentsJSON, or derivativesXML

Schema freeDocuments are independentNon relationalRun on large number of machinesData is partitioned and replicated amongthese machines

A document can contain any number of fields ofany length can be added to a document.

Fields can also contain multiple pieces of data.

1. FirstName=“Abo", Address=“Insinoorinkatu 60",Hobby=“swimming“

2. FirstName=“another Abo", Address=“Orivedenkatu 8",Children=(“unbornChild1”, -5", “unbornChild2”, -10",“unbornChild3”, -15").

Example

Some examples…

Large Data Sets

Web Related Data

Customizable Dynamic Entities

Persisted View Models

Each document has a unique field named“_id”Each document has a revision field named“_rev” (used for change tracking)

Java Script Object Notation

lightweight data storage format based on a subset of

JavaScipt syntax

eg:{"Subject": "ASF turns 10","Author": "ajith","PostedDate": "2012-10-20","Tags": [

"Apache Software Foundation","Open source"

],"Body": "Recently Apache Software Foundation became 12 years old."

}

REST• REST stands for REpresentational State Transfer• Uses existing HTTP verbs (GET, POST, PUT,

DELETE)• URL contains identifiers.

1. REST API– curl (unix like OS)– cURL (windows)– GET/PUT/POST/DELETE

cURL is an open source, command lineutility for transferring data to andfrom a servercURL supports all common Internetprotocols, including SMTP, POP3, FTP,IMAP, GOPHER, HTTP and HTTPSExamples:

curl –X GET http://www.bing.com/search?q=couchdb

Check server versioncurl http://localhost:5984

Create databasecurl –X PUT http://localhost:5984/albums

Delete databasecurl –X Delete http://localhost:5984/cds

Get a UUIDcurl http://localhost:5984/_uuids

Create documentcurl –X POST http://localhost:5984/albums-d “{ \”artist\” : \”The Decembrists\” }”–H “Content-Type: application-json”

Get document by IDcurl

http://localhost:5984/artists/a10a5006d96c9e174d28944994042946

What is a view?View in CouchDB context

A "show" that directly renders a document using JavaScript

MapReduceTwo types

Permanent viewIndexedJSON for the view is stored as a design document

Temporary viewSent via a HTTP POSTComputed on the fly

Creating a view using Futon

Takes data in and transforms it intosomething else.

Output is a key/value pair

Keys can be complex types

Like a .Select() in LINQ

Takes in a set of intermediate values and combinesthem into a single value.

Reduce needs to be able to accept results from themap function AND the reduce function itself.

Like an .Aggregate() in LINQ

var test=new [] {"1","2","3”};var mapped = test.Select(Int32.Parse);var reduced = mapped.Aggregate((sum,i)=>sum+=i);

"map": "function(doc){emit(doc._id, parseInt(doc.value));

}","reduce": "function(keys,values) {

return sum(values);}"

Map function (extracting data) is executed on every document in thedatabase.Emits key/value pairs (Can emit 0, 1, or more KeyValue pairs for eachdocument in the database)key/value pairs are then ordered and indexed by keyQuery types:

Exact: key = xRange: key is between x and yMultiple: key is in list (x,y,z)

Reduce functions(data aggregation)e.g. count, sum, group

Map function

Key, Value

Key, Value

Key, Value

Key, Value

All Documents

Query by KeyKey, Value

Key, Value

Futon is a simple web admin for managingCouchDB instances and is accessible athttp://127.0.0.1:5984/_utils/Used for setting server configurationAllows for database administration (create/delete,compact/cleanup, security)Allows for CRUD operations on documentsCreating and testing viewsCreating design documents

• Clients available for many languages– C, C#, Erlang, Java, JavaScript, Perl, PHP,

Python,Ruby & many more..

Replication and synchronization capabilities of CouchDBmake it ideal for using it in mobile devices, wherenetwork connection is not guaranteed but the applicationmust keep on working offline.

CouchDB is well suited for applications withaccumulating, occasionally changing data, on which pre-defined queries are to be run and where versioning isimportant (CRM, CMS systems, by example).

Master-master replication is an especially interestingfeature, allowing easy multi-site deployments.

Use cases & production deployments

Thank You!

Recommended