Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100...

Preview:

Citation preview

© Datameer, Inc 2010

Horizontal Virtualization On Commodity Hardware Stefan Groschupf sg@datameer.com

© Datameer, Inc 2010

Laws Of Physics

2

Disk

SSD

Memory

1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000

358,200,000

42,200,000

53,200,000

36,700,000

1,924

316

RandomSequential

Adam Jacobs The Pathologies of Big Data

Values/Sec.

© Datameer, Inc 2010

Laws Of Physics

3

Hadoop(mergesort) DB(b-tree/index)

© Datameer, Inc 2010

Thesis, Antithesis, Synthesis

4

Thesis:SQL/ACID solves all you data problems.

Antithesis:Nobody needs SQL/ACID, let's throw it all out.

Synthesis:By carefully considering data integrity constraints we can find a more optimal data management solution for a particular problem.

voidpointer:http://news.ycombinator.com/item?id=1163516

© Datameer, Inc 2010

ConsiderationsKeep it simple!=> Simple to distribute.What is really the problem to solve?• Serve Data or Analyze Data?Sequential Write/Read performs best.• Data aggregation challenges.Index only if really necessary.• For data serving.

6

© Datameer, Inc 2010

Distribute on low cost HW

7

Distributed Storage, Computation.Distributed Index.Horizontal Virtual Operating System.Open Source Platform.

Storage

Computation

Linux Linux Linux

© Datameer, Inc 2010

Meet the ...

8

© Datameer, Inc 2010

www.datameer.com

9

Free for .edu!

© Datameer, Inc 2010

Street Cred

Long time open source contributor

10

http://github.com/sgroschupf/zkclient

http://github.com/sgroschupf/aws-tasks

Recommended