10
© Datameer, Inc 2010 Horizontal Virtualization On Commodity Hardware Stefan Groschupf [email protected]

Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Horizontal Virtualization On Commodity Hardware Stefan Groschupf [email protected]

Page 2: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Laws Of Physics

2

Disk

SSD

Memory

1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000

358,200,000

42,200,000

53,200,000

36,700,000

1,924

316

RandomSequential

Adam Jacobs The Pathologies of Big Data

Values/Sec.

Page 3: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Laws Of Physics

3

Hadoop(mergesort) DB(b-tree/index)

Page 4: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Thesis, Antithesis, Synthesis

4

Thesis:SQL/ACID solves all you data problems.

Antithesis:Nobody needs SQL/ACID, let's throw it all out.

Synthesis:By carefully considering data integrity constraints we can find a more optimal data management solution for a particular problem.

voidpointer:http://news.ycombinator.com/item?id=1163516

Page 5: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000
Page 6: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

ConsiderationsKeep it simple!=> Simple to distribute.What is really the problem to solve?• Serve Data or Analyze Data?Sequential Write/Read performs best.• Data aggregation challenges.Index only if really necessary.• For data serving.

6

Page 7: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Distribute on low cost HW

7

Distributed Storage, Computation.Distributed Index.Horizontal Virtual Operating System.Open Source Platform.

Storage

Computation

Linux Linux Linux

Page 8: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Meet the ...

8

Page 9: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

www.datameer.com

9

Free for .edu!

Page 10: Stefan Groschupf sg@datameer · © Datameer, Inc 2010 Laws Of Physics 2 Disk SSD Memory 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 358,200,000 42,200,000

© Datameer, Inc 2010

Street Cred

Long time open source contributor

10

http://github.com/sgroschupf/zkclient

http://github.com/sgroschupf/aws-tasks