35
NoSQL 研究 鍾聖彥 指導師 李永教授

No sql

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: No sql

NoSQL研究⽣生 鍾聖彥

指導⽼老師 李永⼭山教授

Page 2: No sql

NoSQL 意指不僅僅是 SQL

Page 3: No sql

Introduction

• NoSQL⼀一詞最早出現於1998年,是Carlo Strozzi開發的⼀一個輕量、開放原始碼、不提供SQL功能的關係資料庫

• 2009年來⾃自Rackspace的Eric Evans再次提出了NoSQL的概念,這時的NoSQL主要指⾮非關係型、分布式、不提供ACID的資料庫設計模式。(為聚會取名⼦子>>>NoSQL)

• NoSQL資料庫通常都是基於21世紀早期網站需求⽽而設計

• 聚焦在叢集上的巨量資料

Page 4: No sql

NoSQL是⼀一個不⼩小⼼心被創⽴立的名詞

Page 5: No sql

NoSQL資料庫的共通特性•不使⽤用關聯式模型

•在叢集上可以良好執⾏行

•開放原始碼

•爲21世紀的網站所建⽴立

•無綱要的

•不使⽤用SQL作為查詢語⾔言

Page 6: No sql

為什麼要⽤用 NoSQL?(1/2) • 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。

(petabytes =250 bytes; 1024 terabytes, or a million gigabytes.)

• 不需要資料庫綱要 (Schema):當涉及到綱要建構時,它們提供了相當廣泛的選擇空間,能輕易地和物件相對應。

• 開發者親和性:NoSQL 資料庫對各主要程式語⾔言提供了簡單的 API,因此再也⽤用不著複雜的 ORM 框架。如果特定程式語⾔言沒有 API 可⽤用時,還是可以透過簡單的 Restful API,使⽤用 XML 以及 JSON 格式經由 HTTP 存取資料。

Page 7: No sql

• 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用性。

• 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服器。輕易地運⾏行在⼀一般硬體組成的叢集上。

• 低延遲

為什麼要⽤用 NoSQL?(2/2)

Do not fully support relational features no join operations (except within partitions), no referential integrity constraints across partitions.

Page 8: No sql

How did we get?

• Explosion of social media sites (Google,Facebook, Twitter) with large data needs

• Rise of cloud-based solutions such as Amazon S3 (simple storage solution)

• Open-source community

Page 9: No sql

Who influences?

Page 10: No sql

Dynamo and BigTable

• Three major papers were the seeds of the NoSQL movement:

BigTable (Google)

Dynamo (Amazon)

• CAP Theorem

Page 11: No sql

CAP Theorem• ⼜又被稱作 布魯爾定理(Brewer's theorem)

• Brewer’s CAP “Theorem”: for any system sharing data it is impossible to guarantee simultaneously all of these three properties:

• Consistency: all nodes see the same data at the same time

• Availability: a guarantee that every request receives a response about whether it was successful or failed

• Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system

Page 12: No sql

CAP Theorem!

• Very large systems will partition at some point: It is necessary to decide between C and A !• Traditional DBMS prefer C over A and P !• Most Web applications choose A (except in specific

applications such as order processing)

Page 13: No sql
Page 14: No sql

NoSQL database

Page 15: No sql

NoSQL 資料模型(1/2)• Key-value Stores:⼀一般只包含⼀一系列的全域鍵值對,每個值各伴隨有獨特的鍵

• Document stores:模型的著名實作包括 MongoDB、CouchDB、RavenDB

• Column-oriented:Google 針對內部使⽤用的 BigTable 分散式儲存系統,發表研究論⽂文之後,以列為導向的資料庫知名的其他實作包括 Hadoop Hbase、Apache Cassandra、HyperTable

• 圖形 (Graph):適合⽤用來記錄任何擁有複雜關係的資料,如社群網路、產品偏好或是任何規則等。ex: Twitter ⽤用以實現⽤用⼾戶追蹤 (follow) 圖形的 FlockDB.

Page 16: No sql

!

• Hierarchical :These databases store data in the form of hierarchical relevance, that is, tree or parent-child relationship.

ex:階層式資料庫著名的實作包括 Microsoft 的 Windows Registry 與 IBM 的 IMS 資料庫

• Triple stores:Triple stores save data in the form of subject-predicate-object with the predicate being the linking factor between subject and object.

NoSQL 資料模型(2/2)

Support Semantic Web and RDF Storage

Page 17: No sql

Complexity

Page 18: No sql

Key-value StoresExtremely  simple  interface   Data model: (key, value) pairs Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Implementa2on:  efficiency,  scalability,  fault-­‐tolerance  • Records distributed to nodes based on key • Replication • Single-record transactions, “eventual consistency” • Riak – Based on Amazon’s Dynamo.Example  systems   Google BigTable, Amazon Dynamo, Cassandra, Voldemort, HBase, …

Page 19: No sql

Key-value Stores

!• Riak – Based on Amazon’s Dynamo.

Page 20: No sql

Suitable or Not suitable

合適的使⽤用狀況 不適合使⽤用狀況

儲存網路通訊對話資訊 取得不同資料間的關係

使⽤用者喜好設定 多個鍵值操作

購物⾞車資料 ⽤用資料來查詢

Page 21: No sql

Document storesLike Key-Value Stores except value is document

Data model: (key, document) pairs Document: JSON, XML, other semistructured formats Basic operations: Insert(key,document), Fetch(key), Update(key), Delete(key) Also Fetch based on document contents.

Example systems CouchDB, MongoDB, SimpleDB, …

Page 22: No sql

Document stores

Page 23: No sql

Suitable or Not suitable

合適的使⽤用狀況 不適合使⽤用狀況

事件歷史記錄 包含多種複雜交易

內容管理系統、部落格 在不同叢集結構上查詢

網路分析、即時資料分析

Page 24: No sql

Column-orientedBased on JSON format: a data model which supports lists, maps, dates, Boolean with nesting Really: indexed semistructured documents !

Example: Mongo { Name:"Jaroslav",

Address:"Malostranske nám. 25, 118 00 Praha 1“ Grandchildren: [Claire: "7", Barbara: "6", "Magda: "3", "Kirsten: "1", "Otis: "3", Richard: "1"] }

Page 25: No sql

Column-oriented

Page 26: No sql

Column-oriented

Id username email Department

1 John [email protected] Sales

2 Mary [email protected] Marketing

3 Yoda [email protected] IT

Id

1

2

3

Username

John

Mary

Yoda

email

[email protected]

[email protected]

[email protected]

Department

Sales

Marketing

IT

Row oriented

Column oriented

Page 27: No sql

Suitable or Not suitable

合適的使⽤用狀況 不適合使⽤用狀況

事件歷史記錄 讀取或寫⼊入ACID交易系統

內容管理系統、部落格平台 開發初期階段

限期使⽤用(廣告推播) 查詢變更(COST)

Page 28: No sql

Graph stores

• Data model: nodes and edges • Nodes may have properties (including ID) • Edges may have labels or roles

• Interfaces and query languages vary • Example systems: Neo4j, FlockDB, Pregel, … • RDF “triple stores” can map to graph databases

Page 29: No sql

Graph stores

Page 30: No sql

Graph stores

Page 31: No sql

Suitable or Not suitable

合適的使⽤用狀況 不適合使⽤用狀況

社群網路發佈 更新全部或實體⼦子集

轉發、傳遞基於位置的服務

Recommendation

Page 32: No sql

Conclusion and DiscussNoSQL database cover only a part of data-intensive cloud applications (mainly Web applications). !

Problems with cloud computing: • SaaS applications require enterprise-level functionality,

including ACID transactions, security, and other features associated with commercial RDBMS technology, i.e.

!• NoSQL should not be the only option in the cloud. !!!

Page 33: No sql

Conclusion and Discuss!

!

Hybrid solutions: Voldemort with MySQL as one of storage backend

deal with NoSQL data as semistructured data integrating RDBMS and NoSQL via SQL/XML

!

Page 34: No sql

Reference

http://www.ithome.com.tw/itadm/article.php?c=63360&s=5

http://www.openfoundry.org/index.php?option=com_content&task=view&id=9040&Itemid=4;isletter=1

http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1495 

Book:搞懂NoSQL的15堂課(Pramod J. Sadalage、Martin Fowler)

Page 35: No sql

End