Upload
nosql-tlv
View
368
Download
3
Embed Size (px)
Citation preview
NoSQL Data Modeling
IdoFriedman.ymlName: Ido Friedman,Past:”SQL Server consultant,Instructor,Team Leader”Present:”Data engineer and Architect,
Elasticsearch,CouchBase,MongoDB,Python”,…]WorkPlace:PerionWhenNotWorking:@Sea
Let’s talk•What is the role of data modeling•What does data modeling effect
Data models
Document Columnar Graph
Relational New SQL* And more..
Document
Data DomainsOn line
Batch
Real time analytics
Micro batch
Streaming
Schema and structureSchema
free
Structured
Unstructured
Semi Structured
• Who needs schemas?• Schema description
Normalization/De-Normalization• Born to reserve storage and keep data integrity (RI)• Resolve data joining issues• Performance aspects
Is it still relevant???
Normalization example{"_index": "user_profiles","_type": "properties","_id": "25467834901804247006200168495554902214","_version": 4,"_score": 1,"_source": {
"app_package_id": 4495665825523018,"device_id": "b94b29c3-f03f-4e43-a646-53708e025779","group_id": 876,"customer_user_id": "","customer_device_id": "b2f5fbfb-5d05-01e9-32e9-a8b332e9a8b3","advertise_id": "","device_os": "android","device_os_comparable_version": "00000004.00000002.00000002.00000017","device_os_full_name": "android 4.2.2.17","android_id": "c93334e4e246b83c","manufacturer": "OPPO","model": "R1001","screen_width": 480,"screen_height": 800,"device_language": "vi","cpu": "","is_rooted": 1,"is_jailbroken": 0,"vendor": "perion","app_installation_time": 1435173851000,"push_allowed": 1,"operator": "Beeline VN","mcc": "452","mnc": "07","mac_address": "","created_at": 1435174685000,"registered_in_desktop": "","short_country_code": "VN","updated_at": 1435259712000,"app_package_name": "com.gingersoftware.android.keyboard","app_version_type": ""
}}
"manufacturer": "OPPO",
{"_index": "events-2015-06-04","_type": "events","_id": "AU4-zg5nOG4dkiMGKrCx","_version": 1,"_score": 1,"_source": {
"numeric_value_unit": "key","event_type_name": "custom","text_value": "","event_date": 1433440149000,"numeric_value": 13,"quantity": 0,"event_name": "Saved Tap","numeric_value_name": "taps saved","app_package_id": 1433440149000,"api_key": “asasa1w121","device_id": "c5eb1fe0-8a77-41ef-9f79-ed7ed69d32e6"
}}
"event_name": "Saved Tap","event_name": "SVD T",
"event_name": "Saved TP","event_name": "Saved",
Constraints• No BIG Brother• Data can't be verified once it leaves
the application
Transactions•Atomicity• Locking•Rollback
IT IS YOUR RESPONSIBILITY
Relations type• One to One• One to Many• Many to Many
Relations example1 to Many
City : Person1 to 1
Employee: Resume{_id:101,Name:Jason Voorhees,Age: 99Resume_ID:1004}
{_id:1004,Jobs:[Cook]Education:[Knifery]Hobbies:[Murder]Employee_id:101}
{ "name" : "Dam Square, Amsterdam", "location" : { "type" : "polygon", "coordinates" : [[ [ 4.89218, 52.37356 ],
[ 4.89205, 52.37276 ], ….… ]]}}
{_id:101,Name:Jason Voorhees,Age: 99Resume_ID:1004}
Many to Many
The new 1 to Many …..
One to Few
Student : Teacher{_id:101,Name:Jason Voorhees,Age: 99Resume_ID:1004Courses:[“Chainsaw 101”,”Axing”],Teachers:[101]}
{_id:101,Name: “Freddy Krueger”,Age: 60Resume_ID:1004Skills: [{Skill:}]}
Embedding• Embed• Known doc size• Data is highly related• No joins
• Don’t Embed• Very large data sets• Data is updated rapidly
Doesn’t fit ….• Use the data model that most fits your needs• Don’t be afraid of Polyglot Persistence
Polyglot Persistence • Data usage patterns• Readers vs. Writers• Online vs. Batch• Concurrency
• Issues• Data freshness• Data consistency• System Coupling
Questions?