Upload
couchbase
View
185
Download
12
Embed Size (px)
Citation preview
©2016 Couchbase Inc.©2016 Couchbase Inc.
Introduction
•Data is like us, it is L I V I N G• i.e, It is c h a n g i n g -- Data is created, modified, and deleted• It serves a purpose and be useful to some body, some where, some time –
Data is retrived
•An advanced and Civilized society • Understands, nurtures and manages it• Develops all primitives and sophistication to sustain, simplify, and prosper
•And, so is N1QL, for JSON data• an expressive, powerful, and complete language • for querying, transforming, and manipulating• for developers, admins and enterprises • It is SQL for JSON (in fact, SQL++)
©2016 Couchbase Inc.©2016 Couchbase Inc.
Agenda
1. Introduction
2. Ride with N1QL.. Beyond SELECT
3. Real world use cases
©2016 Couchbase Inc.©2016 Couchbase Inc.
Beyond SELECT•N1QL is much more than just SELECT•DML support• INSERT: Insert new documents• UPDATE: Update existing documents• UPSERT: INSERT + UPDATE• DELETE: Delete existing documents• MERGE: Merge two related documents with INSERT/UPDATE/DELETE• PREPARE/EXECUTE: Run frequent queries/statements faster• INFER: Discover schema of documents• EXPLAIN: Understand how a N1QL query/statement is executed
•DDL support • CREATE [PRIMARY] INDEX: Create primary or secondary index• DROP [PRIMARY] INDEX: Delete primary or secondary index• BUILD INDEX: Build deferred create-index requests
• Index Support (primary Index, Secondary Indexes, partial Indexes, covering Indexes)• Indexes can be used for all DML (with WHERE clause)
©2016 Couchbase Inc.©2016 Couchbase Inc. 6
{ "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ]}
LoyaltyInfo ResultDocuments
Orders
CUSTOMER
NoSQLAPI
AppData Logic
Built ManuallyExpensive (dev, test, maintanance)
Not agile friendly
©2016 Couchbase Inc.©2016 Couchbase Inc. 7
{ "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ]}
LoyaltyInfo ResultDocuments
Orders
CUSTOMER
Built-inOptimized, No data shipping
Agile & Ad hoc
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT• Insert lets you enter new documents into buckets.• There are 2 kinds. • Insert values directly. • Insert with select statement.
• Keyspace-ref is the bucket name• Returning clause returns the values of the attributes specified after insert.
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT
• Inserting values directly – Values-clause
• Inserting values using select statement – Select-clause
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT – Values - Example
• Inserting values directly – Values-clause
• Inserting a single value
• Inserting multiple values
INSERT INTO scientists (KEY, VALUE) VALUES ("ddc", {"name":"Donald D Chamberlin", "type":"SQL"})RETURNING *;
INSERT INTO scientists (KEY,VALUE) VALUES ("efc", {"name": "Edgar F Codd", "type":"DB"}),VALUES ("ll", {"name": "Leslie Lamport", "type":"Distributed"}),VALUES ("ewd", {"name": "Edsger W. Dijkstra", "type":"Algorithms"}),VALUES ("dek", {"name": "Donald E. Knuth", "type":"Algorithms"});
©2016 Couchbase Inc.©2016 Couchbase Inc. 11
Client
1. Submit the query over REST API 5. Query result
2. Parse, Analyze, create Plan 4. Evaluate: Documents to results (RETURNING-clause)
3. INSERT the documents
Index Service
Query Service
Data Service
INSERT INTO scientists(KEY, VALUE) VALUES ("dc", {"name":"Don Chamberlin", "type":"SQL"})RETURNING scientists;
{ "name":"Don Chamberlin", "type":"SQL"}
DCP Stream> 3. Update Index with new data
INSERT-VALUES processing
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT – SELECT - Example• SELECT-clause: Inserting values using select statement• INSERT-SELECT lets you create new documents from complex queries
• The Key is created using UPPER()function on original doc key.• In general, it can be any function, to dynamically generate the key ‘k’
• The value is a dynamically created object with name and array of papers published by the scientists where type = “DB”. • Note, how easy it is to dynamically create/manipulate custom JSON objects in the
query
INSERT INTO DB_papers(KEY k, VALUE v)
SELECT UPPER(meta(s).id) as k, {"name" : s.name, "papers" : s.papers} as v FROM scientists sWHERE type = "DB"
RETURNING *;
©2016 Couchbase Inc.©2016 Couchbase Inc. 13
Client
1. Submit the query over REST API 8. Query result
2. Parse, Analyze, create Plan 7. Evaluate: Documents to results (RETURNING-clause)
3. Scan Request; index filters 6. INSERT the documents
Index Service
Query Service
Data Service
4. Get qualified doc-keys 5. Fetch Request, doc-keys
INSERT INTO DB_papers (KEY k, VALUE s) SELECT meta().id k, {name,papers} v FROM scientists s WHERE type = "DB"RETURNING *;
{ "name":"Jim Gray", "papers": ["transaction processing",…]},{ "name":"Sumit Ganguly", "papers": ["Data Stream Processing",…]} …
DCP Stream> 6. Update Index with new data
INSERT-SELECT processing
Scan ConsistencyNOT_BOUNDEDREQUEST_PLUS
AT_PLUS
©2016 Couchbase Inc.©2016 Couchbase Inc.
DELETE
• DELETE lets you remove documents from buckets.
• USE clause • Using USE KEYS expr • Using USE INDEX (index-ref)
• WHERE clause - Constrain which documents to delete.• USE INDEX can direct to use specific index
• RETURNING clause • LIMIT clause - Constrain how many documents to delete.
• In N1QL DML statements, the LIMIT clause serves as a hint. • The query engine can stop processing records any time after the LIMIT is reached. • The LIMIT is not applied exactly, which is different from SELECT statements
©2016 Couchbase Inc.©2016 Couchbase Inc.
DELETE - Example
• You can also use arbitrarily complex conditions to select objects for deletion using the WHERE clause and use the USE INDEX to do an IndexScan operation.
• Using a where clause
• Using an Index (Deletes 1 document)
• empty the whole bucketeletes 1 document)
DELETE FROM scientists;
CREATE INDEX deleteindex ON scientists (name)
DELETE FROM scientists USE INDEX (deleteindex) WHERE name="Foo Bar"
DELETE FROM scientists USE KEYS ["Foo Bar"];
DELETE FROM scientists WHERE ARRAY_LENGTH(papers) <= 1;
©2016 Couchbase Inc.©2016 Couchbase Inc. 16
Client
1. Submit the query over REST API 8. Query result
2. Parse, Analyze, create Plan7. Evaluate: Documents to results (RETURNING-clause)
3. Scan Request; index filters 6. DELETE the documents with doc-keys
Index Service
Query Service
Data Service
4. Get qualified doc-keys
DELETE scientistsUSE INDEX (deleteindex) WHERE name="Foo Bar"RETURNING *;
{ "name":"Foo Bar", ...}
DCP Stream> 6. Update Index with new data
DELETE-WHERE processing
Scan ConsistencyNOT_BOUNDEDREQUEST_PLUS
AT_PLUS
CAS protectionDelete ONLY the SELECTED docs
5. Fetch Request, doc-keys
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE
•UPDATE replaces a document that already exists with updated values
• Use clause - Using USE KEYS expr
• Where, Limit and Returning clauses same as above.
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE
•SET-clause: Change the value of a particular attribute.
• UNSET-clause : Remove an attribute from a document
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE
•Update-for • The update-for clause uses the FOR statement to iterate over a
nested array • SET or UNSET the given attribute for every matching element
in the array
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE - example
• Following statement adds a paper to existing array:
• In addition to selecting objects by key, you can also select them by field values, with a WHERE CLAUSE.
• Following statement removes the email attribute
UPDATE scientists USE KEYS "Sumit Ganguly" SET papers = ARRAY_APPEND(papers, "Parametric Query Optimization");
UPDATE scientistsUNSET emailWHERE name = "Alan Turing" RETURNING *;
©2016 Couchbase Inc.©2016 Couchbase Inc. 21
Client
1. Submit the query over REST API 7. Query result
2. Parse, Analyze, create Plan6. Evaluate: Documents to results (RETURNING-clause)
3. Scan Request; index filters 5. UPDATE the documents with doc-keys
Index Service
Query Service
Data Service
4. Get qualified doc-keys
UPDATE scientistsSET email = [email protected] name = "C Mohan"RETURNING *;
{ "name":"C Mohan", ...}
DCP Stream> 5. Update Index with new data
UPDATE-WHERE processing
Scan ConsistencyNOT_BOUNDEDREQUEST_PLUS
AT_PLUS
CAS protectionDelete ONLY the SELECTED docs
5. Fetch Request, doc-keys
©2016 Couchbase Inc.©2016 Couchbase Inc.
MERGE
• MERGE lets you update, insert, or delete (actions) in one bucket based on a match with the data in another.
• Multiple actions can be specified in the same query rather than separate independent statements both when a match is found and otherwise.
• It is particularly suited for merge/purge operations and batch updates.• The MERGE statement contains a source bucket and a target bucket. It needs
a join condition based on a common attribute.
• Key-clause – KEY expression• Limit and Returning clauses are the same as before
©2016 Couchbase Inc.©2016 Couchbase Inc.
MERGE - Actions
• Actions on destination keyspace/document – Insert, Update and Delete• Note: the limited syntax of actions. And, UPSERT is not necessary/redundant.
• Merge-insert
• Merge-update
• Merge-delete
©2016 Couchbase Inc.©2016 Couchbase Inc.
MERGE – Example• We have two sets of data about cars.
• Update/merge documents into cars to update the mileage.
INSERT INTO cars VALUES ("1", { "make" : "Toyota", "plate": "AAA-123", "mileage": 1000}), ("2", { "make" : "Chevrolet", "plate": "BBB-456", "mileage": 1000}), ("3", { "make" : "BMW", "plate": "CCC-456", "mileage": 1000});
INSERT INTO car_changes VALUES ("101", { "car_id" : "1", "mileage": 1030}), ("201", { "car_id" : "2", "mileage": 1040}),("401", { "car_id" : "4", "mileage": 1050});
MERGE INTO cars dstUSING car_changes srcON KEY src.car_idWHEN MATCHED THEN UPDATE SET dst.mileage = src.mileageWHEN NOT MATCHED THEN INSERT WHERE src.mileage > 100;
©2016 Couchbase Inc.©2016 Couchbase Inc. 26
MERGE INTO cars dstUSING car_changes srcON KEY src.car_idWHEN MATCHED THEN UPDATE SET dst.mileage = src.mileageWHEN NOT MATCHED THEN INSERT WHERE src.mileage > 100;
MERGE processing
Data/Index Services Query Service
MATCHED
Cars
Car Changes
JOINsrc.car_id = meta(dst).id
NOT MATCHED
UPDATE
DELETE
INSERT
Query Result
©2016 Couchbase Inc. 27©2016 Couchbase Inc.
• In order to avoid the statement being prepared repeatedly we can do the following
• Using cbq
PREPARE - EXECUTE
cbq> PREPARE findcars FROM SELECT plate, make FROM default WHERE mileage > $miles; {
"encoded_plan":"H4sIAAAJbogA5xSwW7qMBD8FctcQIqegt4t7QuSL0hcWwrsnKWxODY6dppoYh+e9cmlAK9lFs8Ozs7O9m9RKtchdWyM2BlIWUmLbTIXwrI88t1SBAcyWIvR+eHXOBLz73IlAVaFMRcv/jJel/HxpH+j2yEukulTnpFmi3UOwwk9pWuGV01B1hhja48x2o6LrCFfQmDGvcor3Xtmak9loesss5Mwyq+Y3ctcAcCIxB87XNXevOtAlIMTFnKx20i9lPM3zXDz1ef5XiXE5OCjlK02CDWWk8mNnQfL3WDm5Nag6EnpuWrN76NBa3XSR+1+M7CKyWxH4ot7BJ1eeb8LQ9j+JyZEj+Rasr0YfkxhUilN8xfRwrbIIhNAmeel1bSH0hLE3zuX62rt4A8nk6cnUgFuW43HYAaGIpytW5Frh0AfkfiZiCJHfLAg3hpk/pDhMV4UMep/8vAZAAD//4OFhwocAwAA",
"name": "findcars”,
PREPARE [ name ( FROM | AS ) ] statement
EXECUTE name | plan
©2016 Couchbase Inc. 28©2016 Couchbase Inc.
• Can PREPARE any DML statement• PREPARED statements are transiently stored in-memory only.
• Need to re-create on restarting the cluster
• EXECUTE : Using curl
• EXECUTE : Using cbq
PREPARE - EXECUTE - Example
PREPARE cc_insert FROMINSERT INTO car_changes (KEY, VALUE) VALUES ($key, $val)RETURNING car_changes;
cbq> \SET -$key "501";cbq> \SET -$val { "car_id" : "4", "mileage": 1050};cbq> execute sampleinsert;
curl http://localhost:8093/query/service -d 'statement=execute cc_insert&$key="501"&$val={ "car_id" : "5", "mileage": 1050}'
©2016 Couchbase Inc.©2016 Couchbase Inc.
INFER
• INFER lets you discover schema characteristics of documents in a bucket
• Based on sampling• sample_size
• Specifies the number of documents to randomly sample in the keyspace or bucket.• The default value is 1000
• num_sample_values• Specifies the number of sample values (example data) for each attribute to be
returned. • The default value is 5
INFER <keyspace_ref> [ WITH { <options>} ]Options ::= [ "sample_size" : <value> ] [, "num_sample_values" : <value> ] [, "similarity_metric" : <value> ] [, "dictionary_threshold" : <value> ]
©2016 Couchbase Inc.©2016 Couchbase Inc.
INFER
• similarity_metric• the degree of similarity between two document schemas to be considered as same flavor
• It is a real number between 0 and 1 indicating the percentage match attributes at the top level.
• The default value is 0.6
• dictionary_threshold• is max #of fields following dictionary pattern, beyond which they are collapsed into a single schema field and marked as a dictionary• i.e fields with different names but with the same sub-document schema. Ex: user_1, user_2• where a field (ex: ratings) has sub-fields that are key-value pairs, instead of name-value pairs.• Appears like the document has a large number of ‘fields’ since a data value (Ex: user_1) is used as field name.
"ratings": { "user_1": { "created": 1439939260000, "rating": 4 }, "user_2": { "created": 1440066307000, "rating": 3 }, "user_3": { "created": 1440044407000, "rating": 2 }, …}
©2016 Couchbase Inc.©2016 Couchbase Inc.
INFER
• Example: discover schema characteristics of `travel-sample` documents in a bucket
• Use Cases• Query workbench in Couchbase Web console• BI/Visualization/Analytics tools
• Cloud9Charts leverages INFER• Schema-mapping for ORM tools
• ODBC/JDBC drivers use internal Schema discovery• Developers/Admins
INFER `travel-sample` WITH { "sample_size" : 4000, "num_sample_values" : 2, "similarity_metric" : 0.75,
"dictionary_threshold" : 10 }
©2016 Couchbase Inc. 32©2016 Couchbase Inc.
• EXPLAIN shows the query plan, i.e exact steps how N1QL plans to execute the query
EXPLAIN
cbq> EXPLAIN INSERT INTO default VALUES ("1", { "make" : "Toyota"});"plan": { "#operator": "Sequence", "~children": [ { "#operator": "ValueScan", "values": "[[\"1\", {\"\\\"make\\\"\": \"Toyota\"}]]" }, { "#operator": "Parallel", "maxParallelism": 1, "~child": { "#operator": "Sequence", "~children": [ { "#operator": "SendInsert",
EXPLAIN statement
©2016 Couchbase Inc.©2016 Couchbase Inc.
CREATE/BUILD INDEX• Index-with Clause
• Index-Using Clause
• Build Index
©2016 Couchbase Inc.©2016 Couchbase Inc.
Use Case – DELETE • Use Case : user-profile management
• Deleting stale users history based on some criteria• WHERE-clause / LIMIT-clause
• Be careful with MISTAKES • Predicate with “field_1” (with quotes””) evaluates to TRUE, and deletes all
documents• Field names are case sensitive
DELETE FROM user_profiles WHERE (id % 10) != 0 AND changed_by IS NOT missing AND object_length(changes) > 1 AND meta().id LIKE "HistoryLog::%" LIMIT 1000
DELETE FROM default WHERE "field_1" IS NOT MISSING;
DELETE FROM default WHERE field_1 IS NOT MISSING;X
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT with Powerful Transformations• Use Cases
• loading data with structure and data transformations•https://catalog.data.gov/dataset?res_format=JSON•data.nasa.gov/data.json
• https://dzone.com/articles/json-files-whats-in-a-new-york-name-unlocking-data
INSERT INTO nynames ( KEY uuid(), VALUE val ) SELECT val FROM ( SELECT meta.`view`.columns[*].fieldname fields, data FROM datagov ) d UNNEST data d_tmp LET val = OBJECT i:d_tmp[array_position(d.fields, i)] FOR i IN d.fields END ;
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT with Powerful Transformations – Sample data
• https://dzone.com/articles/json-files-whats-in-a-new-york-name-unlocking-data
"meta" : { "view" : { "id" : "25th-nujf", "name" : "Most Popular Baby Names by Sex and Mother's Ethnic Group, New York City", "attribution" : "Department of Health and Mental Hygiene (DOHMH)", "columns" : [ { "id" : -1, "name" : "sid", "dataTypeName" : "meta_data", "fieldName" : ":sid", "position" : 0, "renderTypeName" : "meta_data", "format" : { } }, … ]"data" : [ [ 1, "EB6FAA1B-EE35-4D55-B07B-8E663565CCDF", 1, 1386853125, "399231", 1386853125, "399231", "{\n}", "2011", "FEMALE", "HISPANIC", "GERALDINE", "13", "75" ], [ 2, "2DBBA431-D26F-40A1-9375-AF7C16FF2987", 2, 1386853125, "399231", 1386853125, "399231", "{\n}", "2011", "FEMALE", "HISPANIC", "GIA", "21", "67" ], [ 3, "54318692-0577-4B21-80C8-9CAEFCEDA8BA", 3, 1386853125, "399231", 1386853125, "399231", "{\n}", "2011", "FEMALE", "HISPANIC", "GIANNA", "49", "42" ]…] }
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE to sanitize data
• Use Case: Fixing date format, time zone • NO Application code changes !!
UPDATE travel_schedulesSET GMT_DEP_TIME = substr(GMT_DEP_TIME, 6, 4) ||
substr(GMT_DEP_TIME, 2, 4) ||substr(GMT_DEP_TIME, 0, 2) || substr(GMT_DEP_TIME, 10),
EST_ARR_TIME = substr(EST_ARR_TIME , 6, 4) || substr(EST_ARR_TIME , 2, 4) || substr(EST_ARR_TIME , 0, 2) || substr(EST_ARR_TIME , 10),
SCHED_GMT_DEP = substr(SCHED_GMT_DEP, 6, 4) || substr(SCHED_GMT_DEP, 2, 4) || substr(SCHED_GMT_DEP, 0, 2) || substr(SCHED_GMT_DEP, 10) ;
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPDATE nested structures
“Customer: I am wondering how I could update efficiently a fourth level nested document. For exemple, my document would look like this :”
{ "level1_Id" : "L1_ID_0", "level1_Arr": [ { "level2_Id": "L2_ID_0", "level2_Arr": [ { "level3_Id": "L3_ID_0", "level3_Arr": [ { "level4_Id": "L4_ID_0", "level4_Attr": "L4_ATTR" }, { "level4_Id": "L4_ID_1", "level4_Attr": "L4_ATTR" } ] } ] } ]}
UPDATE default d USE KEYS "TEST::1"SET k.level4_Attr = "test" FOR k IN ARRAY_FLATTEN(
ARRAY j.level3_Arr FOR j IN ARRAY_FLATTEN (
ARRAY i.level2_Arr FOR i IN d.level1_Arr END, 1 ) END , 1 )WHEN k.level4_Id = "L4_ID_0"ENDRETURNING *;
©2016 Couchbase Inc.©2016 Couchbase Inc.
UPSERT – Maintain latest summaries
• “Customer: I want to INSERT a document based on unique key or UPDATE a counter if the document already exists”
• Maintaining latest summaries
UPSERT INTO users (KEY, VALUE) VALUES ("EAN1234567", { "productId": "EAN1234567", "counter": 1} )RETURNING * ;
UPSERT INTO default (KEY id, VALUE doc) SELECT ‘app_summary_key' as id, {"type" : 'summary' , "count" : array_agg(tmp) } as doc FROM (SELECT device, count(*) as `count` from `app-devices` WHERE appl IS NOT NULL AND appl_company = “Microsoft” GROUP BY app) as tmp;
©2016 Couchbase Inc.©2016 Couchbase Inc.
Real World Examples: MERGE
“Customer: I want to INSERT a document based on unique key or UPDATE a counter if the document already exists”
•Note that, UPSERT may not help here, as it overwrites whole document, instead of updating a specific field, such as p.counter
MERGE INTO product p USING (SELECT NULL) s ON KEY "EAN1234567"WHEN MATCHED THEN
UPDATE SET p.counter = p.counter + 1 WHEN NOT MATCHED THEN
INSERT { "productId": "EAN1234567", "counter": 1}) RETURNING * ;
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT – SELECT for Stock Trades
• Use Cases –TaxLots• You want to join documents and
merge elements from it to create a completely new document with possibly aggregated data
• change key of the documents, so you do an insert/select with new key.
• Replicate a “temp table” functionality
•https://www.linkedin.com/pulse/modelling-taxlot-process-couchbase-using-n1ql-sandhya-krishnamurthy
• Imagine Developing an API based code..
©2016 Couchbase Inc.©2016 Couchbase Inc.
INSERT – SELECT for Stock Trades
Just.. Feel the power of N1QL
©2016 Couchbase Inc. 50
Share your opinion on Couchbase
1. Go here: http://gtnr.it/2eRxYWn
2. Create a profile
3. Provide feedback (~15 minutes)