45
A deep look at the CQL WHERE clause

DataStax: A deep look at the CQL WHERE clause

Embed Size (px)

Citation preview

Page 1: DataStax: A deep look at the CQL WHERE clause

A deep look at the CQL WHERE clause

Page 2: DataStax: A deep look at the CQL WHERE clause

CQL WHERE clause

2© 2015. All Rights Reserved.

Driver

The WHERE clause restrictions are dependent on:

• The type of statement: SELECT, UPDATE or DELETE

• The type of column: partition key, clustering or regular column

• If a secondary index is used or not

Page 3: DataStax: A deep look at the CQL WHERE clause

3© 2015. All Rights Reserved.

Driver

SELECT statements

Page 4: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

4© 2015. All Rights Reserved.

Driver

Cluster Date Time Count

‘cluster 1’ ‘2015-09-21’ ‘12:00’ 251

‘cluster 1’ ‘2015-09-22’ ‘12:00’ 342

‘cluster 2’ ‘2015-09-21’ ‘12:00’ 403

‘cluster 2’ ‘2015-09-22’ ‘12:00’ 451

CREATE TABLE numberOfRequests (

cluster text,

date text,

time text,

count int,

PRIMARY KEY ((cluster, date))

)

Partition Key

Page 5: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

5© 2015. All Rights Reserved.

Driver

Cluster Date Murmur3 hash

‘cluster 1’ ‘2015-09-21’ -4782752162231423249

‘cluster 1’ ‘2015-09-22’ 4936127188075462704

‘cluster 2’ ‘2015-09-21’ 5822105674898716412

‘cluster 2’ ‘2015-09-22’ 2698159220916609751

A

C

D B

4611686018427387904to

9223372036854775807

-9223372036854775808to

-4611686018427387903

-1to

4611686018427387903

-4611686018427387904 to -1

Page 6: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

6© 2015. All Rights Reserved.

Driver

Cluster Date Node

‘cluster 1’ ‘2015-09-21’ A

‘cluster 1’ ‘2015-09-22’ D

‘cluster 2’ ‘2015-09-21’ D

‘cluster 2’ ‘2015-09-22’ C

A

C

D B

Page 7: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

7© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests;

Driver

Page 8: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

8© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’;

InvalidRequest: code=2200 [Invalid query]

message="Partition key parts: date must be restricted as other parts are"

Page 9: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

9© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’

AND date = ‘2015-09-21’;

Driver

Page 10: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

10© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’

AND date = ‘2015-09-21’;

Driver

…with TokenAwarePolicy

Page 11: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

11© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 2’

AND date IN (‘2015-09-21’, ‘2015-09-22’);

Driver

Page 12: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

12© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’

AND date = ‘2015-09-21’;

Driver

…with TokenAwarePolicy

and asynchronous calls

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’

AND date = ‘2015-09-22’;

Page 13: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

13© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’

AND date >= ‘2015-09-21’;

InvalidRequest: code=2200 [Invalid query]

message="Only EQ and IN relation are supported on the partition key (unless

you use the token() function)"

Page 14: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

14© 2015. All Rights Reserved.

Driver

Cluster Date Node

‘cluster 1’ ‘2015-09-21’ A

‘cluster 1’ ‘2015-09-22’ D

‘cluster 2’ ‘2015-09-21’ D

‘cluster 2’ ‘2015-09-22’ C

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’

AND date >= ‘2015-09-21’;

Page 15: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

15© 2015. All Rights Reserved.

Driver

• Murmur3Partitioner (default): uniformly distributes data across

the cluster based on MurmurHash hash values.

• RandomPartitioner: uniformly distributes data across the

cluster based on MD5 hash values.

• ByteOrderedPartitioner: keeps an ordered distribution of data

lexically by key bytes

Page 16: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions

16© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests

WHERE token(cluster, date) > token(‘cluster 1’, ‘2015-09-21’)

AND token(cluster, date) < token(‘cluster 1’, ‘2015-09-23’);

Page 17: DataStax: A deep look at the CQL WHERE clause

Partition key restrictions (SELECT)

17© 2015. All Rights Reserved.

• Without secondary index, either all partition key components must be

restricted or none of them

• = restrictions are allowed on any partition key component

• IN restrictions are allowed on any partition key component since 2.2

• Prior to 2.2, IN restrictions were only allowed on the last partition key

component

• =, >, >=, <= and < restrictions are allowed with the token function

Page 18: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

18© 2015. All Rights Reserved.

CREATE TABLE numberOfRequests (

cluster text,

date text,

datacenter text,

server inet,

time text,

count int,

PRIMARY KEY((cluster, date), datacenter, server, time))

Page 19: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

19© 2015. All Rights Reserved.

Datacenter Server Time Count

Iowa 196.8.7.134 00:00 130

Iowa 196.8.7.134 00:01 125

Iowa 196.8.7.134 00:02 97

Iowa 196.8.7.135 00:00 178

Iowa 196.8.7.135 00:01 201

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

[Iowa, 196.8.7.134, 00:00, count] :

130

Cell nameCell

Column name

Page 20: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

20© 2015. All Rights Reserved.

Datacenter Server Time Count

Iowa 196.8.7.134 00:00 130

Iowa 196.8.7.134 00:01 125

Iowa 196.8.7.134 00:02 97

Iowa 196.8.7.135 00:00 178

Iowa 196.8.7.135 00:01 201

[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

[Iowa, 196.8.7.134, 00:00, count] :

130

Cell nameCell

Column name

Page 21: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

21© 2015. All Rights Reserved.

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’

AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;

[Iowa,196.8.7.135,00:00]

Page 22: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

22© 2015. All Rights Reserved.

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’

AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;

[Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 23: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

23© 2015. All Rights Reserved.

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’

AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;

[Iowa,196.8.7.135]

Page 24: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

24© 2015. All Rights Reserved.

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’

AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;

[Iowa,196.8.7.135]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 25: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’

AND time = ‘00:00’;

[?,?,00:00]

InvalidRequest: code=2200 [Invalid query]

message="PRIMARY KEY column "time" cannot be restricted as preceding

column "datacenter" is not restricted"

Page 26: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

26© 2015. All Rights Reserved.

AND datacenter = ‘Iowa’

AND server IN (‘196.8.7.134’, ‘196.8.7.135’)

AND time = ‘00:00’;

In 2.2:

[Iowa,196.8.7.134,00:00]

[Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 27: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

27© 2015. All Rights Reserved.

AND datacenter = ‘Iowa’

AND server IN (‘196.8.7.134’, ‘196.8.7.135’)

AND time = ‘00:00’;

In 2.1:

InvalidRequest: code=2200 [Invalid query]

message="Clustering column "server" cannot be restricted by an IN relation"

Page 28: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

28© 2015. All Rights Reserved.

= multi-column restriction:

(clustering1, clustering2, clustering3) = (?, ?, ?)

IN multi-column restriction:

(clustering1, clustering2, clustering3) IN ((?, ?, ?), (?, ?, ?))

Slice multi-column restriction:

(clustering1, clustering2, clustering3) > (?, ?, ?)

(clustering1, clustering2, clustering3) >= (?, ?, ?)

(clustering1, clustering2, clustering3) <= (?, ?, ?)

(clustering1, clustering2, clustering3) < (?, ?, ?)

Page 29: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

29© 2015. All Rights Reserved.

AND datacenter = ‘Iowa’

AND (server, time) IN ((‘196.8.7.134’, ‘00:00’),

(‘196.8.7.135’, ‘00:00’));

In 2.1:

[Iowa,196.8.7.134,00:00]

[Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 30: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions

30© 2015. All Rights Reserved.

AND datacenter = ‘Iowa’

AND server = ‘196.8.7.134’

AND time > ’00:00’;

from after [Iowa,196.8.7.134,00:00]

to end of [Iowa,196.8.7.134]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178[Iowa, 196.8.7.135, 00:01, count] :

201

Page 31: DataStax: A deep look at the CQL WHERE clause

Clustering column restrictions (SELECT)

31© 2015. All Rights Reserved.

• Without secondary index, a clustering column cannot be restricted if

one of the previous ones was not

• = restrictions (single and multi) are allowed on any clustering column

• IN restrictions (single and multi) are allowed on any clustering column

since 2.2

• Prior to 2.2, IN restrictions (single and multi) were only allowed on the

last clustering column or set of clustering columns

• >, >=, <=, < restrictions (single and multi) are only allowed on the last

restricted clustering column or set of clustering columns

• CONTAINS and CONTAINS KEY restrictions are only allowed on

indexed collections

Page 32: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

32© 2015. All Rights Reserved.

CREATE TABLE numberOfRequests (

cluster text,

date text,

datacenter text,

server inet,

time text,

count int,

PRIMARY KEY((cluster, date), datacenter, server, time));

CREATE INDEX ON numberOfRequests (time);…

Page 33: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

33© 2015. All Rights Reserved.

CREATE INDEX ON numberOfRequests (time);

CREATE LOCAL TABLE numberOfRequests_time_idx (

time text,

cluster text,

date text,

datacenter text,

server inet,

PRIMARY KEY(time, cluster, date, datacenter, server);…

Table Partition Key

Table remaining clustering columns

Page 34: DataStax: A deep look at the CQL WHERE clause

IDX-BIDX-D

IDX-C

IDX-A

Secondary index queries

34© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE time = ‘12:00’;

Driver

Page 35: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

35© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests WHERE time = ‘12:00’;

idx

SELECT * FROM numberOfRequests_time_idx

WHERE time = ‘12:00’;

Results (Primary Keys)

tableSELECT with full PK;

[For each]

Add to rows

Page 36: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

36© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests WHERE time >= ‘12:00’;

InvalidRequest: code=2200 [Invalid query]

message="PRIMARY KEY column "time" cannot be restricted as preceding

column "datacenter" is not restricted"

Direct queries on secondary index support only =, CONTAINS or CONTAINS

KEY restrictions.

Page 37: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

37© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests WHERE time = ‘12:00’

AND count >= 500 ALLOW FILTERING;

idx

SELECT * FROM numberOfRequests_time_idx

WHERE time = ‘12:00’;

Results (Primary Keys)

tableSELECT with full PK;

[For each]

Add to rows

[if count >= 500]

Page 38: DataStax: A deep look at the CQL WHERE clause

IDX-BIDX-D

IDX-C

IDX-A

Secondary index queries

38© 2015. All Rights Reserved.

Driver

A

C

D B

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’AND time = ‘12:00’;

Driver

Page 39: DataStax: A deep look at the CQL WHERE clause

Secondary index queries

39© 2015. All Rights Reserved.

Driver

SELECT * FROM numberOfRequests

WHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’ AND time = ‘12:00’;

idx

SELECT * FROM numberOfRequests_time_idx

WHERE time = ‘12:00’ AND cluster = ‘1’ AND

date = ‘2015-09-21’;

Results (Primary Keys)

tableSELECT with full PK

[For each]

Add to rows

Page 40: DataStax: A deep look at the CQL WHERE clause

40© 2015. All Rights Reserved.

Driver

UPDATE/DELETE statements

Page 41: DataStax: A deep look at the CQL WHERE clause

UPDATE statements

41© 2015. All Rights Reserved.

Driver

In the UPDATE statements all the primary key columns must be restricted and

the only allowed restrictions are:

• Prior to 3.0:

• Single column = restriction on any partition key or clustering column

• Single column IN restriction on the last partition key column

• In 3.0:

• = and IN single column restrictions on any partition key column

• = and IN single or multi column restrictions on any clustering column

Page 42: DataStax: A deep look at the CQL WHERE clause

DELETE statements

42© 2015. All Rights Reserved.

Driver

Before 3.0, in the DELETE statements all the primary key columns must be

restricted and the only allowed restrictions were:

• Single column = restriction on any partition key or clustering column

• Single column IN restriction on the last partition key column

Page 43: DataStax: A deep look at the CQL WHERE clause

DELETE statements

43© 2015. All Rights Reserved.

Driver

Since 3.0:

• The partition key columns must be restricted by = or IN restrictions

• A clustering column might not be restricted if none of the following is

• Clustering columns can be restricted by:

• Single or multi column = restriction

• Single or multi column IN restriction

• Single or multi column >, >=, <=, < restriction

Page 44: DataStax: A deep look at the CQL WHERE clause

© 2015. All Rights Reserved. 44

Design your tables for the queries

you want to perform.

Page 45: DataStax: A deep look at the CQL WHERE clause

Thank you