http
localhost:9200/github/repository/_search?q=language:Javascript%20+forks_count:
%3E20000&sort=forks_count:desc&size=3
Lite search
Expects all parameters to be passed via query string and encoded properly e.g:
http localhost:9200/github/repository/_search?q=name:angular.js
Based on _search API:
http localhost:9200/_search http localhost:9200/user,repository
http localhost:9200/{index}/{type}/_search?q=field:value...
http localhost:9200/github/repository/_search?size=2&from=50
Lite searchSupports pagination:
Supports obligatory conditions (+ \ -):
http localhost:9200/github/repository/_search?q=+language:(php%20css)
Supports sorting
http
localhost:9200/github/repository/_search?q=language:Java&sort=watchers_count:d
esc
Lite search
PROS
Powerful
Convenient for development and ad-
hoc queries
End-users can run queries directly
from their web-browser
CONS
Queries should be carefuly encoded
Opened API can cause potentially
slow queries or even kill your
cluster
Not so efficient for complex queries
FULL-BODY SEARCH● Utilizes the same _search API
● Transfers parameters in request body e.g
curl localhost:9200/github/repository/_search -d '{"size": 2, "from": 10}'
● According to RFC 7231 there is no strict definition what to do when server received GET query with body parameters (depends on server
implementation). So both GET and POST methods are allowed.
● Instead of encoded urls there is convenient search query domain-specific
language (DSL)
SEARCH QUERY CLAUSES
● Leaf clauses - compare field to a query string
(match, term, range)
● Compound clauses - combine other query clauses
(bool, dis_max)
SEARCH QUERY DSL EXAMPLEcurl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"match": {
"language": "Javascript"
}
}
}'
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"bool": {
"must": {"match": {"language": "Javascript"}},
"should": {"match": {"description": "library"}}
}
}
}'
SEARCH QUERY MATCHERS
match
multi_match common_terms query_string
simple_query_string
FULL TEXT QUERIES
MATCHERS
MATCH & MULTI_MATCHcurl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"match": {
"language": "Javascript"
}
}
}'
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"multi_match": {
"query": "javascript",
"fields": ["language", "description"]
}
}
}'
QUERY STRING QUERY
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"query_string": {
"query": "language:(C OR PHP) AND watchers_count:[15000 TO *]"
}
}
}'
Supports compact Lucene query string syntax
SIMPLE QUERY STRING QUERY
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"simple_query_string": {
"fields": ["description"],
"query": "(framework^2 realtime) + -(web port client)"
}
}
}'
Have simplified query syntax
COMMON TERMS QUERY
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"common": {
"description": {
"query": "for is and web",
"cutoff_frequency": 0.001
}
}
}
}'
Divides query terms into two groups:
● More important - low frequency
● Less important - high frequency (applied first)
SEARCH QUERY FILTERS
● term
● terms
● range
● exists
● missing
● bool
● prefix
● wildcard
● regex
● fuzzy
TERM AND RANGE FILTERScurl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"term": {
"language": "C++"
}
}
}'
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"range": {
"watchers_count": {
"gte": 5000,
"lte": 15000
}
}
}
}'
EXISTS AND MISSING FILTERScurl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must_not": {
"exists": {
"field": "language"
}
}
}
}
}
}
}'
BOOL FILTER
● must
○ Clauses must match, like and
● must_not
○ Clauses must not match, like not
● should
○ At least one of clauses must match, like or .
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"term": {
"language": "JavaScript"
}
},
"should": {
"range": {
"forks_count": {
"gt": 10000
}
}
}
}
}
}
}
COMBINING FILTERS AND MATCHERS
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"filtered": {
"query": {
"match": {
"has_issues": true
}
},
"filter": {
"term": {
"language": "Objective-C"
}
}
}
}
}'
SORTINGcurl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"filtered": {
"query": {
"match": {
"has_issues": true
}
},
"filter": {
"term": {
"language": "Objective-C"
}
}
}
},
"sort": {
"forks_count": {
"order": "desc"
}
}
}'
RELEVANCE● How well a retrieved document or set of documents meets the information
need (criteria) of the user
● Positive FP number stored under _score property
● Calculated by term frequency/inverce document frequency (TF/IDF) algorithm:
○ Term Frequency (tf): more often - more
relevant (field)
○ Inverted Document Frequency(idf) more often - less relevant (index)
○ Field-length norm (fieldNorm) shorter - more relevant (field)
RELEVANCE EXPLANATION
curl localhost:9200/github/repository/_search?pretty -d '{
"query": {
"term": {
"language": "C++"
}
},
"size": 1,
"explain": true
}'