Upload
mersalin-josh
View
90
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Elasticsearch Training Document
Citation preview
You know, for search
What is Elasticsearch?
Elasticsearch is a tool for querying written words.
Elasticsearch is a near real time search platform.
Terminology
MySQL Elastic Search
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index Everything is indexed
SQL Query DSL
SELECT * FROM table … GET http://…
UPDATE table SET … PUT http://…
Index Example
Term Count Docs4 1 <3>
Apache 1 <3>
Cookbook 1 <3>
ElasticSearch 2 <1><2>
Mastering 1 <1>
Server 1 <1>
Solr 1 <3>
1. ElasticSearch Server (document 1) 2. Mastering ElasticSearch (document 2) 3. Apache Solr 4 Cookbook (document 3)
Indexing
ElasticSearch is able to achieve fast search responses because, instead of searching the text directly, it searches an index instead.
Cluster
A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes.
Node
A node is a running instance of Elasticsearch
DocumentMost entities or objects in most applications can be serialized into a JSON object, with keys and values.
{ "name": "John Smith", "age": 42, "confirmed": true, "join_date": "2014-06-01", "home": { "lat": 51.5, "lon": 0.1 }, "accounts": [ { "type": "facebook", "id": "johnsmith" }, { "type": "twitter", "id": "johnsmith" } ]}
Type
Within an index, you can define one or more types. A type is a logical category/partition of your index whose semantics is completely up to you. In general, a type is defined for documents that have a set of common fields.
Index
An index is a collection of documents that have somewhat similar characteristics. For example, you can have an index for customer data, another index for a product catalog, and yet another index for order data.
Shard
• A single Lucene index• Automatically managed by elasticsearch• Distributed amongst all nodes in the cluster
Replica
• A copy of the primary shard• Each primary shard can have zero or more replicas
Workshop
Installing Elasticsearch
• curl -L -O http://download.elasticsearch.org/PATH/TO/LATEST/$VERSION.zip
•unzip elasticsearch-$VERSION.zip
• ./bin/plugin -i elasticsearch/marvel/latest
Running Elasticsearch• Start : ./bin/elasticsearch• Test it out by opening another terminal window and running:• curl 'http://localhost:9200/?pretty‘• {
"status": 200, "name": "Shrunken Bones", "version": { "number": "1.1.0", "lucene_version": "4.7" }, "tagline": "You Know, for Search"}
Talking to Elasticsearch
• Java API• Restful api with json over http
Request• A request to Elasticsearch consists of the same parts as any HTTP request.
For instance, to count the number of documents in the cluster, we could use:
Response
Elasticsearch returns an HTTP status code like 200 OK and (except for HEAD requests) a JSON encoded response body. The above curl request would respond with a JSON body like the following:
Indexing employee documents
Retrieving Document
Search Lite