Upload
proidea
View
187
Download
1
Embed Size (px)
DESCRIPTION
With Search playing such a big part on modern applications, provisioning robust search solutions that provide the proper level of security and low maintenance costs in multitenant applications become an entire new challenge. In this session, we define the requirements for multitenant search and review different patterns and solutions available to tackle this challenge.
Citation preview
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Mul@tenant Search JDD 2014, Krakow -‐ PL
Pablo Barros Applica@ons Architect October 14, 2014
2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
• The opinions and views expressed in this talk are my own, and do not necessarily reflect the opinions or views of my employer.
3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
About me
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
About me
5
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Giveaway
• Elas@csearch Server – Second Edi@on – By Rafal Kuc
6
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Key Concepts and PiUalls of Mul@tenancy
Designing the Search Index
Defining the Cluster Topology
Integra@ng with your Applica@on
Q&A
7
1
2
3
4
5
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Overview Key Concepts and Pi>alls of MulCtenant Search
8
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Defining Mul@tenancy
“Single so^ware instance serving mul@ple customers.”
9
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Benefits
• Sharing of Resources • Lower Costs • Easier Horizontal Scaling • Quicker onboarding of new Customers • Data Aggrega@on • Simpler Release Processes • “Green”
10
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
PiUalls & Risks
• Resource Sharing Limits • Requires more Customiza@on capabili@es • Higher Complexity • Data Security
11
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Search Engine
Topology
12
Your Applica@on
Search Engine Search Cluster 1..N
Read/Write
Read
Hub/Tribe Node
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Designing the Search Index
13
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Index Logical Granularity
14
vs.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Shared Indices
• Schema-‐less Index • Pros – True Global search – Intermixed Results across Customers/En@@es
• Cons – Cross Tenant Data Security – Weaker data separa@on – Index corrup@on can affect en@re Search – Ability of indexing data in parallel diminished
15
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Dedicated Indices
• Pros – Bejer data separa@on – More modular/portable – Bejer parallel indexing capabili@es
• Cons – More storage – Global search is more limited • However, some search engines allow searching across indexes and even across clusters (Elas@csearch Tribe Node)
16
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Indexing Process & Storage
17
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Indexing Process & Storage
18
Token Pointer
Droid 1, 2, 3
Look 2, 3
Rain 1
Doc 1: …
Doc 2: …
Doc 3: …
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Indexing with Storage Enabled
“total”: 2,!“hits”: [!{!!“id”: 1,!!“text”: “These are not the <b>droids</b> you are looking for.”,!!“date”: “2014/10/8”!
},!{!!“id”: 2,!!“text”: “However, those are the <b>droids</b> you are looking for.”,!!“date”: “2014/10/8”!
} ]!
19
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Indexing with Storage Disabled
!“total”: 2,!“hits”: [!{!!“id”: 1,!
},!{!!“id”: 2,!
} ]!
20
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Storing and Retrieving Original Indexed Data
21
Full Document IDs Only Pros • Avoid hiong database on
your applica@on • Snippet highligh@ng
• Storage on Search Engine file system is light
• Small response payload Cons • Extra storage on Search
Engine file system • Access control needs to be built in the index
• Reliance on database for reading data to show users
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Parent-‐child Rela@onships
• Defines 1-‐to-‐many rela@onship between entries in different indices • Convenient when pushing rela@onal data into Index • Parent can be updated without re-‐indexing children
22
Customer Order 1 0..*
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Mul@ple Languages
• Leverage language auto-‐detec@on • Leverage stop words • Limit amount of stemming • Op@on: – Single entry in mul@ple languages • Merge value in different languages into single field • Pro: Simple implementa@on. Search can be performed in any language • Con: Match might include homonym in other languages
23
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Defining the Search Cluster Topology
24
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Node 2
Shards
25
Node 1
1 2 3
4 5
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Node 2
Replicas
26
Node 1
1 2 3
4 5 1R 2R 3R 4R 5R
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Cluster
• Approach depends on what your framework has to offer • Elas@csearch provides a lot of support out of the box • Considera@ons: – Cluster Segmenta@on (Few Smaller vs Single Large?) – Geographical Distribu@on – Searching across Clusters – Write/Read Ra@o
27
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hub
• Aware of All Clusters • Maintains map of Tenant -‐> Cluster • Serves as discovery mechanism for the Client Applica@on • Able to create/pause/move/delete Tenants
28
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hub Tenant Discovery Service
29
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Integra@ng with your Applica@on
30
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Indexing vs Querying
• Expected load on wri@ng/reads • Depends on Problem Domain of Client Applica@on • Writes are expensive! – Specially if not done in bulk
• Reads are fairly cheap
31
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Ini@al Data Load/Full Re-‐index
• Perform Ac@ons in Bulk – Minimize overall number of Lucene Commits
• Consider enabling External “Versioning” – Safely parallelize indexing requests
• Keep track of documents that failed to index
32
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Incremental Indexing
33
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Incremental Indexing
• Monitor Indexing requests delay • Message customers accordingly – i.e.
Search Results might not include recently updated entries.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Disaster Recovery
• Take advantage of what your Framework offers you – i.e. Replica@on in Elas@c Search
• Nightly Backups + Replay of changes since Backup crea@on • Avoid star@ng from scratch!
35
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Final Thoughts
• Recent Open-‐source Tooling (*cough* Elas%csearch) makes it easy • Consider Carefully: – Design and granularity of your tenant in the Search engine – Define En@@es and their Rela@onships – Sharding and Replica@on Schemes – Clustering Distribu@on • i.e. per applica@on Installa@on, geographically, etc.
– High Availability Mechanisms
36
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Q&A Thank you!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 38