32
Building a scalable Search architecture in SharePoint 2013 Thuan Nguyen, SharePoint MVP [email protected] @nnthuan Vietnam SharePoint User Group

Building a scalable search architecture in share point 2013

Embed Size (px)

Citation preview

Building a scalable Search architecture in SharePoint 2013Thuan Nguyen, SharePoint [email protected]@nnthuan

Vietnam SharePoint User Group

About Me SharePoint Practice Lead, Solution Architect – FPT Software Microsoft SharePoint MVP (2011, 2012, 2013, 2014) Used to love start-up with two SharePoint-based products. Now focus on building a SharePoint core standard and

framework for Singapore Government.

Vietnam SharePoint User Group

Agenda

Common Misunderstandings Architecture & Topology Practical Guide Question & Answer

Vietnam SharePoint User Group

For those who are looking into having multiple Search servers handling millions documents.

Common Misunderstandings For High Availability, create two Search Service

Applications.

There is only one machine playing Search role in your farm Scale out Search architecture by adding more servers. Start Search service is to make search functionality work.

Vietnam SharePoint User Group

Architecture & TopologyLogical Architecture

CrawlContent ProcessingAnalytics Processing IndexAdministrationQuery Processing

Understand each component will help better design a scalable & maintainable Search for your organization.

Vietnam SharePoint User Group

Crawl Component Responsible for crawling content

from different sources SharePoint sites Exchange Lotus Notes Documentum HTTP Website

Deliver crawled items to content processing component.

Crawl database stores information about crawl items and crawl history

Vietnam SharePoint User Group

dbo.MSSCrawlHistoryLocal

Content Processing

Processes crawled items and passes these items to the index component

Performs linguistic processing at index time (e.g. language detection and entity extraction)

Writes information about links and URLs to the Link database

Vietnam SharePoint User Group

dbo.MSSQLogResultDocs

Analytics Processing

Vietnam SharePoint User Group

Analyzes crawled items and how users interact with search results.

When an user does an action (e.g. view a page) the event is collected in usage files on the WFE’s and regularly pushed to event store where they are stored until processed

Results are then returned to the Content Processing Component to be included in the search index

dbo.SearchReportData

Index Component

Vietnam SharePoint User Group

Receives the processed items from the content processing component and writes them to the search index.

Handles incoming queries, retrieves information from the search index, and sends back the result set to the query processing component.

Index Architecture

Vietnam SharePoint User Group

An index partition is a logical portion of the entire search index.

Each partition is served by one or more index components (or “replicas”)

In a partition there’s only one primary (or “Active”) replica who’s the only one that writes data in a partition

Other secondary (or “passive”) replicas are there for fault tolerance and increased query throughput

Index can scale in both horizontal (partitions) and vertical (replicas) ways

Partitions can be added but NOT removed

PrimaryReplica

SecondaryReplica 1

SecondaryReplica 2

Secondary Replica 1

SecondaryReplica 2

Primary Replica

Secondary Replica 3

SecondaryReplica 2

Primary Replica

SecondaryReplica 1

Partition #1 Partition #2 Partition #3

SecondaryReplica 3

SecondaryReplica 3

Servers

Index Servers1, 2 & 3

Index Servers4, 5 & 6

Index Servers7, 8 & 9

Index Servers10, 11 & 12

Query Processing

Analyses and processes search queries and results. The processed query is then submitted to the index

component, which returns a set of search results for the query.

Vietnam SharePoint User Group

Search Administration

Vietnam SharePoint User Group

Search Admin Component Runs number of system processes

required for search Is responsible for search provisioning

and topology changes Coordinates search components –

Content Processing, Query Processing, Analytics, and Indexing.

Search Admin DB Stores search configuration data:

Topology Crawl rules Query rules Managed property mappings Content sources Crawl schedules

Stores Analytics settings

dbo.MSSConfiguration

Practical Guide

Vietnam SharePoint User Group

Assessment Design Implementation Verification

Practical Guide- Assessment

Don’t hastily touch your SharePoint. Leave it alone!

Think about your content What are your content

sources (SharePoint document library, Exchange, File Server..)?

How much of content you want to search? (e.g. 100,000 documents)

Assess the number of concurrent users.

Search database sizing

Vietnam SharePoint User Group

Practical Guide- Assessment

Vietnam SharePoint User Group

Sizing factor: Total Database Size Total Index Size Query Component Index Size Disk Storage Link Database Search Admin Database Total Crawl Database Size Total Crawl Database Log Size Analytics Database Size

=> Total database size for Search

Microsoft already published the formula for these things above.

Practical Guide - Assessment

Vietnam SharePoint User Group

What is exactly High Availability for Search?

Business language: Search doesn’t stop end users searching something.

Technical language: All search logical components and Search databases must be functional as always.

Two or more Search service applications

Two or more Search servers

Practical Guide- Design Don’t hastily touch your SharePoint. Leave it

alone! Start with one machine hosting all

components

Vietnam SharePoint User Group

Practical Guide - Design

Vietnam SharePoint User Group

Don’t hastily touch your SharePoint. Leave it alone!

Think about two machines for Search but different set of components

Redundant set of (Query + Crawl). If one goes down, Query component in another machine still keeps functioning.

Practical Guide - Design

Vietnam SharePoint User Group

Don’t hastily touch your SharePoint. Leave it alone!

Do you need three machines for Search? Speed up Query

component? Reduce crawling time? Balance CPU utilization in

machine?

With more three machines, go to start an assessment of components in terms of the usage of hardware resources

Practical Guide - Design

Component CPU Network Disk RAM

Crawl Component MEDIUM HIGH MEDIUM MEDIUM

Content processing (CPC) HIGH MEDIUM HIGH

Analytics processing (APC) MEDIUM HIGH MEDIUM MEDIUM

Index Component HIGH MEDIUM HIGH HIGH

Query processing (QPC) MEDIUM MEDIUM MEDIUM

Search Admin Component LOW LOW LOW

Vietnam SharePoint User Group

Microsoft Ignite – BK3176

If logical architecture requires scale-out, consider utilization

Practical Guide - Design

Volume of content Sample Search Architecture

< 1 mil items Single-server Search farm 1 mil – 5 mil Two-server Search farm5 mil – 10 mil Small Search farm (3-4 servers)10 mil – 40 mil Medium Search farm (5-6 servers)> 40 mil Large Search farm

Vietnam SharePoint User Group

Sample Search Architecture

Vietnam SharePoint User Group

Handle number of different content sources (with 20 custom applications)

Nearly 1 million items currently Full crawl takes 2 hours Serving for nearly 20,000 users

with 500 concurrent users.

Sample Search Architecture

Vietnam SharePoint User Group

Optimize search query to serve hundreds of concurrent users.

Handle million of documents (approx. 5 TB)

Sample Search Architecture

Vietnam SharePoint User Group

Serve for 20 million documents & items (approx. 10-15 TB).

Central Administration doesn’t help much.PowerShell is your friend

1. Create Search Service Application2. Clone existing topology3. Modify Search component based on your

designated architecture4. Assign Index component and location5. Activate the new Search topology

Vietnam SharePoint User Group

Practical Guide- Implementation

Build Search farm with PowerShell http://bit.ly/search_multi_server_PS

$app1 = "APP-Server-01"$app2 = "APP-Server-02"$SearchAppPoolName = "SharePoint_SearchApp"$SearchAppPoolAccountName = "TestDomain\SPSearchPool"$SearchServiceName = "SharePoint_Search_Service"$SearchServiceProxyName = "SharePoint_Search_Proxy"$DatabaseName = "SharePoint_Search_AdminDB" #Create a Search Service Application Pool$spAppPool = New-SPServiceApplicationPool -Name $SearchAppPoolName -Account $SearchAppPoolAccountName -Verbose #Start Search Service Instance on all Application ServersStart-SPEnterpriseSearchServiceInstance $App1 -ErrorAction SilentlyContinueStart-SPEnterpriseSearchServiceInstance $App2 -ErrorAction SilentlyContinueStart-SPEnterpriseSearchQueryAndSiteSettingsServiceInstance $App1 -ErrorAction SilentlyContinueStart-SPEnterpriseSearchQueryAndSiteSettingsServiceInstance $App2 -ErrorAction SilentlyContinue #Create Search Service Application$ServiceApplication = New-SPEnterpriseSearchServiceApplication -Partitioned -Name $SearchServiceName -ApplicationPool $spAppPool.Name -DatabaseName $DatabaseName #Create Search Service ProxyNew-SPEnterpriseSearchServiceApplicationProxy -Partitioned -Name $SearchServiceProxyName -SearchApplication $ServiceApplication

Vietnam SharePoint User Group

Practical Guide- Implementation

Practical Guide- Implementation

#We need only one admin componentNew-SPEnterpriseSearchAdminComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two content processing components for HANew-SPEnterpriseSearchContentProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two analytics processing components for HANew-SPEnterpriseSearchAnalyticsProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two crawl components for HANew-SPEnterpriseSearchCrawlComponent –SearchTopology $clone -SearchServiceInstance $App1SSINew-SPEnterpriseSearchCrawlComponent –SearchTopology $clone -SearchServiceInstance $App2SSI #We need two query processing components for HANew-SPEnterpriseSearchQueryProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSINew-SPEnterpriseSearchQueryProcessingComponent –SearchTopology $clone -SearchServiceInstance $App2SSI

Vietnam SharePoint User Group

$clone = $ServiceApplication.ActiveTopology.Clone()$App1SSI = Get-SPEnterpriseSearchServiceInstance -Identity $app1$App2SSI = Get-SPEnterpriseSearchServiceInstance -Identity $app2

$IndexLocation = “\\APP2\Index_Search"

New-SPEnterpriseSearchIndexComponent –SearchTopology $clone -SearchServiceInstance $App2SSI -RootDirectory $IndexLocation -IndexPartition 0 $clone.Activate()

Practical Guide- Implementation

Vietnam SharePoint User Group

Practical Guide- Verification

Vietnam SharePoint User Group

Central Administration can help

PowerShell Get-

SPEnterpriseSearchStatus

Get-SPEnterpriseSearchTopology

Search PowerShell http://bit.ly/PowerShell_SP2013_Search

Helpful References SharePoint 2013: SharePoint and Enterprise Search Survival

Guide http://bit.ly/search_survival_guide Plan enterprise search architecture in SharePoint Server

2013 http://bit.ly/plan_for_ent_search Search Architecture for SharePoint 2013 http://zoom.it/Tsuy

Vietnam SharePoint User Group

Vietnam SharePoint User Group

Thank you