View
7.298
Download
1
Category
Preview:
DESCRIPTION
Similar content over at www.benjaminathawes.com Twitter - @benjaminathawes
Citation preview
Nottingham 2012
Planning SP2013 Search
for IT PROs
The boring stuff (disclaimer)
Most of the content of this presentation was put
together using material and tests based on SharePoint
2013 Release Preview. Although a lot of it is still relevant
to the RTM version, it is provided “as-is” and there is no
guarantee as to its accuracy.
Additionally, any opinions stated are that of the author
and do not represent the views of Content and Code or
Microsoft.
About me…
Are all Search engines created equal?
= ?
Are all Search engines created equal?
Enterprise Search is a different animal, right?
The Enterprise Search market (Gartner)
2006 2009
2013?
Introducing SharePoint Server 2013 Search
FAST architecture integrated along with
improvements from Bing; continuous crawl
UI improvements: AJAX, preview panes, results blocks
Content Search Web part reduces need for custom code
What’s new under the hood for IT PROs? Search capability 2007 2010 2013 Preview
Architecture SSP Service App Service App
Configurable
Components
Query
Index
Query
Crawl
Admin
Index Partition
Query Processor
Crawl
Admin
Index Partition
Content Processor
Analytics Processor
(Replaces SP2010 Web
Analytics SA)
Databases Search Crawl
Admin
Property
Crawl
Admin
Property
Analytics
Link
Resiliency No index HA (although
we had query HA)
No admin component
HA
Admin component
redundancy
Management Central Admin
STSADM
Central Admin
STSADM
PowerShell
Central Admin – except
topology changes
STSADM
PowerShell
Scheduling Full/Incremental
Full/Incremental
Full/Incremental
Continuous
Quick Demo
Lets find stuff in SharePoint 2013!
Architecture
SharePoint Server 2013 Search Architecture
1. Crawl Component
• Invokes connectors to retrieve items and metadata
from Content Sources
• Crawl DB stores crawled item history
• Discovers content and metadata (e.g. Author, Title,
and Creation Date) collectively known as crawled
properties
• Delivers crawled properties to the Content
Processing Component
2. Content Processing Component
• Parses crawled items using format handlers and 3rd party iFilters
• Reports crawled properties to the Search Admin Database
• Writes URL information to Link DB for usage by Analytics Component
• Nugget from Neil Hodgkinson (Microsoft): for now we are stuck with the default PDF format handler.
3. Analytics Processing Component
• Replaces SP2010 Web Analytics
• Analyses crawled items and user
interactions with Search results
(e.g. clicks, recommendations)
• Results fed back to the Content
Processing Component to
improve relevance
• Scales well – additional APCs or
databases can be added for
additional throughput/capacity
4. Index Component
• Central part of Search capability – used in both
feeding and Query processes:
• Feeding – writes items received from Content
Processor to index file
• Query – provides results set to the Query Processor
(similar to the “query” component in 2010)
• Physically moves index files in response to Search
topology changes.
• Stores ACLs in disk index
Scaled out SP2013 Search Index
Marketecture Central Admin – 1 partition, 2 replicas
Use Get-SPEnterpriseSearchStatus to
find the Primary Replica:
5. Query Processing Component
• New component in SP2013. Complements the index
component.
• Presents results to users!
• Performs linguistics processing at query time, e.g.
spellchecking, thesaurus
• Analyses and processes query to determine which index
partition to send query to and which rule(s) to apply
6. Admin Component
• Responsible for search provisioning and topology changes
• Search Admin DB is basically a “Config DB” for search – it contains
the topology, crawl/query rules, crawled/managed properties.
• Does NOT store ACLs in 2013 – these are stored within the disk index
alongside content (used for security trimming results)
Demo
Create a new Search Service Application using PowerShell
What did we create?
• One of each Search component
• Up to 10m items (on paper)
• No component redundancy
Topology
Minimum “Enterprise” Search hardware requirements
http://technet.microsoft.com/en-us/library/jj219628(v=office.15).aspx
These requirements are cumulative (56GB in total!)
Example “medium”
topology
• “Medium” topology taken from Microsoft’s
“Topologies for SP2013” document. “Finger in
the air” capacity:
• Up to 10 million items
• 10-20,000 users
• 1-2 TB content
• 8 VMs on 4 physical hosts + SQL!
• OWA for Search Preview Pane
• No Search components on WFE servers
• Query processing and index components hosted
together
• Traditional “app” servers for everything else.
• No mention of a distributed cache (AppFabric)
cluster – this could be a mistake.
Nuts and bolts
Default Search topology footprint - 2013
2 Service Applications and 1 Proxy in SPCA 2 Service App Endpoints in IIS
3 Services on Server 5 noderunner Processes in Task Manager
1 mssearch executable 2 Windows Services
5 Noderunner processes in Process Explorer
4 Databases
So is it really a continuous crawl?
• Short answer: “it depends on how much content you have”.
• Overlapped/parallel crawls every 15 minutes by default. Items shown in index “within seconds”.
• Fresher content, but NOT a “silver bullet” – continuous crawl generally run with a periodic full crawl.
• E.g. Full crawl needed for new managed properties, clean up of inaccessible/deleted items.
PowerShell and Search: what’s new?
• New-SPEnterpriseSearchAnalyticsProcessingComponent
• BUT no “Get” cmdlet is a pain if trying to work with
the component.
• New Get and Set cmdlets for
SPEnterpriseSearchQueryProcessingComponent
• You must use PowerShell if you want to scale a search
topology and to avoid GUIDs
• No interface within SPCA to modify the topology.
Demo
Modifying the Search topology using PowerShell
What did we change?
Upgrade
Considerations
Migrating SP2010 Search to 2013
• Remember that in-place upgrades are not supported
• Only the SP2010 Admin DB can be migrated to 2013.
• SP2010 Search Admin DB contains :
• content sources
• crawl rules
• start addresses
• server name mapping
• federated locations.
• Properties are gathered during the first crawl
• SP2010 Web Analytics does not migrate to SP2013.
• Logical topology settings such as servers, components in farm need to be manually recreated using PowerShell.
• SP2013 can crawl SharePoint 2003/2007/2010 farms to facilitate a “Search first” upgrade
SP2013 Search Boundary key changes
Limit 2010 2013
Crawl Databases 10 per Search SA 5 per Search SA
Crawl Components 16 per Search SA 2 per Search SA
Index Partitions 20 per Search SA
128 total
20 per Search SA
Link DB N/A 2 per Search SA
Query Processing Component N/A 1 per server
Content Processing Component N/A 1 per server
Analytics Processing Component N/A 6 per Search SA
Gotchas / considerations
• Suggested the Search / distributed cache services are split for large implementations
• Impacts the “starting” topology for larger customers
• High resource requirements as discussed
• Some Search features deprecated / removed
(see http://technet.microsoft.com/en-us/library/ff607742(v=office.15).aspx#section1):
• No migration path for SP2010 Foundation Search settings
• No means of modifying Search topology via UI
• No Search SOAP Web service http://server/site/_vti_bin/search.asmx is no more. Use CSOM/REST!
• No Search RSS due to lack of claims support
• No Search SQL Syntax
• No support for docpush.exe to “push” items into the index (possible in FAST)
What about Foundation?
• SharePoint Foundation 2013 Search capabilities are now based on the same search implementation as
SharePoint Server 2013.
• If using the Farm Configuration Wizard (AKA “white wizard”) in SP2013 RP, a Search Service app is created.
• However, the PowerShell cmdlets required to scale out requires a Server license.
• RTM may be different. Any input welcome
• My thoughts: Appropriate only for small implementations due to single server limitation in release preview.
https://www.nothingbutsharepoint.com/sites/itpro/Pages/Search-in-SharePoint-2013-Foundation-Versus-Full-Blown-Server.aspx
The FCW solving all of our problems!??*
*This is a joke. The Farm Config Wizard rarely solves problems.
Summary
• SP2013 brings a bunch of cool new native Search functionality that is an
evolution of 2010 functionality.
• Most FAST features are now integrated
• 2013 Search is resource hungry – we must plan for this!
• Continuous crawl can replace incremental but still requires full crawls
• PowerShell required for topology changes – brush up those skills!
Questions?
Nottingham 2012
Thanks for listening!
Recommended