203
Microsoft Office Microsoft Office SharePoint Server 2007 SharePoint Server 2007 Search Workshop Search Workshop 游游游 游游游 Jade Yu Jade Yu 游游游游游游游游游游游游 游游游游游游游游游游游游

Microsoft Enterprise Seach using SharePoint

Embed Size (px)

Citation preview

Page 1: Microsoft Enterprise Seach using SharePoint

Microsoft Office SharePoint Microsoft Office SharePoint Server 2007Server 2007

Search WorkshopSearch Workshop

游家德 游家德 Jade YuJade Yu敦群數位科技股份有限公司敦群數位科技股份有限公司

Page 2: Microsoft Enterprise Seach using SharePoint

Microsoft Office SharePoint Microsoft Office SharePoint Server 2007 Enterprise SearchServer 2007 Enterprise Search

Enterprise Search Advanced Training – Enterprise Search Advanced Training – Building and Implementing Enterprise Building and Implementing Enterprise

Search SolutionsSearch Solutions

Page 3: Microsoft Enterprise Seach using SharePoint

Workshop AgendaWorkshop Agenda Day 1 – Search Overview

Microsoft Search Landscape MOSS 2007 Walkthrough Architecture and Deployment

Scenarios Crawl and Query Processes Search Object Model

Day 2 – Customization and Management Search Object Model Business Data Catalog (BDC) Search Extensibility and Integration AdministrationAdministration Capacity PlanningCapacity Planning

Page 4: Microsoft Enterprise Seach using SharePoint

AssumptionsAssumptions Some knowledge and experience with Search

functionality Knowledge of the Business Data Catalog in

general (new in Office 2007 System)

Office 2007 System Content Creation/Contribution experience

Knowledge of Web site creation and management in general

Knowledge of MS platform (Windows 2003 Server, ADS, IIS, SQL 2005 & Office Clients)

Knowledge of ASP.NET 2.0 and XSLT

Page 5: Microsoft Enterprise Seach using SharePoint

Workshop ObjectivesWorkshop Objectives Explain how to use the Office 2007 Search

functionality Interpret the Office 2007 System Search

Terminology Describe the rich feature set of Office 2007

System Search - Servers and Clients Describe how to use the platform well enough

to use its APIs to extend the products Explain how Office 2007 System Search will

solve enterprise business requirements

Page 6: Microsoft Enterprise Seach using SharePoint

Module 1Module 1

Enterprise Search OverviewEnterprise Search Overview

Page 7: Microsoft Enterprise Seach using SharePoint

Module AgendaModule Agenda

Microsoft Enterprise Search Client-side Search Platform Client-side Comparison Server-side Search Platform Key Differences between WSS and MOSS MOSS 2007 for Search Key Features MOSS 2007 for Search and MOSS 2007

Comparison

Page 8: Microsoft Enterprise Seach using SharePoint

Microsoft Enterprise SearchMicrosoft Enterprise Search

Server-Side Search Platform

Line-of-business systems and structured data sources

Unstructured information

People, expertise

External Web sites

E-mail messages, appointments, and instant messaging

Client-Side Search Platform

Documents, programs, and media

Page 9: Microsoft Enterprise Seach using SharePoint

Client-Side Search PlatformClient-Side Search Platform Windows Desktop Search (WDS) for

XP and Windows Server You must install an additional program for

Search

Vista – Integrated Desktop Search Integration in the Operating System Ability to search nearly anywhere Virtual Folders

Page 10: Microsoft Enterprise Seach using SharePoint

Client-Side ComparisonClient-Side ComparisonMicrosoft®

Windows®

Desktop Search

Microsoft® Windows® Vista

Rich, actionable interface X X

Integration with Microsoft Outlook X X

Polite indexing (Pauses when computer is in use)

XX

Live icons & document previews XX

Advanced Search integrated into the Operating System X

Save searches to search folders X

Instant SearchX

(on taskbar)

X

(from start menu)

Page 11: Microsoft Enterprise Seach using SharePoint

Server-Side Search Server-Side Search PlatformsPlatforms Windows SharePoint Services v3

“Basic” index / search capabilities to support WSS collaboration and document management

Microsoft Office SharePoint Server (MOSS) 2007 Enterprise search and indexing features

“unlocked” Several SKUs to support different

scenarios and customer needs

Page 12: Microsoft Enterprise Seach using SharePoint

Key Differences Between WSS and MOSSKey Differences Between WSS and MOSS

WSS v3Microsoft Office SharePoint

Server (MOSS)

Can IndexLocal SharePoint

content

XSharePoint sites / collections, Exchange Public Folders, File Shares, Web Content, Lotus Notes, LOB Apps,

and others . . .Rich, relevant results X

Alerts, RSS, Did you mean, Duplicate collapsing

X

Scopes, Managed Properties

X

Best Bets, Result Removal, Query Reports

X

Search Center Tabs

X

BDC Search XAPI’s provided  Query Query + Admin

Page 13: Microsoft Enterprise Seach using SharePoint

MOSS 2007 for SearchMOSS 2007 for Search A Search-only solution for intranets and

public-facing Web (Internet) sites Two versions

Standard Edition limited to 500,000 docs Enterprise Edition with unlimited docs

Includes Out of the box search for file shares, Web sites,

SharePoint sites, Exchange Public Folders, Lotus Notes databases

Extensibility to 3rd party document repositories and file types

Page 14: Microsoft Enterprise Seach using SharePoint

MOSS 2007 and MOSS FS MOSS 2007 and MOSS FS Usage ScenariosUsage Scenarios

Description Scenario

MOSS 2007 An information management solution that includes enterprise search integrated with portal, collaboration, web content management, ECM, forms, and BI functionalities

Customers who desire search as an integrated part of a broader information management solution

MOSS FS A core search-only solution for intranet and public-facing web sites

•Customers who require a core search-only product that can be integrated into their existing infrastructure•Customers who require search functionality for their public-facing web (Internet) sites

Page 15: Microsoft Enterprise Seach using SharePoint

MOSS 2007 for Search and MOSS 2007MOSS 2007 for Search and MOSS 2007Features ComparisonFeatures Comparison

Features MOSS 2007 for Search

(Standard Edition)

MOSS 2007 for Search

(Enterprise Edition)

MOSS 2007 (Standard

CAL)

MOSS 2007 (Standard plus

Enterprise CAL)

File shares X X X X

Web sites X X X X

SharePoint sites X X X X

Microsoft Exchange Server public folders

X X X X

Lotus Notes databases X X X X

Third party document repositories 1

X X X X

Secure content access control

X X X X

Enhanced Search Center user interface

X X

Search for people and expertise

X X

Business Data Catalog (BDC)

X

Search structured data sources

X

Document limit 500,000 No Limit2 No Limit2 No Limit2

Page 16: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 17: Microsoft Enterprise Seach using SharePoint

Module 2Module 2

Microsoft Office SharePoint Microsoft Office SharePoint Search 2007 – Search 2007 – WalkthroughWalkthrough

Page 18: Microsoft Enterprise Seach using SharePoint

Module AgendaModule Agenda End-User ImprovementsEnd-User Improvements

RelevanceRelevance People and ExpertisePeople and Expertise Business Data SearchBusiness Data Search

Administration ImprovementsAdministration Improvements Design GoalsDesign Goals Indexing ManagementIndexing Management Security Security CustomizationCustomization Query ReportingQuery Reporting

Performance ImprovementsPerformance Improvements Demo MOSS 2007Demo MOSS 2007

Page 19: Microsoft Enterprise Seach using SharePoint

End-User ImprovementsEnd-User ImprovementsRelevanceRelevance

Dramatically improved relevanceDramatically improved relevanceis the top goal of this releaseis the top goal of this release

New ingredients added including:New ingredients added including: Anchor textAnchor text Click distanceClick distance URL depth URL depth Missing metadata creationMissing metadata creation

Result is noticeably more relevant searchResult is noticeably more relevant search 100% better on all queries100% better on all queries 500% better on common queries500% better on common queries

Page 20: Microsoft Enterprise Seach using SharePoint

End-User Improvements End-User Improvements People and ExpertisePeople and Expertise

Bring people into the Search experienceBring people into the Search experience Getting your job done means working withGetting your job done means working with

the right peoplethe right people Find subject-matter experts based on theirFind subject-matter experts based on their

knowledge and contactsknowledge and contacts

Numerous improvements over SPS 2003Numerous improvements over SPS 2003 Index any LDAP V3 directoryIndex any LDAP V3 directory Dedicated tab for finding peopleDedicated tab for finding people Results grouped by “social distance” to youResults grouped by “social distance” to you

Page 21: Microsoft Enterprise Seach using SharePoint

End-User Improvements End-User Improvements Business Data SearchBusiness Data Search Information in Line of Business (LOB) systems is Information in Line of Business (LOB) systems is

often hard to accessoften hard to access MOSS 2007 can bring that data to your usersMOSS 2007 can bring that data to your users

Data is accessed through the Data is accessed through the Business Data Business Data CatalogCatalog

Exposed to many features in SharePointExposed to many features in SharePoint Search can easily index the dataSearch can easily index the data

No need to write codeNo need to write code Highly customizable resultsHighly customizable results Integrated with scopes and Search centerIntegrated with scopes and Search center

Page 22: Microsoft Enterprise Seach using SharePoint

Address SPS 2003 administration user Address SPS 2003 administration user interface pain pointsinterface pain points

Unify WSS and MOSS searchUnify WSS and MOSS search Enable full programmability via the object Enable full programmability via the object

modelmodel Even better scalability and performanceEven better scalability and performance

Administration ImprovementsAdministration ImprovementsDesign GoalsDesign Goals

Page 23: Microsoft Enterprise Seach using SharePoint

Streamlined experience and more controlStreamlined experience and more control One index per shared service; no need to One index per shared service; no need to

worry about managing discrete indexesworry about managing discrete indexes Multiple start addresses per content sourceMultiple start addresses per content source MOSS indexes can drive the WSS search MOSS indexes can drive the WSS search

experienceexperience Allow upgrade from WSS to MOSSAllow upgrade from WSS to MOSS

Administration ImprovementsAdministration Improvements Indexing ManagementIndexing Management

Page 24: Microsoft Enterprise Seach using SharePoint

Administration ImprovementsAdministration Improvements SecuritySecurity

Query-time security trimming in SPS 2003Query-time security trimming in SPS 2003 File shares, WSS/SPS 2003, Exchange, Lotus File shares, WSS/SPS 2003, Exchange, Lotus

Notes (via mapping)Notes (via mapping)

Now supports pluggable authenticationNow supports pluggable authenticationfor content in WSS/MOSS sitesfor content in WSS/MOSS sites Based on ASP.NET 2.0 modelBased on ASP.NET 2.0 model

Minimum required crawler permission is nowMinimum required crawler permission is nowjust Full Read, not Administratorjust Full Read, not Administrator Still provides the same security trimming Still provides the same security trimming

functionalityfunctionality

Ability to remove single itemsAbility to remove single items

Page 25: Microsoft Enterprise Seach using SharePoint

Administration ImprovementsAdministration Improvements CustomizationCustomization

Search in Search in everyevery company is different company is different Different metadata might matter:Different metadata might matter:

Documents: Title, Author, File location, SizeDocuments: Title, Author, File location, Size Records: Patient, Doctor, Healthcare provider, SSN…Records: Patient, Doctor, Healthcare provider, SSN…

How users meaningfully scope searches differs:How users meaningfully scope searches differs: ““All finance documents”All finance documents” ““All patient records”All patient records” ““All published documents”All published documents”

Customize results to “pop” metadata that Customize results to “pop” metadata that mattersmatters

Customization offered at many levelsCustomization offered at many levels Web Parts, XSLT/CSS, full object model…Web Parts, XSLT/CSS, full object model…

Page 26: Microsoft Enterprise Seach using SharePoint

Administration ImprovementsAdministration Improvements Query ReportingQuery Reporting Best way to improve SearchBest way to improve Search

is to understand current usageis to understand current usage New out-of-box usage reporting:New out-of-box usage reporting:

Query volume trends, top queries, Query volume trends, top queries, click-through rates, queries with zero click-through rates, queries with zero results, etc. results, etc.

At both site and service provider levelsAt both site and service provider levels Export data for extended reporting in Export data for extended reporting in

ExcelExcel Respond to feedback with configuration Respond to feedback with configuration

changes or editorial resultschanges or editorial results

Page 27: Microsoft Enterprise Seach using SharePoint

Performance ImprovementsPerformance Improvements

Key new features make the crawls faster so Key new features make the crawls faster so the content is fresherthe content is fresher More efficient SharePoint crawlingMore efficient SharePoint crawling

(Change Log Crawl)(Change Log Crawl) Continuous propagationContinuous propagation Unified WSS and MOSS searchUnified WSS and MOSS search Security Change Only CrawlSecurity Change Only Crawl

Maximum scale is Maximum scale is 10s of millions10s of millionsof documents per indexerof documents per indexer

Page 28: Microsoft Enterprise Seach using SharePoint

Demo – MOSS 2007Demo – MOSS 2007

Goal of demo is a high level overview with focus on:•Search boxes and advanced search•Search results experience•Search Center•Admin experience

Page 29: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 30: Microsoft Enterprise Seach using SharePoint

Module 3Module 3

Architecture and Deployment Architecture and Deployment ScenariosScenarios

Page 31: Microsoft Enterprise Seach using SharePoint

AgendaAgenda Key concepts Key concepts

MS Search ArchitectureMS Search Architecture Deployment Building BlocksDeployment Building Blocks WSS v3 Search TopologiesWSS v3 Search Topologies MOSS 2007 Search Topologies MOSS 2007 Search Topologies

Search Topology scenarios Search Topology scenarios Small Small Medium Medium Large Large Geographically distributedGeographically distributed

Solution scenarios Solution scenarios Collaboration sites Collaboration sites Enterprise portal Enterprise portal Internet facing portalInternet facing portal

Page 32: Microsoft Enterprise Seach using SharePoint

Microsoft Search ArchitectureMicrosoft Search Architecture

Query Engine

Index Engine

Protocol

HandlersiFilters

ContentIndex

OOB Search UI/Custom Search Apps

Query OM and Web Service

Information

…ExchangeFolders

NetworkShares

ExternalWeb Sites

SharePointSites

BusinessData

Stemmers

WordBreakers

Resu

lts

Qu

ery

Content Sources

Crawl Log

Scopes

Schema

Best Bets

Keywords

Ranking

Searc

h C

on

fig

ura

tion

Data

Notes

Page 33: Microsoft Enterprise Seach using SharePoint

SharePoint Search Topologies:SharePoint Search Topologies:Deployment Building BlocksDeployment Building Blocks Physical building blocks: Physical building blocks:

Web Front-End ServersWeb Front-End Servers Application servers (Query, Index, Excel Services, etc.)Application servers (Query, Index, Excel Services, etc.) SQL Databases SQL Databases

Search functionality segmented into two roles: Search functionality segmented into two roles: Indexer Indexer QueryQuery

MOSS 2007 specificMOSS 2007 specific Shared Service Provider (SSP)Shared Service Provider (SSP)

IndexerIndexer Web Application(s)Web Application(s)

Site Collection(s)Site Collection(s) Content Database(s)Content Database(s)

Virtual Server(s) (IIS)Virtual Server(s) (IIS)

Page 34: Microsoft Enterprise Seach using SharePoint

WSS v3 Search Topology BasicsWSS v3 Search Topology Basics WSS uses both server roles on the same WSS uses both server roles on the same

machine (“Search Server”)machine (“Search Server”) IndexingIndexing Query Query

Ability to index local content onlyAbility to index local content only Site Collection (content database(s))Site Collection (content database(s))

Content is automatically indexedContent is automatically indexed minimal search administration minimal search administration

Ability to query at a site and below itAbility to query at a site and below it stsadm command exposes some admin stsadm command exposes some admin

operationsoperations Can Crawl Multiple content databases Can Crawl Multiple content databases

Page 35: Microsoft Enterprise Seach using SharePoint

Sample Sample WSSWSS v3 v3 Topology Topology

...

...

X

User Requests

...

Search Server – Indexing and Query

Crawling

Web Front Ends

Content Databases

Load Balancer

Crawling

Page 36: Microsoft Enterprise Seach using SharePoint

WSS v3 - Topology WSS v3 - Topology ConsiderationsConsiderations Scale out just like WSSScale out just like WSS Add content databases for contentAdd content databases for content Add search servers for searchAdd search servers for search Each search server can serve up to 100 Each search server can serve up to 100

content databasescontent databases Could be lower depending on the data in Could be lower depending on the data in

the content databasethe content database

Page 37: Microsoft Enterprise Seach using SharePoint

Adds new functionality over base WSS Adds new functionality over base WSS SearchSearch

Application server roles can be Application server roles can be separated:separated: IndexerIndexer Query serverQuery server

Propagation from indexer to query Propagation from indexer to query serversservers

Crawl local + external contentCrawl local + external content Enhanced administration experienceEnhanced administration experience Ability to search across site collectionsAbility to search across site collections

MOSS 2007 Search Topology MOSS 2007 Search Topology BasicsBasics

Page 38: Microsoft Enterprise Seach using SharePoint

MOSS 2007 Search Topology MOSS 2007 Search Topology Basics (cont)Basics (cont) Query role can be assigned to one or Query role can be assigned to one or

more serversmore servers Indexing role can only be assigned to a Indexing role can only be assigned to a

single serversingle server Multiple query servers not allowed IF Multiple query servers not allowed IF

server is providing both indexing and server is providing both indexing and query servicesquery services

Only one index per SSP . . . although Only one index per SSP . . . although you can have multiple SSPsyou can have multiple SSPs

Page 39: Microsoft Enterprise Seach using SharePoint

Sample Sample MOSS MOSS 20072007 Topology Topology

...

...

X

User Requests

Load Balancer

Query servers

Web front ends

...Crawling

Content databases

Indexer

Propagation of indexes

...

External content

Query servers

separated from indexer

Indexer crawling local +

external content

Page 40: Microsoft Enterprise Seach using SharePoint

MOSS 2007 – Search Topology MOSS 2007 – Search Topology ConsiderationsConsiderations Indexing operations are CPU intensiveIndexing operations are CPU intensive Dedicated query servers *might* be Dedicated query servers *might* be

better in a query heavy environmentbetter in a query heavy environment MOSS / WSS crawls do involve making MOSS / WSS crawls do involve making

HTTP requests against the WFE(s)HTTP requests against the WFE(s) Dual role, WFE / Query servers more Dual role, WFE / Query servers more

efficient with security trimmingefficient with security trimming All servers should be on same network All servers should be on same network

segmentsegment

Page 41: Microsoft Enterprise Seach using SharePoint

MOSS 2007 – Search Topology MOSS 2007 – Search Topology Considerations (cont)Considerations (cont) Each farm can index up to 50 million Each farm can index up to 50 million

itemsitems Beyond this, add more farmsBeyond this, add more farms Hardware is importantHardware is important

Page 42: Microsoft Enterprise Seach using SharePoint

Shared Search ServiceShared Search Service Shared Service Provider (SSP) – grouped Shared Service Provider (SSP) – grouped

high-value, resource intensive serviceshigh-value, resource intensive services

Shared services are consumed by web Shared services are consumed by web applications (and sites within them)applications (and sites within them)

““Always on” shared services – all sites in a Always on” shared services – all sites in a web application use the same indexweb application use the same index

Resource intensive operations controlled Resource intensive operations controlled centrallycentrally

Some admin experience is manageable at site Some admin experience is manageable at site levellevel

Page 43: Microsoft Enterprise Seach using SharePoint

Search servicePeople service

Shared Service Provider (SSP)

http://sales http://finance http://hr

spsite spsite spsite spsite spsite spsite

spweb spweb spweb spweb spweb spweb

Virtual Servers

Search Shared ServiceSearch Shared Service

Content Databases

External content

Page 44: Microsoft Enterprise Seach using SharePoint

Search Shared ServiceSearch Shared Service

...

...

X

User Requests

Load Balancer

Query servers

Web front ends

...Crawling

Content databases

Indexer

Propagation of indexes

...

Search servicePeople service

Shared Service Provider

http://sales http://finance http://hr

spsite spsite spsite spsite spsite spsite

spweb spwebspweb spweb spweb spweb

Virtual Servers

Content Indexed

Content Databases

External content

Page 45: Microsoft Enterprise Seach using SharePoint

Common Search TopologiesCommon Search Topologies

Deployment scenarios Deployment scenarios Small Small Medium Medium Large Large Geographically Distributed (MOSS only)Geographically Distributed (MOSS only)

Page 46: Microsoft Enterprise Seach using SharePoint

Small Search DeploymentSmall Search Deployment WSSWSS

Single Search Server with both rolesSingle Search Server with both roles IndexIndex

Single Site Collection only!Single Site Collection only! Single Set of Content DatabasesSingle Set of Content Databases

QueryQuery

MOSSMOSS Single ServerSingle Server

Dual RoleDual Role IndexIndex

SSP Based – Multiple Site CollectionsSSP Based – Multiple Site Collections Multiple Set of Content DatabasesMultiple Set of Content Databases

QueryQuery

MOSS for SearchMOSS for Search Single Server / Dual Role (Index and Query) Single Server / Dual Role (Index and Query)

Page 47: Microsoft Enterprise Seach using SharePoint

Medium Search DeploymentMedium Search Deployment WSSWSS

Multiple Search Servers with the following limitationsMultiple Search Servers with the following limitations Single Index ServerSingle Index Server

Single Site CollectionSingle Site Collection Single Set of Content DatabasesSingle Set of Content Databases

Multiple Query ServersMultiple Query Servers

MOSSMOSS Three ServersThree Servers

One Index ServerOne Index Server Two Query Servers running on two Web Front-End serversTwo Query Servers running on two Web Front-End servers

MOSS for SearchMOSS for Search Three ServersThree Servers

One Index ServerOne Index Server Two Query ServersTwo Query Servers

Page 48: Microsoft Enterprise Seach using SharePoint

Large Search DeploymentLarge Search Deployment WSSWSS

Multiple Search Servers with the following limitationsMultiple Search Servers with the following limitations Multiple Index Servers (64-bit)Multiple Index Servers (64-bit)

Each Indexing a Single Site Collection with their own Set of Each Indexing a Single Site Collection with their own Set of Content DatabasesContent Databases

Index Servers are not redundant from one another.Index Servers are not redundant from one another. Multiple Query Servers each associated with their own single Multiple Query Servers each associated with their own single

Index Server running on the same machine (64-bit)Index Server running on the same machine (64-bit) Query servers are not redundant from one anotherQuery servers are not redundant from one another

MOSSMOSS One Index Server (64-bit)One Index Server (64-bit) Many Separate Query servers (64-bit)Many Separate Query servers (64-bit)

MOSS for SearchMOSS for Search One Index Server (64-bit)One Index Server (64-bit) Many Separate Query servers (64-bit)Many Separate Query servers (64-bit)

Page 49: Microsoft Enterprise Seach using SharePoint

Geographically Distributed SitesGeographically Distributed SitesMOSS Search DeploymentMOSS Search Deployment

Search service People service

---

Shared Service Provider (SSP)Index Corp, EMEA, APACand other locations

http://sales http://finance http://hr

spsite spsite spsite spsite spsite spsite

spweb spweb spweb spweb spweb spweb

Virtual Servers

External content

Search service People service

---

Shared Service Provider (SSP)Index APAC only

http://apacsaleshttp://apacfinancehttp://apachr

spsite spsite spsite spsite spsite spsite

spwebspweb spweb spwebspweb spweb

Virtual Servers

External contentSearch service People service

---

Shared Service Provider (SSP)Index EMEA only

http://emeasaleshttp://emeafinancehttp://emeahr

spsite spsite spsite spsite spsite spsite

spwebspweb spweb spwebspweb spweb

Virtual Servers

External content

Other Locations

Corp. Sites

Page 50: Microsoft Enterprise Seach using SharePoint

Deployment ScenariosDeployment Scenarios

Collaboration Environment (WSS v3)Collaboration Environment (WSS v3) Enterprise Portal (MOSS 2007)Enterprise Portal (MOSS 2007) Internet Facing Portal (MOSS 2007)Internet Facing Portal (MOSS 2007)

Page 51: Microsoft Enterprise Seach using SharePoint

Collaboration Environment Collaboration Environment Scenario WSS v3Scenario WSS v3 iTech – startup software consulting iTech – startup software consulting

firmfirm

Large number of disjoint teams Large number of disjoint teams working on projects of varying working on projects of varying durationsdurations

Team sites used for collaboration and Team sites used for collaboration and communicationcommunication

No organizational needs across sitesNo organizational needs across sites

Page 52: Microsoft Enterprise Seach using SharePoint

Collaboration Environment Scenario Collaboration Environment Scenario WSS v3 (cont)WSS v3 (cont)

WSS farm with single WSS farm with single IIS virtual server IIS virtual server http://team http://team

Scales to large number Scales to large number of team sites of team sites

Content indexed Content indexed automatically automatically

WSS v3 standalone WSS v3 standalone topology topology 1 Search box (both 1 Search box (both

roles)roles)

X

User Requests

Search Server – Indexing and Query

Crawling

Web Front Ends

ContentDatabases

Load Balancer

Page 53: Microsoft Enterprise Seach using SharePoint

Collaboration Environment Collaboration Environment Scenario WSS v3 (cont)Scenario WSS v3 (cont)

http://team

team1 team2

spweb spweb

Virtual Server

team3

spwebspweb

SPSites

Content Databases

Search – core feature of WSS

Contextual scopes – site and list

No search across sites

Page 54: Microsoft Enterprise Seach using SharePoint

Enterprise Portal ScenarioEnterprise Portal ScenarioMOSS 2007MOSS 2007 iTech – growing company with growing iTech – growing company with growing

needsneeds iTech – needs a single point for iTech – needs a single point for

information access for employeesinformation access for employees They now need to search over other They now need to search over other

repositories:repositories: Personnel records – People searchPersonnel records – People search Seibel sources – BDC searchSeibel sources – BDC search File Shares / Web sites – other external File Shares / Web sites – other external

datadata

Page 55: Microsoft Enterprise Seach using SharePoint

Enterprise Portal ScenarioEnterprise Portal ScenarioMOSS 2007 (cont)MOSS 2007 (cont) Upgrade from WSS Upgrade from WSS MOSS MOSS Search is a shared service through the SSPSearch is a shared service through the SSP Central enterprise portal – http://itechCentral enterprise portal – http://itech Existing virtual server http://team associated Existing virtual server http://team associated

with SSP – search box switches to use with SSP – search box switches to use MOSSMOSS

Base WSS search is not running – but Base WSS search is not running – but search available to sites through shared search available to sites through shared search servicesearch service

Indexes – local and external contentIndexes – local and external content

Page 56: Microsoft Enterprise Seach using SharePoint

Enterprise Portal ScenarioEnterprise Portal ScenarioMOSS 2007 (cont)MOSS 2007 (cont)

http://team

team1 team2

spweb spweb

Virtual Server

team3

spwebspweb

SPSites

Content Databases

Search servicePeople service

…Shared Service Provider

FarmExternal content

http://itech

HR Sales

spweb spweb

Virtual Server

Finance

spwebspweb

SPSites

Content Databases

Page 57: Microsoft Enterprise Seach using SharePoint

Enterprise Portal ScenarioEnterprise Portal ScenarioMOSS 2007 (cont)MOSS 2007 (cont) Topology with Topology with

indexer and indexer and query serversquery servers

Load balanced Load balanced query serversquery servers

Scale out and Scale out and scale up – new scale up – new SSP dimensionSSP dimension

X

User Requests

Load Balancer

Query servers

Web front ends

Crawling

Content databases

Indexer

Propagation of indexes

Query Servers

added for throughput

Single indexer crawls logical SSP = local +

external content

Page 58: Microsoft Enterprise Seach using SharePoint

Internet Facing Portal Internet Facing Portal Scenario - MOSS 2007Scenario - MOSS 2007 Internet facing site for customers – Internet facing site for customers –

www.itech.comwww.itech.com High traffic focused on content High traffic focused on content

presentationpresentation Public accessPublic access More publishing and less collaborationMore publishing and less collaboration Controlled and tightly managed Controlled and tightly managed

contentcontent

Page 59: Microsoft Enterprise Seach using SharePoint

Internet Facing Portal Internet Facing Portal Scenario - MOSS 2007 (cont)Scenario - MOSS 2007 (cont) Two separate farms: Production and Two separate farms: Production and

test farmstest farms MOSS installationMOSS installation Controlled publishing of content to Controlled publishing of content to

production farm from test farmproduction farm from test farm Single shared service provider per farmSingle shared service provider per farm Shared search service in each farm Shared search service in each farm

crawls content in each farm crawls content in each farm independentlyindependently

Page 60: Microsoft Enterprise Seach using SharePoint

Internet Facing Portal Internet Facing Portal Scenario - MOSS 2007 (cont)Scenario - MOSS 2007 (cont)

www.itech.com

Services Customers

spweb spweb

Virtual Server

About itech

spwebspweb

Content Databases

SPSites

Search servicePeople service

---

SSPProduction farm

http://itechtest

Services Customers

spweb spweb

Virtual Server

About itech

spwebspweb

Content Databases

SPSites

Search servicePeople service

---

SSPTest Farm

Page 61: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 62: Microsoft Enterprise Seach using SharePoint

Module 4Module 4

Crawl and Query ProcessesCrawl and Query Processes

Page 63: Microsoft Enterprise Seach using SharePoint

AgendaAgenda

The Crawl ProcessThe Crawl Process Crawl WalkthroughCrawl Walkthrough Index PropagationIndex Propagation

The Query ProcessThe Query Process

Page 64: Microsoft Enterprise Seach using SharePoint

Crawl WalkthroughCrawl Walkthrough

When a crawl is requested . . .When a crawl is requested . . .

1.1. Indexer grabs the start address of Indexer grabs the start address of content sourcecontent source

2.2. Start address is prefixed with protocol Start address is prefixed with protocol associated with accessing the contentassociated with accessing the content

3.3. Appropriate protocol handler invoked Appropriate protocol handler invoked to traverse the content sourceto traverse the content source

4.4. During traversal, the handler will During traversal, the handler will identify content nodes it needs to identify content nodes it needs to indexindex

Page 65: Microsoft Enterprise Seach using SharePoint

Crawl Walkthrough (cont)Crawl Walkthrough (cont)5.5. Protocol handler invokes IFilter Protocol handler invokes IFilter

associated with content node typeassociated with content node type

6.6. IFilter identifies and extracts properties IFilter identifies and extracts properties from content nodefrom content node

7.7. Protocol handler supplements IFilter Protocol handler supplements IFilter data with additional property data with additional property informationinformation

8.8. Data associated with content node is Data associated with content node is added to indexadded to index

9.9. Index “delta” propagates to search Index “delta” propagates to search serversservers

Page 66: Microsoft Enterprise Seach using SharePoint

Crawl Overview DiagramCrawl Overview Diagram

Search Process

Chunks

Filter Daemon

Shared Memory

Protofcol Handler

IPro

toco

lHan

dler

Filter

IFilt

er

URL

Chunks

Documents

SSP Catalog

Filtering Thread

pool

GathererMetadata

ExtractionIndexer

Catalog

Property Store

SQL Server

· URL History· Crawl Queue· Property Store

Word breakers

URL

Page 67: Microsoft Enterprise Seach using SharePoint

Index PropagationIndex PropagationFarm SampleFarm Sample

Indexer

Load Balancer

Crawling

User Requests

Web

front

ends

Ind

ex P

rop

ag

ati

on

Query

Servers

Page 68: Microsoft Enterprise Seach using SharePoint

Propagation will occur only when Propagation will occur only when the index and search components the index and search components are on separate serversare on separate servers

Continuous propagationContinuous propagation Changes sent incrementally to all query Changes sent incrementally to all query

servers associated with the index server.servers associated with the index server. Merging of the index occurs on the query Merging of the index occurs on the query

servers after propagation.servers after propagation. Query servers continue serving queries Query servers continue serving queries

while propagation is in progresswhile propagation is in progress

Index PropagationIndex Propagation

Page 69: Microsoft Enterprise Seach using SharePoint

Index PropagationIndex Propagation

Index File LocationIndex File Location Set in Office SharePoint Server Search Set in Office SharePoint Server Search

Service settingsService settings Default location: Default location: C:C:\\Program Files\Microsoft Office Program Files\Microsoft Office

Servers\12.0\Data\Office Server\ApplicationsServers\12.0\Data\Office Server\Applications

Can be programmatically set using the stsadm commandCan be programmatically set using the stsadm command

Index Server:Index Server:

““stsadm.exe -o editssp –indexlocation stsadm.exe -o editssp –indexlocation index file path”index file path”

Query ServerQuery Server

““stsadm.exe –o osearch –propagationlocation stsadm.exe –o osearch –propagationlocation index file path”index file path”

Page 70: Microsoft Enterprise Seach using SharePoint

The Query ProcessThe Query Process

Query Initiation and Results Query Initiation and Results PresentationPresentation

Query ExecutionQuery Execution Query WalkthroughQuery Walkthrough

Page 71: Microsoft Enterprise Seach using SharePoint

Query Initiation and Results Query Initiation and Results PresentationPresentation Typically, provided by the WSS / MOSS Typically, provided by the WSS / MOSS

WFE role, through OOB WebPartsWFE role, through OOB WebParts Could be an Office client or other Could be an Office client or other

custom applicationcustom application Responsible for constructing the “full” Responsible for constructing the “full”

query and communicating with the query and communicating with the query execution servicesquery execution services

Page 72: Microsoft Enterprise Seach using SharePoint

Query ExecutionQuery Execution

Always provided by a server tagged Always provided by a server tagged with the Query rolewith the Query role

Consumes a query requestConsumes a query request Executes the request using the query Executes the request using the query

index on the file system as well as the index on the file system as well as the SSP search database (if MOSS)SSP search database (if MOSS)

Handles OOB security trimmingHandles OOB security trimming Returns requested properties of the Returns requested properties of the

result set to the callerresult set to the caller

Page 73: Microsoft Enterprise Seach using SharePoint

Query Walkthrough (cont)Query Walkthrough (cont)When a query is requested . . .When a query is requested . . .

1.1. Query terms collectedQuery terms collected

2.2. Terms supplemented with contextual Terms supplemented with contextual informationinformation

3.3. Query formulated and issued through the Query formulated and issued through the Query OM or the Web ServiceQuery OM or the Web Service

4.4. Query is executed against the index and Query is executed against the index and property storeproperty store

5.5. Query results returnedQuery results returned Results are ordered according to their relevance Results are ordered according to their relevance

to the query wordsto the query words Trimmed based on the user’s permissions.Trimmed based on the user’s permissions.

Page 74: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 75: Microsoft Enterprise Seach using SharePoint

Module 5Module 5

The Search End-User ExperienceThe Search End-User Experience

Page 76: Microsoft Enterprise Seach using SharePoint

Module AgendaModule Agenda Introducing the Search End-User Introducing the Search End-User

ExperienceExperience Customizing SearchCustomizing Search People SearchPeople Search

Page 77: Microsoft Enterprise Seach using SharePoint

Introducing the Search End-Introducing the Search End-User ExperienceUser Experience Complete Search experienceComplete Search experience Search is everywhereSearch is everywhere Tab-based user interface for easy Tab-based user interface for easy

navigationnavigation Easy to extend and customizeEasy to extend and customize

Page 78: Microsoft Enterprise Seach using SharePoint

Introducing the End-User Search ExperienceIntroducing the End-User Search Experience

Search BoxesSearch Boxes Search CenterSearch Center Search Web PartsSearch Web Parts

Page 79: Microsoft Enterprise Seach using SharePoint

Query OM

Qu

ery

Resu

lts

Advanced

Search

Hidden ObjectHttp: Get Http: PostSearch

Box XML XMLXML

Web Parts

XSL

Transformation

OOB Search UI/Custom Search Apps

Query OM and Web Service

Page 80: Microsoft Enterprise Seach using SharePoint

Search WebPartsSearch WebParts Nine Standard Search Web Parts Nine Standard Search Web Parts

Search BoxSearch Box Core ResultsCore Results High ConfidenceHigh Confidence StatisticsStatistics PaginationPagination Action LinksAction Links Matching Keywords and Best BetsMatching Keywords and Best Bets Search Summary Search Summary (Did you mean?)(Did you mean?)

Advanced SearchAdvanced Search

Page 81: Microsoft Enterprise Seach using SharePoint

Result page infrastructure Result page infrastructure Data shared through hidden objectData shared through hidden object

All Search Web Parts within the same page share All Search Web Parts within the same page share the same hidden objectthe same hidden object

Connection between Search Web Part is Connection between Search Web Part is automatically doneautomatically done

Need only to Drag and Drop (or select) a Search Need only to Drag and Drop (or select) a Search Web Part on the pageWeb Part on the page

Allows for rapid page designAllows for rapid page design Hidden Object is internal and cannot be used by Hidden Object is internal and cannot be used by

custom Web Partscustom Web Parts

All Search Web Parts derive from Data Form All Search Web Parts derive from Data Form Web PartWeb Part

Page 82: Microsoft Enterprise Seach using SharePoint

Advanced Search Advanced Search

Allows power searchers to exercise greater Allows power searchers to exercise greater control on how they querycontrol on how they query

A link from the search boxA link from the search box Control what is displayed in the page by Control what is displayed in the page by

modifying the xml stored in the web part modifying the xml stored in the web part property “Properties”property “Properties” i.e., can be used for displaying a new i.e., can be used for displaying a new

language check boxlanguage check box

Not provided by WSS Search UINot provided by WSS Search UI Implemented using the SQL syntaxImplemented using the SQL syntax

Page 83: Microsoft Enterprise Seach using SharePoint

Customizing the End User Customizing the End User ExperienceExperience Search in everySearch in every company is differentcompany is different

Different metadata might matterDifferent metadata might matter Documents: Title, Author, File location, sizeDocuments: Title, Author, File location, size Records: Patient, Doctor, Healthcare provider, SSN…Records: Patient, Doctor, Healthcare provider, SSN…

Multi- or single-languagesMulti- or single-languages How users meaningfully scope searches differsHow users meaningfully scope searches differs

““All finance documents”All finance documents” ““All patient records”All patient records” ““All published documents”All published documents”

Customize results to “pop” metadata that Customize results to “pop” metadata that mattersmatters

Customization offered at many levelsCustomization offered at many levels Web Parts, XSLT/CSS, full Object Model…Web Parts, XSLT/CSS, full Object Model…

Page 84: Microsoft Enterprise Seach using SharePoint

Customization ChoicesCustomization Choices Search CenterSearch Center

Simple Site with few pagesSimple Site with few pages Default PageDefault Page Result PageResult Page Advanced Search PageAdvanced Search Page People Search PagePeople Search Page

Results PagesResults Pages All Sites Results PageAll Sites Results Page People Results PagePeople Results Page

Advanced Search Page and Web PartAdvanced Search Page and Web Part Show Scope PickerShow Scope Picker

ScopesScopes

Property PickerProperty Picker LanguagesLanguages

Search Web PartsSearch Web Parts

Page 85: Microsoft Enterprise Seach using SharePoint

Customizing SearchCustomizing Search

Adding Search Center TabsAdding Search Center Tabs Customizing Search Web PartsCustomizing Search Web Parts Customizing Search ResultsCustomizing Search Results

Page 86: Microsoft Enterprise Seach using SharePoint

People SearchPeople Search Bring people into the search experienceBring people into the search experience

Getting your job done means working withGetting your job done means working withthe right peoplethe right people

Find subject matter experts based on theirFind subject matter experts based on theirknowledge and contactsknowledge and contacts

People list can come from AD, SQL, othersPeople list can come from AD, SQL, others

Discovering ExpertsDiscovering ExpertsPeople are as important as data!People are as important as data!

Page 87: Microsoft Enterprise Seach using SharePoint

People SearchPeople Search

People ResultsPeople Results Customizing ResultsCustomizing Results

Page 88: Microsoft Enterprise Seach using SharePoint

Refine Your People Search Refine Your People Search

Refine by Job TitleRefine by Job Title Searches for the selected Job Searches for the selected Job

TitleTitle

Refine by Department Refine by Department Searches for the selected Searches for the selected

DepartmentDepartment

““Show more options” link (6+) Show more options” link (6+) Listed in order of frequencyListed in order of frequency

Page 89: Microsoft Enterprise Seach using SharePoint

People Search Web Parts People Search Web Parts

Two OOB People Search Web Parts Two OOB People Search Web Parts People Search BoxPeople Search Box People Search Core ResultsPeople Search Core Results

Inherit from the Search Core Results Web PartInherit from the Search Core Results Web Part

Can be mixed on the same page with Can be mixed on the same page with other Search Web Partsother Search Web Parts

Page 90: Microsoft Enterprise Seach using SharePoint

People Results Search Web People Results Search Web PartsParts Web Part properties such as:Web Part properties such as:

(similar to Core Search WP)(similar to Core Search WP) Formatting (i.e. width of the search Formatting (i.e. width of the search

box)box) Number of Results per pageNumber of Results per page Display “Alert Me”, “RSS” linksDisplay “Alert Me”, “RSS” links Turn stemming on/off (default “off”)Turn stemming on/off (default “off”) Remove Duplicate Results on/off Remove Duplicate Results on/off

(default “on”)(default “on”) Fixed keyword QueryFixed keyword Query Select ColumnsSelect Columns Results formatting with XSLResults formatting with XSL Social Distance (view)Social Distance (view)

Page 91: Microsoft Enterprise Seach using SharePoint

Social Distance Colleagues Social Distance Colleagues

Suggested Colleague list Suggested Colleague list members are mined from:members are mined from: Microsoft Windows Microsoft Windows

Messenger (IM)Messenger (IM) Microsoft OfficeMicrosoft Office

Outlook e-mailOutlook e-mail

(Outlook Add-In)(Outlook Add-In)

Page 92: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 93: Microsoft Enterprise Seach using SharePoint

Module 6Module 6

Search Object ModelSearch Object Model

Page 94: Microsoft Enterprise Seach using SharePoint

Workshop AgendaWorkshop Agenda

Scenarios for Extending Search Query Syntax Query Object Model Query Web Service

Page 95: Microsoft Enterprise Seach using SharePoint

Topic: Scenarios for Topic: Scenarios for Extending SearchExtending Search

In this first section we will examine 2 scenarios for extending Search:Integrate with Search Center Integrate Search into 3rd party sites and applications

Page 96: Microsoft Enterprise Seach using SharePoint

Integrate with MOSS Search CenterIntegrate with MOSS Search Center

Use cases: Use Search URL request parameters to add

predefined saved searches Build custom search box Web parts for

custom look and feel Build custom search core result Web parts

for own look and feel and customized querying

Extending Search

Page 97: Microsoft Enterprise Seach using SharePoint

Integrate MOSS Search into 3rd Party Integrate MOSS Search into 3rd Party Sites and ApplicationsSites and Applications

Build 3rd party user interface which leverages MOSS Search through Web Services

Use cases Add MOSS Search features into existing

Web sites Add MOSS Search into existing line of

business or custom applications

Extending Search

Page 98: Microsoft Enterprise Seach using SharePoint

Topic: Query SyntaxTopic: Query Syntax

In this section we will examine the three types of search syntax for building search queries supported by MOSS:KeywordURLSQL

Page 99: Microsoft Enterprise Seach using SharePoint

Keyword SyntaxKeyword Syntax

Used in standard Search Box New keyword syntax Simple and easy to use Consistent property:value syntax

across Office, Windows and Live search

OverviewOverview

gallery hinges –brass site:http//supportdesk scope:Productsgallery hinges –brass site:http//supportdesk scope:Products

Page 100: Microsoft Enterprise Seach using SharePoint

Build-in support for using include and exclude terms

Look for term bike, but not related to fitness

Look for phrase “SharePoint Services” but not the term v2

Include is implied when is no (+/-) prefix

Keyword SyntaxKeyword SyntaxInclude/ExcludeInclude/Exclude

bike -fitnessbike -fitness

+”SharePoint Services”-v2+”SharePoint Services”-v2

Page 101: Microsoft Enterprise Seach using SharePoint

Narrowing results by default Searches using “AND” between query terms

Does not recognize logical operators like “OR”, “NEAR” as keywords – it treats them all as search terms

Does not support complex queries like (A AND B) OR (C AND D)

Complex Boolean searches are supported by the engine and the SQL syntax

Keyword SyntaxKeyword SyntaxBoolean SearchBoolean Search

Page 102: Microsoft Enterprise Seach using SharePoint

Keyword SyntaxKeyword SyntaxProperty restrictionsProperty restrictions

• Supports property:value as part of the keyword string

• Can use any managed property

• Supports the use of phrases Can be used for exact matches when the property

value includes spaces Without quotes then prefix matching is done.

Supports word stemming

Page 103: Microsoft Enterprise Seach using SharePoint

No wildcard support in Keyword Syntax Search box does not do wildcard searching. The

following is not recognized as a wildcard search 

Use Advanced Search property restrictions to look for parts of a word

Requires new search results Web parts Wildcards are supported by the engine and

the SQL query syntax

Keyword SyntaxKeyword SyntaxNo wildcard supportNo wildcard support

SharePShareP**

Page 104: Microsoft Enterprise Seach using SharePoint

URL SyntaxURL SyntaxUse Case

Launching a URL in custom application Save Searches Custom search boxes

Request Parameters Content: results.aspx?k=fish Scopes: results.aspx?k=fish&s=BBC Sort:

results.aspx?v=date results.aspx?v=relevance

Page: results.aspx?start=21

Page 105: Microsoft Enterprise Seach using SharePoint

SQL Syntax OverviewSQL Syntax OverviewSQL Syntax offers: Consistent SQL across enterprise and

desktop Complex queries and Boolean searches

Comparison operators Arbitrary groupings for AND, OR, NOT Freetext() CONTAINS() LIKE ORDER BY ASC | DESC

Custom SQL query statements Wildcard support

Page 106: Microsoft Enterprise Seach using SharePoint

Write complex Boolean searches using AND, OR, NOT

SQL SyntaxSQL SyntaxComplex Boolean SearchesSQL SyntaxSQL SyntaxComplex Boolean Searches

Page 107: Microsoft Enterprise Seach using SharePoint

Returns documents for which the following is true: Document contains all the search terms in

at least one of the columns specified One of the search terms must also be

found in the Contents column

Use only one FREETEXT predicate for most optimal ranking

The FREETEXT predicate also supports (+/-)

SQL SyntaxSQL SyntaxFREETEXT predicateSQL SyntaxSQL SyntaxFREETEXT predicate

Page 108: Microsoft Enterprise Seach using SharePoint

Get wildcard support using the CONTAINS predicate:

Wildcard: Words or phrases with an asterisk (*) added to the end. WHERE CONTAINS

('

"compu*" NEAR "soft*"

')

SQL SyntaxSQL SyntaxWildcard SupportSQL SyntaxSQL SyntaxWildcard Support

Page 109: Microsoft Enterprise Seach using SharePoint

Removed in MOSS 2007 Query property weights UNION ALL MATCHES SELECT * COALESCE TABLE  

SQL SyntaxSQL SyntaxRemoved from SQL syntaxSQL SyntaxSQL SyntaxRemoved from SQL syntax

Page 110: Microsoft Enterprise Seach using SharePoint

Topic: Query Object ModelTopic: Query Object Model

In this section we will examine:The Query Object ModelThe Query Object PathThe Query Web Service

Page 111: Microsoft Enterprise Seach using SharePoint

Query Object ModelQuery Object Model

New object model Use the query object model to:

Build custom search user interface, like Web parts or ASPX applications

Gain direct access to query and results properties

Invoke custom queries

2 types of query syntaxes: Keyword SQL

Page 112: Microsoft Enterprise Seach using SharePoint

Query Object ModelQuery Object ModelFeaturesFeatures Managed code API Single request – multiple results

Result Types• Relevant

results• High

confidence results

• Special terms• Definitions

Optional parameters

• # of Sentences in Summary

• Implicit - AND/OR• Number of results• Ignore noise words• Enable stemming• Language

Page 113: Microsoft Enterprise Seach using SharePoint

Query Object PathQuery Object Path

Query OMQuery OMInputInput OutputOutput

SQL SQL QueryQuery

OptionalOptional

ParametersParameters

Query Query EngineEngine

ResultTableCollectionResultTableCollection ResultTable:ResultTable:

IDataReaderIDataReader

Relevant Relevant resultsresults

High High confidenceconfidence

Special Special termsterms

DefinitionsDefinitions

Site UISite UI

Custom ClientCustom Client

LocalLocal

RemoteRemote

Keyword Keyword QueryQuery

Execute()Execute()

Page 114: Microsoft Enterprise Seach using SharePoint

Query Web ServiceQuery Web ServiceUse and MethodsUse and Methods Use Case

Leverage Search in remote sites or application

Office Research Pane

Methods Query QueryEx GetSearchMetaData Registration Status

Page 115: Microsoft Enterprise Seach using SharePoint

Query Web ServiceQuery Web ServiceSearch Center FeaturesSearch Center Features

Standard Search Center features not built into the Web service Hit highlighting Search usage reporting Search logging Search statistics Result type icons

Using Query vs. QueryEx Implementing hit highlightingImplementing hit highlighting

Page 116: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 117: Microsoft Enterprise Seach using SharePoint

Module 7Module 7

AdministrationAdministration

Page 118: Microsoft Enterprise Seach using SharePoint

Module AgendaModule Agenda

Administrative ArchitectureAdministrative Architecture Farm AdministrationFarm Administration SSP AdministrationSSP Administration Site Collection AdministrationSite Collection Administration Site AdministrationSite Administration

Search Usage ReportingSearch Usage Reporting Administrative ToolsAdministrative Tools Lab: Adding Content SourcesLab: Adding Content Sources Lab: Search SchemaLab: Search Schema

Page 119: Microsoft Enterprise Seach using SharePoint

Shared ServicesShared ServicesBusiness unit ITBusiness unit ITService-level Service-level configurationconfigurationE.g. Create searchE.g. Create searchcontent source, content source, Search ScopesSearch Scopes

Central AdministrationCentral AdministrationIT AdministratorsIT AdministratorsFarm-level Farm-level

StatusStatusResource Resource managementmanagement

One per farmOne per farmE.g. Create new E.g. Create new sitesite

Administrative ArchitectureAdministrative Architecture

Site SettingsSite SettingsBusiness site ownerBusiness site ownerSite specific Site specific configuration and configuration and taskstaskse.g. Create new liste.g. Create new list

Three Tier AdministrationThree Tier AdministrationWeb-basedWeb-basedRole- and Task-delineatedRole- and Task-delineatedControlled DelegationControlled DelegationSecure IsolationSecure Isolation

Page 120: Microsoft Enterprise Seach using SharePoint

Farm ManagementFarm Management(IT Administrators)(IT Administrators)

Page 121: Microsoft Enterprise Seach using SharePoint

SharePoint 3.0 Central AdministrationSharePoint 3.0 Central Administration

Common TasksCommon Tasks Manage Topology and ServicesManage Topology and Services

Servers in FarmServers in Farm Services in ServerServices in Server

Security ConfigurationSecurity Configuration Update Farm Administrator’s GroupUpdate Farm Administrator’s Group

Backup and RestoreBackup and Restore IndexIndex Search DatabaseSearch Database

Global ConfigurationGlobal Configuration Timer Job DefinitionsTimer Job Definitions Timer Job StatusTimer Job Status

Manage Search ServiceManage Search Service

Page 122: Microsoft Enterprise Seach using SharePoint

Using Central AdminUsing Central Admin

Page 123: Microsoft Enterprise Seach using SharePoint

Operations – Topology and ServicesOperations – Topology and ServicesServers in Farm / Services on ServerServers in Farm / Services on Server Query Server(s)Query Server(s)

Office SharePoint Server Search ServiceOffice SharePoint Server Search Service Stop / StartStop / Start

Office SharePoint ServicesOffice SharePoint ServicesHelp Search ServiceHelp Search Service Stop / StartStop / Start

Index Server(s)Index Server(s) Office SharePoint Server Search ServiceOffice SharePoint Server Search Service

Stop / StartStop / Start

Page 124: Microsoft Enterprise Seach using SharePoint

Operations – Backup and RestoreOperations – Backup and Restore

Perform a backupPerform a backup Restore from backupRestore from backup

Page 125: Microsoft Enterprise Seach using SharePoint

Operations – Global ConfigurationOperations – Global Configuration Timer Job DefinitionsTimer Job Definitions

SharePoint Services Search RefreshSharePoint Services Search Refresh Disable / Enable Disable / Enable (Change and update WSS search configuration)(Change and update WSS search configuration)

Indexing Schedule Manager on MOSSIndexing Schedule Manager on MOSS Disable / EnableDisable / Enable

Timer Job StatusTimer Job Status Succeeded / FailedSucceeded / Failed

Page 126: Microsoft Enterprise Seach using SharePoint

Search Application ManagementSearch Application Management

Manage Search ServiceManage Search Service Farm-level Search settingsFarm-level Search settings Proxy Server settingsProxy Server settings Query and Index ServersQuery and Index Servers Server Listing and their Search Server Listing and their Search

serviceservice Shared Service Providers with Shared Service Providers with

Search enabledSearch enabled SSP name listingSSP name listing Crawler Impact RulesCrawler Impact Rules

Page 127: Microsoft Enterprise Seach using SharePoint

Crawler Impact RulesCrawler Impact Rules

Configured through Central Configured through Central AdministrationAdministration

Allows “throttling” of the indexer to Allows “throttling” of the indexer to reduce impact of a crawl on a reduce impact of a crawl on a particular serverparticular server

Supports wildcardsSupports wildcards Used in conjunction with crawl Used in conjunction with crawl

schedules schedules

Page 128: Microsoft Enterprise Seach using SharePoint

Crawler Impact Rules (cont)Crawler Impact Rules (cont)

Use . . . To . . .

* as the site name Apply the rule to all sites

*.* as the site name Apply the rule to sites with a dot in their name

*.site_name.com as the site name Apply the rule to all sites in the site_name.com domain

*.top-level_domain_name (such as *.com or *.net) as the site name

Apply the rule to all sites that end with a specific top-level domain name

? Replace any single character in a rule

Page 129: Microsoft Enterprise Seach using SharePoint

Shared Services ProviderShared Services Provider(SSP)(SSP)

ManagementManagement(SSP Administrators)(SSP Administrators)

(Content Oriented Administration)(Content Oriented Administration)

Page 130: Microsoft Enterprise Seach using SharePoint

Common TasksCommon Tasks

Configure Search Settings Configure Search Settings Content SourcesContent Sources Crawl SettingsCrawl Settings Authoritative Pages SettingsAuthoritative Pages Settings ScopesScopes

Page 131: Microsoft Enterprise Seach using SharePoint

Content SourcesContent Sources

Represent an arbitrary container of Represent an arbitrary container of informationinformation

Require at least one start address, Require at least one start address, although multiple start addresses can although multiple start addresses can be provided be provided

Start address cannot be reusedStart address cannot be reused Requires a registered protocol handlerRequires a registered protocol handler Five out-of-box content source types Five out-of-box content source types

are available, mapping to the five out-are available, mapping to the five out-of-box protocol handlersof-box protocol handlers

Page 132: Microsoft Enterprise Seach using SharePoint

SharePoint Content SourceSharePoint Content Source

Includes both SPS 2003, MOSS 2007, WSS v2, and Includes both SPS 2003, MOSS 2007, WSS v2, and WSS v3 sitesWSS v3 sites

Can limit crawl to only sites specified in start Can limit crawl to only sites specified in start address or all sites found below one or more address or all sites found below one or more provided hostnamesprovided hostnames

Crawler will use target site’s APIs to include Crawler will use target site’s APIs to include security information around content in the indexsecurity information around content in the index

For SPS 2003 content sources, crawler account For SPS 2003 content sources, crawler account requires “change” rights, which necessitates the requires “change” rights, which necessitates the crawler having administrator rightscrawler having administrator rights

Examples: sps3://moss-01/ or Examples: sps3://moss-01/ or http://moss-01/sitecollection/

Content sources decoupled from scopesContent sources decoupled from scopes

Page 133: Microsoft Enterprise Seach using SharePoint

Web Site Content SourceWeb Site Content Source

Any content source available over Any content source available over HTTP or HTTPSHTTP or HTTPS

If a SharePoint URL is provided, the If a SharePoint URL is provided, the crawler will detect this and index it as crawler will detect this and index it as though it were a SharePoint content though it were a SharePoint content source (this can be overridden with source (this can be overridden with crawl rules)crawl rules)

Page depth and server hops can be Page depth and server hops can be controlledcontrolled

Page 134: Microsoft Enterprise Seach using SharePoint

Web Site Content Source Web Site Content Source (cont)(cont) Security information around content is Security information around content is

not included in indexnot included in index Dynamic personalization will result in Dynamic personalization will result in

the index being populated with what the index being populated with what the crawler is presented withthe crawler is presented with

Example: Example: http://website or or http://www.somesite.com

Page 135: Microsoft Enterprise Seach using SharePoint

File Shares Content SourceFile Shares Content Source

Any content visible over a Windows Any content visible over a Windows server shared folderserver shared folder

Some non-Windows shares *may* be Some non-Windows shares *may* be crawled, if that share can be presented crawled, if that share can be presented as a Windows share (for instance, as a Windows share (for instance, Samba with Linux, Services for Unix)Samba with Linux, Services for Unix)

Start address can be the share root or Start address can be the share root or subfolders beneath itsubfolders beneath it

Security information is picked up by Security information is picked up by the gathererthe gatherer

Page 136: Microsoft Enterprise Seach using SharePoint

Exchange Public Folders Exchange Public Folders Content SourceContent Source Allows the indexer to crawl a public Allows the indexer to crawl a public

folder that exists on Exchangefolder that exists on Exchange Requires Outlook Web Access, as Requires Outlook Web Access, as

crawl is done over HTTPcrawl is done over HTTP Includes messages, conversations, Includes messages, conversations,

and other collaborative contentand other collaborative content URL presented in the search results URL presented in the search results

will point to a deep link within OWAwill point to a deep link within OWA Example: http://owa/public/folderExample: http://owa/public/folder

Page 137: Microsoft Enterprise Seach using SharePoint

Business Data Content Business Data Content SourceSource Allows the indexer to crawl metadata Allows the indexer to crawl metadata

exposed through the Business Data exposed through the Business Data CatalogCatalog

Can elect to include all Business Data Can elect to include all Business Data Applications or a selected number of Applications or a selected number of themthem

Page 138: Microsoft Enterprise Seach using SharePoint

Lotus Notes Content Lotus Notes Content SourceSource

Page 139: Microsoft Enterprise Seach using SharePoint

Crawling SchedulesCrawling Schedules

Allow administrator to indicate the frequency Allow administrator to indicate the frequency at which a content source will be re-crawled at which a content source will be re-crawled (daily, weekly, monthly)(daily, weekly, monthly)

Can indicate what time the content source Can indicate what time the content source should be crawledshould be crawled

Schedule should be driven by:Schedule should be driven by: Anticipated change at the content source (is this Anticipated change at the content source (is this

static content or content that is constantly static content or content that is constantly changing)changing)

Business expectations around when content Business expectations around when content changes should be reflected in the indexchanges should be reflected in the index

Schedule can always be modifiedSchedule can always be modified

Page 140: Microsoft Enterprise Seach using SharePoint

Maximum File SizeMaximum File Size

Default file size limit is 16MBDefault file size limit is 16MB To change the limit, you must add in To change the limit, you must add in

the registry new DWORD entry the registry new DWORD entry MaxDownloadSize at MaxDownloadSize at HKEY_LOCAL_MACHINE\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\

Gathering ManagerGathering Manager Make sure to increase timeout value to Make sure to increase timeout value to

avoid timeout exceptionsavoid timeout exceptions Change the value using the Manage Change the value using the Manage

Search Service page of the Central AdminSearch Service page of the Central Admin

Page 141: Microsoft Enterprise Seach using SharePoint

Crawl RulesCrawl Rules

Define exceptions to the “typical” Define exceptions to the “typical” crawl processcrawl process Addresses can be pattern matched for Addresses can be pattern matched for

special treatmentspecial treatment Support exclusionSupport exclusion Support altering the authentication Support altering the authentication

mechanismmechanism

Examples of Crawl RulesExamples of Crawl Rules Testing of Crawl RulesTesting of Crawl Rules

Page 142: Microsoft Enterprise Seach using SharePoint

Search Result Removal Search Result Removal (From Live Index)(From Live Index) Typically used when someone Typically used when someone

discovers something in the index that discovers something in the index that shouldn’t be thereshouldn’t be there

Permits administrator to immediately Permits administrator to immediately remove that content from the indexremove that content from the index

Crawl rule automatically created to Crawl rule automatically created to prevent that content from being indexed prevent that content from being indexed in the futurein the future

Restoring that content requires Restoring that content requires dropping the crawl rule and re-indexingdropping the crawl rule and re-indexing

Page 143: Microsoft Enterprise Seach using SharePoint

Default Content Access Default Content Access AccountAccount Account used for crawling, by defaultAccount used for crawling, by default Can be overridden in the Crawl RulesCan be overridden in the Crawl Rules Set the default account to use when Set the default account to use when

crawling contentcrawling content Minimum crawler permission is “Full Read” Minimum crawler permission is “Full Read”

(still provides the same security trimming (still provides the same security trimming functionality)functionality)

Automatically configured for new sitesAutomatically configured for new sites Do not use an Administrator Account to Do not use an Administrator Account to

avoid crawling unpublished versions of a avoid crawling unpublished versions of a document.document.

Page 144: Microsoft Enterprise Seach using SharePoint

Metadata Property MappingsMetadata Property Mappings

Page 145: Microsoft Enterprise Seach using SharePoint

Server Name MappingServer Name Mapping

Override how MOSS displays Override how MOSS displays search resultssearch results

Hide file pathHide file path Sample: “file://moss/HOL” to Sample: “file://moss/HOL” to

“http://moss.litwareinc.com”“http://moss.litwareinc.com”

Page 146: Microsoft Enterprise Seach using SharePoint

Search-based AlertsSearch-based Alerts Can be Activated / DeactivatedCan be Activated / Deactivated Deactivated after a reset of crawled contentDeactivated after a reset of crawled content Users can subscribe to an alert on a search Users can subscribe to an alert on a search

query query Alert is triggered if there are new or changed Alert is triggered if there are new or changed

items that satisfy the search queryitems that satisfy the search query An item is considered changed if its content An item is considered changed if its content

or metadata has changedor metadata has changed

Timer service is used to issue all alerts notifications (See User Alerts in Site Settings)Timer service is used to issue all alerts notifications (See User Alerts in Site Settings) Frequency can be set to Daily / WeeklyFrequency can be set to Daily / Weekly ““Alert Me” and RSS links can be added/removed using their Web Part propertyAlert Me” and RSS links can be added/removed using their Web Part property

Page 147: Microsoft Enterprise Seach using SharePoint

Reset Crawled ContentReset Crawled Content

Powerful action!Powerful action! Will delete the content index!Will delete the content index! Search Results will no longer be available Search Results will no longer be available

on the farm until the index has been rebuild!on the farm until the index has been rebuild! Search alerts are deactivated unless the Search alerts are deactivated unless the

administrator unchecks the check box. administrator unchecks the check box. Alerts should be activated after a full crawl Alerts should be activated after a full crawl

was performed.was performed.

Page 148: Microsoft Enterprise Seach using SharePoint

Specify Authoritative PagesSpecify Authoritative Pages

Helps prioritize Search Results - a way to Helps prioritize Search Results - a way to influence relevance results that are linked to influence relevance results that are linked to the authoritative pages, which will benefit the authoritative pages, which will benefit from a boost in rank.from a boost in rank. Most authoritativeMost authoritative Second-level authoritativeSecond-level authoritative Third-level authoritativeThird-level authoritative Sites to demoteSites to demote

Page 149: Microsoft Enterprise Seach using SharePoint

ScopesScopes

Scopes are filters applied to Scopes are filters applied to search results to narrow the search results to narrow the results of a search queryresults of a search query

Types of ScopesTypes of ScopesScope Rules and BehaviorsScope Rules and BehaviorsSingle-rule ScopesSingle-rule ScopesMulti-rule ScopesMulti-rule Scopes

Page 150: Microsoft Enterprise Seach using SharePoint

Site CollectionSite CollectionManagementManagement

(Site Collection Administrators)(Site Collection Administrators) (Application Administrators) (Application Administrators)

Page 151: Microsoft Enterprise Seach using SharePoint

Site Collection Administration OptionsSite Collection Administration Options

Common TasksCommon Tasks Search SettingsSearch Settings Search ScopesSearch Scopes Search KeywordsSearch Keywords

Page 152: Microsoft Enterprise Seach using SharePoint

Search SettingsSearch Settings

Two OptionsTwo Options Use the Search Center and custom scopes in the Use the Search Center and custom scopes in the

dropdowndropdown The way to change standard Search Center URL The way to change standard Search Center URL

for search boxesfor search boxes Do not use the Search Center – no custom scopesDo not use the Search Center – no custom scopes

Page 153: Microsoft Enterprise Seach using SharePoint

Site Level ScopesSite Level Scopes Site Level Scopes display all scopes associated with a Site Site Level Scopes display all scopes associated with a Site

CollectionCollection Display Scopes are a site-level feature that is purely UIDisplay Scopes are a site-level feature that is purely UI

Administrator Administrator – – Combine multiple scopes into one selectable itemCombine multiple scopes into one selectable item Visitors Visitors – – UI Search dropdown box (or checked boxes for the UI Search dropdown box (or checked boxes for the

Advanced Search page) populated with the scopes included in the Advanced Search page) populated with the scopes included in the display groupdisplay group

+

Page 154: Microsoft Enterprise Seach using SharePoint

Keywords and Best BetsKeywords and Best Bets

Prominently present editorially selected Prominently present editorially selected search resultssearch results

Keywords: Glossary of important terms Keywords: Glossary of important terms within your organizationwithin your organization

Best Bets are associated with particular Best Bets are associated with particular search keywordssearch keywords

Not available across site collectionsNot available across site collections

Page 155: Microsoft Enterprise Seach using SharePoint

Search Settings for Fields - NoCrawlSearch Settings for Fields - NoCrawl

Set a NoCrawl attribute on one or Set a NoCrawl attribute on one or more columns within the site more columns within the site collectioncollection

Column content will not be indexed! Column content will not be indexed! Associated with Site Columns Associated with Site Columns

(Content Types)(Content Types)

Page 156: Microsoft Enterprise Seach using SharePoint

Search VisibilitySearch Visibility

Site levelSite level Allow or deny the site to appear in search results.Allow or deny the site to appear in search results. If denied, the site will not be indexed.If denied, the site will not be indexed. Control ASPX pages within the site for visibility. Will Control ASPX pages within the site for visibility. Will

take into consideration item’s specific permissions.take into consideration item’s specific permissions.

List LevelList Level Allow or deny the list to appear in search results.Allow or deny the list to appear in search results. If denied, the list will not be indexed.If denied, the list will not be indexed.

Document Libraries and Folder LevelDocument Libraries and Folder Level Allow or deny the document library or folder to Allow or deny the document library or folder to

appear in search results.appear in search results. If denied, the Document Library (or folder) will not be If denied, the Document Library (or folder) will not be

indexed.indexed.

Page 157: Microsoft Enterprise Seach using SharePoint

Search Usage Search Usage ReportsReports

Page 158: Microsoft Enterprise Seach using SharePoint

Benefits of Search Queries Benefits of Search Queries and Results Reportingand Results Reporting Allows Site and SSP Administrators to:Allows Site and SSP Administrators to:

Have a visual look at end-user queries Have a visual look at end-user queries through charts and graphsthrough charts and graphs

Quickly quantify the success or failure of Quickly quantify the success or failure of the optimizations they can make to the optimizations they can make to crawlers and indexescrawlers and indexes

Export data to Microsoft Excel to further Export data to Microsoft Excel to further analyze and mineanalyze and mine

Page 159: Microsoft Enterprise Seach using SharePoint

To Improve the Overall Search To Improve the Overall Search Experience One Must…Experience One Must…

Best way to improve search is to Best way to improve search is to understand visitors’ current search usage!understand visitors’ current search usage!

Understand what visitors are searching forUnderstand what visitors are searching for Products, features, services, general Information about Products, features, services, general Information about

the company, etc.the company, etc.

Understand if their search was successfulUnderstand if their search was successful Have they clicked on one of the results?Have they clicked on one of the results? Were there any results – does content exist?Were there any results – does content exist? Were they offered suggestions specifically associated Were they offered suggestions specifically associated

with their query?with their query? Have they misspelled the words within their query?Have they misspelled the words within their query?

Page 160: Microsoft Enterprise Seach using SharePoint

Reporting ToolsReporting Tools Two sets of reportsTwo sets of reports

Search Query ReportsSearch Query Reports Search Results ReportsSearch Results Reports

Two different levels of reportsTwo different levels of reports Shared Service Provider (SSP)Shared Service Provider (SSP) Site CollectionSite Collection

Enabled by defaultEnabled by default Enabled within the SSPEnabled within the SSP Do not log queries from the Search Web Do not log queries from the Search Web

Service and from the custom Web Parts Service and from the custom Web Parts administratorsadministrators

Note: Data Stored in the SSP databaseNote: Data Stored in the SSP database

Page 161: Microsoft Enterprise Seach using SharePoint

Reporting ToolsReporting Tools At the SSP levelAt the SSP level For enterprise content oriented For enterprise content oriented

administratorsadministrators

Page 162: Microsoft Enterprise Seach using SharePoint

Reporting ToolsReporting Tools At the Site Collection levelAt the Site Collection level For Site Collection administratorsFor Site Collection administrators

Page 163: Microsoft Enterprise Seach using SharePoint

Search Query Reporting – SSPSearch Query Reporting – SSP Tracks Queries that users Tracks Queries that users

issued for issued for all sites managed all sites managed by this SSPby this SSP

Five Different ReportsFive Different Reports Queries Over Previous 30 DaysQueries Over Previous 30 Days Queries Over Previous 12 MonthsQueries Over Previous 12 Months Top Query Origin Site Collection Top Query Origin Site Collection

Over Previous 30 Days*Over Previous 30 Days* Query for Scopes Over Previous Query for Scopes Over Previous

30 Days30 Days Top Queries Over Previous 30 Top Queries Over Previous 30

DaysDays

Also has Tabular View for Also has Tabular View for most reportsmost reports

* Specific to SSP

Page 164: Microsoft Enterprise Seach using SharePoint

Search Query Reporting – Site Search Query Reporting – Site CollectionCollection

Tracks Queries issued Tracks Queries issued within this Site Collectionwithin this Site Collection

Four Different ReportsFour Different Reports Queries Over Previous 30 DaysQueries Over Previous 30 Days Queries Over Previous 12 Queries Over Previous 12

MonthsMonths Top Queries Over Previous 30 Top Queries Over Previous 30

DaysDays Query for Scopes Over Query for Scopes Over

Previous 30 DaysPrevious 30 Days

Also has Tabular View for Also has Tabular View for most reportsmost reports

Page 165: Microsoft Enterprise Seach using SharePoint

Search Results Reporting – SSPSearch Results Reporting – SSP Tracks Result Click Tracks Result Click

Selections by users Selections by users within the sites managed within the sites managed by this SSPby this SSP

Five Different ReportsFive Different Reports Search Results Top Search Results Top

Destination PagesDestination Pages Queries with Zero ResultsQueries with Zero Results Most Clicked Best BetsMost Clicked Best Bets Queries With Zero Best BetsQueries With Zero Best Bets Queries With Low Click-Queries With Low Click-

throughthrough

Page 166: Microsoft Enterprise Seach using SharePoint

Search Results Reporting – Site Search Results Reporting – Site CollectionCollection

Tracks Result Click Tracks Result Click Selections by users for this Selections by users for this Site CollectionSite Collection

Five Different ReportsFive Different Reports Search Results Top Destination Search Results Top Destination

PagesPages Queries with Zero ResultsQueries with Zero Results Most Clicked Best Bets (Editorial Most Clicked Best Bets (Editorial

Results)Results) Queries With Zero Best BetsQueries With Zero Best Bets Queries With Low Click-throughQueries With Low Click-through

Same list reports as SSP but, for Site Collection

Page 167: Microsoft Enterprise Seach using SharePoint

Exporting ResultsExporting ResultsExport data for Export data for

extended extended reporting in Excel reporting in Excel and/orand/orExcel ServicesExcel Services

Page 168: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 169: Microsoft Enterprise Seach using SharePoint

Module 8Module 8

Performance, Scalability, and Performance, Scalability, and Capacity PlanningCapacity Planning

Page 170: Microsoft Enterprise Seach using SharePoint

Module AgendaModule Agenda IntroductionIntroduction Search Capacity Planning in SPS 2003Search Capacity Planning in SPS 2003 MOSS 2007 Search Capacity PlanningMOSS 2007 Search Capacity Planning

Topology Topology QueryingQuerying IndexingIndexing Test EnvironmentTest Environment

Real World Experiences Real World Experiences Microsoft IntranetMicrosoft Intranet Microsoft Technology Center Proof of Microsoft Technology Center Proof of

Concept (PoC)Concept (PoC)

Page 171: Microsoft Enterprise Seach using SharePoint

MOSS 2007 Search MOSS 2007 Search Capacity PlanningCapacity Planning Improvement highlightsImprovement highlights

Topology restrictions removedTopology restrictions removed Indexing limitations improvedIndexing limitations improved Continuous propagationContinuous propagation

Page 172: Microsoft Enterprise Seach using SharePoint

TopologyTopology Deployment optionsDeployment options

Collapse index and query services on the Collapse index and query services on the same serversame server

Enable index service on one server and Enable index service on one server and query service on one or more different query service on one or more different servers servers

For both options you can have only For both options you can have only one index server one index server

Scale up versus scaling outScale up versus scaling out

Page 173: Microsoft Enterprise Seach using SharePoint

Topology (cont)Topology (cont)

Topology restrictions from v2 removedTopology restrictions from v2 removed Can mix indexer/search rolesCan mix indexer/search roles Service can be managed after initial setup Service can be managed after initial setup

or later onor later on

Use mixed x86 and x64 hardware Use mixed x86 and x64 hardware architecturesarchitectures Ifilter, Protocol Handler limitationsIfilter, Protocol Handler limitations

Index server is very CPU intensiveIndex server is very CPU intensive Plan for availablity requirementsPlan for availablity requirements

Page 174: Microsoft Enterprise Seach using SharePoint

Topology (cont)Topology (cont)

Topology Scaling Topology Scaling Reccomandations (for Search):Reccomandations (for Search): Query servers: 8 per farmQuery servers: 8 per farm Front end servers: 8 per farmFront end servers: 8 per farm Index servers: 4 per farmIndex servers: 4 per farm

Page 175: Microsoft Enterprise Seach using SharePoint

MOSS 2007 Search TopologyMOSS 2007 Search Topology

Indexer

Load Balancer

Propagationof indexes

Contentdatabases

Externalcontent

User Requests

Web

front

ends

Query serversQuery serversseparated from

indexer

Page 176: Microsoft Enterprise Seach using SharePoint

QueryingQuerying

Performance parametersPerformance parameters Scaling factorsScaling factors

Page 177: Microsoft Enterprise Seach using SharePoint

Querying – Performance ParametersQuerying – Performance Parameters

Network always is responsible on Network always is responsible on query performances to end-user query performances to end-user experience:experience: In querying the Index Catalog, a front-end In querying the Index Catalog, a front-end

always hits SQL database for getting always hits SQL database for getting information on search results and for information on search results and for Security Trimming.Security Trimming.

In querying the Property Store, the Query In querying the Property Store, the Query server is not involved since the Property server is not involved since the Property Store is now on SQL Search database.Store is now on SQL Search database.

Page 178: Microsoft Enterprise Seach using SharePoint

Querying – Performance ParametersQuerying – Performance Parameters

Page 179: Microsoft Enterprise Seach using SharePoint

Querying – Performance ParametersQuerying – Performance Parameters

Query server memory:Query server memory: The more memory is available, the less The more memory is available, the less

the Search service will have to access the the Search service will have to access the hard disk to satisfy a given query.hard disk to satisfy a given query.

Ideally, enough memory should be Ideally, enough memory should be installed on the query servers to installed on the query servers to accommodate the entire index.accommodate the entire index.

Query server disk speed:Query server disk speed: RAID 10 is recommended.RAID 10 is recommended.

Page 180: Microsoft Enterprise Seach using SharePoint

Querying – Scaling FactorsQuerying – Scaling Factors

Processor architectureProcessor architecture Use 64-bit serversUse 64-bit servers

Planning for performances: separate query Planning for performances: separate query from front-endfrom front-end Dedicated processor timeDedicated processor time Much available RAM for cachingMuch available RAM for caching

Planning for availability: add more than one Planning for availability: add more than one query server in your farmquery server in your farm This will require a dedicated machine for index, This will require a dedicated machine for index,

as described beforeas described before Tested maximum of eight query serversTested maximum of eight query servers

Page 181: Microsoft Enterprise Seach using SharePoint

IndexingIndexing

PlanningPlanning Performance optimizationPerformance optimization StorageStorage LimitationsLimitations ScalingScaling

Page 182: Microsoft Enterprise Seach using SharePoint

Indexing PlanningIndexing Planning Customer environmentCustomer environment

Number of usersNumber of users Network and connectivityNetwork and connectivity Disperse locationsDisperse locations Expected workloadsExpected workloads

PilotPilot Rollout planRollout plan

Estimate indexing windowEstimate indexing window

Page 183: Microsoft Enterprise Seach using SharePoint

Indexing Planning (cont)Indexing Planning (cont)

Corpus definition:Corpus definition: A corpus is defined as the sum of all A corpus is defined as the sum of all

content that is being indexed.content that is being indexed. This includes all valid content sources, This includes all valid content sources,

like Web pages, items, documents, BDC, like Web pages, items, documents, BDC, and any metadata and security and any metadata and security information associated with this content.information associated with this content.

Page 184: Microsoft Enterprise Seach using SharePoint

Indexing Planning (cont)Indexing Planning (cont) For each content source estimate:For each content source estimate:

Number of itemsNumber of items Storage used Storage used Types of itemsTypes of items SecuritySecurity Latency requirementsLatency requirements ConnectivityConnectivity Estimate indexing windowEstimate indexing window Expected yearly growthExpected yearly growth

Page 185: Microsoft Enterprise Seach using SharePoint

Indexing - Indexing - PerformancePerformance OptimizationOptimization

Use dedicated front-end for best indexing Use dedicated front-end for best indexing performanceperformance No other services allowed on that serverNo other services allowed on that server

Adjust the Adjust the indexing performance level indexing performance level Use Maximum for best performanceUse Maximum for best performance

Use Crawler Impact RulesUse Crawler Impact Rules Carefully test impactCarefully test impact

Continuous propagationContinuous propagation Average time is 3 to 27 secondsAverage time is 3 to 27 seconds

WSS Change log for incremental crawlsWSS Change log for incremental crawls

Page 186: Microsoft Enterprise Seach using SharePoint

Indexing - Indexing - PerformancePerformance OptimizationOptimization

Index server CPU:Index server CPU: As many processors are available as much crawl As many processors are available as much crawl

speed increasesspeed increases

Index server memory:Index server memory: The greater the memory capacity the more The greater the memory capacity the more

documents the crawler can process in paralleldocuments the crawler can process in parallel Having much available memory means to improve Having much available memory means to improve

crawl speedcrawl speed

Index Server Disk Speed:Index Server Disk Speed: Raid 10 with 2 ms access time and greater than Raid 10 with 2 ms access time and greater than

150 MB/sec write time150 MB/sec write time

Page 187: Microsoft Enterprise Seach using SharePoint

Index StorageIndex Storage

Planning index storage as ratio of Planning index storage as ratio of corpuscorpus

Sizing depends on content in corpusSizing depends on content in corpus Type of content sourceType of content source Document formatsDocument formats Level of metadata and security Level of metadata and security

informationinformation Plan for expected growth ratesPlan for expected growth rates

Page 188: Microsoft Enterprise Seach using SharePoint

Index Storage (cont)Index Storage (cont) Index / Query Server disk space Index / Query Server disk space

requirements:requirements: Index catalog size is normally in a Index catalog size is normally in a

range of 5% to trough 12% of corpus range of 5% to trough 12% of corpus sizesize

Recommended initial disk space is a Recommended initial disk space is a minimum of 2.5 times of index minimum of 2.5 times of index catalog sizecatalog size

That means: recommended initial That means: recommended initial disk space is disk space is at lease 30%at lease 30% of of indexed corpus sizeindexed corpus size

Page 189: Microsoft Enterprise Seach using SharePoint

Index Storage (cont)Index Storage (cont)

Search databaseSearch database Contains metadata, ACLs, hit highlighting, Contains metadata, ACLs, hit highlighting,

crawl history, and usage reportscrawl history, and usage reports Estimated 2K per crawled documentEstimated 2K per crawled document Sizing depends on corpus contentSizing depends on corpus content Requires more space than the index Requires more space than the index

catalogcatalog Recommended initial disk space is a Recommended initial disk space is a

minimum of 4 times of index catalog sizeminimum of 4 times of index catalog size

Page 190: Microsoft Enterprise Seach using SharePoint

Index Capacity LimitationsIndex Capacity Limitations Supported limit for a single index server is Supported limit for a single index server is

50 million documents50 million documents In this scenario we recommand only one Index In this scenario we recommand only one Index

server per farmserver per farm

One index server per SSPOne index server per SSP More SSPs can use the same indexerMore SSPs can use the same indexer

All MOSS 2007 for Search Editions All MOSS 2007 for Search Editions are are limited limited to one SSP per farmto one SSP per farm

MOSS 2007 is limited to 20 SSPs per farmMOSS 2007 is limited to 20 SSPs per farm MOSS 2007 for Search Standard Edition MOSS 2007 for Search Standard Edition

limited to 500,000 documents per farmlimited to 500,000 documents per farm

Page 191: Microsoft Enterprise Seach using SharePoint

Index ScalingIndex Scaling First scale up (recommended)First scale up (recommended)

Optimal ranking and user experienceOptimal ranking and user experience Best managabilityBest managability Scale up system resourcesScale up system resources

Use x64 architectureUse x64 architecture Add more CPUs to increase performanceAdd more CPUs to increase performance Plan for minimum 4GB of memoryPlan for minimum 4GB of memory RAID 10 is recommended for optimal disk RAID 10 is recommended for optimal disk

speedsspeeds

Page 192: Microsoft Enterprise Seach using SharePoint

Index ScalingIndex Scaling Scale outScale out

Add multiple SSPs each crawling unique Add multiple SSPs each crawling unique parts of the corpusparts of the corpus

Complete isolation between SSPsComplete isolation between SSPs Querying across multiple SSPs to get a Querying across multiple SSPs to get a

single relevant results set is not possiblesingle relevant results set is not possible Tested maximum of four index servers per Tested maximum of four index servers per

farmfarm

Recommended limit per farm across all Recommended limit per farm across all indexes is 50 million itemsindexes is 50 million items For scenarios higher than 50 million For scenarios higher than 50 million

items, add more farmsitems, add more farms

Page 193: Microsoft Enterprise Seach using SharePoint

Test EnvironmentTest Environment

Establish a starting point topologyEstablish a starting point topology Use monitoring to establish actual Use monitoring to establish actual

performance and capacity dataperformance and capacity data Use Performance Monitor to collect Use Performance Monitor to collect

processor, memory, and disk information processor, memory, and disk information for each serverfor each server

Look for resource bottlenecksLook for resource bottlenecks Scale up available resourcesScale up available resources Scale out server rolesScale out server roles

Page 194: Microsoft Enterprise Seach using SharePoint

Real World ExperiencesReal World Experiences

Microsoft IntranetMicrosoft Intranet Microsoft Technology Center PoCMicrosoft Technology Center PoC

Page 195: Microsoft Enterprise Seach using SharePoint

Microsoft IntranetMicrosoft Intranet EnvironmentEnvironment

Estimate of indexed content Estimate of indexed content Around 12 TB in SharePoint Content Databases (mix of Around 12 TB in SharePoint Content Databases (mix of 2003 / 2007), unknown size outside of this environment2003 / 2007), unknown size outside of this environment

Total size of the indexTotal size of the index SSP search database ~282GBSSP search database ~282GB SSP profiles database ~51GBSSP profiles database ~51GB Index size on disk ~156GBIndex size on disk ~156GB

Total number of objects Total number of objects 23 million objects23 million objects 30 content sources, 6 with daily crawls30 content sources, 6 with daily crawls

Typical 'real world' query response time from this Typical 'real world' query response time from this implementation implementation ~2 seconds, although the product group is looking into ~2 seconds, although the product group is looking into

ways we can optimize this for our environmentways we can optimize this for our environment

Page 196: Microsoft Enterprise Seach using SharePoint

Microsoft Technology Microsoft Technology Center PoCCenter PoC ObjectivesObjectives

Indexing large numbers of secure files on Indexing large numbers of secure files on file sharesfile shares

Verify MOSS 2007 search architectureVerify MOSS 2007 search architecture Test and recommend capacity planning Test and recommend capacity planning

and scaleand scale

Page 197: Microsoft Enterprise Seach using SharePoint

TopologyTopology

Indexed corpus

Search db

Index catalog

Propagated catalog

1TB

23GB

25GB

Page 198: Microsoft Enterprise Seach using SharePoint

ResultsResults For the biggest test run, which included For the biggest test run, which included

indexing 2.4 million secure files, here are the indexing 2.4 million secure files, here are the key metrics:key metrics: Full first-time indexing of entire corpus Full first-time indexing of entire corpus

took 23.1 hours.took 23.1 hours. Incremental crawls, where 4.7% of the Incremental crawls, where 4.7% of the

corpus was updated, took 3.7 hours.corpus was updated, took 3.7 hours. Total size of index, versus the corpus, Total size of index, versus the corpus,

was 2.4%, and for the search database, it was 2.4%, and for the search database, it was 2.1%. was 2.1%.

Full corpus crawl versus average number Full corpus crawl versus average number of items indexed per minute was 1642 of items indexed per minute was 1642 files/minute.files/minute.

Page 199: Microsoft Enterprise Seach using SharePoint

Results (cont)Results (cont)

Page 200: Microsoft Enterprise Seach using SharePoint

Summary of Known Limits and Summary of Known Limits and RestrictionsRestrictions

Tested recommendation of 50 million Tested recommendation of 50 million items per farmitems per farm

Hard limits:Hard limits: 1 indexer per SSP1 indexer per SSP 20 indexes per MOSS 2007 farm20 indexes per MOSS 2007 farm 1 index per MOSS 2007 for Search farm1 index per MOSS 2007 for Search farm 500 content sources per SSP500 content sources per SSP 500 start addresses per content source500 start addresses per content source 500,000 documents limit for MOSS 2007 500,000 documents limit for MOSS 2007

for Search Standard Editionfor Search Standard Edition

Page 201: Microsoft Enterprise Seach using SharePoint

Capacity Planning ReferencesCapacity Planning References

Planning for performance and capacity:Planning for performance and capacity: http://technet2.microsoft.com/Office/en-us/library

/eb2493e8-e498-462a-ab5d-1b779529dc471033.mspx

Plan for software boundaries:Plan for software boundaries: http://technet2.microsoft.com/Office/en-us/library

/6a13cd9f-4b44-40d6-85aa-c70a8e5c34fe1033.mspx

Estimate performance and capacity Estimate performance and capacity requirements for search environmentsrequirements for search environments http://technet2.microsoft.com/Office/en-us/library

/5465aa2b-aec3-4b87-bce0-8601ff20615e1033.mspx

Page 202: Microsoft Enterprise Seach using SharePoint

Questions?Questions?

Page 203: Microsoft Enterprise Seach using SharePoint