21
FAST Search for SharePoint 2010 March 2015 Kyle Bodenstab MCITP SharePoint 2010 Database Administrator Jack Henry & Associates

Smarter share point kc user group fast presentation march 2015

Embed Size (px)

Citation preview

FAST Search for SharePoint 2010

March 2015

Kyle Bodenstab MCITP SharePoint 2010 Database Administrator Jack Henry & Associates

SharePoint 2010 Search Products

• SharePoint Foundation

• SharePoint Server 2010

• FAST Search Server 2010 for SharePoint

SharePoint 2010 Search Products

• SharePoint Foundation

− SharePoint site search within a single farm

SharePoint 2010 Search Products

• SharePoint Server 2010

− All the features of Foundation, plus

− Shallow refinement

− Taxonomy tags

− Crawl external farms, Windows File Shares, Exchange Public Folders, LOB apps, structured content in DBs, etc.

− 100 million item index capability

SharePoint 2010 Search Products

• FAST Search Server 2010 for SharePoint

− All the features of Foundation and Server, plus

− Contextual search

− Deep refinement

− Thumbnails, Previews, and Visual Search

− Advanced linguistics

− Social tags and people search

− Document promotion/demotion

− Search driven applications

− 500 million to 1 billion item index capable

What does this mean to users?

• FAST can:

− Deliver results that are relevant

− Search in the language of the business

− Tune results to improve accuracy

− Provide a single platform for indexing and presenting all content in the enterprise, not just SharePoint content

FAST Terms

• Metadata

• (Content) Processing

• (Content) Extraction

FAST Terms

• Metadata

− Is essential to the success of SharePoint search whether it be FAST or Server

− Manual metadata is unreliable and costly

− Poor metadata leads to poor findability

FAST Terms

• FAST Content Processing

− Is designed as a pipeline that performs: − Format conversion

− Language encoding and detection

− Tokenization

− Lemmatization

− Property extraction

− Vectorization

− Date/Time Normalization

− Custom processing

− Property mapping

FAST Terms

• FAST Content Extraction

− Recognize and deliver entities from unstructured content such as: − People, companies, locations (shallow refiners)

− Modified date, result type, language (deep refiners)

− Dictionaries (custom deep refiners);

− Business and industry specific concepts

− Customer names, competitor names

− Employee titles and expertise

− Product names

− Project names

FAST Terms

• FAST Content Processing term definitions − Language encoding and detection – looks at the language of the content so appropriate

dictionaries can be applied downstream

− Tokenization – breaks text into rules regarding punctuation, diacritics, accents, compound words, phrases, etc.

− Lemmatization – applies linguistic normalization to content so users queries match documents that contain words and phrases in either canonical or inflected forms (singular/plural, masculine/feminine) ie, mice would also find mouse.

− Property extraction – recognizing entities such as companies, people, locations, etc within content

− Vectorization – creates document vectors based on the weighting of phrase/terms based on frequency of occurrence – find documents similar to this one result

− Date/Time Normalization – converts date/times to standard representation ie 24-Mar-11 is the same as March 24, 2011

− Custom processing – extend content processing with custom dictionaries

− Property mapping – manages the metadata discovered in the pipeline to the index managed properties

SharePoint Search

• Default Ranking

− URL Depth – Higher ranking based on shorter URL.

− Doc Rank – Higher ranking based on the number and relative importance of links pointing to an item.

− Site Rank – Higher ranking based on the number and relative importance of links pointing to the items on a site.

− HW Boost – Placeholder used for generic usage of static rank points

Search Results

• Dynamic Ranking − Freshness – Higher ranking based on age of content. Content just

added is given more points than content that is older.

− Context – Higher ranking based on the search word hits in the content.

− Proximity – Higher ranking based on a short distance between query terms in the content.

− Managed Property – Higher ranking based on content of a specific item type defined by a managed property.

− Authority – Higher ranking when the query terms are included in the link text.

− Query Authority (Click-through) – Higher ranking when query terms are associated with previous query results and clicked search results.

How Do Users Find Content?

• Site Structure

• Library Structure

• SharePoint Search

Demo

Site Structure

• Plan your site collection and sub-sites

• Consider splitting off projects to their own sites

• Keep things clean!

Library Structure

• Plan your libraries

• Consider using multiple shallow libraries vs a single deep library

• Plan and use metadata tagging

• Keep things clean!

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013:

− Single search result center

− Search user interface improvements − Hover preview of document results

− Results based by type – document, people, sites, etc.

− Results block of similar content

− Accurate query suggestions

− Relevance improvements − New ranking models

− Query rules

− Changes in crawling − Continuous crawls

− Results removal from crawl logs

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013 (continued):

− Discovering structure and entities in unstructured content − Configure the crawler to look for entities such as product names

within the body or title of content.

− Create custom dictionaries as an entity

− Removal of redundant information – menu, headers, boilerplate content

− More flexible search schema − Refinable and sortable managed properties

− Multiple search schemas

− Search health reports

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013 (continued):

− New search architecture

Questions

Contact information:

Kyle Bodenstab, MCITP [email protected]

LinkedIn - www.linkedin.com/in/KyleBodenstab Twitter - @jackson_curve For the lighter side of life – jacksoncurve.blogspot.com