Upload
kyle-bodenstab
View
177
Download
1
Tags:
Embed Size (px)
Citation preview
FAST Search for SharePoint 2010
March 2015
Kyle Bodenstab MCITP SharePoint 2010 Database Administrator Jack Henry & Associates
SharePoint 2010 Search Products
• SharePoint Foundation
• SharePoint Server 2010
• FAST Search Server 2010 for SharePoint
SharePoint 2010 Search Products
• SharePoint Foundation
− SharePoint site search within a single farm
SharePoint 2010 Search Products
• SharePoint Server 2010
− All the features of Foundation, plus
− Shallow refinement
− Taxonomy tags
− Crawl external farms, Windows File Shares, Exchange Public Folders, LOB apps, structured content in DBs, etc.
− 100 million item index capability
SharePoint 2010 Search Products
• FAST Search Server 2010 for SharePoint
− All the features of Foundation and Server, plus
− Contextual search
− Deep refinement
− Thumbnails, Previews, and Visual Search
− Advanced linguistics
− Social tags and people search
− Document promotion/demotion
− Search driven applications
− 500 million to 1 billion item index capable
What does this mean to users?
• FAST can:
− Deliver results that are relevant
− Search in the language of the business
− Tune results to improve accuracy
− Provide a single platform for indexing and presenting all content in the enterprise, not just SharePoint content
FAST Terms
• Metadata
− Is essential to the success of SharePoint search whether it be FAST or Server
− Manual metadata is unreliable and costly
− Poor metadata leads to poor findability
FAST Terms
• FAST Content Processing
− Is designed as a pipeline that performs: − Format conversion
− Language encoding and detection
− Tokenization
− Lemmatization
− Property extraction
− Vectorization
− Date/Time Normalization
− Custom processing
− Property mapping
FAST Terms
• FAST Content Extraction
− Recognize and deliver entities from unstructured content such as: − People, companies, locations (shallow refiners)
− Modified date, result type, language (deep refiners)
− Dictionaries (custom deep refiners);
− Business and industry specific concepts
− Customer names, competitor names
− Employee titles and expertise
− Product names
− Project names
FAST Terms
• FAST Content Processing term definitions − Language encoding and detection – looks at the language of the content so appropriate
dictionaries can be applied downstream
− Tokenization – breaks text into rules regarding punctuation, diacritics, accents, compound words, phrases, etc.
− Lemmatization – applies linguistic normalization to content so users queries match documents that contain words and phrases in either canonical or inflected forms (singular/plural, masculine/feminine) ie, mice would also find mouse.
− Property extraction – recognizing entities such as companies, people, locations, etc within content
− Vectorization – creates document vectors based on the weighting of phrase/terms based on frequency of occurrence – find documents similar to this one result
− Date/Time Normalization – converts date/times to standard representation ie 24-Mar-11 is the same as March 24, 2011
− Custom processing – extend content processing with custom dictionaries
− Property mapping – manages the metadata discovered in the pipeline to the index managed properties
SharePoint Search
• Default Ranking
− URL Depth – Higher ranking based on shorter URL.
− Doc Rank – Higher ranking based on the number and relative importance of links pointing to an item.
− Site Rank – Higher ranking based on the number and relative importance of links pointing to the items on a site.
− HW Boost – Placeholder used for generic usage of static rank points
Search Results
• Dynamic Ranking − Freshness – Higher ranking based on age of content. Content just
added is given more points than content that is older.
− Context – Higher ranking based on the search word hits in the content.
− Proximity – Higher ranking based on a short distance between query terms in the content.
− Managed Property – Higher ranking based on content of a specific item type defined by a managed property.
− Authority – Higher ranking when the query terms are included in the link text.
− Query Authority (Click-through) – Higher ranking when query terms are associated with previous query results and clicked search results.
Site Structure
• Plan your site collection and sub-sites
• Consider splitting off projects to their own sites
• Keep things clean!
Library Structure
• Plan your libraries
• Consider using multiple shallow libraries vs a single deep library
• Plan and use metadata tagging
• Keep things clean!
SharePoint 2013 Enterprise Search
• New search capabilities in SP2013:
− Single search result center
− Search user interface improvements − Hover preview of document results
− Results based by type – document, people, sites, etc.
− Results block of similar content
− Accurate query suggestions
− Relevance improvements − New ranking models
− Query rules
− Changes in crawling − Continuous crawls
− Results removal from crawl logs
SharePoint 2013 Enterprise Search
• New search capabilities in SP2013 (continued):
− Discovering structure and entities in unstructured content − Configure the crawler to look for entities such as product names
within the body or title of content.
− Create custom dictionaries as an entity
− Removal of redundant information – menu, headers, boilerplate content
− More flexible search schema − Refinable and sortable managed properties
− Multiple search schemas
− Search health reports
SharePoint 2013 Enterprise Search
• New search capabilities in SP2013 (continued):
− New search architecture
Questions
Contact information:
Kyle Bodenstab, MCITP [email protected]
LinkedIn - www.linkedin.com/in/KyleBodenstab Twitter - @jackson_curve For the lighter side of life – jacksoncurve.blogspot.com