Upload
olivier-dobberkau
View
1.445
Download
3
Embed Size (px)
DESCRIPTION
Citation preview
http://www.dkd.de
Freitag, 15. Juni 12
d dkdevelopmentkommunikationdesign
Freitag, 15. Juni 12
WelcomeTYPO3 ConferenceQuebec Canada
Olivier Dobberkau, Founder and CIO dkd Member of the Expert Advisory Board TYPO3 Assoc.Twitter @[email protected]
Freitag, 15. Juni 12
Everything You Always Wanted to Know About Search in TYPO3. But Were Afraid to Ask
Freitag, 15. Juni 12
Woody Allen
Inspiration for this Talk:
Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“
Freitag, 15. Juni 12
Woody Allen
Inspiration for this Talk:
Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“
Internet Movie Database:http://www.imdb.com/title/tt0068555/
Freitag, 15. Juni 12
Agenda
A short history of Search
Slang
The need to Search
Who is searching and what is (s)he searching for?
Search in TYPO3 with Apache Solr
Questions & Answers
Freitag, 15. Juni 12
History
A short trip in the History of Searchsolutions in times of IT.
Really short, really lots of missing facts and not scienti!c at all.
Freitag, 15. Juni 12
Scratch your own itch, IBM.
At the beginning was the Mainframe
IBM develops in 1969 STAIRS (storage and information retrieval system)
Fulltext Search for Terminal Applications
Performance: „far below anyone‘s expectations“
First use in the DOJ Case againts IBM
Source: A history of online information services, 1963-1976 von Charles P. Bourne,Trudi Bellardo Hahn
Freitag, 15. Juni 12
Internet years are dog years
The Internet changes the needs in Fulltextsearch.
With Lycos, Alltheweb, Infoseek, Excite and Altavista Searchpages compete in solving the „How do i !nd something in the Internet?“
Its a race for the love of the seeking internet users in 1995.
Yahoo tries to be the Directory of Websites
Freitag, 15. Juni 12
And then came GOOGLE
Who does not know about Googles Secret?
The Anatomy of a Large-Scale Hypertextual Web Search Engine
http://infolab.stanford.edu/~backrub/google.html
Visionary Paper
The named technologies and principles are industry standard and are still changing our IT Industry. (Map reduce, Big data & Pagerank)
A must read!
Freitag, 15. Juni 12
Slang
Freitag, 15. Juni 12
Its all about words!
Irformation Retrieval (IR)
Term versus Query
Index
Recall & Precision
Relevancy
Index, Inverted Index & Posting List
Recency & Authority
Freitag, 15. Juni 12
The need to Search
What leads us when we search?How do we search? How does what we !nd change us?
Freitag, 15. Juni 12
People are like Bears (only less fur)
How do we search?
Marcia Bates, 1989
THE DESIGN OF BROWSING AND BERRYPICKING TECHNIQUES FOR THE ONLINE SEARCH INTERFACE
http://pages.gseis.ucla.edu/faculty/bates/berrypicking.html
Every search can be described with this
Freitag, 15. Juni 12
Marcia J. Bates Berrypicking techniques for the online search interface (1989)
Freitag, 15. Juni 12
Carrots & Sticks
Search Behavior Patterns, John Ferrara
http://www.boxesandarrows.com/view/search-behavior
Domain Expertise
Search Expertise
Cognitive Style
Goal Type
Mode of seeking
Situational idiosyncrasies
Freitag, 15. Juni 12
Neo: The Matrix
Matrix of Scope/Style of information needs
Scope & Type -Tyler Tate. Sohn et al. Church & Smythhttp://twigkit.com/blog/2011/12/06/mobile-information-needs.html
Freitag, 15. Juni 12
Search = Success for your Website
Bene!ts for your Visitors & Users
They will !nd it on your Website
Serendipity
Better and faster knowledge transfer
Business bene!ts
ROI
Agility
Awareness and Enablement
Freitag, 15. Juni 12
TYPO3 & Search
Shameless Plug: Apache Solr for TYPO3
I still have some „I love Indexed Search“ Buttons to giveaway.
Freitag, 15. Juni 12
Solr-Components
Indexing
Query
Analysis
Results
Additional Components
Freitag, 15. Juni 12
Indexing
Freitag, 15. Juni 12
What can be indexed?
TYPO3 Content
TYPO3 Databases (TCA Tables)
External Websites
RSS-Feeds
Files
...
Freitag, 15. Juni 12
Indexing Features
Synonyms
Stopwords
Protected words
External Content
RSS
Microsites
Application Data
...
Freitag, 15. Juni 12
Query
Freitag, 15. Juni 12
Query Options
Operators
“+” and “-” to add or exclude terms
Soon “and” und “or” to combine terms
Quotes to tie words togetherie. “This is a Search with many Terms”
Diacritical Characters
cuvée = cuvee
Søren = Sören = Soeren = Sœren = Soren
Freitag, 15. Juni 12
Query
Takes care of Access Control Rights
Autocomplete
Did you mean?
Freitag, 15. Juni 12
Results
Freitag, 15. Juni 12
Results
Searchresults linking to a result
Page Browser
Sorting
Relavancy (Score)
Author
Date (cr_date of TYPO3 Page)
your own criterias
Freitag, 15. Juni 12
Results
View-Helper to display additional Information like Custom Prices & Preview images.
Preset Filters so that Facets are activated with a Query
Freitag, 15. Juni 12
Results
Field Boosting (Terms in certain Field score higher. Can be freely set)
Boost-Functions (Functions on values of documents. I.e. newer documents are more ranked higher)
Query-Manipulation (Can be changed before they hit Solr)
Elevation (Paid content)
Freitag, 15. Juni 12
Results
Template Engine: "exible Template to customize your results listing fast and easily
Search word highlighting
Spell-Checking: "Did you mean?"
Common Searches
Recent Searches
Freitag, 15. Juni 12
Facets
Type-Facets
Author
Type of Document (Pages, News, Files & many more)
Range-Facets (Work in Progress)(ie. 1-10 $ or Slider)
Hierarchical Facets (Great if you have lots of categorized Data like in News or Filerepositories)
Facets can be combined with each other(ie. Show me all red & blue shoes)
Freitag, 15. Juni 12
Facets
Geo-Search (work in progress)(i.e If you want to search and display the location of your data of a certain type: Stores, Servicepoints, Bus-stops )
Geo-IP based on IP of your visitor(ie: Where is the next salespoint for your products)
Facets are TYPO3 content objects(can be manipulated with typoscript i.e Gifbuilder)
Filters can be preset(You can preset certain facets)
...
Freitag, 15. Juni 12
Analysis
Freitag, 15. Juni 12
Analysis
Query Logging
Stats on Queries (Work in Progress)
Userbased Ranking (Work in Progress)
Integration with analytics tools posible
Roll your own
There might be a Solr Server feature coming up ...
Freitag, 15. Juni 12
Additional Components
Freitag, 15. Juni 12
Additional Components
More like this Component on the Details page can show related additional documents
Its possible to access Indexed Data
Nutch Crawler to Index non TYPO3 Websites
Data Import Handler
Freitag, 15. Juni 12
d dkdevelopmentkommunikationdesign
Thank You! Merci.
Freitag, 15. Juni 12
Quellenangaben
Lucene Scoring for dummies: http://www.supermind.org/blog/378/lucene-scoring-for-dummies
Fotos: Søren Schaffstein
Freitag, 15. Juni 12