Everything you always wanted to know about search in typo3

Preview:

DESCRIPTION

 

Citation preview

http://www.dkd.de

Freitag, 15. Juni 12

d dkdevelopmentkommunikationdesign

Freitag, 15. Juni 12

WelcomeTYPO3 ConferenceQuebec Canada

Olivier Dobberkau, Founder and CIO dkd Member of the Expert Advisory Board TYPO3 Assoc.Twitter @T3RevNeverendolivier.dobberkau@dkd.de

Freitag, 15. Juni 12

Everything You Always Wanted to Know About Search in TYPO3. But Were Afraid to Ask

Freitag, 15. Juni 12

Woody Allen

Inspiration for this Talk:

Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“

Freitag, 15. Juni 12

Woody Allen

Inspiration for this Talk:

Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“

Internet Movie Database:http://www.imdb.com/title/tt0068555/

Freitag, 15. Juni 12

Agenda

A short history of Search

Slang

The need to Search

Who is searching and what is (s)he searching for?

Search in TYPO3 with Apache Solr

Questions & Answers

Freitag, 15. Juni 12

History

A short trip in the History of Searchsolutions in times of IT.

Really short, really lots of missing facts and not scienti!c at all.

Freitag, 15. Juni 12

Scratch your own itch, IBM.

At the beginning was the Mainframe

IBM develops in 1969 STAIRS (storage and information retrieval system)

Fulltext Search for Terminal Applications

Performance: „far below anyone‘s expectations“

First use in the DOJ Case againts IBM

Source: A history of online information services, 1963-1976 von Charles P. Bourne,Trudi Bellardo Hahn

Freitag, 15. Juni 12

Internet years are dog years

The Internet changes the needs in Fulltextsearch.

With Lycos, Alltheweb, Infoseek, Excite and Altavista Searchpages compete in solving the „How do i !nd something in the Internet?“

Its a race for the love of the seeking internet users in 1995.

Yahoo tries to be the Directory of Websites

Freitag, 15. Juni 12

And then came GOOGLE

Who does not know about Googles Secret?

The Anatomy of a Large-Scale Hypertextual Web Search Engine

http://infolab.stanford.edu/~backrub/google.html

Visionary Paper

The named technologies and principles are industry standard and are still changing our IT Industry. (Map reduce, Big data & Pagerank)

A must read!

Freitag, 15. Juni 12

Slang

Freitag, 15. Juni 12

Its all about words!

Irformation Retrieval (IR)

Term versus Query

Index

Recall & Precision

Relevancy

Index, Inverted Index & Posting List

Recency & Authority

Freitag, 15. Juni 12

The need to Search

What leads us when we search?How do we search? How does what we !nd change us?

Freitag, 15. Juni 12

People are like Bears (only less fur)

How do we search?

Marcia Bates, 1989

THE DESIGN OF BROWSING AND BERRYPICKING TECHNIQUES FOR THE ONLINE SEARCH INTERFACE

http://pages.gseis.ucla.edu/faculty/bates/berrypicking.html

Every search can be described with this

Freitag, 15. Juni 12

Marcia J. Bates Berrypicking techniques for the online search interface (1989)

Freitag, 15. Juni 12

Carrots & Sticks

Search Behavior Patterns, John Ferrara

http://www.boxesandarrows.com/view/search-behavior

Domain Expertise

Search Expertise

Cognitive Style

Goal Type

Mode of seeking

Situational idiosyncrasies

Freitag, 15. Juni 12

Neo: The Matrix

Matrix of Scope/Style of information needs

Scope & Type -Tyler Tate. Sohn et al. Church & Smythhttp://twigkit.com/blog/2011/12/06/mobile-information-needs.html

Freitag, 15. Juni 12

Search = Success for your Website

Bene!ts for your Visitors & Users

They will !nd it on your Website

Serendipity

Better and faster knowledge transfer

Business bene!ts

ROI

Agility

Awareness and Enablement

Freitag, 15. Juni 12

TYPO3 & Search

Shameless Plug: Apache Solr for TYPO3

I still have some „I love Indexed Search“ Buttons to giveaway.

Freitag, 15. Juni 12

Solr-Components

Indexing

Query

Analysis

Results

Additional Components

Freitag, 15. Juni 12

Indexing

Freitag, 15. Juni 12

What can be indexed?

TYPO3 Content

TYPO3 Databases (TCA Tables)

External Websites

RSS-Feeds

Files

...

Freitag, 15. Juni 12

Indexing Features

Synonyms

Stopwords

Protected words

External Content

RSS

Microsites

Application Data

...

Freitag, 15. Juni 12

Query

Freitag, 15. Juni 12

Query Options

Operators

“+” and “-” to add or exclude terms

Soon “and” und “or” to combine terms

Quotes to tie words togetherie. “This is a Search with many Terms”

Diacritical Characters

cuvée = cuvee

Søren = Sören = Soeren = Sœren = Soren

Freitag, 15. Juni 12

Query

Takes care of Access Control Rights

Autocomplete

Did you mean?

Freitag, 15. Juni 12

Results

Freitag, 15. Juni 12

Results

Searchresults linking to a result

Page Browser

Sorting

Relavancy (Score)

Author

Date (cr_date of TYPO3 Page)

your own criterias

Freitag, 15. Juni 12

Results

View-Helper to display additional Information like Custom Prices & Preview images.

Preset Filters so that Facets are activated with a Query

Freitag, 15. Juni 12

Results

Field Boosting (Terms in certain Field score higher. Can be freely set)

Boost-Functions (Functions on values of documents. I.e. newer documents are more ranked higher)

Query-Manipulation (Can be changed before they hit Solr)

Elevation (Paid content)

Freitag, 15. Juni 12

Results

Template Engine: "exible Template to customize your results listing fast and easily

Search word highlighting

Spell-Checking: "Did you mean?"

Common Searches

Recent Searches

Freitag, 15. Juni 12

Facets

Type-Facets

Author

Type of Document (Pages, News, Files & many more)

Range-Facets (Work in Progress)(ie. 1-10 $ or Slider)

Hierarchical Facets (Great if you have lots of categorized Data like in News or Filerepositories)

Facets can be combined with each other(ie. Show me all red & blue shoes)

Freitag, 15. Juni 12

Facets

Geo-Search (work in progress)(i.e If you want to search and display the location of your data of a certain type: Stores, Servicepoints, Bus-stops )

Geo-IP based on IP of your visitor(ie: Where is the next salespoint for your products)

Facets are TYPO3 content objects(can be manipulated with typoscript i.e Gifbuilder)

Filters can be preset(You can preset certain facets)

...

Freitag, 15. Juni 12

Analysis

Freitag, 15. Juni 12

Analysis

Query Logging

Stats on Queries (Work in Progress)

Userbased Ranking (Work in Progress)

Integration with analytics tools posible

Roll your own

There might be a Solr Server feature coming up ...

Freitag, 15. Juni 12

Additional Components

Freitag, 15. Juni 12

Additional Components

More like this Component on the Details page can show related additional documents

Its possible to access Indexed Data

Nutch Crawler to Index non TYPO3 Websites

Data Import Handler

Freitag, 15. Juni 12

d dkdevelopmentkommunikationdesign

Thank You! Merci.

Freitag, 15. Juni 12

Quellenangaben

Lucene Scoring for dummies: http://www.supermind.org/blog/378/lucene-scoring-for-dummies

Fotos: Søren Schaffstein

Freitag, 15. Juni 12

Recommended