Metasearching: The Promise and Peril Roy Tennant

Metasearching: The Promise and Peril

Roy Tennant

Outline Why Metasearching?

– The Problem– The Promise– Principles

Metasearching in Libraries Today Issues Present Challenges and Possible

Futures

The Problem Most users do not care where the information

they need comes from, or who provides it…nor should they have to

But our systems presently require them to know:– How to select one or more databases– How to get to them– How to use the unique search options for each

How can we create systems that minimize what the user needs to know to get what they want?

The Promise of Metasearching The “Holy Grail” of resource

discovery: simple to use one-stop shopping

The simplification of a formerly complex activity (put the complexity in the back end, not the front)

Allows the user to focus on evaluating results, not figuring out where to search

Principles

Only librarians like to search, everyone else likes to find

All things being equal, one place to search is better than two or more

“Good enough” is often just that The size of the result set isn’t as

important as how the results are displayed (e.g., relevance)

Principles

Our ability to create effective one-stop searching is dependent on our ability to appropriately target user needs

Services should be placed as close to the user as possible

http://searchlight.cdlib.org/cgi-bin/searchlight

Source: ARL Statistics

Lessons from SearchLight Metasearching is not for everyone or every

purpose… …but metasearching is still worth doing (it

serves particular needs and audiences) For a large research library,

metasearching is best focused on particular needs (e.g., “a few good things”) or subject areas (e.g., Biology)

CDL Metasearch Infrastructure Project

Web site

No. Author Title Year Source Actions

1. Watson JD; Molecular structure of nucleic 1953 Nature [via Expanded View full text Crick, FH acids. A structure for Academic ASAP]

deoxyribose nucleic acid. [details] [basket]

2. Miller GA The magical number seven plus 1956 Psychol Rev [via Expanded View full textor minus two: some limits on our Academic ASAP]capacity for processing information. [details]

[basket]

3. Bush, Vannevar As we may think 1945 The Atlantic [via Google] View full text

[details] [basket]

| | |Home Library Info ResearchServices

Giant squid Search

More search options | Search tips

Best bets for finding articles related to giant squid * BIOSIS Previews * Expanded Academic ASAP * Lexis-Nexis

Best bets for finding articles related to giant squid * BIOSIS Previews * Expanded Academic ASAP * Lexis-Nexis

Ask a Librarian for help with research or using FindIt.FindIt is a service of the UC Libraries, powered by the CDL.

UCSC home

FindIt Basic Search | Advanced Search HelpSearch less, find more... Current Search Results | Marked Items

Sign In | Quit

For background information about giant squid, try Encyclopaedia Britannica.For background information about giant squid, try Encyclopaedia Britannica.

Sort results by: Relevance | Title | Source | Year

Search for giant squid found 3,345,452 results.The system retrieved 60 results and is displaying 1-50.If you want to wait longer you may wish to try to get more results.

To save time, search in only one place:Google 1,234,132Britannica Online 1,203Expanded Academic ASAP 345

Previous

Search Results

Next -->

Previous Next -->

UserInterfacesoftware

Initiates search

Sends search to

MetasearchSoftware

Sends search to

DatabaseAdvisor Tool

Performs search, identifies top 2-3 DBs, writes out file referenced by results page

Launches

multiple

searches

Receives

results

Merges, dedupe

sresults

Buildsdisplay

Sends mergeddisplay to

Database Advisor Service

Technical Underpinnings Structured query/response methods:

– Z39.50– SRU/SRW, the “next generation” (XML Web

Services) version of Z39.50– XML Gateways (proprietary XML APIs)

Unstructured query/response:– URL packing and HTML screen scraping

Record merging and de-duping Ranking (mostly a dream) OpenURL support (e.g., SFX)

Software Provider Issues Access management Search mapping Unreliability of targets Systems that don’t support an API (that

must be screen-scraped) Inadequate result data for good:

– Deduping– Ranking

Database Provider Issues

Access control (robust authentication and authorization)

Load Inappropriate searches (searching

databases that don’t apply) Branding and “unfair” deduping

Library Issues

Selecting the right system Cost (both upfront and ongoing) System design and implementation System maintenance

– Ability to add new resources/targets– Ease of interface changes– Ease of upgrades

User Issues What must I go through before hitting the search

button? How difficult is it to review results? Are results ranked by relevance? (that will be my

assumption) Will I get buried? (too many sources, too many

results?) Do I have methods to easily focus in on what I want? Once I find what I want, can I get to the full-text with

a click? Can I copy a citation and put it in my paper?

Present Challenges & Possible Futures

Software still needs improvement (duh) Some databases are still not searchable If you create a “family” of portals, how does

one find the right portal to search? A meta-metasearch?

We can learn from other systems (e.g., redlightgreen)

Standards are on the way (e.g., NISO)

Documents

Metasearching: The Promise and Peril Roy Tennant