Pete Boguszewski Stephen Meyer

Preview:

Citation preview

R & D for Libraries

Pete Boguszewski

Stephen Meyer

Library Technology Group

UW-Madison Libraries

What is this about?

• Who? libraries

• What? delivering first rate services

• When? the sooner the better

• Where? at home (I mean, work)

• Why? we can always do better

• How? w/ agility and creativity

Who?

Yes, you.You control the Information Age.

Welcome to your world.

Who?

Yes, you.You control your library’s data.

Welcome to your world.

Librarian

What?

Researching and developing better systems and services in libraries.

When?

Now

Where?

Right where you are.

Why?

(Google is just not afraid to admit it.)

Because no one quite has it figured out yet.

What?(cont’d)

A paradigm change:

Embrace the beta!Main Entry: 1be·ta

...

4 : a nearly complete prototype of a product (as software) <released in beta> <the beta version>

(source: http://www.m-w.com/dictionary/beta)

Why release an unfinished product?

We don’t know yet what we don’t know.

(But neither do our users.)

How?

Research & Development

A Web 2.0 darling: Netflix

• Site update schedule: 2 weeks

• They know the benefits of failing fast

Failing Fast

Ironically, teams that fail fast improve as fast, if not faster, than those who try to get it right the first time. The reason is simple: Teams trying to get it right the first time fail as often as everyone else does.

source: http://www.uie.com/articles/fast_iterations/

Agility

Warning

• R&D can be a dangerous enterprisean organization must have clear goals

• venturing into development requires focus

– solve only known problems

– solve problems that are important

• What is the problem I am trying to solve?

• Do the tools exist to take on this project?

• Do the staff exist to take on this project?

What if I fail?

• What can I learn from the experience upon failure?

• What can I learn from the experience if the product or service does not materialize?

Library Tech Group

Overview

*Infrastructure*

The Library Tech Group’s Infrastructure

Virtualization

Security

= Ability to move fast

Virtualization on Vmware

It is truly magic

Server Setup is Time Consuming

Virtual servers can be cloned quickly

We setup servers with specific software sets, patch them, test results for

consistency

Why do we care at all?

It is really cool

Allow us to be able to look at multiple products and/or applications at once

– We can easily create servers to host products that have different needs simultaneously

– Easily compare functionality, look and feel

Cloning (of servers) is good

Last bit on virtualization

Virtualized servers allow us to take a snapshot of the environment before doing

development

Can quickly revert to a moment in time if development goes bad

Virtual Environment

Now we have our server environment

Security

Integral part of development

Never replaces good programming practices or proper development techniques

Web Environment

Security helps development

Blocks out malicious users

This locked down environment allows us to put applications up quickly

Library Tech Group Helpdesk

Project Background

Ticket System Research

Web Search

Ask other institutions

Read current user opinions

Ticket System PreparationCommercial Products

• Read documentation

Open Source

• Read documentation

• Research which platform is best suited for application

• Research back-end requirements

Pick your favorite flavor

Pick your database

Ticket System Setup

Open Source• Install according to

documentation– Modify based on your specific

environment

• Clearly document all changes, snags, surprises

Commercial• Install according to

documentation

Ticket System

Compare all products

Compare Products

Commercial Products

Advantages

• Easy to install

• Tech support

Disadvantages

• Less flexible because we do not have the source code

• Cost $$$

• Less flexible by design

Open Source

Advantages

• Constantly changing, fixing bugs

• Ability to modify the source code

• Community enhancements and plug-ins

• Simple, easily changeable interface

Disadvantages

• Constantly changing, fixing bugs

• No direct customer support

• Development is not free…

What I learned about Open Source

“Free software” is a matter of liberty, not price. To understand the concept, you should think of

“free” as in “free speech”, not as in “free beer”. - Richard Stallman

It can be great in the right situations

Open source is the big winner

Ticket System

• Reinstall to get a clean, unmodified starting point

• Implemented but in perpetual beta– Only used inside our office

Open Source Benefits

– Constantly adding features• Email generated tickets• Web forms• Inventory information

– - Home-grown scripts

A learning experience

• Time is money - open source is not free but can still be well worth the effort

• Economies of scale- Later projects on Linux benefit from this experience

- Now have expertise in-house

Make the catalog data work harder

Inspiration

• The OPAC Sucks

• Libraries don't just collect things, we build collections

– the value of a library lies in its bibliographers, not just its bibliographic info

• A faculty member claimed there is no stack browse

Why does the OPAC suck?

(OPAC)

Who's ever written a great work about the immense effort required in order not to create?

Dostoyevsky Wannabe from the movie Slacker

(OPAC)

Leveraging our greatest strengths

• Patrons come to the library because we have the goods

• Without an infinite budget, we collect smartly, rather than indiscriminately

• Bibliographers and collection managers build collections

There is no online equivalent to browsing the stacks.

source: paraphrase of a faculty comment during question and answer session of a library lecture series

Actually...there is.

Gawd, like even my llama knows that.

Real point of need

vs.

Awkward access to our data

SaneCat(a mini R&D project)

Is it possible to build and OPAC-like toy that addresses these issues over the winter intersession?

Primary challenge

How do I realize the my goals within the construct of a web database application?

What are the problems I am trying to solve?

• To create an OPAC-like prototype that doesn’t suck

• To showcase library collections not just provide the call number for an individual title

• To approximate the experience of browsing the stacks in 2-D

Focusing the task at hand

not sucking = vague, fuzzy, dangerous

Focusing the task at hand(cont’d)

showcasing library collections

how does one bibliographic record stand in relation to others in the collection?

Focusing the task at hand(cont’d)

browsing the stacks online

When does a patron browse the stacks?

Which problems are important?

More importantly, which problems are not important?

How was it built?

A random selection of 72,000 catalog records

• almost 1% of our catalog

• 59,686 after dups and errors were thrown out

• 87,761 unique subjects

• 213,719 subfields within subjects

How was it built?(cont’d)

With a whole lot of help and guidance.

Geeky Details(prototyping tools)

• MySQL database

• marc4j libraries (for parsing raw MARC data)

• Java/Tomcat webapp

• Spring application framework

• Hibernate Object Relational Mapping

• jsp with jstl tag libraries

techie design goals

• model relationships among bib records

(bibliographers build collections)

• provide access to data at point of need

(faculty member did not find the access that exists when he needed it)

What did I learn?

• there are doors to be opened

• there are performance issues to be resolved

• there are data hooks that would need to be addressed– there is no reason to write acq, cat, circ

modules– we need live circ data

What if ... we never create a SaneCat?

• we have a mockup that can stand as leverage with vendors

• we have proof that our data can do what we want

• we know that Amazon does not have a monopoly on 'more like this'

• we can lend our tech to vendors so our systems are better

Where could we take this?

• work out the performance problems

• graph theory and a research map

• begin collecting intentional data

• develop the next gen MARC records: an object-oriented bibl. record

Why should we do this?

This is a fantastic tool for simulating something like browsing through the stacks. I have enjoyed playing with it for a few minutes. ...Again, this is a great tool. I look forward to using it extensively in the future. Please let me know if I can help in any other way.

source: faculty member who would like to browse stacks online