82
R & D for Libraries Pete Boguszewski Stephen Meyer Library Technology Group UW-Madison Libraries

Pete Boguszewski Stephen Meyer

Embed Size (px)

Citation preview

Page 1: Pete Boguszewski Stephen Meyer

R & D for Libraries

Pete Boguszewski

Stephen Meyer

Library Technology Group

UW-Madison Libraries

Page 2: Pete Boguszewski Stephen Meyer

What is this about?

• Who? libraries

• What? delivering first rate services

• When? the sooner the better

• Where? at home (I mean, work)

• Why? we can always do better

• How? w/ agility and creativity

Page 3: Pete Boguszewski Stephen Meyer

Who?

Yes, you.You control the Information Age.

Welcome to your world.

Page 4: Pete Boguszewski Stephen Meyer

Who?

Yes, you.You control your library’s data.

Welcome to your world.

Librarian

Page 5: Pete Boguszewski Stephen Meyer

What?

Researching and developing better systems and services in libraries.

Page 6: Pete Boguszewski Stephen Meyer

When?

Now

Page 7: Pete Boguszewski Stephen Meyer

Where?

Right where you are.

Page 8: Pete Boguszewski Stephen Meyer

Why?

(Google is just not afraid to admit it.)

Because no one quite has it figured out yet.

Page 9: Pete Boguszewski Stephen Meyer

What?(cont’d)

A paradigm change:

Embrace the beta!Main Entry: 1be·ta

...

4 : a nearly complete prototype of a product (as software) <released in beta> <the beta version>

(source: http://www.m-w.com/dictionary/beta)

Page 10: Pete Boguszewski Stephen Meyer

Why release an unfinished product?

We don’t know yet what we don’t know.

(But neither do our users.)

Page 11: Pete Boguszewski Stephen Meyer

How?

Research & Development

Page 12: Pete Boguszewski Stephen Meyer

A Web 2.0 darling: Netflix

• Site update schedule: 2 weeks

• They know the benefits of failing fast

Page 13: Pete Boguszewski Stephen Meyer

Failing Fast

Ironically, teams that fail fast improve as fast, if not faster, than those who try to get it right the first time. The reason is simple: Teams trying to get it right the first time fail as often as everyone else does.

source: http://www.uie.com/articles/fast_iterations/

Page 14: Pete Boguszewski Stephen Meyer

Agility

Page 15: Pete Boguszewski Stephen Meyer

Warning

• R&D can be a dangerous enterprisean organization must have clear goals

• venturing into development requires focus

– solve only known problems

– solve problems that are important

Page 16: Pete Boguszewski Stephen Meyer

• What is the problem I am trying to solve?

• Do the tools exist to take on this project?

• Do the staff exist to take on this project?

Page 17: Pete Boguszewski Stephen Meyer

What if I fail?

• What can I learn from the experience upon failure?

• What can I learn from the experience if the product or service does not materialize?

Page 18: Pete Boguszewski Stephen Meyer

Library Tech Group

Overview

*Infrastructure*

Page 19: Pete Boguszewski Stephen Meyer

The Library Tech Group’s Infrastructure

Virtualization

Security

= Ability to move fast

Page 20: Pete Boguszewski Stephen Meyer

Virtualization on Vmware

It is truly magic

Page 21: Pete Boguszewski Stephen Meyer

Server Setup is Time Consuming

Virtual servers can be cloned quickly

We setup servers with specific software sets, patch them, test results for

consistency

Page 22: Pete Boguszewski Stephen Meyer

Why do we care at all?

It is really cool

Allow us to be able to look at multiple products and/or applications at once

– We can easily create servers to host products that have different needs simultaneously

– Easily compare functionality, look and feel

Page 23: Pete Boguszewski Stephen Meyer

Cloning (of servers) is good

Page 24: Pete Boguszewski Stephen Meyer

Last bit on virtualization

Virtualized servers allow us to take a snapshot of the environment before doing

development

Can quickly revert to a moment in time if development goes bad

Page 25: Pete Boguszewski Stephen Meyer
Page 26: Pete Boguszewski Stephen Meyer

Virtual Environment

Now we have our server environment

Page 27: Pete Boguszewski Stephen Meyer

Security

Integral part of development

Never replaces good programming practices or proper development techniques

Page 28: Pete Boguszewski Stephen Meyer

Web Environment

Page 29: Pete Boguszewski Stephen Meyer

Security helps development

Blocks out malicious users

This locked down environment allows us to put applications up quickly

Page 30: Pete Boguszewski Stephen Meyer

Library Tech Group Helpdesk

Page 31: Pete Boguszewski Stephen Meyer

Project Background

Page 32: Pete Boguszewski Stephen Meyer

Ticket System Research

Web Search

Ask other institutions

Read current user opinions

Page 33: Pete Boguszewski Stephen Meyer

Ticket System PreparationCommercial Products

• Read documentation

Open Source

• Read documentation

• Research which platform is best suited for application

• Research back-end requirements

Page 34: Pete Boguszewski Stephen Meyer

Pick your favorite flavor

Page 35: Pete Boguszewski Stephen Meyer

Pick your database

Page 36: Pete Boguszewski Stephen Meyer

Ticket System Setup

Open Source• Install according to

documentation– Modify based on your specific

environment

• Clearly document all changes, snags, surprises

Commercial• Install according to

documentation

Page 37: Pete Boguszewski Stephen Meyer

Ticket System

Compare all products

Compare Products

Page 38: Pete Boguszewski Stephen Meyer

Commercial Products

Advantages

• Easy to install

• Tech support

Disadvantages

• Less flexible because we do not have the source code

• Cost $$$

• Less flexible by design

Page 39: Pete Boguszewski Stephen Meyer

Open Source

Advantages

• Constantly changing, fixing bugs

• Ability to modify the source code

• Community enhancements and plug-ins

• Simple, easily changeable interface

Disadvantages

• Constantly changing, fixing bugs

• No direct customer support

• Development is not free…

Page 40: Pete Boguszewski Stephen Meyer

What I learned about Open Source

“Free software” is a matter of liberty, not price. To understand the concept, you should think of

“free” as in “free speech”, not as in “free beer”. - Richard Stallman

It can be great in the right situations

Page 41: Pete Boguszewski Stephen Meyer

Open source is the big winner

Page 42: Pete Boguszewski Stephen Meyer

Ticket System

• Reinstall to get a clean, unmodified starting point

• Implemented but in perpetual beta– Only used inside our office

Page 43: Pete Boguszewski Stephen Meyer

Open Source Benefits

– Constantly adding features• Email generated tickets• Web forms• Inventory information

– - Home-grown scripts

Page 44: Pete Boguszewski Stephen Meyer

A learning experience

• Time is money - open source is not free but can still be well worth the effort

• Economies of scale- Later projects on Linux benefit from this experience

- Now have expertise in-house

Page 45: Pete Boguszewski Stephen Meyer

Make the catalog data work harder

Page 46: Pete Boguszewski Stephen Meyer

Inspiration

• The OPAC Sucks

• Libraries don't just collect things, we build collections

– the value of a library lies in its bibliographers, not just its bibliographic info

• A faculty member claimed there is no stack browse

Page 47: Pete Boguszewski Stephen Meyer

Why does the OPAC suck?

Page 48: Pete Boguszewski Stephen Meyer
Page 49: Pete Boguszewski Stephen Meyer
Page 50: Pete Boguszewski Stephen Meyer
Page 51: Pete Boguszewski Stephen Meyer

(OPAC)

Page 52: Pete Boguszewski Stephen Meyer

Who's ever written a great work about the immense effort required in order not to create?

Dostoyevsky Wannabe from the movie Slacker

Page 53: Pete Boguszewski Stephen Meyer

(OPAC)

Page 54: Pete Boguszewski Stephen Meyer

Leveraging our greatest strengths

• Patrons come to the library because we have the goods

• Without an infinite budget, we collect smartly, rather than indiscriminately

• Bibliographers and collection managers build collections

Page 55: Pete Boguszewski Stephen Meyer

There is no online equivalent to browsing the stacks.

source: paraphrase of a faculty comment during question and answer session of a library lecture series

Page 56: Pete Boguszewski Stephen Meyer

Actually...there is.

Page 57: Pete Boguszewski Stephen Meyer
Page 58: Pete Boguszewski Stephen Meyer

Gawd, like even my llama knows that.

Page 59: Pete Boguszewski Stephen Meyer

Real point of need

vs.

Awkward access to our data

Page 60: Pete Boguszewski Stephen Meyer

SaneCat(a mini R&D project)

Page 61: Pete Boguszewski Stephen Meyer

Is it possible to build and OPAC-like toy that addresses these issues over the winter intersession?

Page 62: Pete Boguszewski Stephen Meyer

Primary challenge

How do I realize the my goals within the construct of a web database application?

Page 63: Pete Boguszewski Stephen Meyer

What are the problems I am trying to solve?

• To create an OPAC-like prototype that doesn’t suck

• To showcase library collections not just provide the call number for an individual title

• To approximate the experience of browsing the stacks in 2-D

Page 64: Pete Boguszewski Stephen Meyer

Focusing the task at hand

not sucking = vague, fuzzy, dangerous

Page 65: Pete Boguszewski Stephen Meyer

Focusing the task at hand(cont’d)

showcasing library collections

how does one bibliographic record stand in relation to others in the collection?

Page 66: Pete Boguszewski Stephen Meyer

Focusing the task at hand(cont’d)

browsing the stacks online

When does a patron browse the stacks?

Page 67: Pete Boguszewski Stephen Meyer

Which problems are important?

More importantly, which problems are not important?

Page 68: Pete Boguszewski Stephen Meyer

How was it built?

A random selection of 72,000 catalog records

• almost 1% of our catalog

• 59,686 after dups and errors were thrown out

• 87,761 unique subjects

• 213,719 subfields within subjects

Page 69: Pete Boguszewski Stephen Meyer

How was it built?(cont’d)

With a whole lot of help and guidance.

Page 70: Pete Boguszewski Stephen Meyer
Page 71: Pete Boguszewski Stephen Meyer
Page 72: Pete Boguszewski Stephen Meyer
Page 73: Pete Boguszewski Stephen Meyer
Page 74: Pete Boguszewski Stephen Meyer

Geeky Details(prototyping tools)

• MySQL database

• marc4j libraries (for parsing raw MARC data)

• Java/Tomcat webapp

• Spring application framework

• Hibernate Object Relational Mapping

• jsp with jstl tag libraries

Page 75: Pete Boguszewski Stephen Meyer

techie design goals

• model relationships among bib records

(bibliographers build collections)

• provide access to data at point of need

(faculty member did not find the access that exists when he needed it)

Page 77: Pete Boguszewski Stephen Meyer

What did I learn?

• there are doors to be opened

• there are performance issues to be resolved

• there are data hooks that would need to be addressed– there is no reason to write acq, cat, circ

modules– we need live circ data

Page 78: Pete Boguszewski Stephen Meyer

What if ... we never create a SaneCat?

Page 79: Pete Boguszewski Stephen Meyer

• we have a mockup that can stand as leverage with vendors

• we have proof that our data can do what we want

• we know that Amazon does not have a monopoly on 'more like this'

• we can lend our tech to vendors so our systems are better

Page 80: Pete Boguszewski Stephen Meyer

Where could we take this?

• work out the performance problems

• graph theory and a research map

• begin collecting intentional data

• develop the next gen MARC records: an object-oriented bibl. record

Page 81: Pete Boguszewski Stephen Meyer

Why should we do this?

Page 82: Pete Boguszewski Stephen Meyer

This is a fantastic tool for simulating something like browsing through the stacks. I have enjoyed playing with it for a few minutes. ...Again, this is a great tool. I look forward to using it extensively in the future. Please let me know if I can help in any other way.

source: faculty member who would like to browse stacks online