48

Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:
Page 2: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Beyond the PageLou Burnard

Oxford University Computing Services

Page 3: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The message●Today's digital library applications still focus on serving up virtual pages for the reader: the metaphor of the book is so pervasive that we can barely see it.

●But going digital is not only about producing cheaper and more accessible simulations of printed or painted pages.

●Digital applications should enable us to do more with a text than simply read it from beginning to end, or attach annotations to it for others to read.

Page 4: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The Knowledge Economy and the Information Society

"If infrastructure is required for an industrial economy, then we could say that cyberinfrastructure is required for a knowledge economy." (Report of the National Science Foundation Blue Ribbon Panel on Cyberinfrastructure)

Page 5: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What is the digital content chain?

For publishers, the most dangerous aspect of digital content distribution is not piracy, but rather the lack of viable alternatives to piracy

By 2005, mankind will play with more than 100 billion gigabytes of content in the form of images, text, audio, video, graphics or a combination of all these. The task of managing this content and making sure it reaches the right customers at the right time will be a great challenge... If content assets are treated as SKUs (Stock Keeping Units) in supply chain parlance, and technology applications like Digital Assets Management are applied from an Inventory Optimization perspective, Media content managers will be able to handle the digital onslaught better.

Unifying the digital content commerce value chain The market for digital content services is determined by a relatively simple value chain. It shares the same simple characteristics as any physical product value chain in that product or service flows one way - adding value as it goes - and revenue flows the other.

Page 6: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Three simple truths

1. There is no going back : the knowledge infrastructure is now irrevocably digital

2. The business models of the knowledge infrastructure have changed irrevocably

3. The quantititative changes facilitated by digital technologies approximate qualitative change

Page 7: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

1. Irrevocable digitality

● The sciences don't see this as an issue...● The humanities claim to be different

– hence the concept of “Humanities Computing”● But the objects of Humanities scholarship are now

digital, even if its methods are not

Page 8: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:
Page 9: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

2. Business models for the knowledge infrastructure

● The war of the journal article● The book and the ebook● Other forms of communication

Page 10: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

As a key resource of the 21st century, information goods might displace industrial goods as key drivers of markets. The foundation of the economic prosperity of developed countries is not only based on the efficient conversion of information to knowledge, but also in imparting this knowledge in the educational system. In this context, scientific libraries play a decisive role as a provider of scientific and technical information (STI). After introducing the 2-3-6-concept, an analysis concept based on a special value chain, the paper examines the roles of the different players  - author, scientific library, publisher, bookstore and scientific association - involved in the production of STI. A structural model for the value chain of the STI market is developed to analyse in detail the opportunities for scientific libraries offered by technological progress within the current economic, legal and regulatory framework. The analysis reveals that none of the players can be expected to stay within their historical core competencies. Due to technical developments and associated changes in the structure of transaction costs, each player can cover more fields of value-adding activities. The roles of the different players are merging more and more. Further, analysis of current direct and indirect monetary flows reveals considerable potential for conflict.

Page 11: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The academic article

● a very special kind of document● a long history of exploitation, now ending● e.g. HC STC Report

– institutional repositories vs self-publishing● Fom sept 05, all RCUK-funded output should be

offered to an Institutional Repository

Page 12: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Books and ebooks

● Goodbye to the monograph ● Hello to the best-seller● The market for e-books remains unclear and its

potential remains untapped

Page 13: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What's new about the “e-book”?

● Continuing trends– more, and more varied, readers– more, and more varied, resources– broader cultural sensitivities– decanonization

● Convergence of media

Page 14: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Other forms of communication

● In private life– chatrooms and SMS– the blog– the ipod

● In public life– digital cultural initiatives: public good– influence of the web on other media e.g. radio

Page 15: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Changing roles for publishers

● From producer to aggregator● The RSS concept

Page 16: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

3. Qualitative change: the next challenge

● How do we identify and enrich the content of our resources?

● How do we communicate the results?

Page 17: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What do these documents have in common?

Page 18: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Content enrichment: some ways

● Top down– semantic web, topic maps...

● add keywords and relationships derived from pre-existing ontologies (aka conceptual reference models)

– the basic business of humanities scholarship● Bottom up

– automatic keyphrase identification on linguistic evidence

Page 19: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The humanities tradition

● A focus on textual objects– how is this discourse represented?

● A focus on hermeneutics– what does this discourse mean?– what does it say aside from its denotational content?

● Uncertainty, doubt, skepticism● How useful are those skills?

Page 20: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

towards the uncritical edition

● The insights of critical editing/edition philology need to be re-discovered and re-applied in a new context

● A fruitful synergy – semiotics– textuality– hermeneutics

Page 21: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

qualititative differences in our interactions with digital resources

decentred, non-linear, fragmented, and associative modes of cognition are favoureddifferences of scalestatistical apprehension of decontextualized language useplasticity of format and presentation

Page 22: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

cultural objects (are those which) require an explication

● Resources are invested with meaning by our use of them

● Explication confers value ● “We need to interpret interpretations more than

to interpret things” (Derrida, citing Montaigne)

Page 23: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Resources

digital resources

encoding

analysis

abstractmodel

Page 24: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

digitization reifies an explication

● To encode a resource, it must first be decoded● Decoding implies selection of features● And their re-encoding in unambiguous terms

Page 25: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

whose explication?

● the observer effect – a novelty in the sciences– central to the humanities

● hermeneutics 'r us– computers are for symbolic manipulation, not just for

calculation

Page 26: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

transmitting the hermeneutic

● Scholarship depends on continuity● It is not enough to preserve an encoding● There must also be a continuity of comprehension

Page 27: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

digital resources can only be preserved by migration

● This separation of medium and message implies – selection– potential information loss or transformation at

each step● Hence the need for media-independent

encodings

Page 28: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Frequently Answered Questions

● resource description or characterization● re-use of comon text for multiple purposes

– scholarly edition, school edition, speaking edition● alignment of differing “versions”

– e.g. transcription, sound, image– resource descriptors from different domains

● multiple annotations of a common text– may be additive or alternative

● authoring!

Page 29: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What do content providers need?

● We have – a good notation for textual structure and semantics

(XML)– a pretty complete character encoding (Unicode)– well-defined processing systems for doing cool stuff

with XML fragments ● What more do we need?

Page 30: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Interoperability

● We also need – to interchange and integrate metadata, texts, and tools

● between persons and machines● between machines and machines● across time and space

– to express formal constraints on our markup– (probably) to document the semantics of our markup

● This is the domain of the schema or DTD

Page 31: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

Page 32: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>

Page 33: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>list ((label, item)+ | item+)

Page 34: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>list ((label, item)+ | item+)

figure.attributes.url = xsd:anyURI

Page 35: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>list ((label, item)+ | item+)

if list is of type GLOSS, content must include labels

figure.attributes.url = xsd:anyURI

Page 36: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>list ((label, item)+ | item+)

if list is of type GLOSS, content must include labels

figure.attributes.url = xsd:anyURI

persons referenced by key must exist in the persons database

Page 37: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What did using a schema ever do for us?

<list> <label> <item>list ((label, item)+ | item+)

if list is of type GLOSS, content must include labels

figure.attributes.url = xsd:anyURI

persons referenced by key must exist in the persons database

dont use the table element to represent glossaries!

Page 38: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The scope of “intelligent” markup– orthographic transcription– links to digital recordings, images…– proper nouns, dates, times, etc.– linguistic analyses (morphological, syntactic, discoursal...)– named entity recognition– cross references to other material on the topic– meta-textual status (correction etc)– editorial commentary and annotation– traditional bibliographic description– etc., etc., etc.

How can all these things co-exist?

Page 39: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Towards a new babel?

● If we have ● historical records using “Historical Markup Language”● linguistic data using “Linguistic Markup Language”● illustrations using a “Visual Markup Language”● metadata using (a) “Metadata Markup Language”

● how will we integrate resources or ask interesting questions?

Page 40: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

One answer: the TEI

● TEI P5 takes a modular approach● It provides an integrated XML framework for

– definition of simple or complex text markup schemes– documentation of their use– generation of formal schemata to validate them– mapping of their concepts to other ontologies

● See http://www.tei-c.org and http://tei.sf.net

Page 41: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Semantic interoperability

● It is not hard to achieve interoperability between different markup schemes

● The real challenge is to relate their underlying semantics– where does “meaning” come from?– how does “translation” work?

● These are not new questions in the linguistic research community!

Page 42: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:
Page 43: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

For example: terminology

● Termbanks work by defining – relationships between concepts– relationships between terms in different languages and

those concepts● Translators use termbanks to help them decide

what texts should mean

Page 44: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Translators also use corpora

● How do you (quickly) find out about the technical language of a domain for which no termbank exists?

● Apply the “walks-like-a-duck” procedure to build your own corpus

● This is often a reliable way of identifying new terminology (as document classification research shows)

Page 45: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

Mapping of markup languages

● We can map mark-up semantics using standard conceptual reference models (aka ontologies)– ISO DIS 12620: Data Category Registry for linguistic

resources– CIDOC CRM (now also in ISO)

● But in markup, as elsewhere, praxis means more than syntax

Page 46: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:
Page 47: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

What web semantics?

● denotation: what (we think) it says● connotation: what (we think) it appears to suggest● annotation: what we want to say about it

Page 48: Beyond the Page - users.ox.ac.ukusers.ox.ac.uk/~lou/Talks/beyondthepage.pdf · The message Today's digital library applications still focus on serving up virtual pages for the reader:

The next challengeDigital applications can restore the fugitive multi­layered complexity of a textual tradition otherwise instantiated in a fragmented way by the individual physical copies of the traditional library

They can reconstruct the witnesses as evidence in an analysis of the changing semiotic systems underlying that complexity, for example in linguistic or stylistic terms 

They can deliver components of the tradition for re­integration and synthesis into new forms

 There is  a world of difference between an "electronic library" and a "digital repository"