8
The arts and humanities e-Science initiative in the UK Tobias Blanke Arts and Humanities e-Science Support Centre King’s College London Drury Lane [email protected] Stuart Dunn Arts and Humanities e-Science Support Centre King’s College London Drury Lane [email protected] Abstract This paper presents the approaches within the Arts and Humanities e-Science Initiative in the UK. It describes some of its early activities, and sketches out how virtual organi- zations can transform the way in which researchers in these disciplines can collaborate in the use of digital material. The paper is an attempt to position the specific research needs of arts and humanities within the e-Science frame- work and to show how the early arts and humanities e- Science programme is approaching a mapping of e-Science methods and tools on to arts and humanities. 1. Introduction E-Science has been serving the physical and biological science communities for over five years now. Its agenda has been driven by scientists who needed new technologies and concepts to cope with the ever increasing amount of data, both from experiments and simulations. Faced with this ’data deluge’ [5], a new data and service driven science was conceptualized with the scientist and research methods at the center of new data technologies. The idea of e-Science and the e-Scientist was accompanied by the development of new high-speed computing networks that promised solu- tions to a variety of problems in coping with the vast amount of information. Grid technologies were the result of a global effort from computer scientists working together with prac- titioners to advance existing network technologies in order to create a global space of sharing resources and services. The arts and humanities have not, up until now, been served by these developments. Only recently, the arts and humanities research in the UK has brought forward its own specific e-Science agenda, which differs in significant aspects from the one in other disciplines. The mentioned ’data deluge’ can also be found in arts and humanities since digitization of resources has been a major focus of research in recent years [6]. Digi- tal resources in these disciplines have mushroomed over the past decade: the UK’s Arts and Humanities Research Coun- cil commits roughly half its annual budget to projects which produce some form of digital content, as did its predecessor, the Arts and Humanities Research Board. The UK’s e-Science Core Programme funds and man- ages the development of core generic tools, and coordinates complementary programmes within each of the UK’s Re- search Councils, which concentrate on applicative research in their own disciplinary areas. In 2005 the Arts and Hu- manities Research Board, the NGO which funded the ma- jority of arts and humanities research in UK HEIs, became the Arts and Humanities Research Council (AHRC), and announced its own e-Science programme in collaboration with the JISC. The primary aim of this initiative is to co- ordinate and implement the inheritance of e-Science to the arts and humanities, from the hard and life science commu- nities in which theory and method were developed. This paper outlines this research context of the AHRC- JISC-EPSRC e-Science programme, describes some of its early activities, and sketches out how the core concept of the virtual organization can transform the way in which re- searchers in these disciplines can collaborate in the use of digital material. The paper is an attempt to position the specific research needs of arts and humanities within the e-Science framework and to show how the early arts and humanities e-Science programme is approaching a mapping of e-Science methods on to arts and humanities. In the first section we will discuss the idea of an e- Science agenda for the arts and humanities. We will focus on the specific research needs. The second section presents current activities in the arts and humanities e-Science pro- gramme. It starts with the Arts and Humanities e-Science Support Centre as a hub for the different activities. Though the arts and humanities actually include many different dis- ciplines and subdisciplines, one can identify beyond the de- partmental boundaries reoccurring themes and methodolo- gies that are common to many arts and humanities and could form the basis of first steps towards e-based arts and human- Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

[IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

  • Upload
    stuart

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

The arts and humanities e-Science initiative in the UK

Tobias BlankeArts and Humanities e-Science Support Centre

King’s College LondonDrury Lane

[email protected]

Stuart DunnArts and Humanities e-Science Support Centre

King’s College LondonDrury Lane

[email protected]

Abstract

This paper presents the approaches within the Arts andHumanities e-Science Initiative in the UK. It describes someof its early activities, and sketches out how virtual organi-zations can transform the way in which researchers in thesedisciplines can collaborate in the use of digital material.The paper is an attempt to position the specific researchneeds of arts and humanities within the e-Science frame-work and to show how the early arts and humanities e-Science programme is approaching a mapping of e-Sciencemethods and tools on to arts and humanities.

1. Introduction

E-Science has been serving the physical and biologicalscience communities for over five years now. Its agenda hasbeen driven by scientists who needed new technologies andconcepts to cope with the ever increasing amount of data,both from experiments and simulations. Faced with this’data deluge’ [5], a new data and service driven science wasconceptualized with the scientist and research methods atthe center of new data technologies. The idea of e-Scienceand the e-Scientist was accompanied by the developmentof new high-speed computing networks that promised solu-tions to a variety of problems in coping with the vast amountof information. Grid technologies were the result of a globaleffort from computer scientists working together with prac-titioners to advance existing network technologies in orderto create a global space of sharing resources and services.The arts and humanities have not, up until now, been servedby these developments.

Only recently, the arts and humanities research in theUK has brought forward its own specific e-Science agenda,which differs in significant aspects from the one in otherdisciplines. The mentioned ’data deluge’ can also be foundin arts and humanities since digitization of resources hasbeen a major focus of research in recent years [6]. Digi-

tal resources in these disciplines have mushroomed over thepast decade: the UK’s Arts and Humanities Research Coun-cil commits roughly half its annual budget to projects whichproduce some form of digital content, as did its predecessor,the Arts and Humanities Research Board.

The UK’s e-Science Core Programme funds and man-ages the development of core generic tools, and coordinatescomplementary programmes within each of the UK’s Re-search Councils, which concentrate on applicative researchin their own disciplinary areas. In 2005 the Arts and Hu-manities Research Board, the NGO which funded the ma-jority of arts and humanities research in UK HEIs, becamethe Arts and Humanities Research Council (AHRC), andannounced its own e-Science programme in collaborationwith the JISC. The primary aim of this initiative is to co-ordinate and implement the inheritance of e-Science to thearts and humanities, from the hard and life science commu-nities in which theory and method were developed.

This paper outlines this research context of the AHRC-JISC-EPSRC e-Science programme, describes some of itsearly activities, and sketches out how the core concept ofthe virtual organization can transform the way in which re-searchers in these disciplines can collaborate in the use ofdigital material. The paper is an attempt to position thespecific research needs of arts and humanities within thee-Science framework and to show how the early arts andhumanities e-Science programme is approaching a mappingof e-Science methods on to arts and humanities.

In the first section we will discuss the idea of an e-Science agenda for the arts and humanities. We will focuson the specific research needs. The second section presentscurrent activities in the arts and humanities e-Science pro-gramme. It starts with the Arts and Humanities e-ScienceSupport Centre as a hub for the different activities. Thoughthe arts and humanities actually include many different dis-ciplines and subdisciplines, one can identify beyond the de-partmental boundaries reoccurring themes and methodolo-gies that are common to many arts and humanities and couldform the basis of first steps towards e-based arts and human-

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 2: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

ities. We aim to show that such a common agenda existsby looking at e-Science workshops and demonstrators asthey are funded in the early arts and humanities e-Scienceprogramme. This paper will therefore not only introducethe AHRC-JISC-EPSRC programme, but at the same timeclaim that the term e-Humanities is not just an abstractionlacking reality but describes a reality of a common researchagenda.

2. E-Science in the arts and humanities

Arts and humanities e-Science shares some characteris-tics with its ’parent’ in the physical sciences. Grid technolo-gies and methodologies address how globally distributeddata resources can be integrated into the research process,and how computational power can be shared. Early workhas shown that such technologies and methods have muchpotential in the arts and humanities, where the ratio ofdigital to non-digital content creation is now very high(http://www.ahrcict.rdg.ac.uk/). These data are disparate,dispersed, often fuzzy, incomplete and not interoperable [9].It is also plain that such data’s value would be greatly in-creased if it could become interconnected, and different el-ements accessed seamlessly and simultaneously.

In the sciences the grand challenges that the advancednetwork technologies address were complementary to newadvances in computing and measuring technologies. For thearts and humanities, the digital resources most of the timedo not result from automated simulations on large data sets,but result from an intense human effort to better understandhighly heterogeneous research subjects like artworks, liter-acy text or archeological artifacts [6]. A commonly quotedexample for the specificity of digitized humanities data isthe work by Williard McCarty on Ovid [7]. 55000 tagsfor 12000 lines of text is a large human effort that reflectsthe complexity and undecidedness of the object [6]. In artsand humanities data will be in general highly qualified andnot highly quantified. The question the arts and humani-ties e-Science programmes is trying to answer is whethere-Science can deliver on the specific needs of arts and hu-manities in a digital age and can add value to its research.

There is a clear qualitative case for using e-Science tosupport arts and humanities research [11]. Quantitatively,however, there is far less need for computational power -the arts and humanities do not, and are never likely to, pro-duce the terabyte-scale volumes of data from sensors whichare familiar to (for example) particle physicists and biolo-gists. However, a key to the success of e-Science is the pro-vision of shared access to research facilities and services,which helps scholars operate within the increasingly global-ized research framework. It allows researchers from aroundthe world to work together and use each other’s resourcesas if they were collocated, and digital knowledge objects

are created and (re-)used in virtual collaborative spaces.The e-Science agenda is driven not only by high-

performance computing, CPU power or computer network-ing. It is about pro-active relationships between server andserver, programme to programme, and research practitionerto research practitioner. E-Science is about building bridges[1]. Such global collaboration in a virtual space will be ofkey significance to what arts and humanities researchers aregoing to be doing over the next ten years.

The arts and humanities initiative in the UK started offby organizing expert seminars that scoped out e-Sciencerequirements for disciplines as dislike as archaeology andlibrary and information science (http://www.ahds.ac.uk/e-science/e-science-scoping-study.htm). The grand chal-lenges for the arts and humanities e-Science programme,as they were identified in the seminars, were understood tolocate, access and integrate the content of highly distributedresources that are likely to be unstandardized, and to havebeen encoded using different standards, described using dif-ferent standards, and be of variable quality. The grand chal-lenges in the humanities can therefore be better described as’making available for humanist research’ rather than large-scale automated analysis.

As new ways of generating knowledge from data are ex-plored, the arts and humanities will have to find their placein a new data-driven research environment with shared re-sources and services. Although by now the communitymay find itself in the position of having access to dig-ital texts, images, moving images or audio materials, astrategy to get the community better involved with thesematerial has yet to be taken forward. The AHRC-JISC-EPSRC e-Science Initiative, a 2m national programme topromote and develop e-Science in the arts and humanities(http://www.ahrcict.rdg.ac.uk/e-science), will form such astrategy, and provide support for the researchers in their useof advanced network technologies like the Grid. We willnow present the first activities and initial results of this ini-tiative.

3. Activities in arts and humanities e-Science

The activities in the arts and humanities e-Science ini-tiative describe a mixture of requirement gathering as wellas planning and implementing early demonstrators - mostlybased on existing e-Science technologies. Knowledge frome-Science projects is transferred to humanities and arts re-search. Firstly we describe the centre that is supposed tobe an organizational hub of all the AHRC-JISC-EPSRC e-Science activities. Afterwards, we investigate the sugges-tions of how to map the methods and systems of e-Scienceon to the arts and humanities, as they are discussed in work-shops and demonstrator projects. We aim to show that thereare not only commonalities among some projets, but that

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 3: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

one can actually speak of a common research agenda acrossthe disciplines.

3.1. AHeSSC: Arts and Humanities e-Science Support Centre

Funded by JISC, the Arts and Humanities e-Science Sup-port Centre (AHeSSC) is a critical part of the AHRC-JISC-EPSRC initiative (http://www.ahessc.ac.uk). AHeSSC iscurrently the only permanent institution in the humanitiessolely dedicated to e-Science research. It is unique, as it cre-ates a space for an emerging user community of e-Scienceto coordinate their activities and create the cultural and in-stitutional prerequisites of an arts and humanities e-Scienceprogramme. The centre is co-located at two other uniqueinstitutions: the Arts and Humanities Data Service (AHDS)(http://www.ahds.ac.uk) and the Methods Network (MN)(http://www.methodsnetwork.ac.uk). AHDS is a place tobring together digital content produced in the humanities ofthe UK. It is well known that research results tend to get lostafter a project has run out. A data service like the AHDS canhelp prevent this. The MN supports the general use of com-puting in the arts and humanities, and AHeSSC fills in thegap by researching on the impact of new network technolo-gies that allow the virtual collaboration and coordination ofdigital resources and methods.

AHeSSC exists to support, co-ordinate and promote e-Science in all arts and humanities disciplines, and to li-aise across the e-Science and e-Social Science communi-ties, computing and information sciences. AHeSSC ser-vices include:

• Practical assistance and liaison to bring together artsand humanities researchers who wish to use advancednetworking technologies within the e-Science infras-tructure.

• Advisory and training activities in support of e-Sciencein the arts and humanities.

• Outreach activities to promote e-Science within thearts and humanities academic community.

• Facilitation of interdisciplinary work and the exchangeof expertise.

• Supporting projects funded under the AHRC/JISC artsand humanities e-Science Initiative.

AHeSSC will engage with the broader national and in-ternational landscape of e-Science and Grid activity, and inparticular identify lessons that can be learned for supportingthe arts and humanities academic communities.

AHeSSC also works with so-called early adopters: artsand humanities scholars who have recognized the benefits

of e-Science and begun to apply the methods and technolo-gies in their work. The relationship between AHeSSC andthis community will be now outlined with a series of illus-trated examples. We tried not to simply enumerate exam-ples but to demonstrate that already at this early stage aspecific research agenda for e-Science in the arts and hu-manities is emerging. This research agenda focuses on theimmediate needs in the community with which e-Sciencemethods and technologies can help.

3.2. Collaboration in arts

The arts already have a tradition of activities directly re-lated to Grid technologies. Technologies like the AccessGrid are used in musical compositions, performing arts orlanguage instruction. In the US the Internet2 arts and hu-manities initiative [2] challenges Grid technologies withnovel usages that exploit the specific character of artistic,event based data that pose high requirements for the trans-mission and accuracy of multimedia data over networks.

Associated Motion Capture User Categories (AMUC) isa UK project to create a virtual collaboration demonstratorfor the arts. AMUC is an excellent example of how arts andhumanities projects in e-Science can help a wider commu-nity. It is based in Newcastle and jointly run by the CultureLab at Newcastle University and the North East Regional e-Science Centre. AMUC targets the tracking and capturingof motions that go beyond the everyday use of human bod-ies. Grid technologies provide the infrastructure to adjustmotion capture data to specific user needs and to distributeit across multiple research sites. Particularly in a field likemedia studies, that is still quite young and highly diversi-fied, e-Science technologies provide the means to accom-modate specific research needs. AMUC will make the firststep in enabling access to costly data to a wider range ofusers in the arts and humanities.

Complex, coordinated movements produced by perform-ing artists should theoretically benefit areas that require ex-act measurements of human body movement like medicalengineering. In motion capturing, it remains difficult to an-alyze the retrieved data as its quality heavily depends onthe quality of the motion sensors and other capturing con-ditions. That is the main reason why compromise is stillsought between the costs of motion capturing and the ben-efits of motion reproduction. Performing artists can assist,as they can produce body motion already today in such away that capture strategies can be explored which in the fu-ture will enable highly accurate capture of everyday bodymotions. An interdisciplinary approach to Grid technolo-gies might be able to minimize the cost and benefits bal-ance of complex sensor data gathering and analysis. Here,the complex and fuzzy data typical to humanities can assistthe wider research.

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 4: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

Use of motion capture technology is already quite com-mon among dancers and multimedia artists. AMUC wouldlike to make the research data produced in live digital eventsmore available to media and culture research and at thesame time close a gap in e-Science research for the artsand humanities. Here, most projects still rely on data gen-erated by science projects. Archaeologists e.g. use astro-nomical calculations in order to better understand ancientcultures and their mystical world. This has proven to bea very productive approach. In other disciplines howeverdata must be more specific to the particular research needs.In the performing arts we deal with data that only exists asan event. That makes this research area unique in its datarequirements.

As most of the other projects in the arts and humani-ties e-Science initiative AMUC will make use of existing e-Science technology like the Newcastle upon Tyne campusCondor Grid. 3 web services will configure the Grid infras-tructure to the need of arts and humanities motion captureresearch. A live capture and storage service will work to-gether with a bioengineering analysis service. These tech-nologies will be tested for the ability to satisfy the specificdata needs of Performing Arts research.

AMUC’s attempt to measure the usefulness of Grid tech-nologies in the performing arts is accompanied by two se-ries of workshops that will bring together the arts commu-nity with e-Science technology. The Building the Wire-frame: E-Science for the Art Infrastructure workshopsscope out the potential future use of Grid technologies andworks on the community awareness. The Midlands E-Science Centre and the Visualization Research Unit at theBirmingham Institute of Art and Design will work in part-nership to familiarize the arts community with the new pos-sibilities of Grid technologies and the new way of thinkingthat e-Science brings about. The project involves hands-ontraining for interested groups in e-Science technology to en-able future independent research in the area.

The second series of workshops Performativity, Place,Space (PPS) based in Bristol creates a space for artsresearchers to explore the use of several Grid tech-nologies in collaboration among performative artistsand researchers. Performative arts have been at theforefront in creating unprecedented usages of Gridtechnologies like the Access Grid. As early as2002 dance artist Kelli Dipple used the Access Gridto integrate three continents into one dance perfor-mance (http://mrccs.man.ac.uk/global supercomputing/SC-Global/artists.html).

Arts is able to demonstrate the potential social use ofGrids and other advanced network technologies. Blogs andwikis are only the best known social technologies that makethe internet a creative public space. Performative arts prac-titioners push the Grid technologies to new boundaries and

make them the basis for new public spaces. Their applica-tions are close to a future social use of Grids, which mightsee everyday live interaction and collective work spaces.

Access Grid, Semantic Web technologies and StorageResource Broker are used in PPS to facilitate data discov-ery through Grid technology. The project builds upon a Se-mantic Web project called ’Practice as Research in Perfor-mance’ (PARIP) [10], which uses the Friend-Of-a-Friendparadigms of semantic integration in user communities todevelop links between arts researchers in a database of per-formance. A web-based interface showcases how researchinto performance can be queried.

PPS integrates practice-led research in a novel way byestablishing links across disciplines and departments. Thisway, PPS is able to effectively use the Grid virtual collab-oration platforms. The project will consist of three work-shops that will introduce a wider community of researchersto the technologies available and will start planning onthe integration of PARIP with SRB based video annotationtools like Transana. Transana helps professional researchersin digital video and audio data twofold. Firstly, it helpsannotate and manage data and secondly, the results can beshared within a research group. The workshops will resultin a project outline to build an integrated platform for thesetechnologies.

The focus in the described projects of the UK’s arts andhumanities e-Science initiative is to discover Grid technolo-gies as a social collaboration space. They explore the spe-cific requirements of data that only exists as a live event andwill therefore be close to a future wider use of Access Gridtechnologies in community webs like blogs and wikis arecurrently for the world wide web. At the same time theyhelp overcome limitations of research in arts disciplines bybringing together physically dislocated people in highly di-versified disciplines in a virtual space. Master classes indance and music can now reach students in all parts of thecountry.

3.3. VOs, VREs and user needs

Following extensive consultations with the academiccommunity, the AHRC-JISC-EPSRC e-Science Initiativeis focusing a programme of its first-year activities on so-called ’early adopters’: scholars from ’traditional’ arts andhumanities backgrounds who have recognized the poten-tial of e-Science and begun to use the tools and methodsin their own research. However, even in this small commu-nity, there remains a certain lack of awareness about suchtools and methods. There is a need both for education,to help researchers better understand how they might (andmight not) benefit from engaging with Grids, and training,to foster relevant and specific skills among those who wishto do so. Specific initiatives have been funded to address

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 5: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

this. As noted above, AHeSSC itself has a role in promotinge-Science in these communities, and the AHRC e-ScienceScoping Survey has a specific mandate to raise awarenessvia a ’knowledge base’ of methods and tools in e-Science itis constructing.

Against this background, the Centre for the Study of An-cient Documents (CSAD) at Oxford University is conven-ing an AHRC workshop to identify formal methods for re-searchers, both experienced e-Science users and those newto the field, how these researchers can identify what re-search needs within their proposals may benefit from e-Science methods and technologies, and what services andresources are available to them. This is rooted in existing re-quirements gathering work at Oxford, namely the Buildinga Virtual Research Environment for the Humanities project,which has conducted a detailed survey of user needs withinthe university and will be reporting shortly.

Like the AHRC Scoping Study, the Oxford workshopproject will focus on identifying methods and tools in theexisting e-Science community that may be of interest to artsand humanities scholars. However, the focus here will be onplanning and implementing the formal methods that theseresearchers will need to find in the first place. This has longbeen recognized as a problem with humanities computingresearch applications, defined here as those involving ad-vanced computation methods, but not e-Science: scholarsare often not aware of easily-importable tools and methodsin adjacent fields, and are thus not in a position to maxi-mize their wider benefits [3]. The Oxford workshop formspart of a wider national effort to ensure that in the earlystages of the e-Science initiative, the arts and humanitiesearly adopter community does not reduplicate work alreadydone on e-Science user requirements analysis in the physi-cal sciences.

The CSAD is also hosting an EPSRC demonstratorproject entitled A Virtual Workspace for the Study on An-cient Documents. The primary research material for thisdemonstrator will be degraded or damaged documents writ-ten on papyrus, wood and stone. It will provide a digitalinfrastructure of texts, corpora, dictionaries and other re-sources, and the compute power needed to allow researchersto work with them in a fully integrated environment. Thebenefits of such an approach with material of this kind areclear: the ancient documents in question are very fragile andgeographically dispersed. Many of the documents are alsodisarticulated, with different parts in different locations.

The virtual workspace will provide tools for reassem-bling documents into a larger logical document, whereverpossible. Crucially, the workspace will make available arange of existing infrastructure: the Thesaurus of LinguaGraecae, the Lexicon of Greek Personal Names, the well-known digital library of Classical texts Perseus, the Packardhumanities Institute of Greek Epigraphy Project, etc. It will

thereby add value to these resources, while at the same timeenabling the process of collaborative access to the digitizedprimary sources. This is a type of research problem, whereprimary documentary material is dispersed, fragile, or inac-cessible, which is unique to the humanities. It has to be re-constructed and made available virtually alongside tools forits analysis. This virtual workspace project demonstrates amethod for overcoming this problem to which e-Science isessential.

The same funding stream has supported the Virtual Vel-lum project at the University of Sheffield. The aim of thisproject is to facilitate access to high-end digitizations of fo-lios of the Froissart Chronicles. Like the Oxford demon-strator, this enterprise builds on existing VRE developmentwork, in this case the work done on Sheffield’s FroissartManuscripts Project. This demonstrator also deals withvery rare, fragile and physically inaccessible material. TheFroissart Chronicles themselves, an essential record of theHundred Years War between England and France, can onlybe in one place at a time. Whenever they are moved fromtheir local collection they must be accompanied by an ex-pert curator and security staff, and be stored in a tightly con-trolled environment costing many thousands of pounds topurchase and run. Even when in their home collections,access for scholarly researchers, never mind the viewingpublic, is almost impossible. The Virtual Vellum projectprovides a medium, based on the White Rose Grid con-sortium, for online, real time viewing of the manuscriptsthat overcomes these problems. The project also brings anew, humanities-specific dimension to the VRE concept: aGrid-based exhibition of the manuscript folios alongside ex-hibits of the Royal Armories Museum (Leeds), which willbe streamed on the White Rose Grid to the Tower of Lon-don, and the Royal Armories’ premises at Louisville, Ken-tucky.

Both the Oxford workspace and Virtual Vellum projecthighlight an interesting new aspect which the meeting ofthe arts and humanities and e-science brings: the blurringof the line between a Virtual Organization and a Virtual Re-search Environment. Previously, the latter has been under-stood as a collection of people working together on a Grid:the CERN LCH system, for example, will be using Gridsto deliver terabyte-volume datasets to a worldwide con-stituency of scientists who form what is classically under-stood as a Virtual Organization. On the other hand, a VREhas been defined in the UK as a Grid-enabled network de-livering tools, facilities, server power and resources to a re-search group or groups by linking tools, resources, humans,datasets, etc [4]. The Integrative Biology VRE projectat Oxford, which seeks to ’understand biological systemsthrough the construction of large-scale software systemsthat simulate biological behaviour at a variety of spatial andtemporal scales’ (http://www.vre.ox.ac.uk/ibvre) is a clas-

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 6: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

sic example of this. However, the intrinsically human na-ture of humanities research means that humanists’ collab-orative needs are imported into Virtual Organization andVRE architectures. A very good example of this is theSilchester Roman Town VRE project at Reading University(http://www.silchester.rdg.ac.uk/vre), which allows expertsin a diffuse range of disciplines and from a range of diverselocations to access data from a range of server repositoriesover the Grid in a singe interface. Thus, Silchester fits thedescription of both a Virtual Organization and a VRE. Theconceptual and technological implications of this are likelyto be very significant for the implementation of Grid tech-nology in the arts and humanities.

This merging of two relatively well-established e-scienceconcepts will require a radical rethink of institutional, tech-nological and political infrastructures for the future. Asnoted above, the AHDS is the UK’s principal nationaldigital library for arts and humanities data. Currently itis constituted as a federation of five subject centres ded-icated to particular arts and humanities disciplinary do-mains, and an Executive office. As it moves from a cen-tral server model of data curation to a Grid based struc-ture, this will need to be considered. A better infrastructuremodel would be for the AHDS to operate through a seriesof regional interdisciplinary centres, rather as the e-ScienceCore Programme has for the physical science communi-ties (http://www.nesc.ac.uk/centres). This would greatly en-courage the cross-community trends described above, andgive UK arts and humanities e-Science the cohesion it needsto interoperate with comparable international infrastruc-tures such as EGEE (Enabling Grids for E-sciencE) and Ter-aGrid in the US.

3.4. Managing the data deluge

In the digital age highly structured and complex human-ities data from archives reaches terabyte sizes. The most fa-mous example are the Shoa foundation data archives [11].This 180 terabyte multimedia archive of Shoa testimoniescollates the histories of survivors for future generations tohelp learn the lessons from the atrocities in the camps. Inthe future, with the growing recording of human history invideos, images, databases etc. we will see more archives ofsuch size that are readily available in a digital format. TheUK’s arts and humanities e-Science initiative investigatesappropriate technologies that make human data accessibleto a wider community of researchers and the public.

The London based series of workshops titled Research-ing e-Science Analysis of Census Holding (ReACH) is inves-tigating how to make use of the data set of historical censusdata not only for academic research but also for commonpublic interest in genealogical research. Historical censusdata is an example of human data that is already used in

several computing applications. Digitized census data how-ever is not readily available to researchers either because itis not known to the community or researchers do not haveaccess to sufficient computing power to deal with such data.A full investigation into census data with advanced com-puting methods has not taken place yet. The prospects ofsuch an investigation are the ability to better and more com-pletely analyze changes in population - a research questionthat becomes more and more relevant in a globalized world.

ReACH is led by the department of Library and Infor-mation Studies at the University College London. Libraryand Information Studies is an humanities discipline that hasa long history of complex computing applications. ReACHwill initialize a discussion among library and archive spe-cialists, history researchers, and computing specialists thatshall deal with the technical, legal, and managerial limita-tions in the accessibility of human data in archives and li-braries.

Benefits are on all sides. Library and Information Stud-ies has the expertise in understanding and analyzing largescale archives, while Grid technologies can provide answersto current limitations of the automated analysis of archivaldata. E-Science does not only allow large data sets to beanalyzed efficiently, but this analysis can also be specifiedin such a way that complex research needs like specific carein dealing with human data are addressed.

The ReACH workshops are in so far unique as they canrely on existing data and scope out the analysis of such datafrom a technical but also managerial point of view - within aworkshop solely dedicated to the latter point. In humanities,as they are working with ’human data’, the latter view pointrequires particular attention. Not only intellectual propertyrights but also wide scale ethical concerns have to be ad-dressed before any data analysis can be done. The data it-self needs to be responsibly used while in the sciences gen-erally the data is produced during simulations and experi-ments that can produce ethically problematic results.

Apart from the organizational and managerial side as ad-dressed in ReACH, there is a serious issue to be dealt within terms of how data is stored, managed and made inter-operable. This is true of all kinds and formats of data, butthe need is felt particularly acutely in domains dealing withvery technical and/or specialized data types, such as Geo-graphical Information System (GIS). The Queen’s Univer-sity Belfast’s AHRC workshop, Geographical InformationSystem e-Science: developing a roadmap is assessing thisissue from the point of view of geospatial data. Through aseries of four workshops, this will address current GIS us-age in the arts and humanities, difficulties experienced byhumanities researchers employing the technology, the ad-vantages the technology brings, and develop a roadmap forfuture usage. It will focus on the usage of digitizationsof major established resources such as the English Place

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 7: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

Names Survey, which provides all the alternative variantsof a huge number of toponyms, and how they can be usedto best advantage on GIS platforms. It also highlights aneed for basic connective infrastructure: common thesauri,chronological definitions, and controlled vocabularies to en-sure that spatial data - particularly in the historical domains- is described in a managed way so that they can interoper-ate properly and be cross-searched and cross-analyzed.

The Access Grid will be explored further in its use asa means for community building and data exploration byan AHRC workshop project at the University of Sheffield.This programme of four workshops will focus on the inte-gration of different kinds of humanities data in Access Gridenvironments. As noted above, the disparate and unstan-dardized nature of humanities data is one of the principle’grand challenges’ for e-Science, and the Access Grid there-fore provides an important test environment for their inte-gration into Grid infrastructures. The first workshop willexamine digital images, and will be led by the Principal In-vestigator of the Virtual Vellum demonstrator. Secondly,the project will examine digital texts, and in particular howAccess Grids can facilitate shared interaction with textualdata, with geographically dispersed scholars in a virtualvenue collaborating in real time on textual material: thishas important overtones in terms of the discussed distinc-tion between a virtual organization and a virtual researchenvironment. Sound and moving images will be dealt withby a third meeting, which will compare the results of fairlymature projects, and will explicitly test the technical capa-bilities of Access Grids for humanists streaming sound andaudio material through a virtual venue. The final workshopwill address virtual reality and reconstruction, a burgeoningand rapidly developing area of humanities e-Science.

The workshop at Sheffield is especially significant in thatit sets the range of humanities data in the e-Science contextmost familiar in terms of collaboration. As humanist schol-ars move towards larger and more formal modes of collab-oration, there will be an increasing need for a seamless andfluid environment both in which to transfer that data, and inwhich to collaboratively edit and annotate it. Access Grids,as a set of collaborative methods well established in the sci-entific domains, provide a logical backdrop to take forwardthe humanities e-Science agenda in this respect.

4. Conclusion

The activities within the UK’s arts and humanities e-Science community show that there are specific needs thatneed to be addressed to make e-Science work within thesedisciplines. All projects agree that the data relevant to hu-manist research is particularly fussy and inconsistent, as itis not automatically produced, but is the result of human ef-fort. Data in arts and humanities is discursive [8] and not

just a simple collection of facts. It is fragile and its pre-sentation often difficult, as e.g. data in performing arts thatonly exists as an event. However, with projects like the Shoaarchives, more data like this can be expected in the future,and the technologies to make such data available to a largerpublic need to be examined.

Particularly, the arts but also other humanities project ex-plore the social use of the Grid and other advanced networktechnologies like Semantic Web. They showcase how peo-ple interact across physical boundaries, where this interac-tion can get as complicated as distant master classes in mu-sic.

Another reoccurring topic in the early arts and human-ities e-Science projects is how to foster and enhance inter-disciplinarity not only between sciences, social sciences andhumanities, but also within the humanities themselves. Thedivision of the humanities into sometimes arbitrary disci-plines is high and often hinders research. In order to makethe Grid work in humanities, it is essential to reorganize theinteroperability and reusability of their production, a stan-dard for long reached in e.g. computer science with the con-cepts of module based and object oriented programming etc.The humanities research must learn again to work togetherand see the benefit from it.

There are however also benefits that an arts and humani-ties e-Science agenda can have for the rest of the e-Sciencecommunity. There are specific advantages, as the data per-forming artists can provide for body motion technologiesor the exploration of the social use of Grid technologies.There is also the general benefit that an arts and humani-ties e-Science initiative has. It reminds Grid developers thatthe most important nodes in networks are the human nodeswho up until now deliver most of the knowledge produc-tion. The sometimes-blurred distinction between a virtualorganization and a virtual research environment, as it is ap-parent within the humanities Grid research projects, is a re-minder of the fact that e-Science is about supporting humanresearch and not about substituting human research.

5. Future work

The arts and humanities e-Science initiative will signif-icantly widen its scope and activities with the new majorfunding and research opportunities starting in September2006. As already mentioned, EPSRC, AHRC, and JISChave committed to fund several major research projects be-yond the existing workshops and small-scale demonstrators,as well as studentships to support PhD work in the field. Atthe same time, institutions like the AHDS see the benefitsof Grid-enabling technologies to move towards a distributedinfrastructure.

Current discussions within the initiative tackle questionsabout the shape the new e-infrastructure for humanities re-

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006

Page 8: [IEEE 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) - Amsterdam, The Netherlands (2006.12.4-2006.12.4)] 2006 Second IEEE International Conference

search could take and the technologies that could help ad-dress the specific issues of arts and humanities data. Onthe data consolidation level, technologies like the StorageResource Broker (SRB) for easy distributed file access,or OGSA-DAI for a middleware to assist on distributeddatabase access, are considered to be highly useful. Shibbo-leth is an Internet2 technology to provide single sign on ac-cess to data resources. It is currently being investigated byJISC as a future method to provide access to online digitalresources. SRB, Shibboleth as well as Semantic Web tech-nologies are tested in early adaptor projects at the AHDSfor their use in providing access and organizing arts and hu-manities data.

Another important development is the integration ofmetadata and ontologies as well as other advanced datamanagement technologies in the newly developed e-infrastructure. As humanities data is so dispersed, meta-data technologies could help cross institutional bound-aries. Moreover, it has been shown that the de-velopment of domain ontologies can assist understandbetter research questions in the domain [8]. Exam-ples of potentially useful metadata initiatives are theOpen Archives Initiative (http://www.openarchives.org/),the CIDOC Conceptual Reference Model (CRM) forconcepts and relationships used in cultural heritagedocumentation (http://cidoc.ics.forth.gr/), and the Meth-ods Taxonomy to describe humanities computing meth-ods (http://ahds.ac.uk/about/projects/pmdb-extension/). Tomeet the challenges of inexact and inconsistent data that ex-ists only as events and discourses, text and other data min-ing technologies as those used in the US based Nora project(http://nora.lis.uiuc.edu/) can be of help to researchers andmight become useful tools within a virtual research en-vironment. Probabilistic indexing and record linking isbeing investigated for the second stage of the ReACHproject. Both technologies are already used successfullyin social science and humanities projects (http://www.data-archive.ac.uk/randd/vpsreportforrrb.pdf).

References

[1] D. De Roure, N. R. Jennings, and N. R. Shadbolt. The se-mantic grid: a future e-science infrastructure. In F. Berman,A. Hey, and G. Fox, editors, Grid Computing Making theGlobal Infrastructure a Reality, pages 437–470. John Wileyand Sons, Ltd, Hoboken, NJ, 2003.

[2] A. Doyle. Internet2 arts and humanities initiatives: Inno-vation in advanced networking and education, 13-06-20062006.

[3] S. Dunn and A. Dunning. Community service: Collabora-tive technologies and the development of scholarly commu-nities in the arts and humanities. In Digital Resources in theHumanities, Lancaster, 2005. Lancaster University.

[4] M. Fraser. Virtual research environments: Overview andactivity. Ariadne, (44), July 2005.

[5] T. Hey and A. Trefethen. The data deluge: an e-scienceperspective. In F. Berman, A. Hey, and G. Fox, editors,Grid Computing: Making the Global Infrastructure a Re-ality. John Wiley and Sons, Hoboken, NJ, 2003.

[6] J. Kircz. E-based humanities and e-humanities on a surfplatform. Technical report, SURF-DARE, 1. June 2004.

[7] W. McCarty. Depth, markup and modelling. CHWP (Com-puting in the Humanities Working Papers), (A25), 2003.

[8] G. Nagypal, R. Deswarte, and J. Oosthoek. Applying thesemantic web: The vicodi experience in creating visual con-textualisation for history. Literay and Lingusistic Comput-ing, pages 1–23, 2005.

[9] M. Nentwich. Cyberscience. Research in the Age of the In-ternet. Austrian Academy of Science Press, Vienna, 2003.

[10] S. Price, A. Piccini, S. Agarwal, B. Kershaw, B. Joyner, andL. Miller. Parip explorer - researching the researchers. InInternational Semantic Web Conference, Hiroshima, 2004.

[11] J. Unsworth. The draft report of the american council oflearned societies commission on cyberinfrastructure for hu-manities and social sciences, 2006.

Proceedings of the Second IEEE InternationalConference on e-Science and Grid Computing (e-Science'06)0-7695-2734-5/06 $20.00 © 2006