Upload
andrew-sallans
View
1.002
Download
1
Tags:
Embed Size (px)
DESCRIPTION
A. Sallans. "Understanding the Big Picture of e-Science." Presented at the 2011 eScience Bootcamp at the University of Virginia's Claude Moore Health Sciences Library. 4 March 2011
Citation preview
UNDERSTANDING THE
BIG PICTURE OF E-SCIENCE
Andrew Sallans
Head of Strategic Data Initiatives
University of Virginia Library
E-Science Bootcamp
Claude Moore Health Sciences Library, University of Virginia
4 March 2011
OUTLINE
What it‟s all about
Examples
Implications
UVA Libraries Response (Round 1)
2
WHAT IT‟S ALL ABOUT (AROUND 1999)
"e-Science is about global collaboration in key areas
of science, and the next generation of
infrastructure that will enable it."
"e-Science will change the dynamic of the way
science is undertaken."
Dr Sir John Taylor
Director General of Research Councils,
Office of Science and Technology
United Kingdom3
Source: http://webscience.org/person/8.html
WHAT MADE THIS POSSIBLE?
Internet/World Wide Web
Faster networking (fiber, special research
networks, advances in grids)
Better storage (higher capacity, faster access,
better reliability)
Cheap storage (costs keep decreasing)
Major funding initiatives
Broader interest in collaboration
4
SOME COMMON TERMS
Computational science
Scientific computing
Research computing
High-performance computing
Cyberscience
Cyberinfrastructure
5
CLIMATOLOGY RESEARCH
6
Sources:
1) Climate Simulation on Cray XT5 “Jaguar” supercomputer, ORNL
(http://www.ornl.gov/info/ornlreview/v42_3_09/article02.shtml)
2) Cray XT5 “Jaguar” supercomputer, ORNL
(http://www.ornl.gov/info/ornlreview/v42_1_09/images/a05_p04_xt5_full.jpg)
LARGE HADRON COLLIDER AT CERN
Circumference: 26,659 meters
Magnets: 9,300
Speed: protons move at
99.9999991% speed of light)
Collisions/second: 600 million
Data produced: equivalent to 100,000 dual layer DVDs per year
LHC Grid: tens of thousands of computers around the world used collectively to analyze data (will take 15 years)
7
Source: CERN website (http://cdsweb.cern.ch/record/975468/files/its-2006-003.gif?subformat=icon)
BIOMEDICAL INFORMATICS GRID (CABIG)
Launched as test in 2004
Adopted by over 50 NCI-designated cancer centers
Focused on:
Connecting scientists and practitioners through a
shareable and interoperable infrastructure
Development of standard rules and a common
language to more easily share information
Building or adapting tools for collecting, analyzing,
integrating, and disseminating information associated
with cancer research and care
8Source: caBIG website, National Cancer Institute (https://cabig.nci.nih.gov/)
CITIZEN SCIENCE…THE SOCIAL SIDE
9Source: Zooniverse, Real Science Online (http://www.zooniverse.org/home)
34,617,406 clicks done by 82,931 users!
IMPLICATIONS FOR RESEARCH
Greater emphasis on technology
Increase in interdisciplinary research and
collaboration
Often bigger data, with far more complex
associated issues (storage, access, expertise,
funding, preservation, etc.)
Need for innovative approaches and integration
into education/curriculum
10
DATA TSUNAMI
11Source:
1) The Great Wave off Kanagawa, Katsushika Hokusai. Found on Wikipedia.
2) The Diverse and Exploding Digital Universe, IDC, May 2010 (http://www.emc.com/collateral/analyst-
reports/diverse-exploding-digital-universe.pdf)
IDC estimate of about 1.7 zetabytes (1 trillion terabytes) around 2011
….twice the available space
BUT, NOT ALL DATA IS EQUAL….
12Source: Long Tail, Wikipedia (http://en.wikipedia.org/wiki/The_Long_Tail)
CASE STUDY: UVA LIBRARIES RESPONSE
(ROUND 1)
Collaboration established around 2005 through
discussions between ITC and Library, and
impetus of Frye Institute capstones.
Research Computing Support services in need of
greater visibility, Library seeking ways to
support changes in scientific research, collocation
provides mutual benefits.
In 2006, staff moved to Library locations
(Research Computing Lab & Scholars‟ Lab),
setup new service points and services.
13
RESEARCH IN THE E-SCIENCE WORLD
Heavy use of electronic information resources
Work is predominantly done from a lab/office, not
in the Library
Collaboration is fundamental, but don‟t always
know people in other domains
Grad students are usually bringing new
technology/methods into the team (learning more
about grad students in a research study now)
14
IDENTIFIED E-SCIENCE TRENDS
Various components
Computationally intensive science
IT/software/infrastructure
Collaboration
Data
Often intertwined with Open Access initiatives
15
E-SCIENCE IN OTHER LIBRARIES
Purdue University
Focus on data curation
IATUL Conference, June 2010
University of Illinois – Urbana Champaign
Focus on data curation
Summer Institute on Data Curation
Cornell University
Metadata consulting services
University of New Mexico
Major DataONE grant
16
Aiming to provide support across the entire
scientific research data lifecycle
Staff with expertise in:
Data
Quantitative data, statistics
Modeling, visualization
Scientific publishing
Emphasis on consulting, not drop-off services
Partnership with traditional librarians to help
ease transition to new support models
RESEARCH COMPUTING LAB RESPONSE
17
RCL OUTREACH
University Community
Speaker series 2006, 2007, 2008
Research 2.0 Symposium
Partnerships with courses, other units (ie. MLBS)
Short course series each semester
Library Community
Panel at the ACCS Conference in 2007
Poster at ARL/CNI Forum in 2008
Poster at STS Section of ALA in 2009
Journal article in JLA in 200918
SAMPLE RCL CONSULTATIONS
STS Undergrad Environmental Justice (2008) Development of technology solutions for empowering the
citizen scientist
Web 2.0 tools, data collection/management
Data analysis
Economics Graduate Student (2008/2009) Airline flight price modeling
Screen scraping, data collection/management
Data analysis
Mountain Lake Beetle Project (2009) Mobile data acquisition/collection solution
Database development/management, programming
Data analysis
Archiving of dissertation data (2009) EVSC student, ModelMaker 4.0 data
Biology student, IDL, Matlab, R code 19
SPECIFICS FOR MEDICAL CENTER
At least 600 RCL support requests from Medical
Center from October „07 through December „09
Medical Center patrons are heavy users of
computational software like Matlab, SAS,
LabView
Increasing emphasis on collaboration
(translational research)
Greater attention to open access (NIH policy)
Growing interest in areas like image integrity
20
TAKE-AWAYS
This is the future
Heavily growing space, lots of opportunity
Requires big investment and commitment, the
biggest being training and priority alignment
Libraries and institutions need to make decisions
on what to do and what not to do
It‟s a culture change for both libraries,
institutions, and researchers
21
COMING LATER….(ROUND 2)
“Practical Applications of e-Science” in UVA
Libraries today
22
QUESTIONS?
Please feel free to contact me with questions:
434-243-2180
Twitter: asallans
23
ADDITIONAL INFORMATION
E-Science Talking Points for ARL Deans and
Directors, Elisabeth Jones, University of
Washington, October 2008
(http://www.arl.org/rtl/escience/)
24