Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Planning and managing your
research project
Dr Eddy Verbaan
Research Data Manager
(0114) 225 38 52
Open research
• Transition period
• From toll access to publications and undisclosed research data
• To free access for all to publications and research data
• Affects researchers, publishers and many others
• Open access to publications
• Research data management including open access to research data
Your thesis Your data
Research Archive
(SHURA)
Research Data
Archive
(SHURDA)
Adsetts
Library
http://shura.shu.ac.uk http://shurda.shu.ac.uk
Your project
Planning
• Year 1
• Planning how you will
gather, archive and
share your data
Executing
• Year 2 and 3
• Gathering and
analysing data
• with archiving
and sharing in
mind
• Writing up
• with publishing
your thesis
online in mind
Closing
• Year 3
• Archiving and
sharing the
underpinning and
any other data
(SHURDA)
• Making your thesis
open access
(SHURA)
Before research During research After research
Session outline
1. Open access (OA)
– What is open access?
– Copyright and your electronic thesis
2. Research data management (RDM)
– What is research data management?
– Research data management at SHU
1. OPEN ACCESS AND
ELECTRONIC THESES
What is Open Access?
• Open Access means access that is – online
– free of charge (no access restrictions)
– free of most copyright and licensing restrictions
• Authors do not relinquish their copyright, but attach a licence to their work that usually requires attribution of the author through proper citation
• You would normally be able to download the work, copy it, distribute it, print it, search it, pass it as data through software, etc
• Open access is relevant for all formats, such as peer-reviewed journals, conference proceedings and doctoral theses
The problem
• A global movement that started around 2002 to address problems with commercial "toll access" publishing models
• Above inflation price rises for journal subscriptions but stagnant library budgets mean access to relevant literature is increasingly limited – price rises are on average four times inflation, 7.3% per annum
at SHU
• An increase in published literature means academics can access a decreasing percentage of available literature – published literature increases exponentially with 5% per annum
• The only sustainable model is removing the cost of access to literature by changing publishers' business models
Open access types
Green open access
(Self archiving)
Authors publish in the normal
way and then deposit a
version of their work for free
public use in their institutional
repository. Often the
research is only publicly
available after an embargo
period.
Gold open access
Authors publish in a fully
open access or hybrid journal
that provides immediate open
access (often involves
payment of a fee)
Electronic theses in the UK
• Most universities in the UK require submission of an electronic version of a thesis
• Published in an institutional online repository
• SHU Research Archive (SHURA), http://shura.shu.ac.uk
• The thesis will then also be available via EThOS (Electronic Theses Online Service) from the British Library
• National repository for UK doctoral theses, http://ethos.bl.uk
• Over 400,000 records from over 120 institutions
• About 1/3 have full text PDF available for download via
– EThOS
– a link to an institutional repository
• You can also order scanned copies of older theses
Your thesis Your data
Research Archive
(SHURA)
Research Data
Archive
(SHURDA)
EThOS
Adsetts
Library
Activity
• What are the benefits of making your
thesis available via open access?
• Can you think of any downsides?
Benefits of electronic theses
• Global audience: you can disseminate your research
widely with minimal effort
• Reputation: you may get citations
• Opportunities: further publications, funding opportunities,
collaborations
• You may need an embargo – commercially sensitive information about a company or sponsor that
should remain confidential
– the thesis contains references to individuals which have not been
anonymised
– you have an agreement with a publisher that prohibits prior publication
What is copyright?
• Authors automatically have copyright in anything they write or create, they do not need to apply for it or use the copyright mark ©
• This includes all materials you find on the Internet
• In Europe copyright lasts until 70 years after the end of the calendar year in which the last surviving author dies
• Authors can assign parts of their copyright to others, eg a publisher (this is common)
• Copyright includes the right to – copy the work
– issue copies to the public
– perform, show or play the work
– broadcast the work
– adapt the work
– rent or lend the work
• If you want to do any of these things, you will need permission from the copyright holder
• Copyright, Designs and Patents Act (1988)
Copyright and your
electronic thesis • If you want to include any 'substantial' third party items in your
electronic thesis, you have to make sure you are allowed to do so, and if not, ask permission
• 'Substantial' depends on the significance of the part in the whole item (eg recommendations and conclusions even if less than a page from an 80 page report)
• https://library.shu.ac.uk/lms/freebooks/shucopyrightelectronicthesis.pdf
Examples of substantial third party items
long extracts of text from works by other people
illustrations, photographs and images
figures or tables
maps and charts, even those you have redrawn yourself
materials of your own that have been previously published
Keep track of third party materials
• Keep track of all substantial
third party items that you use,
such as images, tables and
maps
• Make sure you know where
you found the items, how to
attribute them, who the
copyright holder is, and what
you are allowed to do with
them (eg licenses)
• Seek permission via email or
letter to include these items in
your electronic thesis as soon
as you realise that you need to
– Copyright holders may take a long
time to respond!
• Keep any responses, they
need to be included when you
submit your electronic thesis
2. RESEARCH DATA
MANAGEMENT
Research Data Management
• Long-term curation of digital resources
• Principles of open access applied not just to outputs but to the underlying resources or datasets which should be made freely available for the purpose of: • scrutiny of research outputs
• re-use in new research projects
• Making data available to others requires careful planning and management of these resources during the research project
• Mandate from funders and journals since 2011 + good research practice
Research Data Management
Planning
• Including ethics
and copyright
Managing
• Documenting and
organising data
• Storing and
backing up data
Archiving &
Sharing
• Selecting which
data to keep
• Preserving data
• Giving access to
data
Before research During research After research
Re-use
Good research practice
• Direct benefits of managing live data
– Storing and backing up: avoiding the risk of data loss and
unauthorised access
– Documenting: usability of resources through documentation
– Organising: efficiency through logical folder structures, file
naming conventions, file versioning
• Data archiving and sharing
– Research integrity: openness and transparency
– Personal and institutional reputation: increase in citation rate of
associated research output of up to 69%, opportunities to
collaborate
– Altruistic benefit: combining datasets in new ways, may create
new insights and advance academic progress
Good research practice
• Planning – Decisions made at the beginning determine what
you can do with your data later on
– informed consent should allow for data sharing at the end of your project
– re-using secondary data may have certain restrictions to what you can do with that data
• "It took me a while to get my data into suitable formats. I’ve learnt a lesson for future work: think SHURDA from day one!"
Data sharing and management
snafu in 3 short acts
Video produced by New York University: https://youtu.be/N2zK3sAtr-4
Activity
• What are the three most important things
you have learned about research data
management from this video?
This is what I came up with
• ways of sharing
• documenting your data
• file format obsolescence
Drivers
The data deluge
Data security
Research integrity
Open access
The data deluge
• The amount of data grows exponentially
– in the STEM fields 30% annually
• Sciences: sensor networks, satellites,
seismographs, simulations and computational
models, etc.
• Social sciences: government statistics, online
surveys, etc.
• Humanities: large bodies of text, distant
reading, digital images and video, models of
historic sites
• Problems of scale: metadata (discovery,
usability) and preservation (storage medium, file
format)
• The Library of Alexandria revisited: Vint Cerf
(vice-president of Google) warns for a digital
Dark Age
Data security
• Unwanted loss • 6% of all PCs will suffer an episode of
data loss in any given year (hard disk
crashes)
• 31% of PC users have lost all their files
due to events beyond their control
• 2005: fire at the University of
Southampton causes significant data
loss - not all data were backed up, and
not all data could be recovered from
the 70 damaged hard drives with the
most critical data
• loosing your portable device with all
your work on it
• Unauthorised access • e.g. loosing a portable device with
personal and sensitive personal
information relating to the Data
Protection Act (1998)
Research integrity: openness and
transparency in a crisis of trust
• Recent cases of misconduct, e.g. in
social psychology
– Diederik Stapel (falsifying data for
dozens of papers) in 2011
– Dirk Smeesters (massaging data to
strengthen outcomes in his papers) in
2012
Crisis of trust: failure of replication
• The Reproducibility Project in psychology involved 270 academics replicating findings of 100 papers published in top peer-reviewed journals in 2008 - the findings were published in Science in September 2015
• Only 36 out of 100 papers could be replicated
• The measured effect was on average only half from the original publication
• Similar studies with similar outcomes for other disciplines
Crisis of trust: questionable
practices • Survey under
psychologists (John et al. 2012) shows that – 0.6% of psychologists
surveyed admitted to falsifying data
– 22% round off p-values to get significance (if you get a result of 0.054 you round to 0.05 to get significance)
– 38.2% said that they decided whether to exclude data after looking at the effect of doing so
Reponses to the crisis of trust
• Articles get retracted (http://retractionwatch.com): May 2012 review of 2,047 retracted articles concluded that:
• 21.3% were retracted because of errors
• 67.4% were retracted because of scientific misconduct
– fraud or suspected fraud (43.4%)
– duplicate publication (14.2%)
– plagiarism (9.8%)
• Many journals now encourage transparency through data publication (Nature since 2013)
Open Access and funder
requirements • Since 2011, an increasing
number of research funders
have requirements for
publishing open access
• Improves the impact of
research
• "Publicly funded research data
are a public good, produced in
the public interest, and should
be made openly available"
• Funders usually expect:
• Timely release of data
– on publication, or soon after
data generation, or project end
• Data sharing
– as open as possible
– with a data availability
statement in research papers
• Preservation of data
– typically 10+ years if the data
are of long-term value
As open as possible: open data vs
open access • Different objects, different demands
• Open access publications are free to use by anyone but there
may be necessary restrictions to openness of data (RCUK
Common Principles on Data Policy):
– data may be subject to various legal, ethical and contractual
restrictions
– data producers should have the right of first exploration of those
data
• Dimensions of openness:
– what materials are made available (a selection of your data)
– when they are made available (when research outputs are
published, when research project finishes, with an embargo period)
– to whom they are available (unrestricted access versus controlled
access to bona fide researchers for a specific purpose)
– on what terms and conditions they are available
RESEARCH DATA
MANAGEMENT AT SHU
SHU RDM Policy
• SHU's Research Data Management Policy: – draft a data management plan before the project commences
– store all active research data on the University networked storage system
– make arrangements for the long-term preservation of datasets that underpin a publication, are of potential long-term value, and/or support a patent application
– register all preserved datasets in the University's Research Data Repository
– share datasets where this is required by any funders or where it will be beneficial for the research community
– include a statement in any publication on how to access the supporting data
– formally cite any third party data that you use
• Mandatory for all publicly-funded research, good practice for all other research
• Responsibility lies with the Director of Studies: “It is their duty to ensure that all members of the research team with access to the research data adhere to good research data management practice.”
Planning
• Data Management
Planning Tool
online tool for planning
research data, also as pdf
http://dmponline.dcc.ac.uk;
http://research.shu.ac.uk/
rdm/dmp.html#pgr
Managing
• Research Store
(Q:\Research)
safe and secure storage of
'live' research data
http://research.shu.ac.uk/
rdm/research-store.html
Archiving &
Sharing • SHU Research
Data Archive
(SHURDA)
archive for digital and
non-digital research data
http://shurda.shu.ac.uk
Before research During research After research
Support
Guidance website, http://research.shu.ac.uk/rdm
Advisory service, [email protected]
Events (training, workshops, drop-ins)
Your thesis Your data
Research Archive
(SHURA)
Research Data
Archive
(SHURDA)
EThOS
Adsetts
Library
Activity
• What are the advantages and dangers of the following storage options? – Networked drives
– Local drives on PCs and laptops
– Cloud-based storage
– External portable storage
• What would be the best way to back up?
Research store (Q:\Research)
• Places to store your data: – Local drives on PCs and laptops (risk of data corruption, data loss, unauthorised access if
unencrypted)
– Cloud-based storage (host has access to all of your data, they may have the right to
use/publish your information)
– External portable storage (risk of data corruption, data loss, unauthorised access if
unencrypted)
• The Q:\Research drive is a networked storage facility which is – secure (firewalls, passwords)
– safe (daily automatic backup on two remote
locations)
– flexible (enough space, flexible access arrangements)
• Ask your supervisor for your own space on the
Research Store (Q:\Research drive).
• More info at http://research.shu.ac.uk/rdm/research-store.html
Backing up
• Back up regularly, preferably daily
• Use the 3-2-1- rule:
– (3) Keep 3 copies of important files (a primary
and two backups)
– (2) on 2 different media types (such as CDs,
hard drives, memory sticks and online
storage)
– (1) with 1 copy being stored offsite (or offline)
Support
Online self-help
• Portal, http://www.shu.ac.uk/research/rdm.html
– links to policy, guidance, and SHURDA
• Very informative guidance website, http://research.shu.ac.uk/rdm/
• Online learning module on Blackboard/shuspace as part of the Academic CPD Online Courses
– 30-60 minutes
Personal advice
• Advisory service, http://research.shu.ac.uk/rdm/advisory-service.html, [email protected]
– data management planning
– depositing data
– any other queries
Events
• Workshops on planning and on archiving research data
• Monthly drop-in sessions on both campuses
• http://research.shu.ac.uk/rdm/events.html
Activity
• You will make your thesis available via SHURA and your data via SHURDA as open access at the end of your doctoral project
• This has consequences for how you manage third party materials that you wish to include in your thesis and how you plan and manage your data
• What are you going to do next? Write down three things and discuss with your neighbour(s).
To do
• Do the online Blackboard module on research data management
• Talk to your supervisor about research data management
• If you have an external sponsor (eg English Institute of Sport) find out what their policies on open access and research data are
• Start thinking about ethics and copyright of your data as soon as you have made decisions about the data for your doctoral thesis
• Write a data management plan towards the end of year 1
• Ask your supervisor for a folder on the Research Store before you start working with data
• Make good note of any third party materials that you use