WGS Data management course Try-out 2012-09-24, Hugo Besemer

WGS Data management course Try-out 2012-09-24, Hugo Besemer

Embed Size (px)

Citation preview

Page 1: WGS Data management course Try-out 2012-09-24, Hugo Besemer

WGS Data management course


2012-09-24, Hugo Besemer

Page 2: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: file and path names

MS/Windows , Mac OS allow very long names but ...

Are your filenames descriptive?

Are your filenames unique?

8.3 convention (12345678.abc ) important e.g. when burning CD’s or DVD’s

Avoid spaces for files that may go on the web

Avoid punctuation () \ / : * ? " < >’As they may be reserved in operating system or programming languages

Page 3: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: Descriptive file names

Descriptive filename Not unique Unique

in a folder structure (across folders) (across folders)

This will work for relatively small numbers of files. If

large numbers of files are produced automatically non-

descriptive filenames may be used. You need to know

something else (“DAMS “Digital assets management

system”) to keep track what is what

Page 4: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Short time storage: version control

Questions and Best practices

●Are you working alone or with others?

●Do you store files at different locations? (synchronisation)

●Keep track of ‘master files’ and ‘milestone files’ and store them in a single location (Dropbox?)

Identifying versions

●Use a naming convention that includes date or number (..._v1, ..._v2)

●Your software may be able to do (part of) the job

Page 5: WGS Data management course Try-out 2012-09-24, Hugo Besemer


Stick to the agreed way of working within your group (if there are any)

In the next slides some points of view from the Wageningen UR IT department (FB-IT)

Page 6: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: IT Data storage Continuity


• Data centre . Secure: (fire, power incidents, burglary).

• 2 data centres in case of disaster• The equipment is fail-safe • 500 TB reserved, 300 in use, 1 PB avail

Page 7: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: ICT Data Products & Services



Price per GB






Volatile or static data





50 GB


Databases or research data

€5,- without€7,- withbackup




50 GB


Critical data


Month+ History

max 1 year



1 GB


Mass reproducable


€520 / TB




1 TB

Massive double

Same as massive, high


€1000 / TB




1 TB

Page 8: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: Better alignment

(% is total percentage of score + 1 up or down)

Subject Importance Score FBIT

Ease of use 9 (85%) 8 (64%)

Backup/Restore 9 (64%) 7 (28%) very diverse

Share (intern) 9 (79%) 9 (71%)

Share (external) 6 (28%) very diverse 5 (21%) many n/a

Archive function 8 (50%) many n/a 5 (14%)

Findable 9 (79%) 7 (28%)

Price 9 (86%) 4 (28%) very diverse

Speed data transfer 9 (72%) 5 (21%) very diverse

Availability 9 (79%) 8 (64%)

Flexibility 8 (78%) 6 (28%)

Security 7 (50%) very diverse 8 (57%)

Page 9: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Backups: Data storage workshop conclusions

Enhancements Request:

1. Lower the price

2. Set up a Concern policy for Information security

3. Higher flexibility (request period, use period, costing, etc)

4. Accessibility for external people

5. Deliver a Product for Archiving

6. Higher throughput (data rate)

What is the next step?

● Building a roadmap for IT Storage and Products

Page 10: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: Metadata

Content metadata

Context metadata

Metadata serves different purposes:

Metadata are structured data that provide a short summary about any

information resource, print or electronic, and facilitate the location,

identification, or discovery of that resource.

Subject terms, titles

creator, place , time, project

Location. Metadata can indicate where an information resource is located, either physically or virtually.Identification. Metadata can distinguish one information resource from another without describing the entire collection of information resources.Resource discovery. Metadata can link a user's queries about a particular subject with those information resources about the same subject.

Page 11: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets

Page 12: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets 2

Page 13: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata and datasets 3

DANS: Dutch national repository for datasets

Unique ID

Page 14: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: metadata, datasets and


It’s as open as you want it to be

In a sustainable format, independent of (version of) software

With proper documentation for re-use

Page 15: WGS Data management course Try-out 2012-09-24, Hugo Besemer

Long term storage: selection




Subject content

Easy to reproduceCost of documentation / conversion acceptableFile size Reliable

AuthenticIs it stored elsewhere?

Required for verificationRequired for legal purposes


General interest


Page 16: WGS Data management course Try-out 2012-09-24, Hugo Besemer

What does all this mean

for your data

management plan?