21
Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager, Australian Antarctic Data Centre & Chief Officer, SCAR Standing Committee on Antarctic Data Management)

Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Embed Size (px)

Citation preview

Page 1: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Joint SCAR/COMNAP Delegates Meeting

(SCAR Lecture)A Strategy for Data and

Information Management in the 21st Century

9th July 2010

Kim Finney (Manager, Australian Antarctic Data Centre & Chief Officer, SCAR

Standing Committee on Antarctic Data Management)

Page 2: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• In 1962 John F. Kennedy announced a man would be put on the moon by the end of the decade.

“Well, space is there, and we're going to climb it, and the moon and the planets are there, and new hopes for knowledge and peace are there” (JFK, 1962)

• But...if we wanted a challenge...we have one in our own backyard. 50 years on and we may now know more about the surface of the moon than we do about our Antarctic and Southern Ocean environments.

Page 3: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Have better maps of the moon than Antarctica.– LIMA (15m spatial accuracy) of Antarctica– Lunar Reconnaissance Orbiter (1.0m spatial accuracy).

Page 4: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Only a small fraction of the Southern Ocean seafloor topography has been surveyed by ships.– Satellite altimetry is helping to fill in the broad-scale features >10-

15km in width (Sandwell and Smith)

Page 5: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Can land a rover on Mars but:• Transport in Antarctica is still difficult subject to severe restrictions and

limitations resulting from weather and terrain extremes.• Struggle with developing underwater technology for sampling

biodiversity.– Most promising AUVs still have power, sea-state, instrument, speed,

navigation limitations. Sensors mainly for physical parameter detection.

Rover

AUV

Page 6: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Why are we not further advanced in our understanding of, and access to, the Antarctic environment ?– Lack a grand collective vision ?– How well are we collaborating (scientifically

and logistically) ?– Still like to treat Antarctica as a heroic frontier

for testing the resilience of man ?

Page 7: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• What has this to do with data management ?– Its no longer about heroic polar men, blazing trails into

the unknown collecting small amounts of data.– Its about launching autonomous mobile and fixed sensors

to all points of interest, sharing the vast volumes of data generated, piecing these data together at local, regional, continental and global scales.

– It’s the era of “networked data” and visionary collaboration. Heroes in this age will be those that have the skills, vision and technological innovation to build and exploit these data networks.

Page 8: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

21st Century Data Management• In the next 10 years most scientists working on

Antarctic data will never travel to the ice.– Advantageous perhaps for those countries without physical ice-

based research facilities. Contributions instead to data network building ?

Marine Sensor Network

(courtesy of IMOS)

Page 9: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Data managers won’t operate as an adjunct to science. The new polar scientist by default will be data management literate and proficient with data networks.

Page 10: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Data production in many disciplines is doubling annually (UK e-infrastructure Steering Group, 2006).– Data stores need to be optimised for the disciplines they support and the access paradigms expected

by those communities.

Copied from Kirk Borne, 2008

• Computing power doubles every 18 months (100X in 10 yrs)

• I/O bandwidth increases @ 10% p/a (3X in 10 yrs)

• Data doubling every year (1000X in 10 yrs)

• NSCA example: 1st 19 yrs generated =1 PB Year 20 (2007) = 2 PB Year 21 (2008) = 4 PB Year 2025 ? 10156 PB ???

Borne (2008)

Page 11: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

• Scientific communities will become dependent on very large, openly accessible databases.– necessitating stable financial support for repositories.

Page 12: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Datasets becoming very complex – multi or hyper-dimensional.– Will require dimensionality reduction via machine discovery of patterns, substructures

and correlations in the data (Djorgovski, 2009).– Requires even more emphasis on: skills in data visualisation, algorithm development,

data access, data description, stable repositories, distributed computing.

Page 13: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

How Prepared Are We ?• SCAR/COMNAP Report Card:

1. Collaborative logistical infrastructure development and utilisation

2. Pan Antarctic observation network

3. National investment in polar data management repositories

4. Data sharing and access

5. Investment in building professional skills in data analysis and/or data management.

snowflake scores are out of 10

Page 14: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Investment In Repositories

Source: DIMS (2009)

2 or less staff in all but UK and Australian Centres.

Approx 33 nations participate in SCAR.

Belgium – SCAR MarBIN Data Centre – only on temporary funding.

Page 15: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

How Prepared Are We ?• SCAR/COMNAP Report Card:

1. Collaborative logistical infrastructure development and utilisation

2. Pan Antarctic observation network

3. National investment in polar data management repositories

4. Data sharing and access

5. Investment in building professional skills in data analysis and/or data management.

snowflake scores are out of 10

Page 16: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

What Is Required ?• SCAR has already invested in developing

a Data and Information Strategy (DIMS)– Individual academic institutions not best

placed to manage long-term repositories (or develop sustainable national infrastructure),

– National Antarctic Programs (as represented through COMNAP) are better positioned.

– Suggest both SCAR and COMNAP have much to gain by pursuing DIMS in unison.

– SCAR has the vision BUT COMNAP has the capacity and capability.

Page 17: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Antarctic Master Directory(NASA)

Human Readable Metadata

registe

ring

meta

data

National Data Centres

A human user can search a metadata catalogue. But data might not be linked to these descriptive records (only 53% of records have data).

DIMS Implementation Plan designed to move us from a metadata centric infrastructure to......................

Page 18: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

An infrastructure that delivers data/information through specialised, networked, national or institutional data portals.

Antarctic Master Directory (with a

registry interface)(NASA)

Standardmachine to machine

interfaces

data store

Data Portal

Scientists are able to use a data discovery portal from one country,

that can also access data from another country’s data store.

data store

Harvests from

Service Registry

Portals Standard Interfaces

Protocols

Page 19: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

What if I don’t have anywhere to put my data but I’m happy to share

it ?

data

bucket

Internet Cloud

Tom

Jerry

I wonder what data is out there for the polar regions ?

Su

bm

it data

Metadata Catalogue &

Registry

Regis

ter i

n

Discover/retrieve

Polar data

Search

Publish to

AND provides virtual physical storage for orphan data....AND allows us to search for data using public search engines.

Page 20: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Possible approach:– COMNAP’s Data Management Expert Group (DMEG) reviews

SCAR DIMS and Implementation Plan for its fit with COMNAP business objectives.

– National members collaborate to “resource” project(s) in the Plan focussed on delivering outputs/outcomes for a specific program of science that is supported by both SCAR and COMNAP members (this constrains infrastructure development on meeting immediate user needs).

– Projects are run as international “managed” collaborations – signed off by MOU and subject to project management.

– COMNAP and SCAR jointly review the function, structure and role of SCADM, SCAGI and DMEG with a view to streamlining approaches to Antarctic data infrastructure development.

Page 21: Joint SCAR/COMNAP Delegates Meeting (SCAR Lecture) A Strategy for Data and Information Management in the 21st Century 9 th July 2010 Kim Finney (Manager,

Conclusion• The data deluge is already here !• Our ability to manage and harness this deluge will be a key

determinant of the quantity of “high quality” science we can produce in the C21st .

• Effectively sharing logistical/research management information underpins how we can collectively get more value out of existing and future investments made in deploying to Antarctica.

• Data Infrastructures – have to be planned, designed, funded and managed. They are expensive – but the pain can be shared !

• Lets make sure we know more about Antarctica than the moon before we are due to land there again (2020 ?).