27
North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at-risk digital geospatial data Partners: NCSU Libraries Project Lead: Steve Morris NC Center for Geographic Information & Analysis Project Lead: Zsolt Nagy NSDI Partnership Community Meeting March 1, 2006

North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Embed Size (px)

Citation preview

Page 1: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

North Carolina Geospatial Data Archiving Project/NDIIPP:

Collection and preservation of at-risk digital geospatial data

Partners:

NCSU LibrariesProject Lead: Steve Morris

NC Center for Geographic Information & AnalysisProject Lead: Zsolt Nagy

NSDI Partnership Community Meeting March 1, 2006

Page 2: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 2

Outline

Risks to Digital Geospatial DataOverview of NC Geospatial Data Archiving Project and NDIIPPPreservation Challenges and Possible SolutionsPoints of Engagement with Spatial Data Infrastructure and Industry

Page 3: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 3

Risks to Digital Geospatial Data

.shp

.mif

.gml

.e00

.dwg

.dgn

.bsb

.bil

.sid

Page 4: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 4

Risks to Digital Geospatial Data

Producer focus on current dataArchiving data does not guarantee “permanent access”

Future support of data formats in questionNeed to migrate formats or allow for emulation

Data failure“Bit rot”, media failure

Preservation metadata requirementsDescriptive, administrative, technical, DRM

Shift to “streaming data” for access

Page 5: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 5

Time series – vector dataParcel Boundary Changes 2001-2004, North Raleigh, NC

Temporal data to support business needs in: Real estate analysis

Land use change analysisEconomic planning

Page 6: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 6

Time series – Ortho imageryVicinity of Raleigh-Durham International Airport 1993-2002

Even static orthophotos are at risk.

Page 7: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 7

Today’s geospatial data as tomorrow’s cultural heritage

Future uses of data are difficult to anticipate (as with Sanborn Maps).

Page 8: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 8

NC Geospatial Data Archiving Project

Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP)One of 8 initial NDIIPP partnerships (only state project)Focus on state and local geospatial content in North Carolina (state demonstration)Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventoriesObjective: engage existing state/federal geospatial data infrastructures in preservation

Page 9: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 9

Targeted Content

Resource TypesGIS data (vector, etc.)Digital orthophotography Digital mapsTabular data (e.g. assessment data)

Content ProducersMostly state, local, regional agenciesSome university, not-for-profit, commercialSelected local federal projects

Page 10: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 10

Work plan in a Nutshell

Work from existing data inventories

NC OneMap Data Sharing Agreements as the “blanket”, individual agreements as the “quilt”

Partnership: work with existing geospatial data infrastructures (state and federal)

Technical approach

Metadata: FGDC, METS, PREMIS?, GeoDRM?

Repository-independent: Dspace initially

Web services consumption for archival development (in future?)

Page 11: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 11

NCGDAP Philosophy of Engagement

Take the dataas is, in the manner in whichit can be obtained

Provide feedback to producer organizations/inform state geospatial infrastructure

“Wrangle”and archivedata

Note the ‘Project’ in ‘North Carolina Geospatial Data ArchivingProject’– the process, the learning experience, and the engagementwith industry and infrastructure are more important than the archive

… What is the long term solution?

Page 12: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 12

Big Technical Challenges

Format migration paths

Management of data versions over time

Preservation metadata

Harnessing geospatial web services

Preserving cartographic representation

Keeping content repository-agnostic

Preserving geodatabases

More …

Page 13: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 13

Vector Data Format Issues

Vector data much more complicated than image data

‘Archiving’ vs. ‘Permanent access’An ‘open’ pile of XML might make an archive, but if using it requires a team of programmers to do digital archaeology then it does not provide permanent access

Piles of XML need to be widely understood piles

GML: need widely accepted application schemas (like OSMM?)

The Geodatabase conundrumExport feature classes, and lose topology, annotation, relationships, etc.

… or use the Geodatabase as the primary archival platform (some are now thinking this way)

Page 14: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 14

Managing Time-versioned Content

Continuously updated data: Frequency of snapshots?Different for various framework

layers?

Page 15: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 15

Metadata Availability – Limited at Local Level

February 2005

Page 16: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 16

Harnessing Geospatial Web Services

Image atlases from WMS services?Capturing cartographic representation?Recording records from decisions-making processes?Later: data transfer via WFS & GML?, Other?

Page 17: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 17

“Web mash-ups” and the New Mainstream Geospatial Web Services

How does temporal data fit into emerging WMS caching and tiling schemes?Capture of tiles and caches for archive?

Page 18: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 18

Preserving Cartographic Representation

Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc.

Page 19: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 19

Content replication also needed for:Disaster preparednessState and federal data improvement projectsAggregation by regional geospatial web service providers

WFS, e.g.: efficiency in complete content transfer?Need rsync-like function, informed by: rights management, inventory processes, metadata management, data update cyclesArchiving delta files vs. complete replication – need to avoid requiring “digital archaeology” in the future

Needed: Efficient Content Replication

Page 20: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 20

GML for archiving (PDF/A version of GML?)GeoDRM

Adding preservation use casesContent Packaging

Will there be an industry solution?Web Map Context Documents

Can we save data state as well as application state?Content Replication

Is this a layer in the overall architecture?Persistent Identifiers

Points of Engagement with the Open Geospatial Consortium (OGC)

Page 21: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 21

Framework data communitiesSnapshot frequency, naming schemes, classification, GML application schemas, format strategies

Metadata standards and outreachPersistent identifiers, versioning, feedback on metadata quality

Content replication/transferFor data improvement projects, disaster preparedness, aggregation by regional service providers, … and archives

Where does archiving and preservation fit into the NSDI, GOS, etc?

Points of Engagement with Spatial Data Infrastructure

Page 22: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 22

Software vendors

Better support for temporal data management

Tools for retrospective data conversion

Web mashup and open source communities

WMS caching schemes

Standard tiling schemes with temporal component?

Data vendors

Cultivate market for older data (scaled pricing?)

Tech transfer on archiving practices?

Points of Engagement with Industry

Page 23: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 23

Project StatusCultivating a market

for older data.

Page 24: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 24

Project StatusCultivating tools for

retrospective conversion.

Page 25: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 25

Demonstration archiveOutreach activity – planting seeds

International, national, state, local, commercial

Learning experience, informing:Spatial data infrastructureCommercial vendors (data/software/consulting)Repository software communitiesMetadata practice (both GIS & preservation)Rights management developmentsData and interoperability standards

Expected Project Outcomes

Page 26: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 26

Project Status

Storage system and backup deployedDSpace deployedFGDC Metadata workflow finalizedIngest workflow near finalizationContent migration workflow plan near finalizationRegional site visits planned for coming monthsWide range of outreach/collaboration: FGDC, ESRI, EDINA (JISC), USGS, OGC, TRB, etc.Pilot project, georegistering digital archival geologic maps

Page 27: North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project

Note: Percentages based on the actual number of respondents to each question 27

Questions?

Contact:

Steve MorrisHead, Digital Library InitiativesNCSU [email protected]

Web site: http://www.lib.ncsu.edu/ncgdap/