11
The mapping process – some observations Robina Clayphan EDLF

The mapping process – some observations Robina Clayphan EDLF

Embed Size (px)

Citation preview

Page 1: The mapping process – some observations Robina Clayphan EDLF

The mapping process – some observations

Robina ClayphanEDLF

Page 2: The mapping process – some observations Robina Clayphan EDLF

Local schemas > ESE

Data Flow

Page 3: The mapping process – some observations Robina Clayphan EDLF

Management of the process• Sheer complexity of managing the hundreds of files

going through the steps in the process• keeping track of the status of the files

– straight-forward ones – in the right place for the next step – problem ones - refer back to provider or a developer

• Use of Sharepoint document libraries and rapid establishment of procedures that all must adhere to

• The management of the process evolved during implementation - a very steep learning curve

• Maintenance of authority files– getting for meta-metadata from the providers (types etc)– collection IDs

Page 4: The mapping process – some observations Robina Clayphan EDLF

(sort of) Policy issues• Inclusion criterion: must have a link giving direct

access to the digital object– check if URLs in data actually resolve to the object

described• Often:

– resolve to metadata page with e.g. pdf icon – how many clicks are acceptable – need for policy decision– granularity mismatch – link at title level only

• Sometimes: – 404 page not found - refer to provider – persistence of URLs– need a plug in (e.g. DjVu) – is that OK?

• Occasionally: a log-in required for restricted access resources

• Need for providers to ensure they only provide links to resources that can be accessed

Page 5: The mapping process – some observations Robina Clayphan EDLF

Data level problems 1

• Trying to understand decision-making process of the original metadata creators– What they meant by e.g. dc:date, dc:source

• Trying to discern the (implicit) data model of the original metadata creators– What is the dc:relation referring to

• Understanding data in a foreign language or foreign script– Is negyedévenként really hungarian for terminally?

• And, if so, why is it in dc:format?

Page 6: The mapping process – some observations Robina Clayphan EDLF

Data level problems 2• Questions to developers that arose from examining

the data– All records have two instances of dc:identifier the first a URL the

second (possibly) a shelfmark. Need to map each instance to a different ESE - can it be done?

– All records have two instances of dc:rights the first appropriate the second not – is it possible to just display the first and ignore the second?

– Where values had been divided between multiple instances of the same element – could they be concatenated with punctuation for a better display e.g spatial1, spatial2, spatial3 used for a geographic hierarchy. Another with up to 14 instances of dc:subject.

Page 7: The mapping process – some observations Robina Clayphan EDLF

Normalisation level

• At the normalisation stage you can see if your interpretation of the record actually makes sense when it has been processed against the source data.

• Apply the Quality Control Checklist• Edit mapping and repeat !

Page 8: The mapping process – some observations Robina Clayphan EDLF

(my) Conclusion

• All indicates:– that it is easier if the mapping and normalising is done as

close to source as possible, ideally by the providers• they are the ones who understand what the data means and can

make sensible mapping decisions• they understand the language and script

– Tools would be nice!

Page 9: The mapping process – some observations Robina Clayphan EDLF

Local schemas > ESE

Data FlowTransform data to populate local repository

#0

Export data to Europeana

#5

Aggregator? EuropeanaLocalAggregator with provider?

Aggregator with provider?

Page 10: The mapping process – some observations Robina Clayphan EDLF

EuropeanaLocal Content Provider Model - to illustrate movement of metadata only

Aggregator

EuropeanaLocal Parallel Test Environment

Aggregator

Europeana

C o n t e n t p r o v i d e r r e p o s i t o r i e s

C o n t e n t p r o v i d e r l o c a l s y s t e m s

Customised transformations to e.g. OAI-DC

Mapping and transformation to ESE, including <europeana> elements

Harvesting of e.g. OAI-DC

No metadata transformations

Page 11: The mapping process – some observations Robina Clayphan EDLF

• Currently a great deal of manual effort goes into metadata transformation. – at provider sites: local format to repository format– by the Europeana development team harvested

format to ESE – normalisation by Europeana development team

• Where will this work happen in EuropeanaLocal?– feasibility of central Europeana staff handling

hundreds more collections?• Can we minimise the current manual overhead?

Issues for EuropeanaLocal

• What are the possibilities for automating all or some of the transformation work?