27
Web Obervatories, Data Analytics and Archives Professor Dame Wendy Hall University of Southampton, UK and Kluge Chair, Library of Congress 16 th June 2016

Professor Dame Wendy Hall - Saving the Web

Embed Size (px)

Citation preview

Main presentation title goes here.

Web Obervatories, Data Analytics and ArchivesProfessor Dame Wendy HallUniversity of Southampton, UK and Kluge Chair, Library of Congress16th June 2016

1Please use the dd month yyyy format for the date for example 11 January 2008. The main title can be one or two lines long.

2

Library of Congress Web ArchivesOver 18 million resources available via the LOC website

Books, images, drawings, newspapers, and websites

Over 11 thousand archived websites supported by the Internet Archive and counting

Library of Congress APILOC exposes their resource data via API to enable services to be built on top>

Building Web Observatory Services on the LOC websiteThe Web Observatory uses schema.org vocab to describe different resources

simple, lightweight metadata (URL, name, description, author)

The Web Observatory lens can be added to existing repositories and data catalogueshelping make them discoverable across Web Observatories

Enriching LOC Resources with Schema.org

LOC API (JSON)>WO Schema.org (JSON-LD)

Archives Unleashed 2.0 used datasets found through the Observatory

Archives Unleashed 2.0 the results can be made accessible through the Observatory

Studying the Web across continentsAstronomers obtain a very high resolution picture of the sky from small telescopes a long distance apart.Many web research labs, contributing across the globe, help build an accurate picture of human activity at planetary scale.transcending parochial social, political, economic, legal interpretations

understanding web evolution:observationexperimentation

Web Observatory Infrastructure todayAn initiative supported by the Web Science Trusthttp://www.webscience.org A growing number of sites using common metadata for hosted datasets and apps (schema.org)http://index.webobservatory.orgSome WO sites use purpose-built software that:Allows their community members to list and share public or private datasets and appsProvides for discovery and access to listed datasets and apps across WO sitesProvides APIs for app development using listed datasets:http://webobservatory.soton.ac.uk

Thanassis Tiropanis University of Southampton [email protected]

15

index.webobservatory.orgFollow us: @wo_teamContact us: [email protected]

16

Web Observatory Infrastructure tomorrowA distributed catalogue across WO sitesWO sites use common technical standards forDescribing locally or remotely hosted datasets and appsAccessing datasets and apps across sitesAPIs for developing apps and visualisationsMeaningful terms and conditionsImplementing ethical practiceThanassis Tiropanis University of Southampton [email protected]

17

The Web Observatory: A Middle Layer for Broad Data. (2014). Tiropanis, T, Hall, W, Hendler, J A, De Larinaga, C. Big Data, 2(3).

18

Datasets20

Apps

Who is editing what in Wikipedia?22

The Web of ObservatoriesWeb ObservatoryWeb ObservatoryWeb ObservatoryWeb ObservatoryWeb Observatory

We have a number of emerging Web Observatories 23

Observing the Web

How do we catalogue Observatories and content?

https://www.w3.org/wiki/WebSchemas/SchemaDotOrgProposalshttps://www.w3.org/wiki/WebSchemas/WebObsSchema

Were building a crawler and a search engine

# Web Science Trust 2013Observing the Web

The ambition is to map the digital universe!

# Web Science Trust 2013webscience.org/web-observatoryFollow us: @wo_teamContact us: [email protected]

26

Digital VellumThe Self Archiving WebWho is archiving what?