Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Webrecorder: Building, Maintaining
& GrowingIIPC WAC | November 14, 2018
Anna PerricciAssociate Director of Strategic PartnershipsWebrecorder at Rhizome Rhizome at the New [email protected]
Webrecorder Team
Dragan EspenschiedPreservation Director,
Rhizome
Ilya KreymerSoftware Director
Mark BeasleySenior Front-End
Developer
Pat ShiuAssociate Director
of Design
Anna PerricciAssociate Director,
Strategic Partnerships
John BerlinSenior Back-End
Developer
About WebrecorderCreate high-fidelity, interactive captures of any web
pages you browse
http://webrecorder.io Webrecorder Player
A project by
with generous support from
Webrecorder Project● Easy to use, robust tools● Free to use● Fully open source● Using open standards● Growing user community● Quickly evolving
High fidelity web archiving
● Fidelity = similarity between original and capture
● Capture any web page loaded in the browser
● Archive interactive content (only available after user input)
● Same system for capture & browsing (web browser)
Webrecorder creates high fidelity web archives including
elements that crawler based systems often fail to capture
such as interactive content
Collecting at human scale• Collecting is done with intent by a person via web browser one page at a time
• Can import and augment collections created by crawlers
• Anyone can use Webrecorder: web archiving for all!
The payoff for careful capture is an accurate representation of the original
● Being mindful and observant while collecting is necessary and respectful (regardless of tool)
● Hand collecting can facilitate thoughtful acquisition & require one to be more aware of what is being gathered
● Crawler/machine based collecting is done via a software that is made with human influence ○ Any claim of neutrality because automated tools
are used rings false (biases via algorithms)
Thoughts on human scale collecting
Your data ● Web archiving is more than just capture – it is about
description, stewardship and sharing, too
● You can save in free account &/or take your data out of Webrecorder for storage (local &/or repository!)
● Downloaded WARCs accessible with free software (Webrecorder Player)
● Collections can be shared or kept private
What about social media?● Webrecorder can capture content from social
media sites (e.g. Instagram and Twitter)● Some websites deliver content individualized for
each user● Webrecorder can capture the content you see
when you are logged in to a social media profile
Login credentials● Webrecorder.io is designed to not capture or retain
your login credentials (i.e. social media)
● But a non-personal account is not a bad idea since there’s a possibility credentials could be written into a WARC file if Webrecorder is not successful in suspending capture during log in
● Under no circumstances is your log in exposed via Webrecorder.io or discoverable
Human scale collecting (one page at a time)
with no barrier to entry (via Webrecorder.io)
levels the playing field
Bound archive & symmetrical web archiving
● Bound archive à collection with firm boundaries, With web archiving users can often quickly find boundaries via browsing
● Symmetrical web archiving à same user facing software for collecting and access (a web browser)
Sustainability● More tools to come though some will not be free
● Subscription for more storage – Q1 2019
● Automated collecting & QA tools – Q3 2019
● Support to be better staffed but good design & remote meetings/presentations are helping a lot
● Income: subscriptions, grants, grant subawards (i.e. preservation tool for web projects), funded collaborations & custom work, priority support
Collaboration & partnerships in action
● User partnerships with new web archiving projects (Carnegie Hall, Stanford UP, Pelican Bomb) and leaders in the field
● Development work (British Library, Perma.cc)
● Collaborative development (NFB)
● Increased usage from journalists & professors
Conference: Ethics & Archiving the Web
● IMLS funded forum co-organized by Rhizome and
Doc Now colleagues
● Videos of panels freely available
https://eaw.rhizome.org/
● Stay tuned for recommended practices document
and blog post follow ups from EAW
Will we mirror or amplify the web’s problems?
● Let’s try not to
● Trauma or re-traumatization possible through collecting
● Opportunity to highlight voices that are not privileged by for profit or institutional entities
● Decolonizing web archives
Workshop3:30pmtoday!
A project by
with generous support from
Thank you