Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Why Heritrix?
Internet Archive’s web-scale, archival-quality web crawlerprojectOpen-source and extensibleWritten in Java and used in CiteSeer
Download/untar/cd bin
http://crawler.archive.org/index.html Go to sourceforge downloads page and get version 1.14.3