View
14
Download
0
Category
Preview:
Citation preview
Why Heritrix?
Internet Archive’s web-scale, archival-quality web crawlerprojectOpen-source and extensibleWritten in Java and used in CiteSeer
Download/untar/cd bin
http://crawler.archive.org/index.html Go to sourceforge downloads page and get version 1.14.3
Recommended