View
214
Download
0
Category
Tags:
Preview:
Citation preview
Natural Language Processing in Archaeology: Natural Language Processing in Archaeology: disciplinary impact and beyond.disciplinary impact and beyond.
Arts and Humanities E-Science Project Meeting, Arts and Humanities E-Science Project Meeting, UCL, London, June 8UCL, London, June 8thth 2009. 2009.
Work package 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; primary facets - What, Where, When .
Work package 2&3 – Natural language processing /Data-mining of Grey Literature, Data-mining of Historic Literature; plus geoXwalk
A quick reminder about Archaeotools………..A quick reminder about Archaeotools………..
“WHAT”
• Records that have no subject information
• Records that use terms not found in TMT, so these records cannot be indexed (6,442 unique terms)
Records (1,001,407)
19,269 records (2%)
Records (1,001,407)
101,507 records (10.1%)
“WHEN”
• Records that have no temporal information
• Records that use period terms not found in MIDAS so these records cannot be indexed (457 types of irresolvable dates)
Records (1,001,407)
292,793 records (29.2%)
Records (1,001,407)
114,505 (11.4%)
1066, 1001-1100,11th Centuary, C11, 11C, Eleventh Century
“WHERE”
• Records that have no spatial information
• Records that use terms not found in CDP, so these records cannot be indexed.
Records (1,001,407)
11,126(1.1%)
Records (1,001,407)
245,601 records (24.5%)
• Vast majority of UK archaeological work undertaken as part of the planning process, administered by local authority archaeologists.
• 4,500 fieldwork events each year in England alone.
• Use of different recording standards for events recording.
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
DownloadsDownloads per quarter 2005-2009
OASIS - Grey Literature Library
EH and University of Glamorgan SKOS browser project.
University of Edinburgh,Edina - Digimap
GeoXwalk service.
TALM – Transatlantic Archaeological Literature Mining – ADS, University of Sheffield, Arizona State University and Arkansas University (geosciences, computer science and ‘Digital Antiquity’, JISC, NEH & NSF.)
CDI Type II proposal - Symbiosis of automated knowledge extraction from scientific text and virtual community contribution leading to reasoning based scientific discovery (NSF, awaiting decision).
“By enormously increasing academic access to, and thereby academic use and appreciation of the results of archaeological work done in cultural resource management settings, it would foster a rapprochement between academic and consulting archaeology, resulting in more productive research in both sectors….the same tools would also serve to make the management of cultural resources more effective, because managers (largely in government) could make decisions with an improved ability to assess what is known, what is contested, and what is little investigated in a given management context”. Chitta Baral, ASU
Recommended