Upload
denis-franklin
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Nikola Tesla Museum Clipping Library
Saša Malkov
Nenad Mitić
Žarko Mijajlović
3rd SEEDI Int.Conf.
Cetinje, Montenegro
14. September 2007.
Clipping Library
Nikola Tesla Museum possesses a rich collection of newspaper clippings on work and life of Nikola Tesla
The clipping library is collected by Nikola Tesla, supported by his personal secretary
One part of the library is organized in books, while many clippings are not organized
Digital Library Prototype
Digitization Group at Faculty of Mathematics approached the development of digital clipping library prototype
Primary goals:– The problem analysis– Recognition of appropriate solutions
Problems
Significant variations in materials sources and qualities
The data and metadata organization and modeling
Data access
Differences in sources and preservation level
Different digitization techniques provide the different results, depending on paper and print type and preservation level
Different target formats are considered– Digital image formats– PDF– DejaVu format
Data organization
File systems are not appropriate– Complex data and metadata access– Limited search capabilities
Databases allow– Simpler access– Advanced searching
Automatic text extraction
Primary problems are :– Different languages– Large varieties and high font stylization used in the
corresponding time period– Significantly low material quality, because of aging
Different OCR systems are evaluated– No OCR software satisfied, primarily because of the low
material readability– Significant amount of manual corrections is necessary
Searching
The multiple criteria searching is essential, including searching by
– Metadata Caption Key words Publications Language Period
– The clipping content Manual corrections of text are essential The efficiency require the application of some indexing methods
The solution – DBMS
The prototype is based on DBMS IBM DB2– Advanced SQL implementation– Efficient handling of binary content– High concurrency level – High reliability– Good experiences– Free licensing terms
The solution – User interface
Web application concept is– Rich in content and visual presentation – Customizable – Portable– Relatively simple for implementing
The solution – Application
The library prototype is implemented in functional programming language Wafl– Wafl is designed for automatic document generation
and particularly customized for Web development– Features very simple and efficient database access
Nikola Tesla Museum Clipping Library
Saša Malkov
Nenad Mitić
Žarko Mijajlović
3rd SEEDI Int.Conf.
Cetinje, Montenegro
14. September 2007.