Upload
asher-stone
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Collaborative Digital Library Services in a Cloud
Kurt Maly [email protected] Harris Wu [email protected] Mohammad Zubair [email protected] Milena Mektesheva [email protected]
Department of Computer Science, Old Dominion University, Norfolk, VA, USA
Service Computation 2010November 21-26, 2010 - Lisbon
Outline
1. Introduction What’s the main issue of traditional computing?
2. Background The existing facet based system with the compute
intensive nature of some features.
3. Evaluation and scaling issues The evaluation of Facet System The scaling issues of traditional computing
4. Cloud development architecture The system on LAMP system with PHP and MySQL on Windows Azure.
5. Future Work
1November 21-26, 2010 - Lisbon Service Computation 2010
Introduction
We have developed a web-based system that allows users to collaboratively organize large online multimedia collections into an evolving faceted classification.
The system includes backend algorithms that systematically enrich the classification and automatically classify documents
Evaluation of the prototype system (Facet System) shows promise, and identifies some issues.
2November 21-26, 2010 - Lisbon Service Computation 2010
Introduction One major issue: the scalability of the system
on traditional server implementations.
Traditional computing cannot support ever-increasing number of users, documents, schema objects, schema history, and automated classification processes without difficult, expensive and time consuming resource reconfiguration.
To address this problem, we are proposing to move our system on a cloud-based Microsoft Windows Azure platform as a collaborative cloud service.
3November 21-26, 2010 - Lisbon Service Computation 2010
Background – the existing Facet System
browsing screen
4November 21-26, 2010 - Lisbon Service Computation 2010
Background – the existing Facet System
Facet classification
The personal schema allows user to have a personal, persistent, idiosyncratic view of the collection
5November 21-26, 2010 - Lisbon Service Computation 2010
Background – the existing Facet System
Facet classification with both global and personal schemas.
Personal schema
Globalschema
6November 21-26, 2010 - Lisbon Service Computation 2010
Background – the existing Facet System
The back-end algorithms utilize the metadata in personal schemas for enrichment of global schema and automated classifications.
When automated classification is enabled for the personal hierarchy (in user preference settings), the backend algorithms take significant amount of computing resources for each additional user.
Furthermore, our system supports schema history – which allows users to examine global or personal schema at any given point in time.
7November 21-26, 2010 - Lisbon Service Computation 2010
Evaluation and scaling issues The evaluation of Facet System:• We have evaluated the Facet System for over a
year with over 300 students at the Old Dominion University and the niversity of Delaware.
• We have tested the system by simulating a large number of users.
• The scaling issue proves to be a critical factor in expanding the evaluation and deploying our system for public use in a multimedia document repository.
8November 21-26, 2010 - Lisbon Service Computation 2010
Evaluation and scaling issues The scaling issues of traditional computing:
• Traditional computing cannot support ever-increasing number of users, associated personal schemas, schema history logging, schema enrichment, and automated classification process.
• With traditional computing, resources are typically configured rigidly with respect to both hardware and software (including licenses) to handle expected usage for a fairly short time horizon.
9November 21-26, 2010 - Lisbon Service Computation 2010
Evaluation and scaling issues Our long-term vision: the cloud-based document-
organization approach may go beyond organizing an online multimedia collection to organizing knowledge bases in a large enterprise or a global research community.
The cloud not only eliminates the storage limitation of desktop computers and traditional file servers, but also reduces duplicate storage and allows for value-added services such as document version controls.
The Facet system on Windows Azure
10November 21-26, 2010 - Lisbon Service Computation 2010
Cloud development architecture Current System: Joomla on LAMP (Linux, Apache,
MYSQL, and PHP) The system using Azure: the Joomla system along with
PHP and MySQL on Windows Azure.
AzureWeb Role
Worker Role
Run the user-facingFacet System, which is programmed in PHP
Run the MySQL database, backend schema enrichment and classification programs in Java.
11November 21-26, 2010 - Lisbon Service Computation 2010
Cloud development architecture
Overview of the Azure Cloud
12November 21-26, 2010 - Lisbon Service Computation 2010
Cloud development architecture In our deployment there are DIFFERENT Web Roles
and Work Roles. The FacetUI instances: serve the end-user
interface to the Facet system. The FacetAdmin role: contains administrative
tools that administer the database and caches.
Two web roles
The MySQL instances host the MySQL database that supports the core-Joomla features.
The MemCached instances host Memcached, a popular distributed object cache system.
The FacetBackend role contains systematic schema enrichment and automated classification
algorithms, which operate with data in SQL Azure.
Threework roles
13November 21-26, 2010 - Lisbon Service Computation 2010
Cloud development architecture
Architecture of proposed deployment on Windows Azure
14November 21-26, 2010 - Lisbon Service Computation 2010
Cloud development architecture
Deployment of Facet System on Azure Development
15November 21-26, 2010 - Lisbon Service Computation 2010
Future Work
On the user-oriented side: address issues that come with the large scale.
On the back-end: address scalability issues of schema enrichment and automated classification
We will evaluate various aspects of system functionality, including both user interface and backend algorithms.
In parallel with code changes, we will develop a large test bed that allows us to test the scalability of the system.
……
16November 21-26, 2010 - Lisbon Service Computation 2010
Thank You!
Question ?
November 21-26, 2010 - Lisbon Service Computation 2010