Upload
webscraping
View
218
Download
0
Embed Size (px)
Citation preview
WEB SCRAPING FOR CMS
WEBSITES IN ANDROID
APPLICATION
Major Project Part-2
Problem Statement
As number of users accessing World Wide Web using smart phones is growing exponentially there is an urgent need to address issues related this mode of access. Some of these issues includes limited display space, quick access to ‘only’ relevant information etc. which can be addressed using Web scraping. Major part of World Wide Web today exists as web pages developed using Content Management Systems like Wordpress, phpBB etc. So a generic approach can be developed for extracting relevant information from such web pages using already available Web scraping tools.
Assumptions
Assumption 1: The user knows how to
operate the android phones.
Assumption 2: The Android Version should
not be less than the Version 4.4 or API
level 19.
Approach to the Solution
Wordpress is the most extensively used Content Management System. So we decided to develop a solution for Web Scraping based information retrieval from Wordpress web pages.
Thereby we have chosen android application to serve the purpose. Android is the world's most widely used Smartphone platform. Android's share of the global smart phone market, led by Samsung products, was 64% in March 2013. In July 2013 there were 11,868 different Android devices, scores of screen sizes and eight OS versions simultaneously in use.
Approach to the Solution
For its compatibility with Android platform
we selected jsoup as preferred web
scraping tool and developed a model
application for website
www.geeksforgeeks.org.
Algorithm
Tools and Technology
Android SDK
Eclipse
jsoup
Hardware Requirements
Personal computer/Laptop
Android phone (version: 4.4 API level:19)
LIMITATIONS OF THE SOLUTION
The applications are only available for
android users.
Developed only for Wordpress CMS and
can be extended for other CMSes.
Thank You!
A Presentation by:
Dolly Kharbanda (9910103471) F-3
Pradeep (9910103554) F-3
Under Mentorship Of
Mr. Sandeep Jain
Department of CSE and IT
JIIT Sec 128, Noida