18
ONTOLOGY BASED WEB CRAWLER SUBMITTED BY: Sachin Murwariya (9910103457)

JIIT;Project 2013-14; CSE/IT

Embed Size (px)

Citation preview

ONTOLOGY BASED WEB CRAWLER

SUBMITTED BY:Sachin Murwariya (9910103457)

WHAT IS RSS

RSS is a defined standard for syndicating headlines and other content.

RSS is created using XML or eXtensible Markup Language, which is a markup language similar to HTML. All fields are defined. Tags are used to denote the field’s classification.

Like HTML, proper construction requires that tags are both opened and closed.

Example: <title> Title of Item in Feed </title>

RSS has been around for more than a decade, but only recently the standard has been embraced by bloggers, webmasters and large news portals as a means of distributing Information, in a standardized format.

WHAT IS ONTOLOGY BASED WEB CRAWLER

We present News Personalization using the Semantic Recommender, a news recommender system which applies Semantic Web technologies to describe and relate news contents and user preferences in order to produce enhanced recommendations

APPLICATIONS

◦ User profile construction◦ Semantics based recommendation:◦ Delevering categorised news items

BENEFITS: Help in constant updateEase of Operation:User can collect information from multiple

sources into a single data stream.

PROBLEM STATEMENT

The extremely large volume of online news has created an urgent need for tools that let users effectively and efficiently browse topics, detect temporal trends, and search news of interest.

For this we are preparing a ONTOLOGY BASED WEB CRAWLER to extract valuable information from large online news collections

TEST PLAN

The purpose of testing is quality assurance, verification and

validation, or reliability estimation.

Unit Testing

Component testing

Integration testing

Validation Testing

System Testing

ARCHITECTURE :

METHODS IN USE:1. Crawling Algorithm2. Concept Based Algorithm3. Recommendation Algorithm

CRAWLING ALGORITHM:

Concept Based Algorithm

RECOMMENDATION ALGORITHM

Recommender systems typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering.

Collaborative filtering approaches build a model from a user's past behavior (items previously purchased or selected )

Then use that model to predict items that the user may have an interest in Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties.

IMPLEMENTATION

Login Page

Search using keyword:

TEST PLAN

The purpose of testing is quality assurance, verification and

validation, or reliability estimation.

Unit Testing

Component testing

Integration testing

Validation Testing

System Testing

REFERENCES

[I] Ching Hsu .Taiwan, National Formosa University,2011. [2] I.Jntema, F.Frasincar, F.Goossen and F.Hogenboom, Erasmus

University Rotterdam, 2010[3] M.Shea and M.Levene, University of London, UK, 2011

[4] Z.Rui-juan , Z. Yang-sen, 9th International Conference,2012[5] S.Saha, A. Sajjanhar, S. Gao, R.Dew and Y. Zhao,0 IEEE 10th International Conference,2010[6] Sajjanhar, A. Ying Zhao, ChinaGrid Annual Conference (ChinaGrid), 2012[7] S. Sarumathi , (PRIME)International Conference,2012[8] F.Goossen, W.IJntema, F.Frasincar, F.Hogenboom, U.Kaymak, Erasmus University Rotterdam.2011

THANK YOU