7/29/2019 Knowledge Discovery From Weblogs
1/25
A
SEMINAR REPORT
ON
Knowledge Discovery From Weblogs
Submitted in partial fulfillment of degree of
BACHELOR OF TECHNOLOGY
In
Information Technology
2012-13
Guided by: Submitted by:
Mr. Saurabh Anand, Avtar Kishore Gaur,
Lecturer, B. Tech. (IT),
Department Of IT VIII Semester, IT/09/53
DEPARTMENT OF INFORMATION TECHNOLOGY
POORNIMA COLLEGE OF EGINEERING
ISI 06, RIICO INSTITUTIONAL AREA
JAIPUR302 022
7/29/2019 Knowledge Discovery From Weblogs
2/25
2
ON ......................................................................................................................................................................... 1
SUBMITTED IN PARTIAL FULFILLMENT OF DEGREE OF ........................................................................................... 1
BACHELOROFTECHNOLOGY ..................................................................................................................................... 1
INFORMATION TECHNOLOGY ............................................................................................................................................ 1
1. INTRODUCTION .................................................................................................................................................. 3
2. FIELDS IN WEB LOG FILE ..................................................................................................................................... 3
3. MINING WEB LOGS FOR PATH PROFILES ............................................................................................................ 4
3.1WEB CONTENT MINING: ............................................................................................................................................ 4
3.2WEB LOG MINING FOR PREFETCHING........................................................................................................................... 4
3.3WEB OBJECT PREDICTION .......................................................................................................................................... 4
4. WEB MINING TAXONOMY: ................................................................................................................................ 5
4.1WEB CONTENT MINING: ........................................................................................................................................... 5
4.1.1 Classification of Multimedia Content and Websites: ................................................................................... 5
4.1.2 Focused Crawling: ........................................................................................................................................ 6
4.1.3 Clustering Web Objects: ............................................................................................................................... 6
Clustering : ............................................................................................................................................................ 6
Association: ........................................................................................................................................................... 7
4.2WEB STRUCTURE MINING: ......................................................................................................................................... 7
Web structure mining techniques: ........................................................................................................................ 9
4.3WEB USAGE MINING: ............................................................................................................................................. 10
4.3.1 Data Preparation: ..................................................................................................................................... 11
4.3.2 Data Mining ............................................................................................................................................... 11
4.3.3 Web usage data: ........................................................................................................................................ 13
4.3.4 Web Server Data: ....................................................................................................................................... 15
5 ADVANTAGES/ MERITS: .................................................................................................................................... 16
6. DISADVANTAGES/ DEMERITS: .......................................................................................................................... 17
7. APPLICATIONS: ................................................................................................................................................ 18
6.1SEARCH ENGINES: ................................................................................................................................................... 19
6.2SIMILARITY MEASURES: ........................................................................................................................................... 19
6.3ONTOLOGY: .......................................................................................................................................................... 20
6.4RECOGNITION TECHNOLOGY: .................................................................................................................................... 20
6.5SUMMARIZATION: .................................................................................................................................................. 21
6.6E-COMMERCE: ....................................................................................................................................................... 21
6.7CONTENT MANAGEMENT: ........................................................................................................................................ 22
6.8INFORMATION AGGREGATION: .................................................................................................................................. 23
8. CONCLUSION ................................................................................................................................................... 23
9. REFERENCES ..................................................................................................................................................... 24
7/29/2019 Knowledge Discovery From Weblogs
3/25
3
1. IntroductionWeb usage mining is obtaining the interesting and constructive knowledge andimplicit information from activities related to the WWW. Web servers trace andgather information about user interactions every time the user requests for
particular resources. Evaluating the Web access logs would assist in predicting the
user behavior and also assists in formulating the web structure. Based on the
applications point of view, information extracted from the Web usage patternspossibly directly applied to competently manage activities related to e-business,
eservices, e-education, on-line communities and so on. On the other hand, since thesize and density of the data grows rapidly, the information provided by existing
Web log file analysis tools may possibly provide insufficient information andhence more intelligent mining techniques are needed. There are several approaches
previously available for web usage mining. The approaches available in theliterature have their own merits and demerits. This paper focuses on the study and
analysis of various existing web usage mining techniques.
2. Fields in Web Log File
a) Web Server:Apacheb) IP Adress:-66.249.71.6 and 180.76.5.92c) UserName:- -- and --d) Timestamp :- [23/Feb/2012:06:23:46 -0600] and -
[23/Feb/2012:06:11:04 -0600] (time of visit by webserver)
e) Access request :"GET /robots.txt HTTP/1.1 and "GET / HTTP/1.1f) Result status code : 500 and 500 (Internal Server Error)g) Bytes transferred : 7370 and 7370h) User Agent: Mozilla/5.0i) Referrer URL : (compatible; Googlebot/2.1;
+http://www.google.com/bot.html) and (compatible; Baiduspider/2.0;+http://www.baidu.com/search/spider.html)
j) Access request :"GET /robots.txt HTTP/1.1 and "GET / HTTP/1.1k) Result status code : 500 and 500 (Internal Server Error)l) Bytes transferred : 7370 and 7370m)User Agent: Mozilla/5.0
7/29/2019 Knowledge Discovery From Weblogs
4/25
4
n) Referrer URL : (compatible; Googlebot/2.1;+http://www.google.com/bot.html) and (compatible; Baiduspider/2.0;+http://www.baidu.com/search/spider.html)
3. Mining Web Logs for Path Profiles
3.1Web Content Mining: Steps involved in mining web logs for path profile are
a. Data Cleaning on Web Log Datab. Mining Web Logs for Path Profilesc. Web Object Prediction.d. Learning to Prefetch Web Documents
3.2 Web Log Mining for Prefetching
Caching and prefetching as effective approaches to explosive growth in Network
users and Web service, and has been widely used in Web Proxy,P2P,GridComputing and Wireless network. Bringing some of more popular items closer to
end-users can improve the network performance and, therefore, reduce thedownload latency and network congestion. Web caching and prefetching are based
on temporal locality of user sequence. Independent Reference Model (IRM) andMarkov Reference Model (MRM) are mostly used for Web caching Model at
present. While Markov-based Prefetching Model is mostly used for prefetching.
The design of replacement policy is always based on characteristic of requestsequences. Therefore, to modeling on user request sequences and Web objectsproperties exactly and simply is so important, and we hope to find optimal policies
under these factors to be pursued in systematic manner. This paper firstly analyzesand compares Web caching and prefetching models that are used nowadays, and
then based on the measurement of Relative Popularity and Byte Cost, it presents anoptimal Web caching and prefetching model PR PPM that satisfy different
performance metrics.We have separate visiting sessions.Apath profile consists
frequent subsequences from the frequently occurring paths.Path profile helps us topredict the next pages that are most likely to occur.
3.3 Web Object Prediction
It is possible to train a path-based model for predicting future URL's based on a
sequence of current URL accesses.This can be done on a per-user basis, or on aper-server basis. The former requires that the user-session be recognized and
7/29/2019 Knowledge Discovery From Weblogs
5/25
5
broken down nicely through a filtering system, and the latter takes the simplisticview that the accesses on a server is a single long thread.
4. WEB MINING TAXONOMY:
Web Mining can be broadly divided into three distinct categories, according to the
kinds of data to be mined:
4.1 Web Content Mining: Web content mining techniques:
4.1.1 Classification of Multimedia Content and Websites:
In order to retrieve relevant knowledge a system has to analyze webcontent first. Classification of web objects offers an automatic way to
decide the relevance of web objects. Our focus in this area is theclassification of websites or hosts. Since websites represent
information on a more general level (e.g. a complete company) andare usually represented by multiple pages, classifiying website on topof webpage classification demands new algorithms.
7/29/2019 Knowledge Discovery From Weblogs
6/25
6
4.1.2 Focused Crawling:
A focused web crawler takes a set of well-selected web pages
exemplifying the user interest. Searching for further relevant webpages, the focused crawler starts from the given pages and recursively
explores the linked web pages. We are especially interested in
crawling to retrieve complete websites, a task demanding new crawl
strategies. While the crawlers used for refreshing the indices of theweb search engines perform a breadth-first search of the whole web, a
focused crawler explores only a small portion of the web using a best-first search guided by the user interest. Furthermore, we are interested
in crawling for multimedia content in the web, retrieving topicsspecific multimedia content instead of plain HTML documents.
4.1.3 Clustering Web Objects:
Focused Crawling retrieves large numbers of relevant data. In order tooffer fast and more specific access to the query results, clustering is
an established method to group the retrieved information to achievebetter understanding. If the query results are websites or combined
objects like images and their text descriptions, new algorithm areneeded to handle these combined data types to find meaningul
clusterings.
Clustering : It is the process of grouping a set of physical andabstract objects into class of similar objects is called clustering.
Requirements of clustering in web mining:
1 .Scalability
7/29/2019 Knowledge Discovery From Weblogs
7/25
7
2. ability to deal with different type of attributes
3. discovery of clusters with arbitrary shape
4. minimal requirements for domain knowledge to determine input
parameters
5. ability to deal with noisy data
6. high dimensionality
7. interpretability and usability
Fig: clustering
Association:
Association analysis identifies items events that happen or dont happen
together .it is used to search frequent pattern. Suppose, instead, that weare given the All Electronics relational database relating to purchases. Aweb mining system may find association rules like
age(X, 2029)^ income(X, 20K29K)-> buys(X, CD player)
[support= 2%, confidence = 60%]
4.2 Web Structure Mining:
7/29/2019 Knowledge Discovery From Weblogs
8/25
8
Web Structure Mining can be regarded as the process of discovering structure
information from the Web.The structure of a typical Web graph consists of Web
pages as nodes, and hyperlinks as edges connecting between two related pagesThis
type of mining can be further divided into two kinds based on the kind of structural
data used.
There has been a significant body of work on hyperlink analysis. Document
Structure: In addition, the content within a Web page can also be organized in a
tree-structured format, based on the various HTML and XML tags within the page.
Mining efforts here have focused on automatically extracting document object
model (DOM) structures out of document.
Hyperlinks: A Hyperlink is a structural unit that connects a Web page to different
location, either within the same Web page or to a different Web page. A hyperlink
that connects to a different part of the same page is called anIntra-Document
Hyperlink, and a hyperlink that connects two different pages is called anInter-
Document Hyperlink.
7/29/2019 Knowledge Discovery From Weblogs
9/25
9
Web structure mining techniques:
Generate structural summary about the Web site an
webpage:
Depending upon the hyperlink, Categorizing the Web pages and therelated Information @ inter domain level Discovering the Web Page
Structure. Discovering the nature of the hierarchy of hyperlinks in the
website and its structure.
Finding Information about web pages:
->Retrieving information about the relevance and the quality of the
web page.
->Finding the authoritative on the topic and content.
7/29/2019 Knowledge Discovery From Weblogs
10/25
10
Inference on Hyperlink:
The web page contains not only information but also hyperlinks,
which contains huge amount of annotation. Hyperlink identifies
authors endorsement of the other web page.
4.3 Web Usage Mining:
Web Usage Mining is the application of data mining techniques to discover
interesting usage patterns from Web data, in order to understand and better servethe needs of Web-based applications. Usage data captures the identity or origin ofWeb users along with their browsing behavior at a Web site. Web usage mining
itself can be classified further depending on the kind of usage data considered webusage mining techniques:
7/29/2019 Knowledge Discovery From Weblogs
11/25
11
4.3.1 Data Preparation:
Data Collection:
Data collection is the first step of web usage mining, the data authenticity
and integrality will directly affect the following works smoothly carrying on
and the final recommendation of characteristic services quality. Therefore itmust use scientific, reasonable and advanced technology to gather variousdata. At present, towards web usage mining technology, the main data origin
has three kinds: server data, client data and middle data (agent server dataand package detecting).
Data Selection:
Where data relevant to the analysis task are retrieved from web.
Data Cleaning:
The purpose of data cleaning is to eliminate irrelevant items, and these kinds
of techniques are of importance for any type of web log analysis not onlydata mining. According to the purposes of different mining applications,
irrelevant records in web access log will be eliminated during data cleaning.Since the target of Web Usage Mining is to get the users travel patterns,following two kinds of records are unnecessary and should be removed:
1. The records of graphics, videos and the format information Therecords have filename suffixes of GIF, JPEG, CSS, and so on, whichcan found in the URI field of the every record;
2. The records with the failed HTTP status code. By examining theStatus field of every record in the web access log.
4.3.2 Data Mining
Navigation Patterns:
Web page hierarchy of web site:
7/29/2019 Knowledge Discovery From Weblogs
12/25
12
Example:
70% of users who accessed /company/product2 did so by starting at/company and proceeding through /company/new, /company/products and
company/product1 80% of users who accessed the site started from/company/products 65% of users left the site after four or less pagereferences.
7/29/2019 Knowledge Discovery From Weblogs
13/25
13
Sequential Patterns :
Mining Results
Fig. Mining result
4.3.3 Web usage data:
The record of what actions a user takes with his mouse and keyboard while
visiting a site.
Sources
- Server access logs
- Server Referrer logs
- Agent logs
- Client-side cookies
- User profiles
7/29/2019 Knowledge Discovery From Weblogs
14/25
14
- search engine logs
- Database logs
Transfer / Access Log: The transfer/access log contains detailed
information about each request that the server receives from usersweb browsers.
Agent log : The agent log lists the browsers (including versionnumber and the platform) that people are using to connect to yourserver.
Referred log : The referrer log contains the URLs of pages on othersites that link to your pages. That is, if a user gets to one of theservers pages by clicking on a link from another site, that URL ofthat site will appear in this log.
7/29/2019 Knowledge Discovery From Weblogs
15/25
15
Error log: The error log keeps a record of errors and failed requests.
A request may fail if the page contains links to a file that does
not exist or if the user is not authorized to access a specific pageor file.
4.3.4 Web Server Data:
They correspond to the user logs that are collected at Web server. Some ofthe typical data collected at a Web server include IP addresses, pagereferences, and access time of the users.
7/29/2019 Knowledge Discovery From Weblogs
16/25
16
4.3.5 Application Server Data:
Commercial application servers, e.g. Web logic have significant features in
the framework to enable E-commerce applications to be built on top of them
with little effort. A key feature is the ability to track various kinds ofbusiness events and log them in application server logs.
4.3.6 Application Level Data:
Finally, new kinds of events can always be defined in an application, andlogging can be turned on for them generating histories of these speciallydefined events.
5 Advantages/ Merits:
Web usage mining has many advantages which makes this technology
attractive to many corporations including the government agencies. The
predicting capability of the mining application can benefits the society by
identifying criminal activities. The companies can establish better customer
relationship by giving them exactly what they need. Companies can
understand the needs of the customer better and they can react to customer
needs faster. The companies can find, attract and retain customers; they cansave on production costs by utilizing the acquired insight of customer
requirements. This technology has enabled e-commerce to do personalized
marketing, which eventually results in higher trade. The government
agencies are using this technology to classify threats and fight against
terrorism. They can increase profitability by target pricing based on the
profiles created. They can even find the customer who may default to a
competitor the company will try to retain the customer by providing
promotional offers to the specific customer, thus reducing the risk of losing a
customer or customers.
Easy to implement
7/29/2019 Knowledge Discovery From Weblogs
17/25
17
Improve the quality of public search engine and personalized searchengines
To create personalized search engines, which can understand apersons search queries in a personal way by analyzing and profiling
users search behaviour
6. Disadvantages/ Demerits:
Some mining algorithms might use controversial attributes like sex, race,
religion, or sexual orientation to categorize individuals. These practices
might be against the anti-discrimination legislation. The applications make it
hard to identify the use of such controversial attributes, and there is no
strong rule against the usage of such algorithms with such attributes. This
process could result in denial of service or a privilege to an individual based
on his race, religion or sexual orientation, right now this situation can be
avoided by the high ethical standards maintained by the data mining
company. The collected data is being made anonymous so that, the obtained
data and the obtained patterns cannot be traced back to an individual. It
might look as if this poses no threat to ones privacy, actually many extra
information can be inferred by the application by combining two separate
unscrupulous data from the user.Another important concern is that the
companies collecting the data for a specific purpose might use the data for a
totally different purpose, and this essentially violates the users interests.
Web usage mining by itself does not create issues, but this technology when
used on data of personal nature might cause the issues. The most criticized
ethical issue involving web usage mining is the invasion of privacy. Privacy
7/29/2019 Knowledge Discovery From Weblogs
18/25
18
is considered lost when information concerning an individual is obtained,
used, or disseminated, especially if this occurs without their knowledge or
consent. The obtained data will be analyzed, and clustered to form profiles;
the data will be made anonymous before clustering so that there are no
personal profiles. Thus these applications de-individualize the users by
judging them by their mouse clicks. De-individualization, can be defined as
a tendency of judging and treating people on the basis of group
characteristics instead of on their own individual characteristics and merits.
The growing trend of selling personal data as a commodity encourages
website owners to trade personal data obtained from their site. This trend has
increased the amount of data being captured and traded increasing the
likeliness of ones privacy being invaded. The companies which buy the data
are obliged make it anonymous and these companies are considered authors
of any specific release of mining patterns. They are legally responsible for
the contents of the release; any inaccuracies in the release will result in
serious lawsuits, but there is no law preventing them from trading the data.
7. Applications:
a. Search Engines
b. Similarity Measures
c. ontology
d. matching techniques;
e. recognition technology;f. summarization;
g. e-commerce;
h. content management;
i. database querying;
7/29/2019 Knowledge Discovery From Weblogs
19/25
19
j. information aggregation
6.1 Search Engines:
Given the rate of growth of the Web, scalability of search engines is a key
issue, as the amount of hardware and network resources needed is large, and
expensive. In addition, search engines are popular tools, so they have heavy
constraints on query answer time. So, the efficient use of resources can
improve both scalability and answer time. One tool to achieve these goals is
Web mining. Web mining has three branches: link mining, usage mining,
and content mining. One important analysis in all these cases is the dynamic
behavior. Here we give examples of link and usage mining related to search
engines, as well as the related Web dynamics.
6.2 Similarity Measures:
Ranking model construction is an important topic in information retrieval
and web mining. Recently, many approaches based on the idea of learning
to rank have been proposed for this task and most of them attempt to score
all documents of different queries by resorting to a single function.we
propose a distributional similarity measure for query-dependent ranking. In
the query-dependent ranking framework, an individual ranking model is
constructed for each training query with associated documents. When a new
query is asked, the documents retrieved for the new query are ranked
according to the scores determined by a joint ranking model which is
7/29/2019 Knowledge Discovery From Weblogs
20/25
20
combined from the individual models of similar training queries. The
distributional similarity measure is used to calculate the similarities between
queries. Experimental results show that our method is more effective than
other approaches.
6.3 Ontology:
The world wide web today provides users access to extremely large websites
containing many information of education and commercial values.due to the
unstructures and semi structures of web pages and the design of idiosyncrasy
of websites.its a challenging task to develop digital libraries for organisingand managing digital content from the web.web mining research in the last
10 years has on the other hand made significant process in categorising and
extracting content from the web.ontology represnts set of content and their
interrelationships relevant to some knowledge domain.the knowledge
provided by ontology is extremely useful defining the structure and scope
for mining web content.
6.4 Recognition Technology:
The explosive growth of internet has made more necessary to the users to
use automatic tool to find, to extract, to filter and to evaluate the available
resources over the internet. there are powerful tools to find information for
category or for content such as yahoo, Google etc. for these searches we
need to introduce keywords and they determine the web pages that contain
these words. trying to satisfying users requirements, many times these
consultations bring inconsistence or documents that fulfill the search
approach but not the users interest.
7/29/2019 Knowledge Discovery From Weblogs
21/25
21
there is necessity of having new technologies that help us to use the content
of web more efficiently. for this reason in last years a series of techniques
that allow advanced processing data on internet have been developed. these
techniques carry out a depth analysis in an automatic way and they belong to
area denominated as web mining.
6.5 Summarization:
Hypermedia has emerged as primary means for storing and structuring
information yet due to the continuously increasing size of these
infrastructure ,it is getting ever difficult for users to understand and navigatethrough such sites. we see that to overcome these obstacles it is essential to
use techniques that recover the web authors intentions and superimpose it
with the users retrieval context in summarizing websites.
Although most of the developing world is likely to first access the Internet
through mobile phones, mobile devices are constrained by screen space,
bandwidth and limited attention span. Single document summarization
techniques have the potential to simplify information consumption on
mobile phones by presenting only the most relevant information contained in
the document.
6.6 E-commerce:
Nowadays, the web is an important part of our daily life. The web is now the
best medium of doing business. Large companies rethink their business
strategy using the web to improve business. Business carried on the Web
offers the opportunity to potential customers or partners where their products
and specific business can be found. Business presence through a company
7/29/2019 Knowledge Discovery From Weblogs
22/25
22
web site has several advantages as it breaks the barrier of time and space
compared with the existence of a physical office. To differentiate through
the Internet economy, winning companies have realized that e-commerce
transactions is more than just buying / selling, appropriate strategies are key
to improve competitive power. One effective technique used for this purpose
is data mining. Data mining is the process of extracting interesting
knowledge from data. Web mining is the use of data mining techniques to
extract information from web data.
6.7 Content management:
With the rapid growth in business size, todays businesses orient towards
electronic technologies. Amazon.com and e-bay.com are some of the major
stakeholders in this regard. Unfortunately the enormous size and hugely
unstructured data on the web, even for a single commodity, has become a
cause of ambiguity for consumers. Extracting valuable information from
such an ever increasing data is an extremely tedious task and is fastbecoming critical towards the success of businesses. Web content mining
can play a major role in solving these issues. It involves using efficient
algorithmic techniques to search and retrieve the desired information from a
seemingly impossible to search unstructured data on the Internet.
Application of web content mining can be very encouraging in the areas of
Customer Relations Modeling, billing records, logistics investigations,
product cataloguing and quality management. In this paper we present a
review of some very interesting, efficient yet implementable techniques
from the field of web content mining and study their impact in the area
specific to business user needs focusing both on the customer as well as the
7/29/2019 Knowledge Discovery From Weblogs
23/25
23
producer. The techniques we would be reviewing include, mining by
developing a knowledge-base repository of the domain, iterative refinement
of user queries for personalized search, using a graph based approach for the
development of a web-crawler and filtering information for personalized
search using website captions. These techniques have been analyzed and
compared on the basis of their execution time and relevance of the result
they produced against a particular search.
6.8 Information aggregation:
Web Data Extraction Services provides robust, cutting-edge solutions and
services for data extraction from websites. Web SQL, for creating turnkey
web extraction applications, such as price collector, patent information
aggregator, etc.
XML MinerXML Miner is a system and class library for mining data and
text expressed in XML, extracting knowledge and re-using that knowledge
in products and applications in the form of fuzzy logic expert system rules
8. Conclusion
The purpose of this paper is to advocate the discovery of actionable knowledge
from Web logs. In this chapter, we presented two examples of actionable Web log
mining. In our future work, we will further explore other types of actionable
knowledge in Web applications, including the extraction of content knowledge and
http://www.webdataextractions.com/http://www.ql2.com/http://www.scientio.com/http://www.scientio.com/http://www.ql2.com/http://www.webdataextractions.com/7/29/2019 Knowledge Discovery From Weblogs
24/25
24
knowledge integration from multiple Web sites. The first method is to mine a Web
log for Markov models that can be used for improving caching and prefetching ofWeb objects. A second method is to use the mined knowledge for building better,
adaptive user interfaces. A third application is to use the mined knowledge from a
query web log to improve the search performance of an Internet Search Engine.Actionable knowledge is articularly attractive for Web applications because they
can be consumed by machines rather than human developers. Furthermore, theeffectiveness of the knowledge can be immediately put to test, making the merits
of the type of knowledge and methods for discovering the knowledge under moreobjective scrutiny than before.
9. References
1. Qingtian Han; Xiaoyan Gao; Wenguo Wu; Study on Web Mining Algorithm
based on Usage Mining, 9
th
International Conference on Computer-AidedIndustrial Design and Conceptual Design (CAID/CD 2008), Pp. 11211124, 2008.
2. Heydari, M.; Helal, R.A.; Ghauth, K.I.; A graphbased web usage mining
method considering client side data, International Conference on Electrical
Engineering and Informatics (ICEEI '09), Vol. 1, Pp. 147153, 2009.3. Salin, S.; Senkul, P.; Using semantic information for web usage mining based
recommendation, 24th International Symposium on Computer and InformationSciences (ISCIS 2009), Pp. 236241,2009.
4. Chih-Hung Wu, Yen-Liang Wu, Yuan-Ming Chang and Ming-Hung Hung,
"Web Usage Mining on the Sequences of Clicking Patterns in a Grid ComputingEnvironment", International Conference on Machine Learning and Cybernetics(ICMLC), Vol. 6, Pp. 2909- 2914, 2010.
5. Gang Fang; Jia-Le Wang; Hong Ying; Jiang Xiong;A Double Algorithm ofWeb Usage Mining Based on Sequence Number, International Conference on
Information Engineering and Computer Science (ICIECS), Pp. 14, 2009.6. Raghavendra, P.S.; Chowdhury, S.R.; Kameswari, S.V.; Comparative study of
neural networks and kmeans classification in web usage mining,InternationalConference for Internet Technology and Secured Transactions (ICITST), Pp. 1-7,
2010.7.Hussain, T.; Asghar, S.; Fong, S.; A hierarchical cluster based preprocessing
methodology for Web Usage Mining, 6th International Conference on AdvancedInformation Management and Service (IMS), Pp. 472477, 2010.
8. Khosravi, M.; Tarokh, M.J.; Dynamic mining of users interest navigationpatterns using nave Bayesian method, IEEE International Conference
7/29/2019 Knowledge Discovery From Weblogs
25/25
25
on Intelligent Computer Communication and Processing (ICCP), Pp. 119 122,
2010.9. Etminani, K.; Delui, A.R.; Yanehsari, N.R.; Rouhani, M.; Web usage mining:
Discovery of the users' navigational patterns using SOM, First International
Conference on Networked Digital Technologies (NDT '09), Pp. 224249, 2009.10.Shinde, S.K. and Kulkarni, U.V., International Conference on Advanced
Computer Theory and Engineering, Pp. 973-977, 2008.11.Yang Bin; Dong Xiangjun; Shi Fufu; Research of WEB Usage Mining Based
on Negative Association Rules, International Forum on Computer Science-Technology and Applications (IFCSTA '09), Vol. 1,Pp. 196199, 2009.
12.Hussain, T.; Asghar, S.; Masood, N.; Web usage mining: A survey on
preprocessing of web log file, International Conference on InformationandEmerging Technologies (ICIET), Pp. 16, 2010.