36
WEB INTELLIGENCE Seminar Report Submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Technology in Computer Science Engineering of Cochin University Of Science And Technology by NIJIL Y (12080050) DIVISION OF COMPUTER SCIENCE SCHOOL OF ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-682022

Web intelligence-future of next generation web

Embed Size (px)

DESCRIPTION

Web intelligence is the area of study and research of the application of artificial intelligence and information technology on the web in order to create the next generation of products, services and frameworks based on the internet. This presentation was presented by Nijil Y from SEO, CUSAT

Citation preview

Page 1: Web intelligence-future of next generation web

WEB INTELLIGENCE Seminar Report

Submitted in partial fulfilment of the requirements

for the award of the degree of

Bachelor of Technology

in

Computer Science Engineering

of

Cochin University Of Science And Technology

by

NIJIL Y (12080050)

DIVISION OF COMPUTER SCIENCE

SCHOOL OF ENGINEERING

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

KOCHI-682022

Page 2: Web intelligence-future of next generation web

WEB INTELLIGENCE Seminar Report

Submitted in partial fulfilment of the requirements

for the award of the degree of

Bachelor of Technology

in

Computer Science Engineering

of

Cochin University Of Science And Technology

by

NIJIL Y (12080050)

DIVISION OF COMPUTER SCIENCE

SCHOOL OF ENGINEERING

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

KOCHI-682022

Page 3: Web intelligence-future of next generation web

DIVISION OF COMPUTER SCIENCE

SCHOOL OF ENGINEERING

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

KOCHI-682022

Certificate

Certified that this is a bonafide record of the seminar entitled

“WEB INTELLIGENCE”

Presented by the following student

NIJIL Y

of the VIIth semester, Computer Science and Engineering in the year 2010

in partial f ulfillment of the requirements in the award of Degree of

Bachelor of Technology in Computer Science and Engineering of Cochin

University of Science and Technology.

Mr. SUDEEP EDAYILAM Dr. DAVID PETER

Seminar guide Head Of Division

Page 4: Web intelligence-future of next generation web

ACKNOWLEDGEMENT

I thank GOD almighty for guiding me throughout the seminar. I would like to thank all those

who ha ve contributed to t he c ompletion of t he s eminar a nd he lped me with va luable

suggestions for improvement.

I a m e xtremely grateful to Dr. David Peter, Head Of Division, Division of Computer

Science, for providing me with best facilities and atmosphere for the creative work guidance

and encouragement. I am profoundly indebted to my seminar guide Mr. Sudheep Elayidom,

sr.Lecturer, Division of Computer Science, for all help and support extend to me. I thank

all Staff me mbers of my c ollege a nd f riends f or e xtending t heir c ooperation during m y

seminar.

Above all I would like to thank my parents without whose blessings, I would not have been

able to accomplish my goal.

NIJIL Y

Page 5: Web intelligence-future of next generation web

ABSTRACT

Web Intelligence is a new direction for scientific research and development that explores the f undamental roles as w ell as practical i mpacts of ar tificial i ntelligence and adva nced information t echnology f or t he ne xt ge neration of Web-empowered systems, services, and environments. Web Intelligence is regarded as the key research field for the development of the Wisdom Web ( including t he S emantic W eb). The Web r evolutionizes t he w ay w e ga ther, process, a nd us e i nformation. Despite cu rrent t echnological adva nces, w e st ill ca nnot pred ict what t he Web’s ne xt pa radigm s hift w ill b e. H owever, w e pr opose t hat t his c hange w ill transform the Web into an intelligent entity—hence, the term Web intelligence.

The ne xt-generation W eb w ill go b eyond i mproved i nformation s earch a nd know ledge queries and will help people achieve be tter ways of l iving, working, playing, and learning. To fulfil its potential, the intelligent Web’s design and development must incorporate and integrate several f undamental capa bilities. A f ew o f i ts capa bilities a re R eflexive ser ver pro pagation , Growth Specialization , A utocatalysis et c. Intelligent Web agents can use t he P roblem S olver Mark-up L anguage ( PSML) t o s pecify t heir r oles, s ettings, a nd r elationships w ith a ny ot her services. The i ntelligent Web must a lso ha ve the a bility t o pr ocess and unde rstand na tural language. It must understand and c orrectly judge the meaning of concepts expressed in words, such as “go od,” “be st,” and “season” et c. WI r esearch incorporates k nowledge f rom e xisting disciplines, such as artificial intelligence and information technology, in a t otally new domain. At t he sam e t ime, Web Intelligence r esearch also enriches t hese established disciplines as it introduces new topics and challenges.

Page 6: Web intelligence-future of next generation web

TABLE OF CONTENTS

CHAPTER NO. CHAPTER TITLE PAGE NO.

1 Introduction 1

2 Perspectives Of Wi 4

3 Intelligence Exploration 8

3.1 A New Field Of Science, Technology And Engineering 8

3.2 Design Philosophy And Principles Of The Web 8

3. 3 The Laws Of The Web 9

3. 4 The Web Revolution: One Link At A Time 10

3.5 The More Things Change, The More They Stay The Same 11

4 Components Of Web Intelligence 13

4.1 Web Data 13

4.2 Representation 15

4.3 Psml And Web Inference Engine 17

4.4 Social Network Intelligence 17

4.4 Social Network Intelligence 17

5 Computational Web Intelligence 18

5.1 Web Uncertainty 19

5.2 Computational Web Intelligence For Web Uncertainty 19

5.3 Granular Web Intelligence For Web Uncertainty 21

6 Trends And Challenges Of Wi Related Research And Development 23

6.1 Intelligent Web Agents 24

6.2 From Wa To Web-Based Services 25

7 Semantic Search Engine 28

8 Conclusion 29

References 30

Page 7: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 1

CHAPTER 1

INTRODUCTION

With the rapid growth of Internet and World Wide Web (WWW), we have now entered

into a new information age. The Web provides a total new media for communication, which goes

far beyond the traditional communication media, such as radio, telephone and television. The

Web has significant impacts on both academic research and ordinary daily life. It revolutionizes

the way in which information is gathered, stored, processed, presented, shared, and used. The

Web offers new opportunities and challenges for many areas, such as business, commerce,

marketing, finance, publishing, education, research and development. For computer scientists, the

Web introduces many new research topics and provides a new platform to reconsider old

problems. It might be high time to create a new sub-discipline of computer science covering

theories and technologies related to the Web. Web Intelligence is our proposal for this purpose.

Through the billions of Web pages created with HTML and XML, or generated

dynamically by underlying Web database service engines, the Web captures almost all aspects of

human endeavor and provides a fertile ground for data mining. However, searching,

comprehending, and using the semi-structured information stored on the Web poses a significant

challenge because this data is more sophisticated and dynamic than the information that

commercial database systems store. To supplement keyword-based indexing, which forms the

cornerstone for Web search engines, researchers have applied data mining to Web-page ranking.

In this context, data mining helps Web search engines find high-quality site administrator.

WI explores the fundamental and practical impact that artificial intelligence and advanced

information technology will have on the next generation of Web-empowered systems, services,

and environments. In an era dominated by the World Wide Web, Grid computing, intelligent-

agent technology, and ubiquitous social computing, WI represents information technology’s next

challenge. 3 Motivations and Justifications for WI The introduction of Web Intelligence (WI) can

be motivated and justified fromboth academic and industrial perspectives. Two features of the

Web make it a useful and unique platform for computer applications and research, the size and

complexity. The Web contains a huge amount of interconnected Web documents known as Web

pages. For example, the popular search engine Google claims that it can search 1,346,966,000

pages as of February 2001. The sheer size of the Web leads to difficulties in the storage,

Page 8: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 2

management, and efficient and effective retrieval of Web documents. The complexity of the Web,

in terms of connectivity and diversity of Web documents, forces us to reconsider many existing

information systems, as well as theories, methodologies and technologies underlying those

systems. One has to deal with a heterogeneous collection of structured, unstructured, semi-

structured, interrelated, and distributed Web documents consisting of texts, images and sounds,

instead of homogeneous collection of structured and unrelated objects. The latter is the subject of

study of many conventional information systems, such as databases, information retrieval, and

multi-media systems. To accommodate the needs of the Web, one needs to study issues on the

design and implementation of the Web-based information systems by combining and extending

results from existing intelligent information systems. Existing theories and technologies need

to be modified or enhanced to deal with complexity of the Web. Although individual Web-based

information systems are constantly being deployed, advanced issues and techniques for

developing and for benefiting from the Web remain to be systematically studied. The challenges

brought by the Web to computer scientists may justify the creation of the new sub-discipline, WI,

for carrying out Web-related research.

The Web increases the availability and accessibility of information to a much

larger community than any other computer applications. The introduction of Personal Computers

(PCs) brought the computational power to ordinary people. It is the Web that delivers more

effectively information to everyone at finger tips. The Web, no doubt, offers a new means for

sharing and transmitting information unmatchable by other media. The revolution started by the

Web is just beginning. New business opportunities, such as e-commerce, e-banking, and

e-publication, will increase with the maturity of the Web. It can hardly overemphasize more

impacts of the Web on the business and industrial world. The creation of a new sub-discipline

devoted toWeb related research and applications might has a significant value in the future.

The needs for WI may be further illustrated by the current fast growing research and industrial

activities centered on it. We searched the Web by using the keyword “Web Intelligence” through

several search engines in February 2001.

Page 9: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 3

What is Web Intelligence?

“Web Intelligence (WI) exploits Artificial Intelligence (AI) and advanced Information

Technology (IT) on the Web and Internet.”

This definition has the following implications. The basis of WI is AI and IT. The “I”

happens to be shared by both “AI” and “IT”, although with different meanings in them, and “W”

defines the platform on which WI research is carried out. The goal of WI is the joint goals of AI

and IT on the new platform of the Web. That is, WI applies AI and IT for the design and

implementation of Intelligent Web Information Systems (IWIS). An IWIS should be able to

perform functions normally associated with human intelligence, such as reasoning, learning, and

self improvement. There perhaps might not be a standard and non-controversial definition of WI,

as the case that there is no standard definition of AI. One may argued that our definition of WI

focuses more on the software aspects of the Web. It is not our intention to exclude any research

topic using the proposed definition. The term, Web Intelligence, should be considered as an

umbrella or a label of a new branch of research centered on the Web. Our definition simply states

the scopes and goals of WI. This allows us to include any theories and technologies that either fall

in the scopes or aim at the same goals. To complement the formal definition, we try to make the

picture clearer by listing topics to be covered by WI.

WI will be an ever-changing research branch. It will be evolving with development of the

Web as new media for information gathering, storage, processing, delivery and utilization. It is

our expectation that WI will be evolved into an inseparable research branch of computer science.

Although no one can predict the future in detail and without uncertainty, it is clear that WI would

have huge impacts on the application of computers, which in turn will affect our everyday lives.

Page 10: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 4

CHAPTER 2

Perspectives of WI

As a new branch of research, Web Intelligence exploits Artificial Intelligence (AI) and

Information Technology (IT) on the Web. On the one hand, it may be viewed as applying results

from these existing disciplines to a totally new domain. On the other hand, WI may also introduce

new problems and challenges to the established disciplines. WI may also be viewed as an

enhancement or an extension of AI and IT. It remains to be seen if WI would become a sub-area

of AI and IT or a child of a successful marriage of AI and IT. However, no matter what happens,

studies on WI can benefit a great deal from the results, experience, success and lessons of AI and

IT. In their very popular textbook, Russell and Norvig examined different definitions of artificial

intelligence from eight other textbooks, in order to decide what is exactly AI. They observed that

the definitions vary along the two dimensions. One dimension deals with the functionality and

ability of an AI system, ranging from thought processes and reasoning ability of the systems to the

behavior of the systems. The other dimension deals with the designing philosophy of AI systems,

ranging from intimating human problem solving to making rational decision. The combination of

the two dimensions results in four categories of AI systems adopted from Russell and Norvig .

Systems that think like humans. Systems that think rationally.

Systems that act like humans. Systems that act rationally.

This classification provides a basis for the studies of various views and approaches for AI.

It also clearly defines goals in the design of AI systems. According to Russell and Norvig , they

correspond to four approaches, the cognitive modeling approach (thinking humanly), the Turing

test approach (acting humanly), the laws of thought approach (thinking rationally), and the

rational agent approach (acting rationally).The two rows for separating AI systems in terms of

thinking and acting may not be a most suitable classification. Action is normally the final result of

a thinking process. One may argue that the class of systems acting humanly is a super set of the

class of system thinking humanly. In contrast, the separation of human-centered approach and

rationality-centered approach may have significant implications in the studies of AI. While earlier

research on AI was focus more on human-centered approach, rationality-centered approach

received more attention recently

Page 11: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 5

The first column is centered around humans and leads to the treatment of AI as an

empirical science involving hypothesis and experimental confirmation. A human-centered

approach represents the descriptive view of AI. Under this view, a system is designed by

intimating the human problem solving. This implies that a system should have the usual human

capabilities such as knowledge representation, natural language processing, reasoning, planning

and learning. The performance of an AI system is measured or evaluated through the Turing

test. An system is said to be intelligent if it provides human level performance. Such a descriptive

view dominates the majority of earlier studies of expert systems, a special type of AI systems.

The second column represents the prescriptive or normative view of AI. It deals with theoretical

principles and laws that an AI system must follow, instead of intimating humans. That is, a

rationalist approach deals with an ideal concept of intelligence, which may be independent of

human problem solving. An AI system is rational if it does the right thing and makes the right

decision. The normative view of AI based on the well established disciplines such as

mathematics, logic, and engineering. The descriptive and normative views also reflect the

experimental and theoretical aspects of AI research.

The experimental study represents the descriptive view. It covers theories and models for

the explanation of the workings of the human mind, and applications of AI to solving problems

that normally require human intelligence. The theoretic study aims at the development of theories

of rationality, and focuses on the foundations of AI. The two views are complementary to each

other. Studies in one direction may provide valuable insights into the other. Web Intelligence

concerns the design and development of intelligent Web information systems. The previous

framework for the study of AI can be immediately applied to that of Web Intelligence. More

specifically, we can cluster research in WI into the prescriptive approach and the normative

approach, and cluster Web information systems in terms of thinking and acting. Various research

topics can be identified and grouped accordingly. Like AI, a foundation of WI can be established

by drawing results from the following many related disciplines:

• Mathematics: computation, logic, probability.

Applied Mathematics and Statistics: algorithms, non-classical logics, decision theory,

information theory, measurement theory, utility theory, theories of uncertainty,

approximate reasoning.

Page 12: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 6

• Psychology: cognitive psychology, cognitive science, human-machine interaction, user

interface.

• Linguistics: computational linguistics, natural language processing, machine translation.

• Information Technology: information science, databases, information retrieval systems,

knowledge discovery and data mining, expert systems, knowledge-based systems, decision

support systems, intelligent information agents.

The topics under each entry are only intended as examples. They do not form an exhausted

list. In the development of AI, we have witnessed the formulation of many of its new sub-

branches, such as knowledge-based systems, artificial neural networks, genetic algorithms, and

intelligent agents. Recently, non-classical AI topics have received much attentions under the name

of computational intelligence. Computational intelligence focuses on the computational aspect of

intelligent systems , . The application of AI in other disciplines also leads to new techniques in the

corresponding fields. For instance, Business Intelligence (BI) is a result of applying artificial

Page 13: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 7

intelligence to the business domain. Artificial Intelligence in Medicine also proved to be a

successful application. When viewing WI in such settings, we can identify at least two of its roles.

WI may be interpreted “Web based Artificial Intelligence” as the study of particular aspects of AI

in the context of the Web, in parallel to the study of computational intelligence.

WI may also be interpreted as “Artificial Intelligence on the Web” which regards it as a

new application of AI.A more practical goal of WI is the design and implementation of intelligent

Web information systems (IWIS). It should be realized that an IWIS is an integrated system

containing many sub-systems. To design such a system, it is necessary to apply a variety of

theories and technologies.

In his work on vision, Marr convincingly made the point that a full understanding of an intelligent

system involves explanations at various levels. The same argument is applicable to the

development of an IWIS. We can identify at least two levels, the conceptual formulation and

physical implementation. The conceptual formulation deals with foundations of IWIS, while

physical implementation concerns with construction of an IWIS. The former depends on

mathematics and logic, and the latter depends on algorithms and programming. Each level may be

further divided into more sub-levels. Research in WI should include any topics at different levels.

Page 14: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 8

CHAPTER 3

WEB INTELLIGENCE EX PLORATION

Web intelligence further explores the transformation of knowledge from information, and

wisdom from knowledge, in its search of the Wisdom Web. Some of the important issues,

although may not be well-conceived yet, are briefly discussed in this section.

3.1 A new field of science, technology and engineering

The Web, as a new technical and social phenomenon and a growing organism, creates a

new field of science that involves a multi-disciplinary study and enquiry for the understanding of

the Web and its relationships to us. The Web may be studied from many perspectives, such as

philosophical foundations, theoretical and technical foundations, applications, and social impacts.

Some examples are given below:

• Webology,

• Web Science,

• Web Technology,

• Web Engineering,

• Weblization.

The term, webology, is coined to label the study of the Web as a new field of science. By post-

fixing the phrase, science and technology, one clearly states the scope. By post fixing the phrase,

engineering, one emphasizes the design and implementation aspects. Together, they are driving

forces for information revolution. The term, weblization, concisely summarizes the development:

of the Web and web based systems so far. The process of weblization involves building the Web

itself and reconstructing existing tools and systems OR the web platform.

3.2 Design philosophy and principles of the Web

The design philosophy and principles set the direction of web growth and its ultimate

destiny. It may be difficult to compile a non-controversial and complete list. However, examples

include Decentralization principle, Universalist principles, Minimum constraint principle,

Page 15: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 9

Separation of form and content principle. The decentralization principle is inherited from the

decentralization property of the Internet. The universalist principles cover universal connectivity,

universal accessibility, as well as diversity of web contents and users. The minimum constraint

principle suggests that the Web should be as un-constraining as possible to realize its universality.

The separation principle deal with the presentation of web documents, in order to achieve

location, machine, and apphcation independence. The design principles ensure that the Web has

the desirable properties, such as decentralization, adaptability, evolvability, scalability, universal

connectivity and accessibility, affordability, anonymity, diversity, and many others. The Web is

able to support communication, collaboration. interaction, and intercreation.

3.3 The laws of the Web

Two sets of laws have been studied, namely, the set of laws governing the Web and the set of

empirical laws observable on the Web. The Web has given new meaning to publishing and

library, but not their underlying principles. Nomzi argued that Ranganathan’s Five Laws of

Library Science is weli applicable today as it was more than 70 years ago . Ranganathan’s Five

Laws of Library Science state:

• Books are for use.

• Every reader his or her book.

• Every book its reader.

• Save the time of the reader.

• The Library is a growing organism

These laws describe a user-oriented, as well as a serviceoriented, view of library science. The

Web consists of a massive collection of resources. By replacing “book”, “reader”, and “library”

with “web resource”, “user”, and ‘‘web’, respectively, Noruzi stated Five Laws of the Web

• Web resources are for use.

• Every user his or her web resource.

• Every web resource its user.

• Save the time of the user.

• The Web is a growing organism.

Page 16: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 10

They concisely represent the underlying philosophy of the Web and web services. They also

describe the ideal Web - “of the people, by the people, for the people”. Many researchers studied

empirical laws revealed by the Web, either its growth, web page distributions, or user surfing

patterns. An example set of such laws is reported by Huberman :

I. Power Law of Distribution.

2. Small World Law.

3. . Law of Surfing.

4. Law of Congestion.

5. The Free Ride Law

6. The Law of Downloading.

Website designers, webmasters, and organizations can apply such laws for the design of better

website and web resources.

3.4 The Web revolution: one link at a time

The story of the invention of the Web and the revolution brought by the Web provides a

good case study for web intelligence. It poses a challenge: how to derive insights and wisdom

from the existing data, information, and knowledge. Regarding the pre-web uses of hypertext

links, Berners-Lee commented, “The research community had used the links between paper

documents for ages: Tables of contents, indexes, bibliographies, and reference sections are

hypertext links.’’ A crucial question is what we can get from this common knowledge and

practice. Two types of approaches have been proposed and studied. One focuses on the

exploration of the potential implications of such knowledge, which leads to the creation of a field

of science known as citation indexing and analysis. The other focuses on the representation,

storage, and access of the similar types of data and knowledge using new media as they become

available, which leads to the invention of the Web.

A basic idea of citation indexing and analysis is to index and study the literature of science

based on how scientists cite each other. Although it mainly uses bibliographies, citation indexing

Page 17: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 11

and analysis brings more insights into science, publishing, scientific research, and many more

fields. Information retrieval systems, based on citation indexing and analysis, have been

implemented and used by scientists for many years. The same methods have been applied or

rediscovered in many recent studies, such as web search engines, social network analysis, and so

on.

A basic idea of the Web is to create a global space in which anything can be linked to

anything . The development of the Web emphasizes the implementation of this idea using

different type machines and media. The Web attempts to make the existing associations and links,

that people had used either explicitly or implicitly, concrete and computer manageable. The

similar concepts had been explored in preweb age. Vnnevar Bush described a

photoelectromechanical machine called the Memex that can make and follow cross-references

among microfilm documents. Ted Nelson introduced the concept of hypertext, so that people can

use computers to read, write and publish non-linear texts. Doug Engelbart demonstrated a

collaborative work space called NLS which does hypertext browsing editing, email, and so on.

Thanks to the timely invention of the Internet for providing global connectivity, the dream of the

Web became a reality. The revolution of the Web is brought by grassroots effort that builds the

Web link by link. There are recent research efforts in cross-applications of the two types of

approaches. The methods developed for citation indexing and analysis are used and extended to

analyze the links and conductivity of the web. Existing systems for citation indexing and analysis

are moved to, and new such systems are impregnated on, the Web.

The above brief description, which is almost common knowledge, is repeated here to serve

one special purpose. It demonstrates that the great minds of our time bring revolutions by

analyzing what everyone has already known or by implementing, alternatively, what everyone has

already used. The question is: Can web intelligence help in the future?

3.5 The more things change, the more they stay the same

Now, we turn our attention to the other side of the same coin by investigating the things that the

resolutions do not change. In spite of the technological changes, achievements of the current Web

and associated systems lie in the process of weblization. The weblization of a specific field or an

organization does not change its fundamental principles, although it may become more effective

and efficient, as well as being at different level of scale. For example, electronic commence does

not change the principles of doing business, but does introduce more dynamics, opportunities,

flexibility, and other new properties. Another example is the Five Laws of the Web:the subject

Page 18: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 12

matters are changed, but the philosophy remains to be the same. Both paper documents and the

Web use links.

The physical implementations are different, one on paper and the other on computer, but the

logical meanings stay more or less the same. The same analytical tools and methods apply to both.

The property of “unchangeness” makes it possible to apply the same principles again and again,

with possible adaptation and adjustment. The philosophy and principles that have been proved to

be effective in past can be applied to design and implement intelligent web information systems.

Some illustrative examples are listed here:

Separation of logical view and physical view.

Separation of knowledge and inference engine.

Keep It Simple, Stupid!

The first two separation principles are along the same line as the separation of content and form

principle. The first one is widely used in the design and implementation of database systems. Its

application to the Web implies that one can generate many virtual logical views from the same

physical web. The second principle is a fundamental one in expert systems. It is applicable to the

design of web inference engines. The last rule, also known as the KISS principle, is universally

applicable It has been applied throughout the design of the Web.

Page 19: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 13

CHAPTER 4

Components of Web Intelligence

4.1 Web Data

The data available in electronic commerce environments is three-fold and includes server

data in the form of log files, site specific web meta data representing the structure of the

web site, and marketing information, which depends on the products and services provide. Server

data is generated by the interactions between the persons browsing an individual site and the web

server. This data can be divided into log files and query data. Historically, web servers recording

server activity, errors and referrer information used a log file to record each event. It is now the

standard that web servers use a combined log file format, called Common Log file Format . This

format combines the server and error logs into one file. More recently, the Extended Log file

Format has been used, which consolidates the Common format with additional information,

namely the referrer and cookie information. By incorporating referrer information, the output of

the mining of these logs files being much more useful and actionable in marketing terms. Cookies

are tokens generated by the web server and held by the clients. The information stored in a

cookie helps to ameliorate the transaction less state of web server http interactions, enabling

servers to track client access across their hosted web pages. The logged cookie data is

customizable and can contain keys for relating the navigational data to the content of the

marketing data, including transactional data. Usually the following information is contained in a

cookie: User ID, source IP address, time-to-live, randomly generated unique ID and user defined

information. A fourth data source that is typically generated on electronic commerce sites is

query data to a web server. This data is usually generated when users of the web site use search or

product locator facilities on the web site to search for relevant pages/products. This is often user

interaction with a product database, via the company’s Internet site. The final source of data is

web meta-data. This data describes the structure of the web site and is usually generated

dynamically and automatically after a site update. Web meta-data generally includes neighbor

pages, leaf nodes and entry points. This information is usually implemented as a site-specific

index table, which represents a labeled, directed graph. Meta-data also provides information

whether a page has been created statically or dynamically and whether user interaction is required

or not. In addition to the structure of a site, web meta-data can also contain information of more

semantic nature, usually represented in XML.

Page 20: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 14

Web Mining Components of Web Intelligence

In the context of web intelligence, web mining may be defined as the application of data

mining techniques to Internet data. This definition is sometimes extended to include statistical,

database optimization, and artificial intelligence techniques. Web mining has been sub-divided

into web structure, web usage, and web content mining . Web structure mining is the application

of data mining techniques to web site structures. In many cases this may be the entire web, and

research in intelligent search engines and intelligent agents is described in many articles, . In our

research, we define web structure mining as the mining of Internet data, together with data about

the structure of the site. This may be thought of as enriching the efficacy of the data mining

process with domain knowledge. The application of domain knowledge is further discussed in the

analytical process section. Web usage mining is the application of data mining to Internet web

server log file data, which is described in the earlier section on web data. Web usage mining

forms the core of our research in web mining for web intelligence, and log files provide the

foundation data for visitor analysis. This type of analysis of the visitors to a web site can be

subdivided into technographic and psychographic analysis . Technographic analysis focuses on

what is known about the visitor’s technical platform, i.e., operating system, browser, plug-ins,

user language, cookie information. On its own, this information is not a rich source of

discriminatory data for visitor profiling but in conjunction with the homogenous data sets

available after extract, transform & load operations to data warehousing, it contributes

significantly. Psychographic analysis is the examination of what we know about the behavioral

patterns of web site visitors. This includes the routes taken by visitors through a site, the time

spent on each page, route differences based on differing entry points to site, aggregated route

behavior, general click stream behavior, etc. This is the information of most use to web marketers,

and is equivalent to marketing intelligence about where shoppers enter the store, where shoppers

go in the store, where they leave the store, what they look at but don’t buy, what they buy and

how quickly, etc.

Web content mining is the application of data and text mining algorithms and techniques

to the contents of web pages, usually written in HTML. At its simplest, this entails the extraction

of text between HTML tags for headings and titles, or the extraction of the HTML Meta tag

content.. Our research is based upon XML and RDF-based data schemas that help to ensure

correctness and proper context.

Page 21: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 15

4.2 Representation

Intelligent Web agents can use the Problem Solver Markup Language (PSML) to specify

their roles, settings, and relationships with any other services. The intelligent Web must also have

the ability to process and understand natural language. It must understand and correctly judge the

meaning of concepts expressed in words, such as “good,” “best,” and “season.” Further, the

intelligent Web must grasp the granularities of these terms’ corresponding subjects and the

location of their ontology definitions.

Self-direction and learning

In addition to the semantic knowledge that an intelligent search can extract and

manipulate, intelligent Web agents must also incorporate a dynamically created source of meta-

knowledge that deals with the relationships between concepts and the spatial or temporal

constraint knowledge that planning and executing services use. This allows the agents to self-

resolve their conflicts. To solve specific problems, intelligent Web agents must be able to plan.

The planning process uses goals and associated sub goals, as well as constraints. In the intelligent

Web, ontologies alone will not be sufficient. Personalization The intelligent Web can personalize

interactions by remembering a particular user’s recent encounters and relating the topics and sites

that a user accesses during different online sessions. It may further identify other goals and

courses of action as a user’s interactions broaden and deepen, providing ever more data upon

which to base its recommendations. As part of its personalized approach to user services, the

intelligent Web will interact with the user when executing these tasks. In summary, semantics

contributes a vital aspect to the intelligent Web. We expect the Web to extend not only the

knowledge of artificial assistants, but also their intelligence.

WI’s Four Levels

We can study Web intelligence on at least four conceptual levels, ranging from the lower, hardware- centered level to the higher, application-centered level. This framework builds upon the fast development and application of various Web technologies.

• Internet-level communication, infrastructure, and security protocols.

At its core, the Web is a computer-network system. WI techniques for this level include Web data perfecting systems built upon Web surfing patterns to resolve latency issues. The intelligence of

Page 22: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 16

the Web’s perfecting routines comes from an adaptive learning process based on observations of user surfing behavior.

• Interface-level multimedia presentation standards.

The Web functions as an interface for human-Internet interaction. At this level, the Web interfaces require adaptive cross-language processing, personalized-multimedia-representation, and multimodal-data-processing capabilities.

• Knowledge-level information processing and management tools.

The Web serves as a distributed data and knowledge base. Accessing and manipulating this information requires semantic markup languages to represent the Web’s contents in machine-understandable formats. Agent-based autonomic computing functions such as searching, aggregation, classification, filtering, managing, mining, and discovery can then use this data.

• Application-level ubiquitous computing and social intelligence environments.

The Web can form the basis for establishing social networks that contain communities of people, organizations, or other social entities. Social relationships such as friendship, co-working, or exchanging information about common interest connect these entities. The study of WI thus encompasses issues central to social network intelligence. Users access the Web’s multimedia content from stationary desktop computers and increasingly from mobile platforms as well.5 Ubiquitous Web access and computing from various wireless devices requires even greater adaptive personalization. WI should suit these needs well by providing techniques for use in constructing interest models derived from implicit inferences based on user behavior.

Page 23: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 17

4.3 PSML and Web inference engine

Distributed inference engines form PSML’s core. These engines can perform automatic

reasoning on the Web by incorporating autonomically collected and transformed content and

meta-knowledge into locally operational knowledge and databases. A feasible way to implement

PSML is to use an existing Prolog-like logic language supplemented with agents that perform

dynamic-content updates, meta-knowledge.

4.4 Social network intelligence The social intelligence approach to Web computing presents new opportunities for WI

research and development. As the Web becomes an integral part of our society, WI can and

should support Web-based social networks at all levels. Study in this area must receive as much

attention as Web mining, Web agents, ontologies, and related topics. Web-based computing The

intelligent Web seeks to provide not only a medium for seamless information exchange and

knowledge sharing, but also the sort of human-crafted resources that encourage sustainable

knowledge creation and scientific and social evolution. The intelligent Web will rely on Grid-like

service agencies that self-organize, learn, and evolve their courses of action to perform service

tasks and transform their identities and interrelationships in communities. These services will also

cooperate and compete among themselves to optimize their resources and utilities and those of

others.

4.5 Benchmark applications

To effectively develop and evaluate systems and applications that address WI research issues, we

must consider benchmark applications that will demonstrate these capabilities. Suppose we want

to conduct a Web-based search to compile the data and generate a market report for an existing

product or a potential new product. To perform these tasks, an information agent will mine and

integrate available Web information, which will in turn be passed to a market analysis agent.

The analysis will involve the quantitative simulation of customer behavior in a marketplace,

instantaneously handled by other service agencies involving a large number of Grid agents. Given

that the number of variables can number in the hundreds or thousands, generating one prediction

can easily require significant computer resources

Page 24: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 18

CHAPTER 5 Computational Web Intelligence and Granular

Web Intelligence for Web Uncertainty

With explosive growth of Web data on wired and wireless networks, a challenging

problem for a new generation of intelligent Web techniques is how to handle uncertain Web data

and making right decisions under Web uncertainty. So it is necessary to develop new intelligent

Web techniques for Web applications under different types of uncertainty including probability,

possibility, fuzziness, roughness, randomness, etc. Web Intelligence (WI), a new direction for

scientific research and development, exploits Artificial Intelligence (Al) and advanced

Information Technology (IT) on the Web and Internet. In general, Al-based Web techniques can

be used to handle probabilistic Web data. Since there are lots of fuzzy Web data and other kinds

of uncertain Web data, we need to apply relevant intelligent techniques to process different

uncertain Web data that cannot be processed by traditional precise intelligent techniques like

Boolean logic. To promote the use of fuzzy Logic in the Internet, Zadeh stated "fuzzy logic may

replace classical logic as what may be called the brainware of the Internet" at 2001 BISC

International Workshop on Fuzzy Logic and the Internet (FLINT2001) . The fuzzy intelligent

agents are used in smart e-Commerce applications. The conceptual fuzzy sets are applied to Web

search engines to improve quality of Web service. Clearly, the intelligent e-brainware based on

soft computing plays an important role in smart e-Business applications. So soft computing

techniques can play an important role in building the intelligent Web brain. So soft-computing-

based Web techniques can enhance Web Qol (Quality of Intelligence). In order to use CI

(Computational Intelligence) techniques to make intelligent wired and wireless systems with high

Qol, Computational Web Intelligence (CWI) was proposed at the special session on CWI at

FUZZ-IEEE'02 of 2002 World Congress on Computational Intelligence. CWI is a hybrid

technology of CI and Web Technology (WT) dedicating to increasing Qol of e-Business

application systems on the wired and wireless networks. Main CWI techniques include

• Fuzzy Web Intelligence (FWI)

• Neural Web Intelligence (NWI)

• Evolutionary Web Intelligence (EWI)

• Granular Web Intelligence (GWI)

• Rough Web Intelligence (RWI)

• Probabilistic Web Intelligence

Page 25: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 19

5.1 WEB UNCERTAINTY

Web holds various data sets distributed on a huge number of computers just like a human

brain contains biological data stored on a large number of biological neurons. The biological data

in the human brain are not always precise but uncertain in most cases due to information

incompleteness, linguistic vagueness, imperfect measurement, knowledge limitations, etc.

Similarly, Web data on the Internet are not accurate but uncertain usually because of partial Web

information, dynamic Web data, fuzzy Web data, Web ontology, unpredictable Web information,

different Web users, different hardware environments, different data formats, etc.So the big

challenging problem is how to design intelligent Web techniques for Web-based applications with

uncertainty. With explosive growth of the wired and wireless networks, Web users suffer from

huge amounts of raw Web data because current Web tools still cannot find satisfactory

information and knowledge effectively and make decisions correctly because of uncertain Web

data, uncertain Web information, uncertain Web knowledge and uncertain Web intelligence. Now

the Internet and wireless networks connect an enormous number of computing devices including

computers, PDAs (Personal Digital Assistants), cell phones, home appliances, etc. CI is used in

telecommunication network applications . Clearly, such a huge networked computing system on

the world provides a complex, dynamic and global environment for developing the new

distributed intelligent theory and technology based on Al, BI (Biological Intelligence) and CI.

Therefore, we must design an intelligent Web technology for dealing with Web uncertainty.

5.2 COMPUTATIONAL WEB INTELLIGENCE FOR WEB

UNCERTAINTY

Zadeh states that traditional (hard) computing is the computational paradigm that underlies

artificial intelligence, whereas soft computing is the basis of CI. Based on the discussions on CI

and Al ,the basic conclusion is that CI is different from Al, but CI and Al have a common overlap.

In general, hard computing and soft computing can be used in intelligent hard Web applications

and intelligent soft Web applications. To enhance Qol (Quality of Intelligence) of e-Business,

Computational Web Intelligence (CWI) is proposed to use CI and Web Technology (WT) to make

intelligent e-Business applications on the Internet and wireless networks . So the concise relation

is given by CWI=CI+WT. Fuzzy logic, neural networks, evolutionary computation, granular

computing, rough sets and probabilistic methods are major CI techniques for intelligent e-

Page 26: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 20

Applications on the Internet and wireless networks. Currently, seven major research areas of CWI

are (1) Fuzzy WI (FWI), (2) Neural WI (NWI), (3) Evolutionary WI (EWI), (4) Probabilistic WI

(PWI), (5) Granular WI (GWI), and (6) Rough WI (RWI). In the future, more CWI research areas

will be added. The six current major CWI techniques are described below.

• FWI has two major techniques: fuzzy logic and WT. The main goal of FWI is to design

intelligent fuzzy e-agents to deal with fuzziness of Web data, Web information and Web

knowledge, and also make good decisions for e-Applications effectively.

• NWI has two major techniques: neural networks and WT. The main goal of NWI is to

design intelligent neural e-agents that can learn Web knowledge from of Web data and

Web information and make smart decisions for e-Applications intelligently.

• EWI has two major techniques: evolutionary computing and WT. The main goal of EWI

is to design intelligent evolutionary e-agents to optimize e-Application tasks effectively.

• PWI has two major techniques: probabilistic computing and WT. The main goal of PWI is

to design intelligent probabilistic e-agents to deal with probability of Web data, Web

information and Web knowledge for e-Applications effectively.

• GWI has two major techniques: granular computing and WT. The main goal of GWI is to

design intelligent granular e-agents to deal with Web data granules, Web information

granules and Web knowledge granules for e-Applications effectively.

• RWI has two major techniques: rough sets and WT.

The main goal is to design intelligent rough e-agents to deal with roughness of Web data, Web

information and Web knowledge for e-Applications effectively.CWI can be used to increase the

Qol of e-Business applications. CWI has a lot of wired and wireless applications in intelligent e-

Business. Currently, FWI, NWI, EWI, PWI, GWI and RWI are major CWI techniques. CWI can

be used to deal with uncertainty and complexity of Web applications. HWI, a more broad area

Page 27: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 21

than CWI, can be applied to more complex e-Business applications. In summary, HWI including

CWI will play an important role in designing the smart e-Application systems for wired and

wireless users. In summary, CWI technology is based on multiple CI techniques and WT.

Relevant CI techniques and WT are selected to make a powerful CWI system for the special

e-Business application.

5.3 GRANULAR WEB INTELLIGENCE FOR WEB UNCERTAINTY

Granular computing technology can be to do high-level information processing and

knowledge discovery based on data granules that are clustered intelligently from raw data with

uncertainty. Since there are huge amounts of Web data at different geographical places, it is

naturally necessary to use the granular computing technology to preprocess raw Web data, then do

granular Web data mining, and finally discover granular Web knowledge. So GWI is a general

intelligent technology in dealing with raw Web data with Uncertainty. Mathematically speaking,

to handle Web uncertainty effectively, it is really necessary to develop a novel granular set theory.

Here, a general framework about granular sets is briefly described below to deal with data

uncertainty such as Web data uncertainty.

Definition 1 (A Granular Set) Let X be a universal set of data elements. A granular set A in Xis

characterized by m granular membership functions Fk(x) for x in X, Fk(x)E[O,1], and

k=1,2,...m.

For example:

If k=1, a granular set is a fuzzy set (a special case: a crisp set) since one membership function is

used. The traditional fuzzy sets just use truth values in [0, 1] to handle data

uncertainty.

If k=2, a granular set is an intuitionistic fuzzy set [25] since two membership functions are used.

Intuitionistic fuzzy sets use both truth values and falsity values in [0, 1] to deal with data

uncertainty. If k=3, a granular set is a neutrosophic set since three membership functions are

used. For example, interval neutrosophic sets are defined on a truth-membership function, an

indeterminacy-membership function and a falsity-membership function . The major advantage of

interval neutrosophic sets is to reduce data uncertainty by using three types of information that are

truth values, falsity values and indeterminacy values in order to make a right decision. 100

Page 28: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 22

We hope that new granular sets and new granular logical systems with four or more membership

functions will be developed in the future to handle Web uncertainty effectively and

fundamentally.

Web uncertainty is a long-term challenging problem related to many Web applications like

semantic Web, Web mining, Web knowledge discovery, Web agents, Web search engines, Web

security, e-Commerce, e-Business, etc. To handle Web uncertainty, we need to develop relevant

intelligent Web technology such as CWI and GWI. Importantly, we need to continue to create

new granular sets such as neutrosophic sets to try to solve Web uncertainty effectively.

Web uncertainty is a difficult long-term problem. So we need to use different intelligent

techniques together for this complicated problem. Hybrid Web Intelligence (HWI), a broad hybrid

research area, uses Al, CI, BI (Biological Intelligence) and WT to build hybrid intelligent Web

systems to handle Web uncertainty effectively and efficiently. In the future, HWI will have a lot

of intelligent Web applications under uncertainty. Main HWI applications include (1) intelligent

Web agents for e-Applications such as e-Commerce, e-Government, e-Education and e-Health,

(2) intelligent Web security systems such as intelligent homeland security systems, (3) intelligent

Web bioinformatics systems, (4) intelligent grid computing systems, (5) intelligent wireless

mobile agents, (6) intelligent Web expert systems, (7) intelligent Web entertainment systems, (8)

intelligent Web services, (9) Web data mining and Web knowledge discovery, (10) intelligent

distributed and parallel Web computing systems based on a large number of networked computing

resources, ..., and so on.

Page 29: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 23

CHAPTER 6 Trends and Challenges of WI Related Research and

Development

Web Intelligence presents excellent opportunities and challenges for the research and

development of new generation Web-based information processing technology, as well as for

exploiting business intelligence. With the rapid growth of the Web, research and development on

WI have received much attention. We expect that more attention will be focused on WI in the

coming years. Many specific applications and systems have been proposed and studied. Several

dominant trends can be observed and are briefly reviewed in this section. E-commerce is one of

the most important applications of WI. The e-commerce activity that involves the end user is

undergoing a significant revolution. The ability to track users’ browsing behavior down to

individual mouse clicks has brought the vendor and end customer closer than ever before. It is

now possible for a vendor to personalize his product message for individual customers

at a massive scale. This is called targeted marketing or direct marketing

Web mining and Web usage analysis play an important role in e-commerce for customer

relationship management (CRM) and targeted marketing. Web min- ing is the use of data mining

techniques to automatically discover and extract information from Web documents and services.

Zhong et al. proposed a way of mining peculiar data and peculiarity rules that can be used for

Web-log mining. They also proposed ways for targeted marketing by mining classification rules

and market value functions. A challenge is to explore the connection between Web mining and

the related agent paradigm such as Web farming that is the systematic refining of information

resources on the Web for business intelligence. Text analysis, retrieval, and Web based digital

library is another fruitful research area in WI. Topics in this area include semantics model of the

Web, text ming, automatic construction of citation. Abiteboul et al. systematically investigated the

data on the Web and the features of semi-structured data. Zhong et al. studied text mining on the

Web including automatic construction of ontology, e-mail filtering system, and Web-based e-

business systems. Web based intelligent agents are aimed at improving a Web site or providing

help to a user. Liu et al. worked on e-commerce agents . Liu and Zhong worked on Web agents

and KDDA (Knowledge Discovery and Data Mining Agents). We believe that Web agents will be

a very important issue. It is therefore not surprising that we decide to hold the WI conference in

Page 30: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 24

parallel to the Intelligent Agents conference. In the next section, we provide a more detailed

description of intelligent Web agents.

The Web itself has been studied from two aspects, the structure of the Web as a graph and

the semantics of the Web. Studies on Web structures investigate several structural properties of

graphs arising from the Web, including the graph of hyperlinks, and the graph induced by

connections between distributed search servants. The study of the Web as a graph is not only

fascinating in its own right, but also yields valuable insight into Web algorithms for crawling, 10

searching and community discovery, and the sociological phenomena which char- acterize its

evolution. Studies of the semantics of the Web were initiated by Tim Berners-Lee, the creator of

the World Wide Web. The Web is referred to as the “semantic Web”, where information will be

machine-processible in ways that support intelligent network services such as information brokers

and search agents.

The semantic Web requires interoperability standards that address not only the syntactic

form of documents but also the semantic content. A semantic Web also lets agents utilize all the

data on all Web pages, allowing it to gain knowledge from one site and apply it to logical

mappings on other sites for ontology-based Web retrieval and e-business intelligence. Ontologies

and agent technology can play a crucial role in enabling such Web-based knowledge processing,

sharing, and reuse between applications. A new DARPA program called DAML (DARPA Agent

Markup Languages) is a step toward a “semantic Web” where agents, search engines and other

programs can read DAML mark-up to decipher meaning rather than just the content on a Web

site.

6.1 Intelligent Web Agents

Intelligent agents are computational entities that are capable of making decisions on behalf

of their users and self-improving their performance in dynamically changing and unpredictable

task environments . In , Liu provided a comprehensive overview of related research work in the

field of autonomous agents and multi-agent systems, with an emphasis on its theoretical and

computational foundations as well as in-depth discussions on the useful techniques for developing

various embodiments of agent-based systems, such as autonomous robots, collective vision and

motion, autonomous animation, and search and segmentation agents. The core of those techniques

is the notion of synthetic or emergent autonomy based on behavioral self-organization. Intelligent

Page 31: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 25

Web Agents (WA) are software programs that primarily serve two important roles: a).

autonomous entities for exploring and exploiting Web-based services, and b). prototype entities

for exhibiting and explaining Web-generated regularities. These two roles are summarized below.

6.2 From WA to Web-Based Services

The first role for WA can be readily described and appreciated by examining the following

typical scenarios in which various tasks and objectives are achieved.

• Personalized Multimodal Interface WA can provide users with a user-friendly style of

presentation that personalizes both the interaction with users and the content presentation.

This activity involves the creation of various cognitive aids, including tables, charts,

executive summaries, indices, and personalized visual assistants (e.g., graphically

animated personas and virtual-reality avatars). WA as interfaces must offer the ease of

using electronic services. The provided cognitive aids must be concise (i.e., accessible

with as fewer manipulations as possible and as less memorization as possible) and

consistent (i.e., understandable based on users’ previously customized cognitive styles).

• Push and Pull WA can play an important role in dynamically creating pull-and-push

advertising. Here, by pull-and-push advertising we mean that a user expresses his or her

favorites during the interaction with the agents (pull advertising) and in return the agents

search and deliver the information about the favorite items dynamically to the user (push

advertising). Such agents can also increase the positive externality of products, that is,

the better people are informed about certain products, the more likely the products will be

sold.

• Pattern Discovery and Self-Organization WA will enable to detect what users’ buying

patterns are forming and how they are structured, and hence effectively manage the online

commerce. Collaborative recommendation agents can help individual users aggregate into

groups, which can in turn form a dynamical marketplace.

• Information Gateway WA can provide users with immediate access to the most relevant

information. This support encompasses a wide spectrum of information filtering and

delivery activities by manipulating various heterogeneous Web sources including

databases, data warehouses, newswire, financial reports, newsletters, newsgroups,

outbound emails, electronic bulletin boards, and hypermedia documents, and based on

Page 32: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 26

users’ profiles, tailoring and delivering the retrieved information to the users. The

provided summary information must be just-in-time (i.e., delivered whenever is needed),

relevant (i.e., focused on whichever topics the users are concerned with), and up-to-

minute (i.e., refreshed whenever a new piece of information arrives). An example of

applications with this type of agent support is comparison shopping that utilizes WA with

mobile and filtering capabilities. Some related experiences have been reported in .

• Reward WA can motivate users to enter and re-enter a certain electronic service. While an

ever-greater proliferation of content continues to consume individuals’ attention, e.g.,

through push technology to sell something or to support users, WA can play a crucial role

in creating a captive audience, in educating it constantly, and even in removing away

users’ old purchase habits. To be rewarding is to add value. The motivational rewards or

incentives can be created by offering free access to certain information and utility

resources (e.g., free software download), opportunities to participate in multi-user

information/commodity exchange activities (e.g., collaborative recommendation, chat,

bidding, and auction), and scheduled plans for promotional deals.

• Matchmaking WA can serve as a new means for trading commodities. Since the interests

of users as well as the availability of products from dealers can change dynamically from

time to time, what usually happens in present day electronic commerce is: (1) a dealer

sells his or her items simply because these are the only items that he or she has at the

moment, or (2) a user buys a certain item simply because it is the last item that he or she

can find that partially fits his or her need. WA-based customized business attempts to

change the existing online buying and selling into the following new scenarios: (1) a

dealer identifies and offers what exactly users are interested in, and (2) a user finds and

purchases what he or she really loves – some technical issues related to matchmaking

have been addressed in .

• Decision WA can assist Web users in making decisions. Such decision support may be in

the forms of evaluations or recommendations on the various features of certain specific

items, cost-benefit analysis, inference support for optimizing utility and resources with

respect to functional, time, and cost requirements, and model-based trend analysis and

projections concerning new patterns of demand.

• Delegation WA can act on behalf of Web users in online activities. The

tasks that WA may delegate to achieve include matchmaking, server monitoring,

negotiation, bidding, auction, transaction, transfer of goods, and follow-up support. This

scenario will empower a new paradigm shift from user-centric to user-delegated

Page 33: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 27

electronic business. The delegations of these tasks may be carried out in either semi-

autonomous (with users’ intervention on decisions) or fully autonomous manners. To this

end, various computational theories and models have been proposed and reported in.

• Collaborative Work Support WA can offer the infrastructure support as well as the

necessary function for collaboratively solving problems and managing workflow

activities

Page 34: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 28

CHAPTER 7

Semantic Search Engine

The framework’s search engine component queries the information generated by the annotation

component. It accepts queries posed in SPARQL and returns a set of links to matching resources.

A specialized search interface lets users develop an abstract model of a semantic query, pose it to

the engine, and then review the resulting matched documents. The search interface gives end

users (people who aren’t experts in Semantic Web technologies) a way to access the resources

filtered and annotated by the semantic annotator component. It is also possible to add and delete

entities and properties (with related values), so that a user can interact with the knowledge base to

fine-tune the query, making subsequent searches more accurate. The key aim for the query

interface is to give the user an intuitive and clear abstract query model that hides, as much as

possible, the underlying complexity of representation and reasoning. Furthermore, the agents in

the search engine multi-agent system exhibit various autonomic features that aim at making the

system more robust and scalable. The QS system has been deployed in two different commercial

test cases in the UK. In the first case, QS was used to examine specific Web-published documents

for commercial opportunities matching the business interests of the customer company. In the

second deployment, QS was used to perform knowledge-based searches over existing database

sources. In evaluating the performance of the search system in both applications, we could

see that by using ontological knowledge and ontology-based annotations, users could perform

more accurate queries while being returned up to 71 percent fewer documents than with a

keyword-based search engine—in the best cases eliminating more than 90 percent of the

irrelevant documents. We are now in the process of further refining these two deployments, and

we are planning more industrial deployments in the near future with other UK companies

Page 35: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 29

CHAPTER 8

CONCLUSION

While it may be difficult to define what exactly Web Intelligence (WI) is, one can easily

argue for the need and necessity of creating such a subfield of study in computer science. With the

rapid growth of the Web, we foresee a fast growing interest in Web Intelligence. Roughly

speaking, we define Web Intelligence as a field that “exploits Artificial Intelligence (AI) and

advanced Information Technology (IT) on the Web and Internet.” It may be viewed as a marriage

of artificial intelligence and information technology in the new setting of the Web. By examining

the scope and historical development of artificial intelligence, we discuss some fundamental

issues of Web Intelligence in a similar manner. There is no doubt in our mind that results from AI

and IT will influence the development of WI. Instead of searching for a precise and non-

controversial definition of WI, we list topics that might be interested by a researcher working on

Web related issues. In particular, we identify some challenging issues of WI, including

ecommerce, studies of Web structures and Web semantics, Web information storage and retrieval,

Web mining, and intelligent Web agents, to examine performance characteristics of various

approaches in Web-based intelligent information technology, and to cross-fertilize ideas on the

development of Web-based intelligent information systems among different domains.

It is not intended to be a complete and systematic study of the field, but rather a record of

personal observations, scattered (perhaps immature) ideas, general comments, speculations, and

opinions. We hope that a careful study of these not yet well-connected points may lead to a web

of knowledge for web intelligence. From several perspectives, we examined the Web. This

enables us to see clearly the current status, the scope, and the future of web intelligence research.

Web intelligence exploration of the Web was then commented from a few angles. A couple of

challenges were posed. Finally, Web-based Support Systems (WSS) were used to demonstrate the

ideas presented, which may further enhance the Web as a tool - “of the people, by the people, for

the people”

Page 36: Web intelligence-future of next generation web

Web Intelligence

Division Of Computer Science , SOE CUSAT Page 30

REFERENCES

[1] Research Challenges and Trends in the New Information Age

Y.Y. Yao1, Ning Zhong, Jiming Liu, and Setsuo Ohsuga , IEEE

[2] Web Intelligence: New Frontiers of Exploration Yiyu (Y.Y.) Yao

Department of Computer Science, University of Regina Regina ,

saskatchewa , IEEE

[4] Education and the Semantic Web Vladan Devedzic, Department of

Information Systems and Technologies, FON – School ,of Business

Administration, University of Belgrade

[5] Computational Web Intelligence and Granular ,Web Intelligence

for Web Uncertainty ,Yan-Qing Zhang, Member, IEEE