10
4 IN 1 SEARCH ENGINE MASHUP Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. The First Mini-Conference in Web Technologies and Trends (WTT) © 2009 Information Technology Department, CCIS, King Saud University, Riyadh, Saudi Arabia

4 in 1 Search Engine Mashup

Embed Size (px)

Citation preview

Page 1: 4 in 1 Search Engine Mashup

4 IN 1 SEARCH ENGINE MASHUP

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

The First Mini-Conference in Web Technologies and Trends (WTT)

© 2009 Information Technology Department, CCIS, King Saud University, Riyadh, Saudi Arabia

Page 2: 4 in 1 Search Engine Mashup

Shahd I. AL-ForaihInformation Technology Department

College of Computer and Information Science, KSU

Riyadh, Saudi Arabia

[email protected]

Page 3: 4 in 1 Search Engine Mashup

ABSTRACTIn the web 2.0 there is a lot of technologies that can help the web developers to create and develop web sites.

One of these technologies is Mashup which means combining two or more web sources and present them in a new way.

So in this research we will explain the definition of the Mashup, then we will talk about the sources that we can combine it, we will know its format and the communication protocols, after that we will talk about some editors that will help the beginner to create a mashup application, finally we will discus some examples, then we well display a simple example which I created by my self, and discuss it.

KeywordsMashup, Web 2.0, API, RSS, ATOM, Screen Scraping, REST, SOAP

1. INTRODUCTIONIn web world there is no limit to the sites that offer a useful information or great functions, but we still need more sites and services. So why we don’t create a new useful application in an easy way and short time also with low cost, we can do that by take advantage of the mashup techniques. With mashup we can use already sites data or function and combine it to create the new useful application.

The word Mashup represent combining of tow or more things to present them in a new way, in this context we talk about Web Mashup so that’s mean the combining will be of a web content and the result will be an application.

2. WHAT IS WEB MASHUPThe web Mashup means combine content or functionality from two or more sources (web services or websites feeds) to serve a new purpose that will be a new application (program, web site or web service).

Mashup can be done in various types such as mapping, search, photos....etc. In Figure1.we have the most mashup types.

Figure 1. The most mashup types until 11/12/2008 [1]

2.1 Why we use Mashup? The most Important reason to use mashup technology in creating a new application is to save time and cost [2]. By using data or function that’s already created.

2.2 Can a Mashup be from one source?The term Mashup is derived from the idea of combining data from two or more sources and displaying it in a new look. However, mashups can only use a single source. For example the site TwitterSpy pulls data only from Twitter.

TwitterSpy:URL: http://twitspy.com/. Source: Twitter.Description: To see what people are posting on Twitter in real time by spying on the Twitter timeline. It also tracks website links within tweets [3].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

The First Mini-Conference in Web Technologies and Trends (WTT)

© 2009 Information Technology Department, CCIS, King Saud University, Riyadh, Saudi Arabia

Page 4: 4 in 1 Search Engine Mashup

3. WHAT SOURCES CAN I USE? The sources of mashup are typically other websites and their data may be obtained by the developer in various ways including, but not limited to: APIs, web feeds (RSS or Atom), or screen-scraping [1].

3.1 APIAn API (Application Programming Interface) is a set of functions that one program makes available to other programs so they can talk to it directly.

There are many types of APIs: operating system APIs, application APIs, toolkit APIs and web site APIs.

The simplest example of API is an operating system like Microsoft Windows with hundreds of APIs providing services. These operating system APIs are used by desktop applications like word processors. [1]

Other example is the Java API, which allows programmers to use already written methods or functions in their programs [2]. Instead of rewire the method from the beginning, so they will save the time.

3.1.1 Web site APIsA web site APIs provide the accessing to external online services or databases.

Web services APIs are offered by some websites as ways of sharing some of their functionality and information across the Internet.

There are many sites that provide APIs to access their data or to use their services in the development of mashups. Some of these sites offer it for free, and others require to take a permission before using their APIs. There are hundreds of APIs, the most popular of them are shown in Teble1 [2].

Table 1. The most popular APIs [1]

Category API Description

Advertising Google AdSense Advertising management

Answers Yahoo Answers Community driven reference service

Blog Search Technorati Blog search services

Blogging FeedBurner Blog promotion tracking service

Bookmarks del.icio.us Social bookmarking

Calendar Google Calendar Calendar service

Chart Google Chart Chart creation service

Chat MSN Messenger Chat and messaging

Community Twitter Community site

Facebook Social networking service

Enterprise Salesforce.com CRM services

Events Eventful Events discovery and demand

Feeds Google Ajax Feeds Access RSS and Atom feeds with JavaScript

Internet Amazon EC2 Elastic Compute Cloud virtual hosting

hostip.info IP lookup

Job Search indeed Job search services

Mapping Google Maps Mapping services

Microsoft Virtual Earth Mapping services

Yahoo Maps Mapping services

Media Management BBC Multimedia archive database

Messaging 411Sync SMS, WAP, and email messaging

Music Last.fm Music playlist management

News Digg Community driven news links and ratings

Payment PayPal Online payments

Page 5: 4 in 1 Search Engine Mashup

Photos Flickr Photo sharing service

Search Google Search Search services

Yahoo Search Search services

Google Ajax Search Web search components

Yahoo Image Search Image search services

Windows Live Search Internet search

Shipping FedEx Package shipping

Shopping Amazon eCommerce Online retailer

eBay Online auction marketplace

Storage Amazon S3 Online storage services

Telephony Skype Internet communication

Tools Google Mashup Editor Mashup creation tool extensions API

Traffic Yahoo Traffic Traffic data and routing

Utility Google Translate Language translation serviceVideo YouTube Video sharing and searchWeather WeatherBug Weather forecast servicesWidgets Google Homepage Portal gadgets

3.2 Web feedWeb feed is a document (often XML) which contains content items with web links to longer versions. The two main web feed formats are RSS and Atom [4].

3.2.1 RSS RSS (Really Simple Syndication) is a family of XML-based syndication formats. It’s enabled client to check the publisher's feed for new content and react to it in an appropriate manner [5].

3.2.2 Atom Atom is newer than RSS, but similar. It is a proposed standard at the Internet Engineering Task Force (IETF) and seeks to maintain better metadata than RSS, provide better and more rigorous documentation [5].

These technologies (RSS and ATOM) are great for mashups that based on updatable content.

3.3 Screen scraping “Scraping is the process of using software tools to analyze content that was originally written for human in order to extract semantic data structures representative of that information that can be used and manipulated programmatically”. [4]

Screen scraping has two drawbacks. The first is that, unlike APIs with interfaces, scraping has no specific programmatic contract between content provider and content consumer. The second issue is the lack of sophisticated, reusable screen scraping toolkit software.

3.3.1 Why we use screen scraping?The main reasons that prompted the developers to asylum to screen scraping is the lack of the APIs (earlier), but now they may use it because some of the interesting data sources like Wikipedia and most of the government and public domain Web sites do not (yet) provide APIs. So extracting the content from sites like these, do so by a screen scraping technique.

4. WHICH PROTOCOLS WE USE? The data retriever can be communicate with remote services (providers) through Web protocols such REST and SOAP [5]. In Figure3 we can see the most protocols that being used.

Page 6: 4 in 1 Search Engine Mashup

Figure 3. The most protocols used by APIs until 11/12/2008 [1]

4.1 SOAPSOAP (Simple Object Access Protocol) is a fundamental technology of the Web Services paradigm. It's focus has shifted from object-based systems towards the interoperability of message exchange. There are two key components of the SOAP specification. The first is the use of an XML message format for platform-agnostic encoding, and the second is the message structure, which consists of a header and a body. [5]

4.2 RESTREST (Representational State Transfer) a technique of Web-based communication using just HTTP and XML. Its more simple than SOAP. REST fundamentally supports only a few operations (POST, GET, PUT and DELETE) that are applicable to all pieces of information. This pieces of information called resources. You can retrieve a record through a GET operation, update it by a PUT operation, and so on.

5. IS THERE ANY TOOL THAT CAN HELP TO CREATE A NEW MASHUP? Yes there are already several mashup editors that help the users create or edit mashups.

Some of these editors are online editors, and some of them required installing them on your computer

The most common mashup editors are:

Google Mashup Editor, Yahoo pipes, Microsoft Popfly, Open Mashup Studio and Liquid Apps.

6. MASHUP EXAMPLESThese are a lot of mashups and you can see a list of some of them in these two sites [1], [6] which are interested in mashup. Here we well take two examples.

6.1 2lingualURL: http://www.2lingual.com/Source: Google AJAX Language, Google Ajax Search.Description: 2lingual makes it possible for users to bilingual search the World Wide Web. In addition, users can translate their search terms into 35 different languages.

6.2 WhereAmI.AtURL: http://whereami.at/Source: Flickr, Google Maps, Google Search, hostip.info.Description: WhereAmI.At can tell you where you are, and display some images from flickr for the town that you are in.

7. 4 IN 1 SEARCH ENGINE MASHUPIt’s easy to crate a mashup, first determine the idea, then look up for the sources, finally start coding. Or even you can create a mashup without depth coding with some editors like Yahoo Pipes.

In my example 4 in 1 search engine I create it by using Yahoo Pipes tools. And RSS web feeds from the sites that will be listed later.

4 in 1 Search Engine:URL: http://pipes.yahoo.com/shahd/4in1_search_engine Source: Yahoo Search, Yahoo Maps, Flickr, YouTube, Google Blogs Search.Description: With 4 in 1 search engine you can search for any thing and the result can be in 4 different formats from the sites that the 4 in 1 search engine search in it. Video from YouTube, Blogs by Google Blogs

Page 7: 4 in 1 Search Engine Mashup

search, Yahoo search engine or Images from Flickr and the images will be pleased on its location on Yahoo Maps.

In Image1 we see a screen shot for a result from the 4 in 1 search engine.

Image 1. Screen Shot for the 4 in 1 Search Engine.

8. CONCLUSIONIn the web 2.0 there are a lot of technologies that can help the web developers to create and develop web sites. The Mashup which we talk about it is one of these technologies. But as we see, we cannot use the mashup technology alone. We must to use other technologies with it like RSS, ATOM and many more technologies to devalop a new web site.

Page 8: 4 in 1 Search Engine Mashup

9. REFERENCES[1] Programmable Web.

Located on the Internet at http://www.programmableweb.com/. Last visited: 11 December, 2008.

[2] Dive Into Web 2.0.

[3] Daniel Nations , What is a Mashup? Exploring Web Mashups, About.com: Web Trends. Located on the Internet at http://webtrends.about.com/od/webmashups/a/what-is-mashup.htm. Last visited: 4 December, 2008.

[4] Mashup (web application hybrid), Wikipedia. Located on the Internet at http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid). Last visited: 4 December, 2008.

[5] Duane Merrill, Mashups: The new breed of Web app, IBM.com.Located on the Internet at http://www-128.ibm.com/developerworks/library/x-mashups.html?ca=dgr-lnxw16MashupChallenges. Last visited: 5 December, 2008.

[6] Mashup Awards. Located on the Internet at http://mashupawards.com/. Last visited: 11 December, 2008.