Cyboinsect Nov14 Issue-7

Intel released an Arduino-certified development boards. Galileo is a microcontroller board. It is based on the Intel Quark SoCx1000 application processor a 32-bit Intel Pentium class sys-tem on chip.

It is designed for the maker and education communication. Intel Galileo can be programmed though OsX, Microsoft windows and Linux. Galileo is made to support shield that operate at either 3.3V or 5V.The core operating voltage of Galileo is 3.3V.However, a jumper on the board enables voltage translation to 5V at the I/O pins.

The Intel Galileo board is software-compatibility with the arduino soft-ware development environment, which makes usability and introduction a snap.In addition,the Intel Galileo board has several PC industry standard I/O ports and features to expand native usage and compabilities beyond the Arduino shield ecosystem.

This Galileo board is also supporting 10/100 Mbit Ethernet, SD, USB 2.0 device and EHCI/OHCI USB host ports, high-speed UART, RS-232 serial port, programmable 8 MB NOR flash, and a JTAG port for easy debug.This provides support for 5V Uno shields and is the default behaviour.

Rasberrrypi vs. GalileoGalileo board supports a 400MHz Pentium-class System-on-a-chip(SoC) called “Quark”.RPi is normally clocked at 700MHz,but is easily overclocked.Rasberry Pi is best for handling media such as photos or video,and a Galileo is an excellent choice if we have a project requiring sensor, monitoring,or have produictivity-related applica-tions.

The RPi model’s price is half of the Galileo board cost,but there are the hidden costs with RPi.To use RPi,we need: a USB power supply and an SD card with boot code installed.We may also need keyboard,mouse,HDMI-to-DVI cable.Galileo can boost from on board memory.The RPi boosts from an SD card so RPi requires from formatting a card and copying the image before booting first time.Galileo and Raspberry pi have same Same instruction set that is 32-bit.Both have same RAM that is 512 KB.Galileo has 16 KB L1 cache whereas RPi has 32KB L1 cache & 128 L2 cache ;shared with CPU&GPU.Galileo does not support video and audio from board whreas RPi does support it has 3.5 mm stereo.Galileo has 6 analog i/o whereas RPi has 17 GPIO.And they also have digital I/O respectively 14 and 8.Galileo includes reset button whereas RPi does not.Galileo does not allow camera.There are RPi boards available with camera.Galileo board requires 15 V power rating whereas RPi board requires 3.5V power rating.

- Rachana Solanki

WIRELESS GENERATION

The generation we are living in is called ‘THE GENER-ATION OF WIRELESS COMMUNTICATION.’ Here, wireless refer to the connection which do not have any hardware connection or any electrical conductor attached to it for communication. It is as simple as switching off the television.

The most simple and common example of wireless technology is Radio. Radio is one of the communication media. It connects people with the outside world. When radio was invented, it became popular among people as it was the first wireless device and was a media between people and the environment. As it was easy to carry, each and every house had their own radio at that time. After radio, cordless phones, keyboards and mouse, etc have become a part of our daily life. The world’s first wireless telephone was invented in 1880 by Alexander Graham Bell and Charles Sumner Tainter and manifested the photophone, a telephone that perform conversation wire-lessly over light beam.

As there is high development in wireless field, now the most famous development is the 5G’s of telecommuni-cation i.e. 1G, 2G, 3G, 4G and 5G. In this G stands for GENERATION.

1G (First Generation)

When this generation was in its development stage, many organisations had not thought of evenness of mobile technology. The fact was that, the mobile communication market was run by government and evenness of mobile communication was not up to them. However, an inter-esting effort came from countries like Denmark, Finland, Iceland, Norway etc which were standardizing the mobile communications.

The 1G mobile communication is had limited ca-pacity, serving only forte markets for military, gov-ernment industries and special industries. In 1960s and 1970s, this service was limited and at that time mobile was too large so it was usually used in cars, trucks etc. This type of mobile communication was not ready for mass development because: 1) It had limited capacity to serve the public. (2) It had limited capacity to cover large areas. (3) Mobile de-vices were of larger size. (4) High prices of mobile devices. Cellular technology is the only thing that differentiates 1G from other technologies. Before 1G, mobile communications were focussed to de-velop a base station that can serve larger areas. The coverage of 1 base station was 50miles which was enough to serve the metropolitan city at a time. But with given frequency band, very less subscribers had to use mobile communication channel at the same time. And this gave rise to “cellular phone.” As problems were resolved, 1G communications based on analog and cellular technology became feasible during 1980s in countries like Japan, Euro-pean nations etc.

2G (Second Generation)

While 1G was based on analog digital signal, 2G was based on digital analog signals. When smart phones were introduced, there was no texting services and had to face terrible con-nections. After 2G came into exist-ence, it gave growth to the capability of transferring data and getting infor-mation, but velocity were sluggish i.e. 9.6 kb/s, that was slower than the old. But slowly the information transfer rates were raised and technology in-novation was bounded, the speed was increased upto 56kb/s and we thought it was fast unti….!

3G (Third Generation)

… Until 3G was developed after 2G. The third generation of mobile technology gave an experience of high speed, we should say 4 times quicker than 2G. The initial speed was 200 kb/s and steady flow of information noticed maximum speed upto 7.2 mbps. The latter speed were just numbers because it doesn’t reach to the highest rates unless you are in the right spot at right time. 3G is the next generation of technology which modernized the telecommunication indus-try. However, this generation has given a fast accessing speed but apart from this it has also provided us with value added services like video calling, live streaming, mobile internet access, IPTV, etc on mobiles. This all are possible because the 3G connec-tion endows with required bandwidth. 3G technology is outlined for multimedia com-munication. It provides higher data transfer rates. One of its significant revelation is to provide global roaming, it means that the user can move anywhere across borders with the same number and handset.

4G (Fourth generation) 4G is a set of measures for providing broadband internet access to devices like cellphone and tablets. The main difference between 4G and other standards are that it provide huge increase in data transfer speeds and people can access different types of media. Though it was first available in US in 2009, till 2011 no particular technologies were elected as 4G. But as there was development in technology field, 4G became popular. The common feature of 4G mobile device is that it must be IP-based and must pro-vide data speed upto 100 megabits per second (Mbps) when device is being used and while the device is stationary it should provide speed upto 1 Gigabit per second (Gbps). The device should also support all digital voice and streaming videos. Also they had to provide some security to transmission. There are also different technical feature that a 4G device should have, like wireless standard, radio interface and frequency spectrum. Commencing of 2011, only two technologies were selected as 4G mobile: LTE-ADVANCED and WiMAX RELEASE 2.

5G (Fifth Generation)

5G is the next larger phase of mobile telecommunication. It will provide the best and most outstand-ing feature as compared to previous technologies. But no specific characteristics or features are speci-fied yet. However updated standards that define capabilities of 5G are still under considerations and are defined in current ITU-T 4G standards. Every 10 years, a new mobile generation has occurred. In 1981, 1G system was introduced, followed by 2G in 1991, 3G in 2001 and 4G in 2012. Each mobile generation took 10 years for its development. According to the sources, it is said that 5G will be introduced in eaaly 2020s. Still no international projects are developed on 5G and there is no clear assumption about what 5G is exactly about. But if 5G is introduced then it will have features like high data volume per area unit, low battery consumption, high bit rate in large portions of the coverage area, lower latencies, higher number of supported devices, lower infrastructure costs, high versatility, and scability or higher reliabil-ity of communications. These are some objectives that can be expected from 5G.

- Priyanka Sable

1)word processors2)software program3) vishal sikka4)palo alto,andy rubin5) McAfee security6) Asimo7)george eastman8) TIF,JPG,PNG,GIF9)MILLION10) 198211) 1995

Last Magazine Quiz Answers

From ,This Issue We are Planning To introduce you every friday Quiz online And Winner should be announced on our Facebook page .

For that you have to visite Our Facebook page link is :- www.facebook.com/cyboinsect.

Digital Inspiration : A genesis of hot tech tips

Interested to know about little tips and tricks which can probably make your life a little easier and better. Then we got the a al carte style meal for you. Actually, it’s not that a al carte. All the content is in a stream of articles man-ner, there is one real categorie or say, section - “tech!”. But anyway Digital Inspiration urled www.labnol.org provies a great deal of interesting hot tech tips that you might actually need in daily life activities.

The tech guide features life hacks like using Evernote as a password strength measuring tool. It also provides detailed tutorials and how to? articles such as “How to make money on the internet?”. “Ahem-ahem ”...Digital Inspiration provides you with tips and tricks about the newest versions of available software and related services out there.

When you first open the Digital Inspiration website, it literally inspires you and gives you all the motivation you need to download and install adblock extension for you respective browser. Seriously, a lot of ad banners. But luckily, you can get rid of those annoying and distracting ads while reading a very well written article about a cool tech trick, by using the couch mode option which secret-ly translated to “ad free” mode. On a mobile phone, the website was fluid and laminar. Also, there is one less ad.

Now lets talk about the most crucial thing - “content quality”. Content quality of any media is of paramount importance. Digital inspiration is basically a blog, elo-quently written by Amit Agarwal. He claims to be the radix of Indian professional blogging with how-tos and tech tutorials dating back to 2004. Man has an active Flickr account !!. Don’t know what flickr is… you’re not alone. Honestly, in the tech field, no one really cares that you are the first, one must be really good and hands down, Amit is extraordinary

He is great at articulating his thoughts and emphatic in writing. No wonder he gets boosting reviews from all over the globe praising his blog, if you forgot, it’s Digital Inspiration urled labnol.org . He has a computer science degree from IIT, which is not really that shocking. Also, being a very skilled IT professional, he also offers services like wordpress optimization, adsense integration, website reviews, and many more stuff at a hefty price.

So, next time you start a website or a blog which reaches to a popularity level, at which you feel like you need tech-nical help, remember Amit Agarawal.

Also, his article are also features in flipboard. You prob-ably might have read and enjoyed his articles but failed to remember or even read his name, you know, like normal people. Well now you do know about him and next time you see his name under a headline of a flipboard article, be sure that the article would be a great read.

Amit Agarwal

Founder Of Digital Inspiration

Amit Agarwal holds an Engineering degree in Computer Science from I.I.T. and has previously worked at ADP Inc. for clients like Goldman Sachs and Merrill Lynch. In 2004, Amit quit his job to become India’s first and only Professional Blogger.

- Utsav Jain

Google search engine has undoubtedly become the most powerful search engine. It would be practically impossible to find any information that you need without a search engine. All search engines have an algorithm for this task and so does Google. About which Google shares only general facts, the details being the secret so as to remain competitively strong and reduces the chance of someone finding out how to abuse the system.

How to make this idli, dosa, vadapav, cake? What is the cold-est place on earth? How to impress a girl? Why does a person snore? Where do you get the best pizzas in your city?

Where do you go and look for answers to all these questions?? The most accessible, accurate and efficient answers are pro-vided by on one another than “THE GREAT AND GRAND GOOGLE”.Would it be wrong to say, that like roti –kapda-makan and cell phones, Google too has become a basic necessity to most of the people of this era?But did we try to figure out the mystery about how GOOGLE figures out all the mysteries, confusions and queries generated by our minds?

A Web crawler starts with a list of Uniform Resource Locators(URLs) to visit, called the seeds. As the crawler visits these URLs, it identifies all kinds of hyperlinks in the page and adds them to the list of URLs to visit.This is called crawl frontier. Accord-ing to the set of policies, frontier’s URLs are recursively visited. If the crawler is performing archiving of websites it copies and saves the information as it goes. Such archives are usually stored such that they can be viewed, read and navigated as they were on the live web, but are preserved as ‘snapshots’. The large volume implies that the crawler can only download a limited number of the Web pages within a given time, so it needs to prioritize its downloads. The high rate of change implies that the pages might have already been updated or even deleted.

Google spiders unleashed on Web

The number of possible URLs crawled being generated by server-side software has also made it difficult for web crawlers to avoid retrieving dupli-cate content. Endless combinations of HTTP GET (URL-based) parameters exist, of which only a small selection will actually return unique content. For example, a simple online photo gallery may offer three options to users, as specified through HTTP GET parameters in the URL. If there exist four ways to sort images, three choices of thumb-nail size, two file formats, and an option to disable user-provided content, then the same set of con-tent can be accessed with 48 different URLs, all of which may be linked on the site. This mathemati-cal combination creates a problem for crawlers, as they must sort through endless combinations of relatively minor scripted changes in order to retrieve unique content.

Cover Story

Automated programs called ‘spiders’ or ‘crawlers’ are used by Google. It has a giant index of keywords and words can be found. What sets Google apart is how it ranks search results, which in turn determines the order Google displays results on its search engine results page (SERP). Google uses a trademarked al-gorithm called PageRank, which assigns each Web page a relevancy score.

As Edwards et al. noted, “Given that the bandwidth for conducting crawls is neither infinite nor free, it is becoming essential to crawl the Web in not only a scalable, but efficient way, if some reasonable measure of quality or freshness is to be maintained.” A crawler must carefully choose at each step which pages to visit next.

A) Revisit policyCrawling the web can be very time consuming and complicated. Crawling just even a few web pages can take weeks or even months. By the time it finishes the the crawling all over the web pages, many radical changes could have already been made, like the cre-ation of new pages, modification, deletion.

From the search engine’s point of view, there is a cost associated with not detecting an event, and thus hav-ing an outdated copy of a resource. The most-used cost functions are freshness and age.

Freshness: This is a binary measure that indicates whether the local copy is accurate or not. The freshness of a page p in the repository at time t is defined as:

Age: This is a measure that indicates how outdated the local copy is. The age of a page p in the repository, at time t is defined as:

Some simple revisit policies are:1) Uniform policy-All the pages of the collection with same frequency are revisited rather than their rate of change

2) Proportional policy- In this policy,the pages with higher rate of change are revisted.

To improve freshness, the crawler should penalize the elements that change too often.The optimal re-visiting policy is neither the uniform policy nor the propor-tional policy. The optimal method for keeping average freshness high includes ignoring the pages that change

Crawling policy

The behavior of a Web crawler is the outcome of a combi-nation of policies

• a selection policy – determines which pages to download

• a re-visit policy –determines the time for checking the changes in the pages• a politeness policy –for determing and stating how to avoid overloading webites• a parallelization policy –stating the coordination of dis-tributed crawlers.

The factors which define the PageRank of any web page are:

1) The frequency and location of keywords within the web page: If the keyword only appears once within the body of a page, it'll receive a low score for that keyword.

2) How long the net page has existed: People create new web content each day, and not all of them stick around for long. Google places more value on pages with a long time history.

3) The number of other web content that link to the page in question: Google looks at what percentage web content link to a selected site to see its relevance.Because Google looks at links to an internet page as a vote, it is not simple to cheat the system. The simplest way to make your website is high on Google's search results is to produce nice content in order that people can link back to your page. The more links your page gets, the upper its PageRank score are. If you attract the attention of web sites with a high PageRank score, your score can grow quicker.

Cover Story

Some simple revisit policies are:1) Uniform policy-All the pages of the collection with same frequency are revisited rather than their rate of change

2) Proportional policy- In this policy,the pages with higher rate of change are revisted.

To improve freshness, the crawler should penalize the elements that change too often.The optimal re-visiting policy is neither the uniform policy nor the proportional policy. The optimal method for keeping average freshness high includes ignoring the pages that change too often, and the optimal for keeping average age low is to use access frequencies that monotonically (and sub-linearly) increase with the rate of change of each page. In both cases, the optimal is closer to the uniform policy than to the proportional policy

B) Politeness policyCrawlers are more quicker and efficient in finding the information as compared to human searcher. This obvi-ously has a deep and direct impact on the performance of any website. Needless to say, if a single crawler is per-forming multiple requests per second and/or download-ing large files, a server would have a hard time keeping up with requests from multiple crawlers.

Webcrawlers are very useful and are capable of perform-ing many tasks.But with some cost which include:1) network resources2) server overload3) poorly written crawlers, which can crash servers or routers,4) personal crawlers that, if deployed by too many users, can disrupt networks and Web servers

What can be the solution to this? Partially-ROBOTS EXCLUSION PROTOCOL.It is standard for administrators to indicate which parts of their Web servers should not be accessed by crawlers.Google uses an extra ‘crawl delay’ parameter in the robots.txt file to indicate the number of seconds to delay between requests.

c) Parallization policyA crawler is capable of handling multiple processes simultaneously.Such a crawler is called a parallel crawler. The goal is to maximize the download rate while minimizing the overhead from paralleliza-tion and to avoid repeated downloads of the same page. To avoid downloading the same page more than once, the crawling system requires a policy for assigning the new URLs discovered during the crawling process, as the same URL can be found by two different crawling processes.

Cover Story

ARCHITECTURE

A crawler in addition to an efficacious crawling policy should have a highly optimised architecture.As we have gathered that crawlers are the central part of any search engine.Like Google, all search engines have to hide the details of algorithm and implementation for business purposes. Crawler designs are pub-lished keeping in mind to hide the details in order to prevent any one else to reproduce the same. There are also emerging concerns about "search engine spamming", which prevent major search engines from pub-lishing their ranking algorithms.

Crawler Identification

There is a user –agent field in HTTP request.Web crawlers identify themselves to a web server using these user agent fields. Administers can keep a check on which web crawlers have visited and how often they have visited their website using their user-agent fields. It is important for Web crawlers to identify themselves so that Web site administrators can contact the owner if needed. In some cases, crawlers may be accidentally trapped in a crawler trap or they may be over-loading a Web server with requests, and the owner needs to stop the crawler. Identification is also useful for administrators that are interested in knowing when they may expect their Web pages to be indexed by a particular search engine.

There is hell lot more to know about crawlers. So now we know who gets you the answers of you ‘WHAT,WHY,HOW,WHERE’ whenever you use Google (or any other search engine)Be grateful to our WEB-SPIDERS.

Cover Story

- Krupali Rana

- Manasi Thakkar

Documents

Cyboinsect Nov14 Issue-7