33
CAPTCHA Shashwat Shriparv [email protected] InfinitySoft

Captchas

Embed Size (px)

Citation preview

CAPTCHA

Shashwat Shriparv

[email protected]

InfinitySoft

CAPTCHA

• CAPTCHA is an acronym for “ Completely

automated public Turing Test To Tell the

Computers and Human apart”

• A CAPTCHA is a challenge response test used

in computing to determine the user is human .

• Trademarked in 2000 by Luis von Ahn,Manuel

Blum,Nicholas Hopper and John Langford of

Carnegie Mellon University ,who developed the

first CAPTCHA.

CAPTCHA

• A common type of CAPTCHA requires the user to type the letters of a distorted image sometimes with the addition of an obscured sequence of letters or digits appears on screen.

• This string which the user has to type to submit a form .This is a simple problem for humans,but a very hard problem for computers which have to use character recognition,because the displayed string is alienated in a way,which makes it very hard for a computer to decode

• Early CAPTCHAs such as these distorted images generated by EZ-Gimpy program were used on Yahoo.

CAPTCHA

A program that can generate and grade

tests that:

1. Most humans can pass

2. Current computer programs cannot pass

Contd…

• The concept of a CAPTCHA is motivated by real-world problems faced by internet companies such as Yahoo! and AltaVista.

• These companies offer free email accounts, intended for use by humans.

• However, they found that many online vendors were using "bots", computer programs that would sign up for thousands of email accounts, from which they could send out masses of junk email.

Origin

• The first discussion of automated tests which

distinguish humans from computers for the

purpose of controlling access to web services

appears in a 1996 manuscript of Moni Naor from

the Weizmann Institute of Science, entitled

"Verification of a human in the loop, or

Identification via the Turing Test".

• Primitive CAPTCHAs seem to have been later

developed in 1997 at AltaVista by Andrei Broder

and his colleagues to prevent bots from adding

URLs to their search engine

Contd…

• In order to make the images resistant to

OCR (Optical Character Recognition), the

team simulated situations that scanner

manuals claimed resulted in bad OCR.

• In 2000, von Ahn and Blum developed

and publicized the notion of a CAPTCHA,

which included any program that can

distinguish humans from computers.

Characteristics

• A CAPTCHA system is an automated means of generating new challenges which current computers are unable to accurately solve, but most humans can solve .

• CAPTCHAs are by definition fully automated, requiring little human maintenance or intervention in administering the test.

• This has obvious benefits in cost and reliability.By definition, the algorithm used to create the CAPTCHA must be made public, though it may be covered by a patent.

Accessibility

• Because CAPTCHAs rely on perception, users unable to perceive a CAPTCHA due to a disability (such as blindness) will be unable to perform the task protected by a CAPTCHA. In certain Cases, failing to provide a universally accessible means of bypassing the CAPTCHA could make site owners a target of litigation

• In order to combat this problem, many implementations of CAPTCHAs permit users to opt for an audio CAPTCHA in addition to a text based one.

Contd…

• While the combination of an audio and

visual CAPTCHA can not satisfy all users

(for example, those with deafblindness),

the choice of adding a CAPTCHA to an

application is a balance between ease of

use for legitimate users and creating

enough of a challenge for abusers that

abusing the application is not worthwhile

Contd…

• The inconvenience caused by a

CAPTCHA is sometimes higher for users

with disabilities. For some applications, the

potential for abuse is so high that the

application author feels that a CAPTCHA

is necessary. For other applications, the

need for accessibility outweighs the abuse

that a CAPTCHA would prevent.

EZ-Gimpy

• EZ-Gimpy and Gimpy, the CAPTCHAs that we have broken, are examples of word-based CAPTCHAs.

• In EZ-Gimpy, the CATPCHA used by Yahoo! the user is presented with an image of a single word.

• This image has been distorted, and a cluttered, textured background has been added.

• The distortion and clutter is sufficient to confuse current OCR (optical character recognition) software.

Contd…

• However, using our computer vision techniques

we are able to correctly identify the word 92% of

the time.

• Gimpy is a more difficult variant of a word-based

CAPTCHA. Ten words are presented in

distortion and clutter similar to EZ-Gimpy.

• The words are also overlapped, providing a

CAPTCHA test that can be challenging for

humans in some cases.

Generating CAPTCHAs :Bongo

Answer: left

Contd…

• to the Right series or to the Left displays

two series of blocks, the Left and the Right

• blocks in the Left series differ from those

in the Right, and the user must find the

characteristic that sets them apart.

• then, the user is presented with a single

block and is asked to determine whether

this block belongs

Sound Based CAPTCHAs:Eco

• Sound-based CAPTCHA

picks a word or a sequence of numbers at random, renders the word or the numbers into a sound clip and distorts the sound clip.

then presents the distorted sound clip to its user and asks them to enter the contents of the sound clip

Text Based CAPTCHAs

Character Recognition

• A number of research projects have attempted (often with success) to beat visual CAPTCHAs by creating programs that contain the following functionality

1.Extraction of the image from the web page.

2.Removal of background clutter, for example with color filters and detection of thin lines.

3.Segmentation, i.e. splitting the image into segments containing a single letter.

4.Identifying the letter for each segment

Contd…

• Steps 1, 2, and 4 are easy tasks for computers

The only part where humans still out perform

computers is segmentation.

• If the background clutter consists of shapes

similar to letter shapes, and the letters are

connected by this clutter, the segmentation

becomes nearly impossible with current

software. Hence, an effective CAPTCHA should

focus on the segmentation

Graphic Based CAPTCHAs

Image-recognition CAPTCHAs

• Some researchers promote image recognition CAPTCHAs as a possible alternative for text based CAPTCHAs. To date, no major website has made use of an image based CAPTCHA. As such, the technology would be best described as in the stage of theoretical research. Image recognition CAPTCHAs face many potential problems which have not been fully studied:

• It is difficult for a small site to acquire a large dictionary of images which an attacker does not have access to. Without a means of automatically acquiring new labelled images, an image based challenge does not meet the definition of a CAPTCHA.

Principles

The principles behind CAPTCHA are as follows:

• The user is presented with a garbled image on which some text is displayed. This image is generated by the server using random text.

• The user must enter the same letters in the text into a text field that is displayed on the form to protect.

• When the form is submitted, the server checks if the text entered by the user matches the initial generated text. If it does, the transaction continues. Otherwise, an error message is displayed and the user has to enter a new code.

CAPTCHA would look like…

• The captcha would look like this:

• On the main registration form a regular captcha is presented just like before. Users that can see the image may use this test. A link informs users that there is an alternative test.

• Clicking the link leads to the audio based test form. This form provides access to an audio file and three input fields. The audio file contains three numbers that the user has to enter into the fields

Applications

• Online polls

• Protecting Website Registration

• Preventing Comment Spam in Blogs.

• Search Engine Bots

• Worms and Spam

• Prevent Dictionary attacks

Applications

• Online polls

In November 1999,htttp://slashdot.com

Released an online poll asking which was the best graduate school in computer science!. As is the case with most online polls, IP addresses of voters were recorded in order to prevent single users from voting more than once. However, students at Carnegie Mellon found a way to stuff the ballots by using programs that voted for CMU thousands of times.

Contd…

CMU's score started growing rapidly. The

next day, students at MIT wrote their own

voting program and the poll became a

contest between voting “bots". MIT

finished with 21,156 votes, Carnegie

Mellon with 21,032 and every other school

with less than 1,000.

Applications

• Protecting Website Registration

Several companies offer free email services. Up Until a few years ago most of these services suffered from a a specific type of attack:”bots” that would sign up for thousands of email accounts every minuite.The solution to this problem was to use CAPTCHAs to ensure that only humans obtain free accounts.

Applications

• Preventing Comment spam in Blogs

Most Bloggers are familiar with programs that

submit bogus comments usually for the purpose

of raising search engine ranks of some

website.This is called comment spam.By using a

CAPTCHA only humans can enter comments on

a blog.There is no need to make users sign up

before they enter a comment,and no legitimate

comments are over lost!

Applications

• Search Engine Bots

It is sometimes desirable to keep

webpages unindexed to prevent others

from finding them easily.There is an html

tag to prevent search engine bots from

reading webpages.

Applications

• Worms and Spam

CAPTCHA tests also offer a plausible

solution against email worms and spam:

only accept an email message if you know

there is a human behind the other

computer.

Applications

• Preventing Dictionary attacks

CAPTCHA can also be used to prevent dictionary attacks in password systems.The idea is simple:prevent computer from being able to iterate through the entire space of passwords by requiring it to solve a CAPTCHA after a certain number of unsuccessful logins.

Conclusion

• Interested in breaking a CAPTCHA?

• People have tried already!

THANK YOU

Shashwat Shriparv

[email protected]

InfinitySoft