Captcha Upload

Embed Size (px)

Citation preview

  • 8/2/2019 Captcha Upload

    1/38

    Are youHuman?(Sorry, I had to ask)

    1

    Preetam Joga

    08311A1228

  • 8/2/2019 Captcha Upload

    2/38

    Agenda

    What is CAPTCHA?

    Types of CAPTCHA

    Where to use CAPTCHAs?

    Guidelines when making a CAPTCHA

    Ways to break CAPTCHAs

    reCAPTCHA

    2

  • 8/2/2019 Captcha Upload

    3/38

    3

    VulnerabilitiesHTTP does not distinguish between human & machine users.HTTP & SSL do not guarantee client software or user is benign.Malicious bots can be anonymous and distributed.

    Threats to Web

    Content Theft-- stealing paid dataCopyright Infringement-- scraping content from one site to display on

    another, out of contextPoll Stuffing-- MIT vs. CMU.Web Spam-- unsolicited commenting, abusing free email, scrapingaddresses

  • 8/2/2019 Captcha Upload

    4/38

    4

    CAPTCHA, the Acronym

    CompletelyAutomatedPublicTuring Test to TellComputers andHumansApart

  • 8/2/2019 Captcha Upload

    5/38

    5

    CAPTCHA Origins

    1997: Andrei Broder at AltaVista wanted to prevent bots fromautomatically submitting sites for indexingHe decided to add a test to the submission page

    2000: Luis von Ahn, Manuel Blum & John Langford at CarnegieMellon University created CAPTCHA for Yahoo to prevent

    automated e-mail account registration

  • 8/2/2019 Captcha Upload

    6/38

    6

    What is CAPTCHA?

    A program that can tell whether its user is ahuman or a computer.

    It uses a type ofchallenge-response test todetermine that the response is not generated

    by a computer.

    A puzzle or problem that is easy for humans to solveand very difficult for computers.If the puzzle is solved correctly, you are consideredhuman and can continue.

    ReverseTuring test-- computers finding humans, nothumans finding computers

  • 8/2/2019 Captcha Upload

    7/38

    Turing Test

    StandardInterpretation"

    player C, theinterrogator,is tasked with trying to

    determine which player- A or B - is a computerand which is a human.

    7

  • 8/2/2019 Captcha Upload

    8/38

    Reverse Turing Test

    A CAPTCHA issometimesdescribed as areverse Turing test,because it is

    administered by amachine andtargeted to ahuman.

    8

  • 8/2/2019 Captcha Upload

    9/38

    Making a CAPTCHA

    Pick randomstring ofcharacters

    (or words)

    ifhkfp

    Renders it into adistorted image

    9

  • 8/2/2019 Captcha Upload

    10/38

    Outperform the computers

    In many simple tasks, a typical 5-year-oldcan outperform the most powerfulcomputers

    easier for computers: like medical diagnosis,

    playing chess,

    hard for computers: operations requiring vision, hearing, languageor motor control.

    10

  • 8/2/2019 Captcha Upload

    11/38

    Type: Early CAPTCHAs

    Generated by the EZ-Gimpy program;

    Used previously on Yahoo!

    11

  • 8/2/2019 Captcha Upload

    12/38

    Type: Improved CAPTCHA

    12

    high contrast for human readability;

    medium, per-character perturbation; random fonts per character;

    low background noise;

  • 8/2/2019 Captcha Upload

    13/38

    Type: A modern CAPTCHA

    13

    rather than attempting to create a distorted

    background and high levels of warping onthe text;

    focus on making segmentation difficult byadding an angled line;

  • 8/2/2019 Captcha Upload

    14/38

    Type: A modern CAPTCHA

    14

    another way to make segmentation difficult

    is to crowd symbols together; this can be read by humans but cannot be

    segmented by bots;

  • 8/2/2019 Captcha Upload

    15/38

    15

    Printed CAPTCHA Types

    Gimpy-- text distortion used by Yahoo!

    Bongo-- visual puzzle

    http://www.captcha.net/captchas/gimpy/http://www.captcha.net/captchas/bongo/http://www.captcha.net/captchas/bongo/http://www.captcha.net/captchas/gimpy/
  • 8/2/2019 Captcha Upload

    16/38

    Other Types of CAPTCHA

    Animated CAPTCHAs

    3D CAPTCHA

    ASCII art

    16

  • 8/2/2019 Captcha Upload

    17/38

    Other: Cognitive Puzzles

    Distinguish pictures of dogs fromcats

    Choose a word that relates to all theimages Trivia questions Math and word problems 3D Object CAPTCHA Solve failed OCR inputs

    17

  • 8/2/2019 Captcha Upload

    18/38

    Other: Distinguish pictures

    18

    Microsoft Asirra (Animal Species Image Recognitionfor Restricting Access);

    KittenAuth Project .

    http://research.microsoft.com/asirra/http://www.thepcspy.com/kittenauthhttp://www.thepcspy.com/kittenauthhttp://research.microsoft.com/asirra/
  • 8/2/2019 Captcha Upload

    19/38

    Other: Mathematical

    CAPTCHA

    19

  • 8/2/2019 Captcha Upload

    20/38

    Other: 3D Object CAPTCHA

    You must enter themin the exact sequencelisted:

    The Head of theWalking Man,

    The Vase, The Back of the Chair.

    20

  • 8/2/2019 Captcha Upload

    21/38

    Other: 3D CAPTCHA

    21

    Renders OTP in 3D space to image

    Reputedly the most difficult to crack

    Server needs good graphics card to be practical (rare)

    Can be combined with other methods

    Might see more in future

  • 8/2/2019 Captcha Upload

    22/38

    Other: Jumble Game

    22

  • 8/2/2019 Captcha Upload

    23/38

    Other: Drupal Examples

    23

  • 8/2/2019 Captcha Upload

    24/38

    Other: Tests

    24

    Common Sense" questions:

    What is 3 + 5?

    What color is the sky?"

    Type the word 'orange'; Require a valid email to approve;

    These attempts violate principles:

    they cannot be automatically generated;

    they can be easily cracked given the state of AI.

  • 8/2/2019 Captcha Upload

    25/38

    25

    Audio CAPTCHA

    Pick a word or a sequence of numbers atrandom

    Render them into an audio clip using a TTSsoftware

    Distort the audio clip

    Ask the user to identify and type the word ornumbers

  • 8/2/2019 Captcha Upload

    26/38

    Where to use CAPTCHAs? Data Collection Worms and Spam Preventing Comment Spam in Blogs. Protecting Email Addresses From

    Scrapers.Mechanism to hide your emailaddress, require users to solve a CAPTCHA beforeshowing your email address

    Online Polls. You cannot trust the results of anonline roll because anybody could just write aprogram to vote for their favorite optionthousands of times.

    26

  • 8/2/2019 Captcha Upload

    27/38

    Where to use CAPTCHAs?

    Protecting Website Registration. (E-mail services: Yahoo, Microsoft, Google)

    Preventing Dictionary Attacks(inpassword systems). Prevent a computer to

    iterate through the entire space of passwords byrequiring it to solve a CAPTCHA after a certainnumber of unsuccessful logins.

    Search Engine Bots. It is sometimes

    desirable to keep webpages unindexed toprevent others from finding them easily.

    27

  • 8/2/2019 Captcha Upload

    28/38

    Ways to break CAPTCHAs

    Exploiting bugs in the implementation thatallow the attacker to completely bypassthe CAPTCHA;

    Improving Character Recognition software(OCR Optical Character Recognition );

    Using cheap human labor to process the

    tests (sweatshops).

    28

  • 8/2/2019 Captcha Upload

    29/38

    Break: Character Recognition

    Programs that have the following functions:

    Extraction of the image from the web page

    Removal of background clutter, for examplewith color filters and detection of thin lines;

    Segmentation, i.e. splitting the image intoregions each containing a single letter;

    Identifying the letter for each region.

    29

  • 8/2/2019 Captcha Upload

    30/38

    Break: Human solversAttacks that uses humans to solve the

    puzzles;

    Approaches:

    relaying the puzzles to a group of humanoperators who can solve CAPTCHAs;

    copying the CAPTCHA images and usingthem as CAPTCHAs for a high-traffic siteowned by the attacker.

    30

  • 8/2/2019 Captcha Upload

    31/38

    31

    How CAPTCHA Sweatshops work ?

  • 8/2/2019 Captcha Upload

    32/38

    reCAPTCHA (2007)

    New form of CAPTCHA that also helpsdigitize books;

    The words displayed to the user comedirectly from old books that are beingdigitized;

    Words that OCR could not identify;

    32

  • 8/2/2019 Captcha Upload

    33/38

    reCAPTCHA

    33

    Pairs an unknown word with a known one; Distorts them both and puts a line through

    them and then sent them to be proofread;

    Respondent answers both elements:

    half of effort validates the challenge;

    the other half is captured as work.

  • 8/2/2019 Captcha Upload

    34/38

    reCAPTCHA

    34

  • 8/2/2019 Captcha Upload

    35/38

    Time spent

    Roughly 60 million CAPTCHAs are solvedeach day;

    Medium 10 seconds to solve a captcha;

    People around the world waste more than

    150,000 hours on solving CAPTCHAs;

    35

  • 8/2/2019 Captcha Upload

    36/38

    Time spent

    A fifth of those users giving30,000 daily man-hours of work;

    It would constitute the world's fastest and

    most accurate character-recognitioncomputer, processing 10 million words aday.

    Recreating the books word by word

    36

  • 8/2/2019 Captcha Upload

    37/38

    Still not thinking big enough

    "If we have that many people alldoing some little part, we could do

    something insanely huge forhumanity."

    "We'll never run out of things to

    digitize"

    37

  • 8/2/2019 Captcha Upload

    38/38

    Thank you!

    38