Web Captcha

Embed Size (px)

Citation preview

  • 8/14/2019 Web Captcha

    1/31

    WEB CAPTCHA

    HUMAN OR SCRIPT?

    An AI approach to cryptography

  • 8/14/2019 Web Captcha

    2/31

    Overview

    Vulnerabilities, Threats, Controls

    2 Precursors

    4 Proposals

    6 General Approaches

    3 Deployment OptionsIf time: issues and links

  • 8/14/2019 Web Captcha

    3/31

    Vulnerabilities

    HTTP does not distinguish betweenhuman & machine users.

    HTTP & SSL do not guarantee clientsoftware or user is benign.

    Malicious bots can be anonymous and

    distributed.

    Benign bots spider for searches, etc.

  • 8/14/2019 Web Captcha

    4/31

    Threats to Web

    Content Theft-- stealing paid data

    Copyright Infringement-- scraping content

    from one site to display on another, out ofcontext

    Unwanted spidering-- search engines mayignore robots.txt or nofollow tags

    Poll Stuffing-- MIT vs. CMU on /. [1]Web Spam-- unsolicited commenting,abusing free email, scraping addresses

  • 8/14/2019 Web Captcha

    5/31

    Web Spam

    Web comments, discussions, guestbooks, Wikis, many public forms are

    open to spam messages.More eyeballs per message than e-mail

    E-mail spam is illegal, but most Web

    spam is legal.

    Bots collect email addresses on Web.

  • 8/14/2019 Web Captcha

    6/31

    Motives

    Google-- more links, higher ranking

    Profit-- ads for real product/service

    Phishing-- bait and switch for identitytheft, financial theft

    Astroturfing-- promote agenda by

    simulating grassroots word-of-mouthVandalism-- competition, damage,thrill, revenge, activism, etc.

  • 8/14/2019 Web Captcha

    7/31

    Cracked Controls

    IP tracking/banning-- repurposed DDoSscripts; IP masking, hijacking

    User Authentication-- if not easilycracked, use service like bugmenot.com

    Moderation (human review)-- script

    makes own moderator account in DB

    Good start, but may need more.

    http://www.bugmenot.com/http://www.bugmenot.com/
  • 8/14/2019 Web Captcha

    8/31

    CAPTCHA

    Acronym for Completely AutomatedPublic Turing test to tell Computers &

    Humans Apart-- Dr. Manuel BlumReverseTuring test-- computers findinghumans, not humans finding computers

    A category, not a specific solution

  • 8/14/2019 Web Captcha

    9/31

    Precursors

    Unpublished manuscript by Moni Naorfirst mentions automated Turing test in

    1997, but not proposed or formalized.Altavista patent in 1998 first practicalexample of using slightly distorted

    images of text to deter bots, but onlydefeats stock OCR, not custom OCR

  • 8/14/2019 Web Captcha

    10/31

    Definition

    In 2000, formalized by Luis vonAhn, Manuel Blum & Nicholas J. Hopper of Carn

    A CAPTCHA is a cryptographicprotocol whose underlying hardnessassumption is based on an AI problem.[1]

    www.captcha.net

    http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/http://www.captcha.net/
  • 8/14/2019 Web Captcha

    11/31

    Win-Win

    If cracked, AI is advanced because avery difficult (unsolved) AI problem has

    been solved;If not cracked, steganographiccryptography is advanced [1]

  • 8/14/2019 Web Captcha

    12/31

    CAPTCHA.net Proposals

    Gimpy-- text distortion used by Yahoo!

    (routinely cracked & improved)

    Bongo-- visual puzzle, like Mensa tests

    (if 4 options, guess works 25%)

    Pix-- photographic recognition (need

    large image DB, or Google API)

    Sounds-- voice synthesis, distortion

    http://www.captcha.net/captchas/gimpy/http://www.captcha.net/captchas/bongo/http://www.captcha.net/captchas/pix/http://www.captcha.net/captchas/sounds/http://www.captcha.net/captchas/sounds/http://www.captcha.net/captchas/pix/http://www.captcha.net/captchas/bongo/http://www.captcha.net/captchas/gimpy/
  • 8/14/2019 Web Captcha

    13/31

    Gimpy

    Images of distorted text.

    Frequently cracked and

    improved.In current version, 5 pairs ofoverlapped words. Useridentifies 3 words.

    Random placement, font,distortion, background pattern

    Overlapping words need nonoise.

  • 8/14/2019 Web Captcha

    14/31

    Bongo

    Visual puzzle

    Computer can generate &

    display, but not solve.If too many choices,humans get it wrong.

    If not enough choices,computers can be effectivewith random guess.

  • 8/14/2019 Web Captcha

    15/31

    Pix

    Photo Recognition

    Need large image DB

    Images need keywords

    Four images with same keyword shown

    Random subset of keywords as choicesPoor implementations easy to crack(color of top left pixel unique, etc.)

  • 8/14/2019 Web Captcha

    16/31

    General Approaches

    Text (ASCII/Unicode)

    Image

    Speech

    Animation

    3-DCombinations of all above

  • 8/14/2019 Web Captcha

    17/31

    ASCII/Unicode 4Pth4

    Change text to look-alike: SPAM is $P4M.Fools simplest text matching.

    Accented or non-English chars: SpmChars to words: [email protected] --> uce at ftcdot gov

    URL/HTML entities: COPY becomes

    0 or %430P%59Better than nothing, but easy to crack

    It is not technically CAPTCHA

    mailto:[email protected]:[email protected]
  • 8/14/2019 Web Captcha

    18/31

    Image CAPTCHA

    Presents one-time-password as an imagehumans can read, but not scripts

    If image is too simple, OCR can crack; toocomplex, human cannot read.

    To beat OCR, vary position, warp, noise,background, colors, overlap, randomness,

    font, angles, language, methods usedShow filtered photos as well as words

    Can deny accessibility to vision-impaired

    http://sam.zoy.org/pwntcha/http://sam.zoy.org/pwntcha/
  • 8/14/2019 Web Captcha

    19/31

    Considering Accessibility

    Government and everyone who does business withgovernment must meet federal accessibilitystandards for disabilities. Serious legal penalties.

    Professional ethics requires everyone else to do thesame, with lesser consequences.

    Often ignored by amateurs, but at risk of beingconsidered rude.

    Very few CAPTCHAs are accessible.Solution (W3C): use both image & speech, manualapproval; but chain only strong as weakest link.

    http://www.w3.org/TR/turingtest/http://www.w3.org/TR/turingtest/http://www.w3.org/TR/turingtest/http://www.w3.org/TR/turingtest/
  • 8/14/2019 Web Captcha

    20/31

    Speech CAPTCHA

    Usually spells out one-time-password insynthesized or recorded voices

    Voice recognition cracks simple case.Applied audio filters risk humanmisunderstanding.

    Used with image CAPTCHA forincreased accessibility.

    If both use same OTP, easier to crack.

    http://captchas.net/http://captchas.net/
  • 8/14/2019 Web Captcha

    21/31

    Animated CAPTCHA

    Can use Flash, MPEG, animated GIF

    Often combined with speech

    Weaknesses of Image CAPTCHA apply

    Usually easierto crack due to extra datafor pattern matching to analyze

    Much higher processor and traffic load

    Not practical in most cases

    http://wanngehtdieweltunter.de/WOS/RalfMuellerJobLogDreihttp://wanngehtdieweltunter.de/WOS/RalfMuellerJobLogDrei
  • 8/14/2019 Web Captcha

    22/31

    3D

    Renders OTP in 3D space to image

    Reputedly the most difficult to crack

    Server needs good graphics card to bepractical (rare)

    Can be combined with other methods

    Not yet common (tEABAG_3D)

    Might see more in future

    http://www.ocr-research.org.ua/index.php?action=teabaghttp://www.ocr-research.org.ua/index.php?action=teabaghttp://www.ocr-research.org.ua/index.php?action=teabaghttp://www.ocr-research.org.ua/index.php?action=teabag
  • 8/14/2019 Web Captcha

    23/31

    Circumventing CAPTCHA

    Social engineering can foil mostCAPTCHAs. How?

    Scrape captcha from origin, pose tohuman for free access to other content(adult, news, search, blogs)

    User unaware of helping spammers

    http://www.boingboing.net/2004/01/27/solving_and_creating.htmlhttp://www.boingboing.net/2004/01/27/solving_and_creating.html
  • 8/14/2019 Web Captcha

    24/31

    Which CAPTCHA?

    Even simplest CAPTCHA can beat vastmajority of scripts

    Even best CAPTCHA can be crackedby dedicated, sophisticated coders

    Weigh strength vs. cost (compute

    cycles, bandwidth, dollars)Be careful not to violate accessibilitylaws or open new holes.

    http://www.mperfect.net/aiCaptcha/http://www.puremango.co.uk/cm_breaking_captcha_115.phphttp://www.puremango.co.uk/cm_breaking_captcha_115.phphttp://www.mperfect.net/aiCaptcha/
  • 8/14/2019 Web Captcha

    25/31

    Deploying CAPTCHA

    Install existing software (pro or free)

    Use remote CAPTCHA service

    Develop own CAPTCHA or customizeopen source scripts.

  • 8/14/2019 Web Captcha

    26/31

    Existing Software

    Hundreds or thousands of options

    Narrow choices by price, server

    requirements, standards compliance,third-party testing results

    Big targets cracking a popular control

    opens hundreds of sites to spammersLike antivirus, ineffective unlessfrequently updated.

    http://en.wikipedia.org/wiki/Captchahttp://en.wikipedia.org/wiki/Captcha
  • 8/14/2019 Web Captcha

    27/31

    CAPTCHA Svc Providers

    Work even with servers not configured togenerate images or sound.

    Server sends encrypted OTP to service,which sends image to client.

    Code is easy to embed (botblock)

    Service updates itself automatically.

    Saves bandwidth and processor time.captchaS.net (experimental, but free)

    Trust issues when outsourcing security.

    http://captchas.net/http://www.chimetv.com/tv/products/botblock.shtmlhttp://captchas.net/http://captchas.net/http://www.chimetv.com/tv/products/botblock.shtmlhttp://captchas.net/
  • 8/14/2019 Web Captcha

    28/31

    Custom CAPTCHA

    Starting from Open Source or public domaincode, not too difficult to customize.

    Customizing can make your implementationresistant to all but direct assaults.

    CAPTCHA volunteers may help you test and

    improve your algorithm.

    Can be stronger than using a service orpreconfigured software.

    http://www.ocr-research.org.ua/index.php?action=listhttp://www.ocr-research.org.ua/index.php?action=list
  • 8/14/2019 Web Captcha

    29/31

    CAPTCHA Beyond the Web

    Prevent dictionary attacks in anypassword system (Pinkas & Sander)

    Protect e-mail systems from worms,spam, other malware-- if sender not inaddress book or message is suspect,challenge sender with CAPTCHA.

    Deter unwanted macro-scripting of astandalone application.

  • 8/14/2019 Web Captcha

    30/31

    My Project

    Survey CAPTCHA alternatives.

    Select and install one.

    Test on MAMP (Mac / PHP)

    Deploy on LAMP (Linux)

    Evaluate and submit to my company foruse with Wiki-based CMS

  • 8/14/2019 Web Captcha

    31/31

    Project Status

    Several false starts

    First few selections either did not install,

    did not meet requirements or failedaccessibility tests

    Best bet now is on the service athttp://www.captchas.net

    Asked for two-week extension to finishinstallation and paper.