Docs Slides Lecture18

Embed Size (px)

Citation preview

  • 7/31/2019 Docs Slides Lecture18

    1/29

    Applica'onexa

    PhotoOCR

    Problemdescri

    andpipeline

    MachineLearning

  • 7/31/2019 Docs Slides Lecture18

    2/29

    ThePhotoOCRproblem

  • 7/31/2019 Docs Slides Lecture18

    3/29

    PhotoOCRpipeline

    1.Textdetec'on

    2.Charactersegmenta'on

    3.Characterclassifica'on

    NA

  • 7/31/2019 Docs Slides Lecture18

    4/29

    Image Textdetec8onCharacter

    segmenta8on

    Ch

    reco

    PhotoOCRpipeline

  • 7/31/2019 Docs Slides Lecture18

    5/29

    Applica'onexa

    PhotoOCR

    Slidingwindo

    MachineLearning

  • 7/31/2019 Docs Slides Lecture18

    6/29

    Textdetec8on Pedestriande

  • 7/31/2019 Docs Slides Lecture18

    7/29Posi'veexamples

    Supervisedlearningforpedestriandetec8on

    pixelsin82x36imagepatches

    Nega'veexamples

  • 7/31/2019 Docs Slides Lecture18

    8/29

    Slidingwindowdetec8on

  • 7/31/2019 Docs Slides Lecture18

    9/29

    Slidingwindowdetec8on

  • 7/31/2019 Docs Slides Lecture18

    10/29

    Slidingwindowdetec8on

  • 7/31/2019 Docs Slides Lecture18

    11/29

    Slidingwindowdetec8on

  • 7/31/2019 Docs Slides Lecture18

    12/29

    Textdetec8on

  • 7/31/2019 Docs Slides Lecture18

    13/29

    Textdetec8on

    Posi'veexamples Nega'veexample

  • 7/31/2019 Docs Slides Lecture18

    14/29

    Textdetec8on

    [DavidWu]

  • 7/31/2019 Docs Slides Lecture18

    15/29

    1DSlidingwindowforcharactersegmenta8on

    Posi'veexamples Nega'veexample

  • 7/31/2019 Docs Slides Lecture18

    16/29

    PhotoOCRpipeline

    1.Textdetec'on

    2.Charactersegmenta'on

    3.Characterclassifica'on

    NA

  • 7/31/2019 Docs Slides Lecture18

    17/29

    Applica'onexa

    PhotoOCR

    GeInglotso

    data:Ar'ficiadatasynthes

    MachineLearning

  • 7/31/2019 Docs Slides Lecture18

    18/29

    Characterrecogni8on

    N

    I

    A

    Q

  • 7/31/2019 Docs Slides Lecture18

    19/29

    Ar8ficialdatasynthesisforphotoOCR

    Realdata

    Abcdefg

    Abcdefg

    Abcdef

    AbcdefgAbcdefg

    [AdamCoatesandTaoWang]

  • 7/31/2019 Docs Slides Lecture18

    20/29

  • 7/31/2019 Docs Slides Lecture18

    21/29

    Synthesizingdatabyintroducingdistor8ons

    [AdamCoatesandTaoWang]

  • 7/31/2019 Docs Slides Lecture18

    22/29

    Synthesizingdatabyintroducingdistor8ons:Speechr

    Originalaudio:

    Audioonbadcellphoneconnec'on

    Noisybackground:Crowd

    Noisybackground:Machinery

    [www.pdsounds.org]

  • 7/31/2019 Docs Slides Lecture18

    23/29

    Synthesizingdatabyintroducingdistor8ons

    Distor'onintroducedshouldberepresenta'onofthet

    noise/distor'onsinthetestset.

    Audio:

    Backgroundnoise,

    badcellphoneconn

    Usuallydoesnothelptoaddpurelyrandom/meaningle

    toyourdata.

    intensity(brightness)

    randomnoise

    [AdamCoatesandTaoWang]

  • 7/31/2019 Docs Slides Lecture18

    24/29

    DiscussionongeJngmoredata

    1. Makesureyouhavealowbiasclassifierbeforeexpeffort.(Plotlearningcurves).E.g.keepincreasingth

    offeatures/numberofhiddenunitsinneuralnetwoyouhavealowbiasclassifier.

    2. Howmuchworkwoulditbetoget10xasmuchdacurrentlyhave?

    -Ar'ficialdatasynthesis- Collect/labelityourself

    - Crowdsource(E.g.AmazonMechanicalTurk)

  • 7/31/2019 Docs Slides Lecture18

    25/29

    DiscussionongeJngmoredata

    1. Makesureyouhavealowbiasclassifierbeforeexpeffort.(Plotlearningcurves).E.g.keepincreasingth

    offeatures/numberofhiddenunitsinneuralnetwoyouhavealowbiasclassifier.

    2. Howmuchworkwoulditbetoget10xasmuchdacurrentlyhave?

    -Ar'ficialdatasynthesis- Collect/labelityourself

    - Crowdsource(E.g.AmazonMechanicalTurk)

  • 7/31/2019 Docs Slides Lecture18

    26/29

    Applica'onexa

    PhotoOCR

    Ceilinganalysis:

    partofthepipeworkonnext

    MachineLearning

  • 7/31/2019 Docs Slides Lecture18

    27/29

    Es8ma8ngtheerrorsduetoeachcomponent(ceiling

    Image Textdetec8onCharacter

    segmenta8on

    Ch

    reco

    Whatpartofthepipelineshouldyouspendthemost'

    tryingtoimprove?

    Component AccuracOverallsystem 72%

    Textdetec'on 89%

    Charactersegmenta'on 90%

    Characterrecogni'on 100%

  • 7/31/2019 Docs Slides Lecture18

    28/29

    Anotherceilinganalysisexample

    Facerecogni'onfromimages

    (Ar'ficialexample)

    Logistic regresFace detection!

    Camera!image!

    Eyes segmentation!

    Nose segmentation!Mouth

    segmentation

    Preprocess!(remove background)!

  • 7/31/2019 Docs Slides Lecture18

    29/29

    Component

    Overallsystem

    Preprocess(remove

    background)

    Facedetec'on

    Eyessegmenta'on

    Nosesegmenta'on

    Mouthsegmenta'on

    Logis'cregression

    Anotherceilinganalysisexample

    Logistic regressFace detection!

    Camera!image!

    Eyes segmentation!Nose segmentation!

    Mouthsegmentation

    Preprocess!(remove background)!