DIP WISC 13 Recognition

Embed Size (px)

Citation preview

  • 7/27/2019 DIP WISC 13 Recognition

    1/18

    Object Recognition

  • 7/27/2019 DIP WISC 13 Recognition

    2/18

    (C) 2010 by Yu Hen Hu

    An Example

    Consider classify eggs into3 categories with labels:medium, large, or jumbo.

    The classification will bebased on the weight andlength of each egg.

    Decision rules:1. If W < 10 g & L < 3cm, then

    the egg is medium2. If W > 20g & L > 5 cm then the

    egg is jumbo

    3. Otherwise, the egg is large

    Three components in apattern classifier: Category (target) label

    Features

    Decision rule

    W

    L

    2

  • 7/27/2019 DIP WISC 13 Recognition

    3/18

    3(C) 2010 by Yu Hen Hu

    Pattern Classification

  • 7/27/2019 DIP WISC 13 Recognition

    4/18

    4(C) 2010 by Yu Hen Hu

    Features

  • 7/27/2019 DIP WISC 13 Recognition

    5/18

    5(C) 2010 by Yu Hen Hu

    Multi-Spectral Image Feature Vector

  • 7/27/2019 DIP WISC 13 Recognition

    6/18

    6(C) 2010 by Yu Hen Hu

    Multi-spectral Pixel Classification

    (a) (b) (c)

    (d) (e) (f)

    (g) (h) (i)

    FIGURE 12.13 Bayes classification

    of multispectral data. (a)-(d) Images

    in the visible blue, visible green,visible red, and near infrared

    wavelengths. Mask showing

    sample regions of water (1), urban

    development (2), and vegetation (3).

    (f) Results of classification; the

    black dots denote points classified

    incorrectly. The other (white) points

    were classified correctly. (g) all

    image pixels classified as water (in

    white). (h) All image pixels

    classified as urban development (in

    white). (i) All image pixels

    classified as vegetation (in white).

  • 7/27/2019 DIP WISC 13 Recognition

    7/187(C) 2010 by Yu Hen Hu

    Confusion Matrix

  • 7/27/2019 DIP WISC 13 Recognition

    8/188(C) 2010 by Yu Hen Hu

    Decision Boundary

  • 7/27/2019 DIP WISC 13 Recognition

    9/189(C) 2010 by Yu Hen Hu

    Remote Sensing

  • 7/27/2019 DIP WISC 13 Recognition

    10/1810(C) 2010 by Yu Hen Hu

    A Taxonomy of Ground Objects

  • 7/27/2019 DIP WISC 13 Recognition

    11/1811(C) 2010 by Yu Hen Hu

    Template Matching

  • 7/27/2019 DIP WISC 13 Recognition

    12/1812(C) 2010 by Yu Hen Hu

    Template Matching Example

  • 7/27/2019 DIP WISC 13 Recognition

    13/18(C) 2010 by Yu Hen Hu

    Statistical Pattern Classification

    Objective to draw an optimal decision

    rule given a set of trainingsamples.

    Decision rule is optimal because it is

    designed to minimize a costfunction, called theexpected risk in makingclassification decision.

    This is a learning problem!

    Assumptions1. Features are given.

    Feature selection problemneeds to be solvedseparately.

    Training samples arerandomly chosen from apopulation

    2. Target labels are given Assume each sample is

    assigned to a specific,unique label by the nature. Assume the label of

    training samples areknown.

    13

  • 7/27/2019 DIP WISC 13 Recognition

    14/18(C) 2010 by Yu Hen Hu

    Pattern Classification Problem

    LetXbe the feature space, andC= {c(i), 1 iM} beMclasslabels.

    For each x X, it is assumed thatthe nature assigned a class

    label t(x) C according tosome probabilistic rule.

    Randomly draw a feature vectorxfromX,

    P(c(i)) = P(x c(i)) is the a priori

    probability that t(x) = c(i)without referring tox.P(c(i)|x) = P(x c(i)|x) is the

    posterioriprobability that t(x)= c(i)giventhe value ofx

    P(x|c(i)) = P(x |x c(i)) is theconditional probability (a.k.a.likelihood function) that xwill assume its valuegiventhat it is drawn from class

    c(i).P(x) is the marginalprobability

    that x will assume its valuewithout referring to whichclass it belongs to.

    Use Bayes Rule, we haveP(x|c(i))P(c(i)) = P(c(i)|x)P(x)

    Also,

    M

    i

    icPicxP

    icPicxPxicP

    1

    ))(())(|(

    ))(())(|()|)((

    14

  • 7/27/2019 DIP WISC 13 Recognition

    15/18(C) 2010 by Yu Hen Hu

    Decision Function and Prob. Mis-Classification

    Given a sample x X, theobjective of statisticalpattern classification is todesign a decision ruleg(x) C

    to assign a label tox. Ifg(x) = t(x), the naturallyassigned class label, then it isa correct classification.Otherwise, its a mis-classification.

    Define a 0-1 loss function:

    )()(if1

    )()(if0))(|(

    xtxg

    xtxgxgx

    Given thatg(x) = c(i*), then

    Hence the probability of mis-

    classification for a specificdecision g(x) = c(i*) is

    Clearly, to minimize the Pr. ofmis-classification for agiven x, the best choice isto choose g(x) = c(i*) if

    P(c(i*)|x) > P(c(i)|x) for i i*

    )|*)(()|*)()((

    )|0*))()(|((

    xicPxicxtP

    xicxgxP

    )|*)((1

    )|1*))()(|((

    xicP

    xicxgxP

    15

  • 7/27/2019 DIP WISC 13 Recognition

    16/18(C) 2010 by Yu Hen Hu

    MAP: Maximum A Posteriori Classifier

    The MAP classifier stipulatesthat the classifier thatminimizes pr. of mis-classification should choose

    g(x) = c(i*)ifP(c(i*)|x) >P(c(i)|x), i i*.

    This is an optimal decision rule.

    Unfortunately, in real worldapplications, it is often difficult

    to estimateP(c(i)|x).

    Fortunately, to derive theoptimal MAP decision rule, onecan instead estimate adiscriminant functionGi(x)suchthat for anyxX, i i*.

    Gi*(x) > Gi(x) iff

    P(c(i*)|x) >P(c(i)|x)

    Gi(x) can be an approximation ofP(c(i)|x) or any function

    satisfying above relationship.

    16

  • 7/27/2019 DIP WISC 13 Recognition

    17/18(C) 2010 by Yu Hen Hu

    Maximum Likelihood Classifier

    Use Bayes rule,p(c(i)|x) = p(x|c(i))p(c(i))/p(x).Hence the MAP decision rule can

    be expressed as:

    g(x) = c(i*) ifp(c(i*))p(x|c(i*)) > p(c(i))p(x|c(i)), i

    i*.

    Under the assumption that the apriori Pr. is unknown, we mayassume p(c(i)) = 1/M. As such,maximizing p(x|c(i)) isequivalent to maximizingp(c(i)|x).

    The likelihood functionp(x|c(i))may assume a uni-variateGaussian model. That is,

    p(x|c(i)) ~ N(i,i)

    i,i can be estimated usingsamples from {x|t(x) = c(i)}.

    A priori pr. p(c(i)) can beestimated as:

    |X|

    c(i)}t(x)t.s.x#{x;))(( icP

    17

  • 7/27/2019 DIP WISC 13 Recognition

    18/18(C) 2010 b Y H H

    Nearest-Neighbor Classifier

    Let {y(1), , y(n)} X be n samples which has already beenclassified. Given a new sample x, the NN decision rulechooses g(x) = c(i) if

    is labeled with c(i). As n , the prob. Mis-classification using NN classifier is

    at most twice of the prob. Mis-classification of the optimal(MAP) classifier.

    k-Nearest Neighbor classifier examine the k-nearest,

    classified samples, and classify x into the majority of them. Problem of implementation: require large storage to store

    ALL the training samples.

    ||)(||.*)(1

    xiyMiniyni

    18