27
1 Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain Presented by Li-Jen Kao July, 2005

Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

  • Upload
    cleta

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain. Presented by Li-Jen Kao July, 2005. Outline. Introduction Feature Extraction Classification Scheme Experimental Results Conclusion. 1 Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

1

Fast Class Rendering Using Multiresolution Classification in

Discrete Cosine Transform Domain

Presented byLi-Jen Kao

July, 2005

Page 2: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

2

Outline

Introduction Feature Extraction Classification Scheme Experimental Results Conclusion

Page 3: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

3

1 Introduction Classification of objects (or patterns) into

a number of predefined classes has been extensively studied in wide variety of applications such as optical character recognition (OCR) speech recognition face recognition

We may consider the design of classification systems in terms of two subproblems: feature extraction classification.

Page 4: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

4

Feature extraction: Features are functions of the measurements

performed on a class of objects It has not found a general solution in most

applications. Our purpose is to design a general

classification scheme, which is less dependent on domain-specific knowledge.

Reliable and general features are required

Page 5: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

5

Discrete Cosine Transform (DCT)

It helps separate an image into parts of differing importance with respect to the image's visual quality.

Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.

Page 6: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

6

Four advantages in applying DCT

The features extracted by DCT are general and reliable. It can be applied to most of the vision-oriented applications.

The amount of data to be stored can be reduced tremendously.

Multiresolution classification and progressive matching can be achieved by nature.

The DCT is scale-invariant and less sensitive to noise and distortion.

Page 7: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

7

Two philosophies of classification

Statistical the measurements that describe an

object are treated only formally as statistical variables, neglecting their “meaning

Structural regards objects as compositions of

structural units, usually called primitives.

Page 8: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

8

2 Feature Extraction via DCT The DCT coefficients C(u, v) of an N×N

image represented by x(i, j) can be defined as

where

1

0

1

0

),()()(2

),(N

i

N

j

jixvuN

vuC ),2

)12(cos()

2

)12(cos(

N

vj

N

ui

.1

,021

)(otherwise

wforw

Page 9: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

9

Figure 1. The DCT coefficients of the character image “ 為” .

Page 10: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

10

Figure 2. Illustratation of the multiresolution ability

of DCT

(a) (b) (c) (d)

(a) The original image of size 48×48; (b) The reconstructed image of size 8×8; (c) The reconstructed image of size 16×16; (d) The reconstructed image of size 32×32.

Page 11: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

11

3. The Proposed Classification Scheme

The ultimate goal of classification is to classify an unknown pattern x to one of M possible classes (c1, c2,…, cM).

Each pattern is represented by a set of D features, viewed as a D-dimensional feature vector.

Page 12: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

12

3.1. Our classification model

In the training mode: the feature extraction module finds the

appropriate features for representing the input patterns, and the classifier is trained.

In the classification mode: the trained classifier assigns the input

pattern to one of the pattern classes based on the measured features.

Page 13: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

13

To alleviate the burden of classification process, the process is usually divided into two stages: Coarse Classification Fine Classification

Page 14: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

14

Figure 3. Model for multiresolution classification

Page 15: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

15

3.2. Coarse classification module

In the training mode: The features of each training sample are first

extracted by DCT and quantized. Then the most D significant quantized DCT

features of each training sample are transformed to a code, called grid code (GC), which corresponds to a grid of feature space partitioned by the quantization method.

The training samples with the same GC are similar and can be classified into a coarse class.

Therefore, the information about all possible GCs is gathered in the training mode.

Page 16: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

16

In the classification mode: The classes with the same GC as that

of the test sample are chosen as the candidates of the test sample.

Page 17: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

17

3.2.1. Quantization

The 2-D DCT coefficient F(u,v) is quantized to F’(u,v) according to the following equation:

Most of the high frequency coefficients will be quantized to zero and only the most significant coefficients will be retained.

Q

vuFvuF

),(),(

Page 18: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

18

3.2.2. Grid Code Transformation

After the quantization process, the most D significant quantized DCT features of sample Oi are obtained, say [qi1, qi2, .., qiD].

The significance of each DCT coefficient is decided according to the following zigzag order: F(0,0), F(0,1), F(1,0), F(2,0), F(1,1), F(0,2), F(0,3), F(1,2), F(2,1), F(3,0), F(3,1),…, and so on.

Because the value of qij may be negative, for the ease of operation, we transform qij to positive integer dij by adding a number, say kj, to qij.

In this way, object Oi can be transformed to a D-digit GC.

This process is called the grid code transformation (GCT).

Page 19: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

19

3.2.3. Grid Code Sorting and Elimination

After the GCT, we obtain a list of triplets (Ti, Ci, GCi) Ti is the ID of a training sample Ci is the Class ID the training sample

belongs to GCi is the grid code of the training sample.

Then the list is sorted according to the GC ascendingly.

Given the GC of a test sample, we can get a list of candidate classes of the same GC for the test sample.

Page 20: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

20

Elimination of Redundancy

Redundancy occurs as the training samples belonging to the same class have the same GC.

This redundancy can be eliminated by establishing an abstract lookup table that only contains the information about the GCs and their corresponding classes.

Then, given a GC, this table can tell the relevant classes very quickly by binary search.

Page 21: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

21

3.3. The fine classification module

Progressive matching method Adding more DCT coefficients usually imply increasing

the resolution level of an image. If current resolution is not high enough to distinguish

one character from the others, we have to raise the level of resolution such that the discrimination power can also be improved.

The establishment of the templates for each class

Templates are established in the DCT domain. The average DCT coefficients of size N×N are obtained from the set of training samples with respect to the class.

Such that M sets of average DCT coefficients are obtained and served as the templates for each class.

Page 22: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

22

The sum of squared differences (SSD) is used as the matching criterion.

The matching of x and Ti is decomposed into K iterations, each of which corresponds to the matching under the block of size nk×nk.

After the kth iteration, the block size is enlarged from nk×nk to nk+1×nk+1 (nk+1 = nk+d).

The process is repeated until one of the stop criterions is satisfied:

1) to preserve enough signal energy in the block, and 2) to reject unqualified classes as soon as possible.

Page 23: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

23

4 Experimental Results 18600 samples (about 640 categories)

are extracted from Kin-Guan ( 金剛 ) bible. Each character image was transformed into

a 48×48 bitmap. 1000 of the 18600 samples are used for

testing and the others are used for training. The most D significant DCT coefficients were

quantized and transformed to a GC for each

sample.

Page 24: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

24

Figure 3. Reduction and accuracy rate using our coarse classification scheme

Page 25: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

25

Figure 4. Accuracy rate using both coarse and fine

classification

Page 26: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

26

6 Conclusions This paper presents a multiresolution

classification scheme based on DCT for vision-based applications.

The DCT features of a pattern can be extracted progressively according to their significance.

On classifying an unknown object, most of the improbable candidate classes for the object can be eliminated at lower resolution levels.

Experiments were conducted for recognizing handwritten characters in Chinese palaeography and showed that our approach performs well in this application domain.

Page 27: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain

27

Future Works

Since only preliminary experiment has been made to test our approach, a lot of works should be done to improve this system. For example, since features of different

types complement one another in classification performance, by using different types of vision-oriented features simultaneously, classification accuracy could be improved.