Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
i
TIMBER DEFECT DETECTION BASED ON SYSTEMATIC FEATURE
ANALYSIS AND ONE CLASS CLASSIFIER
UMMI RABA’AH BINTI HASHIM
A thesis submitted in fulfilment of the
requirements for the award of the degree of
Doctor of Philosophy (Computer Science)
Faculty of Computing
Universiti Teknologi Malaysia
DECEMBER 2015
iii
DEDICATION
To my beloved husband, children, parents and brothers.
iv
ACKNOWLEDGEMENT
In the name of Allah, most gracious, most merciful. Praise to Allah, for
guiding me in the right path, blessing me with the best in this life. It takes the efforts
and supports of many to bring this research study to completion. I am indebted to the
dozens of people guiding and supporting me throughout this study. I would like to
express my gratitude to the following special individuals:
1. My supervisor and co-supervisor, Assoc. Prof. Dr. Siti Zaiton binti Mohd
Hashim and Assoc. Prof. Dr. Azah Kamilah Muda, for their wonderful
guidance and continuous encouragement during the progression of my study.
2. Academicians of UTM, for their valuable teaching, comment, idea and
motivation for this research.
3. Industry experts from Hasro Malaysia, Teras Puncak and Elegant Success
(Malaysian wood products manufacturers) for their co-operation, invaluable
consultation and kind support.
4. Universiti Teknikal Malaysia Melaka (UTeM) and Ministry of Education
Malaysia for their generous financial support.
5. My husband and children, for their patience and love.
6. My parents and brothers, for their blessing and care.
.
v
ABSTRACT
Substantial research effort has been done in the automation of timber defect
detection to improve the quality of timber products, optimise raw material resources,
increase productivity and reduce error related to human labour. This study extends
the work on automated inspection of timber boards to Malaysian timber species
hoping that the outcome will benefit the local wood product industries. This study
aims to propose a timber surface defect detection approach which is robust in
detecting various defects on multiple timber species using significant texture
features, validated using data from local timber species. In the experiments, defective
samples from Malaysian Hardwood are collected and labelled under supervision of
industry experts. Additionally, this work gives new insight into the characterisation
of timber defect images by using statistical texture from orientation independent
Grey Level Dependence Matrix (GLDM) with appropriate parameter analysis. A
Systematic Feature Analysis (SFA) which includes exploratory and confirmatory
multivariate analysis was performed to investigate the discriminative power of the
proposed feature set. The SFA produces a feature set of timber surface defects
capable of providing significant discrimination between defects and clear wood
classes. Finally, a new concept in the domain of timber defect detection based on
outlier detection concept was introduced to overcome the problem of imbalanced
data. This study proposes a robust Mahalanobis one class classifier (MC) with Fast
Minimum Covariance Determinant estimator (MC-FMCD) for species independent
timber defect detection. The experimental results show that the proposed approach
achieved superior performance over the classical Mahalanobis Distance (MD) and
robust in detecting many types of defects across timber species.
vi
ABSTRAK
Pelbagai usaha penyelidikan telah dilaksanakan dalam pengesanan kecacatan
kayu secara automatik untuk meningkatkan kualiti produk kayu, mengoptimumkan
sumber bahan mentah dan meningkatkan produktiviti. Kajian dalam bidang ini telah
dilanjutkan kepada spesies kayu Malaysia dengan harapan bahawa hasilnya akan
memberi manfaat kepada industri produk kayu tempatan. Kajian ini bertujuan untuk
mencadangkan pengesanan kecacatan permukaan kayu yang teguh dalam mengesan
pelbagai kecacatan pada pelbagai spesies kayu menggunakan ciri tekstur yang
signifikan serta disahkan menggunakan data dari spesies kayu tempatan. Sampel
kecacatan dari spesies kayu keras Malaysia dikumpul dan dilabel di bawah
pengawasan pakar-pakar industri untuk digunakan dalam kajian ini. Selain itu, kajian
ini memberi pemahaman baru dalam perwakilan atribut imej kecacatan kayu dengan
menggunakan tekstur statistik dari Matriks Pergantungan Aras Kelabu (GLDM)
berorientasi bebas berserta dengan analisa parameter yang bersesuaian. Satu
Penilaian Atribut Sistematik (SFA) merangkumi analisa eksplorasi dan pengesahan
multivariat telah dijalankan untuk mengkaji kuasa diskriminasi set atribut yang
dicadangkan. SFA tersebut telah menghasilkan perwakilan atribut yang mampu
membezakan antara kelas-kelas kecacatan kayu dan kayu baik secara signifikan.
Akhirnya, satu konsep baru dalam domain pengesanan kecacatan kayu yang
berdasarkan pengesanan anomali telah diperkenalkan untuk menangani masalah data
tidak seimbang. Kajian ini mencadangkan satu pengelas tunggal Mahalanobis (MC)
yang teguh dengan penganggar Penentu Kovarians Minimum Pantas (MC-FMCD)
untuk pengesanan kecacatan kayu tanpa mengira spesies kayu. Hasil eksperimen
menunjukkan bahawa pendekatan yang dicadangkan berjaya mencapai prestasi yang
lebih baik jika dibandingkan dengan Jarak Mahalanobis (MD) klasik dan berupaya
mengesan pelbagai jenis kecacatan pada pelbagai spesies kayu.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
DEDICATION iii
ACKNOWLEDGEMENT iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xii
LIST OF FIGURES xiv
LIST OF ABBREVIATIONS xvii
LIST OF APPENDICES xx
TERMS AND DEFINITIONS xxi
1 INTRODUCTION 1
1.1 Overview 1
1.2 Research Background 2
1.3 Problem Statement and Research Aim 13
1.4 Research Objective 14
1.5 Research Scope 14
1.6 Significance of the Study 16
1.7 Research Methodology 17
1.8 Research Contribution 19
1.9 Thesis Structure 19
viii
2 LITERATURE REVIEW 21
2.1 Introduction 21
2.2 Overview of Timber Process 26
2.3 Malaysian Timber Species 28
2.4 Timber Defects 31
2.5 Automated Vision Inspection (AVI) of Timber 33
2.5.1 Problem Background 33
2.5.2 AVI in Wood Industry 34
2.5.3 Sensors Used for AVI in Wood Industry 39
2.5.4 General Timber Defect Detection Approach 43
2.5.5 Feature Extraction on Defect Images 46
2.5.6 Defect Classification 50
2.5.7 Discussion 53
2.6 Statistical Texture Feature Based on Grey Level
Dependence Matrix (GLDM) 55
2.6.1 Problem Background 55
2.6.2 Orientation Independent GLDM 58
2.6.3 Statistical Features of GLDM 63
2.7 One Class Classification for Imbalanced Data 71
2.7.1 Introduction and Problem Background 71
2.7.2 Distance-based One Class Classifier (OCC) 73
2.7.3 Fast Minimum Covariance Determinant as Robust
Estimator 77
2.8 Summary 81
3 RESEARCH METHODOLOGY 82
3.1 Introduction 82
3.2 Problem Situation and Solution Concept 82
3.3 Research Design 87
3.3.1 Research Framework 87
3.3.2 Operational Framework 88
ix
3.3.2.1 Phase 1: Construction of timber defect
image dataset of Malaysian hardwood 89
3.3.2.2 Phase 2: Identification of significant texture
feature set representing timber defect. 90
3.3.2.3 Phase 3: Development of robust OCC with
FMCD estimator for timber defect detection 91
3.3.3 Overall Research Plan 92
3.4 Evaluation Measurement 95
3.4.1 Multivariate Analysis of Variance (Manova) to
Evaluate Feature Quality 95
3.4.2 Precision, Recall and F Measure to Measure
Detection Performance 100
3.4.3 Over Detection and Under Detection Errors to
Assess Segmentation Quality 102
3.5 Summary 103
4 CONSTRUCTION OF TIMBER SURFACE DEFECT
IMAGE DATASET 104
4.1 Introduction 104
4.1 Timber Samples Collection 106
4.2 Image Acquisition Setup 106
4.3 Image Labelling and Processing 110
4.4 Findings 113
4.5 Summary 116
5 SIGNIFICANT FEATURE SET OF TIMBER SURFACE
DEFECTS BASED ON STATISTICAL TEXTURE AND
SYSTEMATIC FEATURE ANALYSIS 117
5.1 Introduction 117
5.2 Overview of Approach 118
5.3 Feature Extraction 121
x
5.3.1 Extracting Statistical Features from GLDM 121
5.3.2 Exploring Displacement and Quantization Parameter
of GLDM 127
5.4 Evaluation of Feature Quality 133
5.4.1 Exploratory Feature Analysis 133
5.4.1.1 Univariate Feature Range Analysis 134
5.4.1.2 Bivariate Matrix of Scatter Plot 136
5.4.1.3 Multivariate Intra-Class and Inter-Class
Distance between Clear Wood and Defects 137
5.4.2 Confirmatory Feature Analysis 139
5.4.2.1 Removing Linearly Dependent Features 141
5.4.2.2 Measuring Significant Difference between
Defect Classes using Manova Statistics 143
5.4.2.3 Identifying Significant Features using Post-
hoc Manova (Discriminant Analysis) 145
5.5 Performance Validation 149
5.5.1 Measuring Classification Performance across
Feature Sets and Classifiers 150
5.5.2 Measuring Classification Performance of Individual
Classes 153
5.5.3 Measuring Classification Accuracy across Timber
Species 156
5.6 Discussion 158
5.7 Summary 159
6 ROBUST MAHALANOBIAN CLASSIFIER WITH FMCD
ESTIMATOR (MC-FMCD) FOR TIMBER DEFECT
DETECTION 160
6.1 Introduction 160
6.2 Overview of Approach 161
6.3 Experimental Setting for Simulated Datasets 163
xi
6.4 Experimental Results for Simulated Datasets 165
6.4.1 Detection Peformance across Various Defect Ratios 166
6.4.2 Detection Performance by Defect Type 170
6.4.3 Detection Performance between Classic MD and
Robust MC-FMCD 174
6.4.4 Summary of Detection Performance across Timber
Species 178
6.5 Expert Validation on Test Images 180
6.6 Discussion 185
6.7 Summary 186
7 CONCLUSION AND FUTURE RESEARCH 188
7.1 Summary of Research Finding 188
7.2 Research Contribution 191
7.3 Future Work Recommendation 193
7.4 Concluding Remark 195
REFERENCES 196
Appendices A - N 213 - 297
xii
LIST OF TABLES
TABLE NO. TITLE PAGE
2.1 List of Malaysian timber classification based on density (MTIB, 2000) 29
2.2 Natural durability classification based on years (MTIB, 2000) 29
2.3 Characteristics of four types of timber species (MTIB, 2000) 30
2.4 List of common timber defect 32
2.5 Related works on automated inspection of wood products 36
2.6 Related studies on inspection of external wood defects 40
2.7 Images of directional matrices and rotation invariant matrix 61
3.1 Problem leading to solution 86
3.2 Overall research plan 92
3.3 Confusion matrix 102
4.1 List of data collection setting of past studies on timber surface defect detection 109
4.2 List of classes with example of sub-images collected 114
4.3 Number of samples collection across species 116
5.1 Example of sub-image and the corresponding dependence matrix 123
5.2 List of statistical texture features extracted 124
5.3 Example of extracted features (one sample per class, species=Meranti, d=1, q=32) 125
5.4 Texture characteristics of clear wood and defect 126
xiii
5.5 Distances between test samples and independent clear wood samples 142
5.6 List of feature correlation with r>0.99 142
5.7 List of features removed after correlation test 143
5.8 Box's test of equality of covariance matrices 144
5.9 Manova test 144
5.10 Pillai’s Trace value across multiple quantization levels and displacements 145
5.11 Eigenvalues and canonical correlations 146
5.12 Raw and standardized discriminant function coefficients (Root 1) 147
5.13 Correlation between features and canonical variable 148
5.14 List of remaining features after discriminant analysis 148
5.15 List of feature sets used for performance comparison 150
5.16 Confusion matrices for D7, D5 and D4 154
5.17 Samples mistakenly classified as clear wood (undetected defect) 155
5.18 Confusion matrices for Merbau, KSK and Rubberwood 157
6.1 Experimental Meranti dataset for various defect ratios 163
6.2 Detection performance by defect ratio 167
6.3 Detection performance by defect types 170
6.4 Detection performance on test images: Rubberwood 181
6.5 Detection performance on test images: KSK 182
6.6 Detection performance on test images: Meranti 183
6.7 Detection performance on test images: Merbau 184
xiv
LIST OF FIGURES
FIGURE NO. TITLE PAGE
1.1 Motivation of the study 12
1.2 Overview of research phases 18
2.1 Taxonomy of literature review 23
2.2 Timber process 26
2.3 Log cutting pattern (Cavette, 2006; Tom & Jeff, 2010) 27
2.4 The components of an AVI system in wood industry 35
2.5 Reference pixel, X with its 8 neighbouring pixels (Haralick et al., 1973) 59
2.6 Distribution of non-zero matrix element on the left, and contour plot showing joint probability density function of the spatial dependence matrix on the right. 62
2.7 Research solutions to the problem of classification of imbalanced data (Sun et al., 2009) 73
3.1 Solution concept for timber defect detection 85
3.2 Research framework 88
3.3 Operational research framework 89
4.1 Image acquisition setup 108
4.2 The process of dataset construction 111
4.3 Sample of acquired images 111
4.4 Subdivision of original image into sub-images 113
4.5 Distribution of defect samples across species 115
xv
5.1 Proposed approach in determining significant feature set 120
5.2 Procedures for extracting statistical texture features based on GLDM 122
5.3 Pictorial representation of the orientation independent GLDM 128
5.4 Normalized feature means against displacement and quantization 131
5.5 Energy feature range analysis 134
5.6 Entropy feature range analysis 135
5.7 Contrast feature range analysis 135
5.8 Scatter plot matrix showing pairwise comparison of features 136
5.9 Intra-class distance between clear wood samples and inter-class distance between clear wood and defect samples 138
5.10 Procedures for confirmatory feature analysis 140
5.11 Classification accuracy of three proposed feature sets (D6, D7 and D8) 151
5.12 Classification accuracy between the proposed feature set (D7) and feature sets from previous studies 152
5.13 F scores for each class across datasets D4, D5 and D7 154
5.14 Classification accuracy across timber species 156
6.1 Flow of experiments for timber defect detection 161
6.2 Proposed MC-FMCD for robust timber defect detection 162
6.3 F score across defect ratio: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 168
6.4 OD Error and UD Error across defect ratio: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 169
6.5 F score by defect type: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 172
6.6 OD Error and UD Error by defect type: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 173
6.7 Detection performance for MC-FMCD and classic MD: Meranti dataset 174
xvi
6.8 Detection performance for MC-FMCD and classic MD: Rubberwood dataset 175
6.9 Detection performance for MC-FMCD and classic MD: KSK dataset 176
6.10 Detection performance for MC-FMCD and classic MD: Merbau dataset 177
6.11 Average detection performance by timber species 178
6.12 Average detection performance by defect type across timber species (a) F score comparison between timber species by defect type (b) Average F score by defect type 179
6.13 Average detection performance between MC-FMCD and classic MD 180
6.14 Average detection performance validated by an expert 185
xvii
LIST OF ABBREVIATIONS
ANN - Artificial Neural Network
AUTOC - Autocorrelation
AVI - Automated Vision Inspection
BR - Brown Stain
BS - Blue Stain
CAR - Causal Auto Regressive Model
CCD - charged-coupled device
CL - Clear Wood
CONT - Contrast
COR - Correlation
CPROM - Cluster Prominence
CSHAD - Cluster Shade
CT - Computed Tomography
DENT - Difference entropy
DISS - Dissimilarity
DVAR - Difference variance
EN - Energy
ENT - Entropy
EPQ - Equal Probability Quantization
FMCD - Fast Minimum Covariance Determinant
FMMIS - Fuzzy Min-Max Neural Network for Image Segmentation
FN - False Negative
FP - False Positive
GA - Genetic Algorithm
GLDM - Grey Level Dependence Matrix
xviii
GPR - Ground Penetrating Radar
HL - Hole
HOMO - Homogeneity
IDMN - Inverse difference moment normalized
IDN - Inverse difference normalized
IMC1 - Information measures of correlation 1
IMC2 - Information measures of correlation 2
KN - Knot
KNN - K-nearest Neighbour
KSK - Kembang Semangkuk
LBP - Local Binary Pattern
MANOVA - Multivariate Analysis of Variance
MAXPR - Maximum probability
MCD - Minimum Covariance Determinant
MC-FMCD - Mahalanobian Classifier based on Robust FMCD
MD - Mahalanobis Distance
MGR - Malaysian Grading Rule
MIDA - Malaysian Investment Development Authority
MLP - Multi-layer Perceptron
MSE - Mean Square Error
MTIB - Malaysian Timber Industry Board
MVE - Minimum Volume Ellipsoid
MVV - Minimum Vector Variance
NATIP - National Timber Industry Policy
OCC - One Class Classifier
OD - Over Detection
PC - Pocket
RBFN - Radial Basis Function Network
RGB - Red Green Blue
RT - Rot
SAVG - Sum Average
SDM - Spatial Dependence Matrix
SENT - Sum Entropy
xix
SOM - Self-organizing Map
SOSVH - Sum of Squares: Variance
SP - Split
SSCP - Sum of Squares Cross Product
SVAR - Sum Variance
TN - True Negative
TP - True Positive
UD - Under Detection
WN - Wane
xx
LIST OF APPENDICES
APPENDIX TITLE PAGE
A Related studies on inspection of internal wood defects Related studies on multi sensors approach to timber defect detection
213
B Example of orientation independent GLDM and normalized GLDM
216
C Plots of feature value against displacement and quantization parameter
219
D Univariate feature range analysis
236
E Matrix of scatter plots comparing feature distribution between classes
247
F Pairwise correlation between features and its corresponding significance, p value
249
G SPSS Manova output
252
H Experimental dataset for various defect ratios
260
I Expert validation sheet
267
J UTM letter of permission for data collection
280
K Biography of industry experts
284
L Letter of dataset certification
287
M Photo album
291
N List of Publication 297
xxi
TERMS AND DEFINITIONS
TERM DEFINITION
Wood A hard fibrous material that makes up most of the substance of a tree
Log A part of the trunk that has been cut off from a felled tree
Timber Wood boards sawn from logs
Primary wood industry
Businesses that process logs or other tree sections directly into timber, veneer, plywood, wood chips or other primary wood products.
Sawmill A factory where logs are sawn into timbers
Secondary wood industry
Businesses that process primary wood products such as timber into secondary wood products such as furniture, doors, and parquet flooring.
Rough mill The first production area/stage in a secondary wood product industry where timber is being moulded and cut into rough sized components/parts. At this stage, undesirable characteristics or defects are removed.
Defect Flaws or anomalies found on timber that affect its properties and limit its possible use.
Natural defect Biological defects occurred during the growth of a tree where the timber originates from.
Mechanical defect
Defects that are caused by the handling or processing of timber, such as during drying, sawing and moulding.
Internal defect Defects that are found inside the timber structure
External defect Defects that are found on the surface of timber