View
37
Download
0
Category
Tags:
Preview:
DESCRIPTION
Binning and Indexing Biometric Records. Sharat S. Chikkerur CUBS, University at Buffalo ssc5@eng.buffalo.edu. Problem Description. Biometrics are being deployed for immigration and national ID applications US-VISIT program Voter ID and national ID programs[3] - PowerPoint PPT Presentation
Citation preview
Binning and Indexing Biometric Records
Sharat S. ChikkerurCUBS, University at Buffalo
ssc5@eng.buffalo.edu
Problem Description
Biometrics are being deployed for immigration and national ID applications US-VISIT program Voter ID and national ID programs[3] Potential size that can run into millions
Largest study by NIST considers only 620,000 records[4]
Apart from accuracy speed and efficiency also become important at this scale
Only biometric identification (1:N matching) can prevent duplicate enrollments
Problem Description (cont.) In biometric templates, there is no natural order by which one can sort the biometric records
Biometric Templates are inherently higher dimensional
Semantic features are not stored in the template
Identification Problem
FRRFRR
FARNFARFAR
N
NN
)1(1
Let FAR and FRR be the false acceptance rate and false reject rate for 1:1 matching
For a 1:N matching,
The total number of false accepts is given by
FARNFARNFAR NN 2))1(1(N accepts False
Even if FAR = 0.0001%, False accepts = 1 in 10 for N=100000(lower bound)
No single biometric is capable of meeting this security requirement individually
Uses of Indexing and Binning Ways to reduce identification errors:
Reduce N Reduce FAR (Limited by technology)
We can reduce N by pruning the records Let PSYS – Penetration rate For a 1:N matching,
FRRFRR
FARNPSYSFARFAR
N
NPSYSN
)1(1
The total number of false accepts is given by
FARPSYSNFAR N 2)(PSYSN accepts False
State of the art fingerprint systems has PSYS=0.5 [6]
Indexing and Binning(cont.)
Will allow us to screen immigrants at airports against a ‘watch list’
Will make biometric systems more user-friendly by eliminating the need to remember PINs and Ids
Will improve accuracy (FARN) and performance
0 2 4 6 8 10
x 105
0
2
4
6
8
10x 10
7
N
Fal
se A
ccep
ts
1.0 0.75
0.5
0.3
0.1
Binning Biometric Data
Vector Quantization Approach
In general a biometric template may be represented as a vector
The objective is to classify the vectors into N distinct classes(code book vectors)
The code book vectors divide the feature space into N distinct Voronoi regions
Properties of the regions:
Vector Quantization(cont.)
kikiiii xxxxx ]....,,,[ 4321
kNYYYYY ]....,,,[ 4321
iV
jixYxY ii 22
ik
i VV and
Vector Quantization-Voronoi Regions
Hand Geometry- Template Model
Experimental Evaluation
25x10 hand geometry features used Each print represented by a 21D vector Data divided equally among training and testing Data is normalized using
)(
X
X
VQ is implemented using k-means clustering The codebook vectors are used on the test set
Normalization
0 20 40 60 800
10
20
30
40
50
60
70
80
FTR(1)
FTR(1)
FT
R(2
)
Observations Data normalization leads to spreading of data Without norm., clusters converge to a single center Equivalent to measuring Mahalanobis distance[5] Difference instances of the same had misclassified
-3 -2 -1 0 1 2 3 4-2
-1
0
1
2
3
4
FTR'(1)
FT
R'(2
)
Preliminary Results
PSYS
0
10
20
30
40
50
60
0 5 10 15 20 25
Number of bins
Pen
etr
ati
on
Rate
2 3 4 5 6 7 8 9 10 11 12 21
56.95 46.9 42.6 39.6 37.39 36.14 34.56 33.63 31.91 32.25 31.95 29.4456.95 47.2 42.6 40.17 37.39 35.77 34.56 33.62 33.13 32.5 31.95 29.4456.95 47.24 43.26 39.65 37.82 36.02 34.89 33.91 33.13 31.3 31.95 29.4456.95 46.95 42.6 39.65 37.39 35.77 34.56 33.91 32.82 32.25 31.95 29.4456.95 46.66 42.6 39.65 37.82 36.14 34.78 33.62 32.86 32.25 31.95 29.44
56.95 46.99 42.732 39.744 37.562 35.968 34.67 33.738 32.77 32.11 31.95 29.44
Indexing Biometric Data
Spatial Access Methods Approach
Introduction to Spatial databases Relational databases organize and store scalar data
Has planar organization Contains scalar data (excluding LOBs, binary) Data can be ordered linearly Structured Query Language used to retrieve records
Spatial databases Contain multi-dimensional or vectorial data Relative positions may be explicit or inferred Linear proximity does not imply spatial proximity
Multi dimensional data is used in computer vision, medical imaging, and BIOMETRICS
Original Applications Point sets
CAD VLSI drawings Cartography, astronomy
Spatial databases (cont.)
Difference from pattern classification – QUERIES Spatial searches Neighborhood searches
PAM/SAM Point Access Methods
Used on point databases Points may be multi-dimensional (Vectors) Points have spatial extents, intersection undefined Each point is specified uniquely by its d co-ordinates
Spatial Access Methods Used on lines, polygons, solids Have spatial extent, intersection of objects well defined A point may be occupied by more than one object
Problems with vectorial/spatial data No standard algebra defined on spatial data
Union, intersection, union not defined exactly Data operations highly application specific Operators are not closed
Queries Need support for spatial queries – point and region queries No standard spatial query language
No natural ordering Ordering that preserves spatial proximity does not exist No mapping between multi-dimensional space to 1D such that
two points that are close together in higher dimensional space are also closed linearly[1]
Is it possible to do this via PCA/KLT? Cannot extend single key structures like B-Tree
Requirements of a spatial database
Dynamic updates The structure should be consistent as data is inserted and
deleted Changes should be tracked
Independence of input data and insertion sequence Should handle skewed data Structure should be independent of insertion
sequence(Compare tree)
Scalable Efficiency
Time Efficiency Efficient design will approach the performance of B-Trees
Space Efficiency Indexing overhead should be small
Types of structures
K-d Trees Binary tree in d-dimensional space d-1 hyperspaces separate the subspaces The directions alternate among the d-possibilities Insertion and search are straight forward Deletion is cumbersome Structure is sensitive to insertion order
References1. Gaede and Gunther, “Multidimensional Access Methods”, ACM Computing
Surveys, Vol.30, No.2, 1998
2. www.geocities.com/mohamedqasem/ vectorquantization/vq.html
3. Bolle et al. Guide to Biometrics, Springer Verlag, 2003
4. NIST report to the United States Congress, “Summary of NIST Standards for Biometric Accuracy, Tamper Resistance and Interoperability”, http://www.itl.nist.gov/iad/894.03/NISTAPP_Nov02.pdf
5. http://www.galactic.com/Algorithms/discrim_mahaldist.htm
6. Dr.Wayman’s report, NIST
Recommended