Binning and Indexing Biometric Records

Sharat S. ChikkerurCUBS, University at Buffalo

ssc5@eng.buffalo.edu

Problem Description

Biometrics are being deployed for immigration and national ID applications US-VISIT program Voter ID and national ID programs[3] Potential size that can run into millions

Largest study by NIST considers only 620,000 records[4]

Apart from accuracy speed and efficiency also become important at this scale

Only biometric identification (1:N matching) can prevent duplicate enrollments

Problem Description (cont.) In biometric templates, there is no natural order by which one can sort the biometric records

Biometric Templates are inherently higher dimensional

Semantic features are not stored in the template

Identification Problem

FRRFRR

FARNFARFAR

Let FAR and FRR be the false acceptance rate and false reject rate for 1:1 matching

For a 1:N matching,

The total number of false accepts is given by

FARNFARNFAR NN 2))1(1(N accepts False

Even if FAR = 0.0001%, False accepts = 1 in 10 for N=100000(lower bound)

No single biometric is capable of meeting this security requirement individually

Uses of Indexing and Binning Ways to reduce identification errors:

Reduce N Reduce FAR (Limited by technology)

We can reduce N by pruning the records Let PSYS – Penetration rate For a 1:N matching,

FRRFRR

FARNPSYSFARFAR

NPSYSN

The total number of false accepts is given by

FARPSYSNFAR N 2)(PSYSN accepts False

State of the art fingerprint systems has PSYS=0.5 [6]

Indexing and Binning(cont.)

Will allow us to screen immigrants at airports against a ‘watch list’

Will make biometric systems more user-friendly by eliminating the need to remember PINs and Ids

Will improve accuracy (FARN) and performance

0 2 4 6 8 10

10x 10

1.0 0.75

Binning Biometric Data

Vector Quantization Approach

In general a biometric template may be represented as a vector

The objective is to classify the vectors into N distinct classes(code book vectors)

The code book vectors divide the feature space into N distinct Voronoi regions

Properties of the regions:

Vector Quantization(cont.)

kikiiii xxxxx ]....,,,[ 4321

kNYYYYY ]....,,,[ 4321

jixYxY ii 22

i VV and

Vector Quantization-Voronoi Regions

Hand Geometry- Template Model

Experimental Evaluation

25x10 hand geometry features used Each print represented by a 21D vector Data divided equally among training and testing Data is normalized using

VQ is implemented using k-means clustering The codebook vectors are used on the test set

Normalization

0 20 40 60 800

FTR(1)

Observations Data normalization leads to spreading of data Without norm., clusters converge to a single center Equivalent to measuring Mahalanobis distance[5] Difference instances of the same had misclassified

-3 -2 -1 0 1 2 3 4-2

FTR'(1)

Preliminary Results

0 5 10 15 20 25

Number of bins

2 3 4 5 6 7 8 9 10 11 12 21

56.95 46.9 42.6 39.6 37.39 36.14 34.56 33.63 31.91 32.25 31.95 29.4456.95 47.2 42.6 40.17 37.39 35.77 34.56 33.62 33.13 32.5 31.95 29.4456.95 47.24 43.26 39.65 37.82 36.02 34.89 33.91 33.13 31.3 31.95 29.4456.95 46.95 42.6 39.65 37.39 35.77 34.56 33.91 32.82 32.25 31.95 29.4456.95 46.66 42.6 39.65 37.82 36.14 34.78 33.62 32.86 32.25 31.95 29.44

56.95 46.99 42.732 39.744 37.562 35.968 34.67 33.738 32.77 32.11 31.95 29.44

Indexing Biometric Data

Spatial Access Methods Approach

Introduction to Spatial databases Relational databases organize and store scalar data

Has planar organization Contains scalar data (excluding LOBs, binary) Data can be ordered linearly Structured Query Language used to retrieve records

Spatial databases Contain multi-dimensional or vectorial data Relative positions may be explicit or inferred Linear proximity does not imply spatial proximity

Multi dimensional data is used in computer vision, medical imaging, and BIOMETRICS

Original Applications Point sets

CAD VLSI drawings Cartography, astronomy

Spatial databases (cont.)

Difference from pattern classification – QUERIES Spatial searches Neighborhood searches

PAM/SAM Point Access Methods

Used on point databases Points may be multi-dimensional (Vectors) Points have spatial extents, intersection undefined Each point is specified uniquely by its d co-ordinates

Spatial Access Methods Used on lines, polygons, solids Have spatial extent, intersection of objects well defined A point may be occupied by more than one object

Problems with vectorial/spatial data No standard algebra defined on spatial data

Union, intersection, union not defined exactly Data operations highly application specific Operators are not closed

Queries Need support for spatial queries – point and region queries No standard spatial query language

No natural ordering Ordering that preserves spatial proximity does not exist No mapping between multi-dimensional space to 1D such that

two points that are close together in higher dimensional space are also closed linearly[1]

Is it possible to do this via PCA/KLT? Cannot extend single key structures like B-Tree

Requirements of a spatial database

Dynamic updates The structure should be consistent as data is inserted and

deleted Changes should be tracked

Independence of input data and insertion sequence Should handle skewed data Structure should be independent of insertion

sequence(Compare tree)

Scalable Efficiency

Time Efficiency Efficient design will approach the performance of B-Trees

Space Efficiency Indexing overhead should be small

Types of structures

K-d Trees Binary tree in d-dimensional space d-1 hyperspaces separate the subspaces The directions alternate among the d-possibilities Insertion and search are straight forward Deletion is cumbersome Structure is sensitive to insertion order

References1. Gaede and Gunther, “Multidimensional Access Methods”, ACM Computing

Surveys, Vol.30, No.2, 1998

2. www.geocities.com/mohamedqasem/ vectorquantization/vq.html

3. Bolle et al. Guide to Biometrics, Springer Verlag, 2003

4. NIST report to the United States Congress, “Summary of NIST Standards for Biometric Accuracy, Tamper Resistance and Interoperability”, http://www.itl.nist.gov/iad/894.03/NISTAPP_Nov02.pdf

5. http://www.galactic.com/Algorithms/discrim_mahaldist.htm

6. Dr.Wayman’s report, NIST

Thank You

ssc5@cedar.buffalo.edu

Binning and Indexing Biometric Records

Documents

Biometric Products By WYSE Biometric System

Binning &Labeling XLampXR-E_B&L

Binning in Gaussian Kernel Regularization

Biometric Indicators 1 Running head: BIOMETRIC … Indicators-An...Biometric Indicators 1 Running head: BIOMETRIC INDICATORS ... passwords. Biometric indicators cannot be easily changed

FamilySearch Indexing : Indexing - LDS

Fibonacci Binning

Indexing of Biometric Data

A NOVEL BINNING AND INDEXING APPROACH USING HAND GEOMETRY AND PALM PRINT TO ENHANCE PERFORMANCE OF BIOMETRIC IDENTIFICATION SYSTEM

Biometric Recognition Technologies and Biometric Applications

Processing and Binning Overview

Indexing and Binning Large Databases

3x3 binning pattern experiment

BinX Dynamic binning for time series

New SPECIAL MACHINES 512 October 2019 · 2020. 2. 6. · Plain indexing head Universal indexing head Optical indexing head Methods of indexing Direct indexing Simple or plain indexing

– – · IEC PAS 62707-1 LED - Binning - Part 1: General requirements and white grid 2011 IEC XXXXX LED - Binning - Part 2: Luminous flux binning 2013

Metagenome Binning - GoSeqIt

Fundamentals of What is Epitope Binning? Epitope Binning · IBIS White paper # 6.140522 IBIS Technologies . IBIS White paper Epitope Binning # 6.140522 An important step in the engineering

Hexagon binning for petroleum data

Fibonacci Binning - unimi.itvigna.di.unimi.it/ftp/papers/FibonacciBinning.pdf · Fibonacci binning is a simple exponential (or logarithmic, depending on the viewpoint) discrete binning

Procedure for Binning using xcal