Upload
bhavin-chandarana
View
236
Download
5
Embed Size (px)
Citation preview
Hello!I am Bhavin
I’m going to speak about “A Machine-learning based tool for identifying monosyllabic birds from their call”
Indian Forest Owlet
Status: Near ExtinctionConsidered extinct for 113 yrs
Until it was found in 1997
3 surveys conducted till then
Exact number: unknown
Sociable Lapwing
Status: Critically EndangeredSteady decline in recent years
Reasons for decline: poorly understood
Population affected due to war in Middle East
Current population: unknown
Our Aim
Our aim is to leverage technology to collect quality datasets about bird population, migration patterns, anatomy, physiology, conservation status and any other parameter that can help ecologists, NGOs, govts and other institutions better tackle the challenge of bird conservation
We have compiled a database of 503 birds of India with 51 properties
Hosted at: www.avipulse.com
Current Study
The work presented here is part of our first (two) steps in this direction
Objective: Identify the bird species from its call using numerical methods & Machine Learning
In a nutshell
We procure calls for a list of large number of known birds
For an unknown sample, we find the most probable species from the list
Effectively reducing the problem to a one of classification
A bit about biology
Unlike mammals, birds do not have vocal cords. The sound-producing organ of birds is named Syrinx. It is located at the base of a bird's trachea.
The syrinx is located where the trachea forks into the lungs. Thus, lateralization of bird-song is possible. Syrinx can have multiple simultaneous oscillation modes.
A bit about biology
Some songbirds can produce more than one sound at a time
Even single syllable birds produce calls of different natures for different purposes (e.g. food call, alarm call, mating call etc).
Calls also vary depending on season and gender
There’s also a very high temporal variation in bird calls
Scope of our work
No songbirds, No special calls
Only species with single syllable calls
Typical frame size: 3 msec with 50% overlap (human speech is processed at 20-30 msec)
Overview
90 species
98.73% accuracy
16 species
90.88% accuracy
Training data courtesy of Dr. Sharad Apte (birdcalls.info)
High Quality, noise-free, wav files with high SNR ratio
10 files per bird with min 10 syllables = min 100 syllables per bird
Autosegmentation
Iterative Energy Thresholding[1]
where x(i) is the signal value at the ith time index in the frame f with n samples
[1] Somervuo, Panu, Aki Härmä, and Seppo Fagerlund. “Parametric representations of bird sounds for automatic species recognition.”
The Mel scale is a perceptual scale of pitches
mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency
Feature Extraction - MFCCs
Feature Extraction - MFCCs
MFCCs have a lot of advantages [2]:▣ Better performance than linear freq. features.▣ MFCCs can be extracted from both aperiodic
and periodic signals.▣ Cepstral coefficients can achieve significant
data reduction without the risk of much information loss.
▣ MFCC coefficients are almost perfectly uncorrelated to each other[3]
[2] Jinhai Kai, “Sensor Network for the Monitoring of Ecosystem: Bird Species Recognition.”
[3] Liao, Chao, Patricia P. Wang, and Yimin Zhang. “Mining association patterns between music and
video clips in professional MTV.”
Feature Extraction - Classifiers
SVMsA binary classifier in a hyperplane of higher dimensions
Kernel: The equation of the classifier
Linear:
Polynomial:
Radial Basis (RBF)
MLP
Naive BayesGaussian distribution
independent features
Neural Networks
Feature Extraction - Control
▣ 80% data => training▣ 20% data => testing
▣ Confusion matrix measuring performance
Results - Detailed
Sr# Classifier Name Accuracy (%)
1 NB 82.71
2 NB with Uniform Prior Probability 81.98
3 NB With Kernel Smoothing (KS) 80.66
4 NB With KS & Uniform Prior Probability (UPP) 81.39
5 NB With KS Box Function & UPP 78.1
6 NB With KS Epanechnikov Function & UPP 78.47
7 NB With KS Triangle Function& UPP 78.47
8 SVM with linear kernel 0
9 SVM with polynomial kernel of degree 3 88.62
10 SVM with polynomial kernel of degree 4 89.93
Results - Detailed
Sr# Classifier Name Accuracy (%)
11 SVM with polynomial kernel of degree 4 88.18
12 SVM with polynomial kernel of degree 5 80.09
13 SVM with RBF kernel with sigma 0.5 9.52
14 SVM with RBF kernel with sigma 1 36.54
15 SVM with RBF kernel with sigma 1.5 62.22
16 SVM with RBF kernel with sigma 2 77.64
17 SVM with RBF kernel with sigma 3 88.55
18 SVM with RBF kernel with sigma 4 90.88
19 SVM with RBF kernel with sigma 5 89.13
20 SVM with MLP kernel with parameters [1 -1] 13.21
Results - Detailed
Sr# Classifier Name Accuracy (%)
21 SVM with MLP kernel with parameters [0.5 -0.5] 13.47
22 SVM with MLP kernel with parameters [2 -2] 13.58
23 Neural Network with 5 hidden layers 58.8
24 Neural Network with 10 hidden layers 49.02
25 Neural Network with 20 hidden layers 40.62
82.71%Pure Naive Bayes
90.88%SVM with RBF kernel with σ = 4
89.93%SVM with polynomial kernel with d = 3
The Future
Classifier & autoseg for songbirds
Location-based classifiers
Scale up database quantitatively (no. of species)
Cloud & IoT implementation & mobile apps
Pokédex complete!
Core Team
Data Scientists
Mentors
BhavinIIT Madras ‘14Entrepreneur
RaunakIIT Madras ‘13
UC Berkeley ‘17
PallaviCummins Clg ‘10
Birdwatcher
SutapaIIT KGP ‘18
(Primary Author)
AnkitaIIT Bombay ‘16
RiddishIIT Bombay ‘17
Prof. Anil PrabhakarElectrical Engg.
IIT Madras
Prof. Preeti RaoElectrical Engg.
IIT Bombay