Formant-based Acoustic Features for Cow’s Estrus Detection in …idb.korea.ac.kr/publication/all/avss2014_formant-based... · 2014. 8. 29. · Figure 2: Comparison of spec and estrus

Abstract1

In this paper, we developed an optimal formant feature subset algorithm for the detection of cow’s estrus vocalizations and introduced a prototype system to distinguish the differences between estrus and normal sounds from pattern recognition perspectives. Primarily, we found that there exist 19 formants in a spectrogram of Korean native cow vocalization, and this important finding initiated us to introduce a formant-based feature subset selection algorithm. We obtained the optimal formant feature subset {F1, F2, F4, F7, F14, F19} for the detection of Korean native cow’s estrus. Finally, performance evaluation was conducted using real vocalizations in a commercial loose barn, in which the average detection accuracy reached 97.5%, with false positive rate and false negative rate on average approaching 5.0% and 2.5%, respectively, when AdaBoost.M1 was used as a detector.

1. Introduction Early detection of estrus is of considerable importance.

When estrus is detected late or undetected in the herd management of cows, the profitability of farmers can be seriously damaged [1].

Today, the idea of using information hidden in data relationships has inspired researchers in agricultural fields to apply these techniques for predicting future trends in agricultural processes [2]. Reviewing the recent literature, a lot of methods (standing heat, pedometer or activity meter, video recording and image analysis, body temperature, etc.) have already been discussed for cow’s estrus detection in cattle [1, 3-4]. Among those methods, we are interested in detection of cow’s estrus using sound data in this study. It is well known that animal’s vocalization has very important bio-signal and it can be easily obtained by cheap sound sensors at a distance without causing any stress to the

* Corresponding author

animals [5]. Therefore, sound analysis is a very importance method.

Recently, two interesting papers using cow sound data for automatic detection of estrus were published. First, Chung et al. [6] proposed an efficient data mining approach for cow’s estrus, using the sound data of cows. In their method, they extracted Mel Frequency Cepstrum Coefficients (MFCC) from cow’s vocalization at first. After that, they used the Support Vector Data Description (SVDD) in order to detect cow’s estrus as early as possible. It is known that the MFCC is very good featuring method in sound analysis. In particular, this can model the human’s sound perception, and is therefore a widely used in human speech recognition area. However, it can be different between animal sound perception and human sound perception [7]. Therefore, it can be said that other methods might be more suitable in the animal sound perception. Second, Yeon et al. [8] compared the acoustic characteristics of cow’s vocalizations in different physiological states (estrus and feed anticipating states). The researchers found that formant analysis played an important role in distinguishing estrus from normal calls. That is, the formant variables of vocalizations could be used to discriminate between the two groups. However, Yeon et al. chose empirically four formants {F1, F2, F3, F4}, including a few variables (duration, intensity, pitch).

In our study, we developed an optimal formant feature subset algorithm for a detection of cow’s estrus vocalizations and introduced the prototype system to distinguish the differences between estrus and normal sounds from pattern recognition perspectives. Primarily, we found that there exist 19 formants in a spectrogram of Korean native cow vocalization, and this important finding initiated us to introduce a formant-based feature subset selection algorithm. We obtained the optimal formant feature subset {F1, F2, F4, F7, F14, F19} for the detection of Korean native cow estrus using sound data. Finally, the performance evaluation was verified using real vocalizations in a commercial loose barn.

This paper is structured as follows: In section 2, we introduce our proposed formant-based feature subset

Formant-based Acoustic Features for Cow’s Estrus Detection in Audio

Surveillance System Jonguk Lee, Shangsu Zuo, Youngwha Chung, Daihee Park*

Korea University Department of Computer and Information Science, College

of Science and Technology, Korea University, Sejong, Republic of Korea.

{eastwest9, mining2015, ychungy, dhpark}@korea.ac.kr

Hong-Hee Chang, Suk Kim Gyeongsang National University Gyeongsang National University,

Jinju, Republic of Korea {hhchang, kimsuk}@gnu.ac.kr

2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

978-1-4799-4871-0/14/$31.00 ©2014 IEEE 236

selection algorithm with some mathematics background, and briefly reviews the basic works on AdaBoost.M1 machine learning-based detector that was used in the experiments. In section 3, we present the experimental results, and the conclusion is followed in section 4.

2. Formant-based Feature Subset Selection and Machine Learning-based Detector

2.1. Formant-based Acoustic Feature Subset Selection

The effective feature subsets selection solution for pattern recognition is of considerable importance [9]. The feature subsets selection is defined by the following: to select a subset of features from the universal feature set. Therefore, by eliminating the redundant, useless, or least-used features, it makes a compact, precise, and fast recognizer, thereby not causing any harmful performance degradation [10].

Here, this paper introduces a new Formant-based Feature Subset Selection Algorithm (FFSSA) in order to detect cow’s estrus. FFSSA takes into account the features’ inter-correlations and predictive performances for the purpose of searching for a good feature subset. FFSSA considers the usefulness of individual features. It uses statistical t-test and information gain for predicting the class label, along with the inter-correlation among them. FFSSA then performs a Sequential Forward Search (SFS) algorithm [11] in the feature subset space. The following is a summary of FFSSA with some mathematics background [12-14].

1) t-test:

(1)

where and are the data points number; the

corresponding data means are and ; and the variances for each class are and .

2) Information Gain:

Let D be the set of d data samples set with m distinct classes. The expected information I is given by:

∑ log (2)

In the formula, m is the number of distinct classes and is the probability of an arbitrary sample’s belonging to class

. Let attribute have distinct values. Let be the

number of samples of class in a subset . contains those samples in that have value of . The entropy, or

expected information based on the partitioning into subsets by , is given by

∑ (3) The encoding information that would be gained by

branching on is (4)

3) Correlation Analysis: For a pair of variables , , the linear correlation coefficient is given by the formula:

∑∑ ∑ (5) where is the mean of , and is the mean of .

4) Between Class Distance (BCD): ∑ ∑ , (6) In this equation, let and be the sets of samples in

class and , and be the th sample in . and are numbers of samples in each class, and ,

is an Euclidean distance between two points and .

Formant-based Feature Subset Selection Algorithm (FFSSA)Definition: cows’ sound data set , : estrus sound, : non-estrus sound samples. Universal formant features set , , … , , is the number of formant that can be extracted from . Input: universal formant feature set 1. Compute t-test and obtain the first candidate set : if

p-value of 0.05 then remove in . 2. Compute the information gain of each formant in . 3. Apply correlation analysis to using information gain

and obtain the second candidate set : if |correlation value| then select the highest ranked formant.

4. Apply SFS to with BCD as an evaluation function: search the best formant feature subset

Output: an optimal formant feature subset ;

2.2. Machine Learning-based Detector In this section, we review the basic works on

AdaBoost.M1 machine learning-based detector briefly. This was used in the experiments.

AdaBoost.M1: It was proposed by Freund and Schapire [15]. AdaBoost.M1 is a straightforward generalization

237

of AdaBoost for 2 groups for the musing multiclass base classifiers [16refer to [16])

3. Experimental results

3.1. Data Collection and Data SetsIn our experiments, we used 24 ~ 70

These are all multiparous Korean naticollection process was performed in abarn located in Jinju, South Korea. Thecows were recorded using a digitalHDR-XR160, Tokyo, Japan). We exsounds from the sounds emitted by cestrus sounds were extracted using astandard soundcard (Realtek AC97) at Hz sampling rates using Cool Edit (AdUSA). We used the recorded sounds aour experiments. We used the normal shead of cattle and the estrus sounds of fiIn our experiments, we used 163 estnormal sound data for the detection of eused the same data set from our earldetails, see [6]).

3.2. Finding Acoustic Formant FeFigure 1 shows a waveform and spec

(moo) call using Praat 5.3.52 [17]. Hemean formant contour. And the dark gralarge concentration of sound energy in frequency. In Figure 1, we can see man{F1 ~ F19} in the entire frequency ranHz. Figure 2 shows two spectrograms oestrus sound, respectively. Comspectrograms, we can see the differencof frequency amplitude. This meansvariables of calls can discriminate betw(estrus and normal calls), as indicated b

Figure 1: A waveform and spectrogramsound of a Korean native

multiclass problem 6] (for more details

s 0 months old cows. ive cows. The data a commercial loose e sounds emitted by l camcorder (Sony xtracted the estrus

cows. The recorded a Computer with a t 16 bits and 44.100 dobe, San Jose, CA, as reference data in sounds of thirty-two ifteen head of cattle. trus sound and 140 estrus cow calls. We ier work (for more

eatures trogram of a normal ere, the dotted lines ay band represents a a specific time and

ny formant features nge from 0 ~ 21,000 of normal sound and

mparing the two e in the distribution s that the formant

ween the two groups by Yeon et al. [8].

m of a normal (moo) e cow

(A) Normal (m

(B) Estrus Figure 2: Comparison of spec

and estrus

3.3. Formant-based FeatureWe performed the optima

experiments with the proposeSection 2. Let the universal set bwas performed to select statistiwe obtained a subset from the un{F3, F5, and F6} (see Table 1)information gain and obtained such as {F19, F14, F13, F11, FF18, F2, F16, F17, F1, and F4could then use this ranking described below. Some redundcorrelation analysis. After theobtained the candidate formant F9, F10, F7, F18, F2, F16, F1, aKorean native cow estrus usinobtained the optimal feature suF19} using a SFS algorithm in t

F1 F2

Normal 793.81 128.1 1587.3

120.7

Estrus 657.67 71.4 1435.9

78.5

Species effecta p 0.000

** 0.000**

Note. The p levels are for one-way analyspecies on the acoustic measure. α

a Level of significance for tests are listeTable 1: Means for each forman

and estrus

moo) sound.

sound.

ctrograms between normal sound

e Subset Selection al formant feature subset ed algorithm described in be {F1 ~ F19}. First, a t-test cal significant features and niversal set, which excluded . Second, we computed the a ranking of the attributes

F15, F9, F10, F8, F7, F12, } in descending order. We in the relevance analysis

dancies can be detected by e correlation analysis, we feature set {F19, F14, F15,

and F4} for the detection of ng sound data. Finally, we ubset {F1, F2, F4, F7, F14, the feature subset space.

F3 … F19

2372.3 139.9 …

17759.0 237.2

2311.2 128.2 …

18004.8 252.3

0.107 … 0.037

ysis of variance for main effect of 0.05. p* < 0.01, p** < 0.001. d numerically. nt features of cow’s normal sound

238

3.4. Estrus Detection using Formant-based Feature Subset

For the performance evaluation of the proposed method, experiments were conducted using the real vocalizations of Korean native cows in the audio surveillance system that was introduced in our earlier study [6]. We adopted three evaluation formulas [13]: the Estrus Detection Rate (EDR), False Positive Rate (FPR), and False Negative Rate (FNR). The formulas are given:

∑∑ 100 (7) ∑∑ 100 (8)

∑∑ 100 (9) where I is the individual estrus data and N is the normal

data. T is the estrus data classified as such. P is the normal data misclassified as estrus. F is estrus data misclassified as normal.

A summary of detection experimental results for three different formant feature subsets with a machine learning-based detector is shown in Table 2. Our results indicate that the average detection accuracy showed 97.5%, with FPR and FNR on average approaching 5.0% and 2.5%, respectively, when AdaBoost.M1 is used with an obtained optimal formant feature subset. We used a 10-fold cross validation in all our experiments. Comparing the above result with those of two subsets such as {F1, F2, F3, F4} (most popular formants set in literature) and {F1 ~ F19} (all formants in spectrogram), we can see that {F1, F2, F4, F7, F14, F18, F19} (the optimal formants subset) outperformed in all cases.

4. Conclusions In this paper, we discussed the methodology and results

of the study we conducted to develop the optimal formant

feature subset algorithm for the detection of cow’s estrus vocalizations and introduced the prototype system to distinguish the differences between estrus and normal sounds from pattern recognition perspectives. First, we found that there exist 19 formants in spectrograms of Korean native cow vocalizations. Second, we introduced a new formant-based feature subset selection algorithm. Next, we obtained the optimal formant feature subset {F1, F2, F4, F7, F14, F19} for the detection of Korean native cow estrus using sound data. Finally, the performance evaluation was verified using real vocalizations in a commercial loose barn. Experiments showed that the average detection accuracy of the optimal formant feature subsets approached 97.5%, with FPR and FNR on average reaching 5.0% and 2.5%, respectively, when AdaBoost.M1 was used.

Acknowledgement This research was supported by the BK (Brain Korea) 21

Plus Program and Advanced Production Technology Development Program, Ministry for Food, Agriculture, Forestry and Fisheries, Korea.

References [1] M. Saint‐Dizier and S. Chastant‐Maillard. Towards an

automated detection of oestrus in dairy cattle. Reproduction in Domestic Animals, 47(6): 1056-1061, 2012.

[2] A. Mucherino, P. Papajorghi, and P. Pardalos. Data Mining in Agriculture. Springer, 2009.

[3] J. Alawneh, N. Williamson, and D. Bailey. Comparison of a camera-software system and typical farm management for detecting oestrus in dairy cattle at pasture. New Zealand veterinary journal, 54(2): 73-77, 2006.

[4] C. Hockey, J. Morton, S. Norman, and M. McGowan. Evaluation of a neck mounted 2‐hourly activity meter system for detecting cows about to ovulate in two paddock‐based Australian dairy herds. Reproduction in Domestic Animals, 45(5): e107-e117, 2010.

[5] J. Aerts, P. Jans, D. Halloy, P. Gustin, and D. Berckmans. Labeling of cough data from pigs for on-line disease monitoring by sound analysis. Transactions-American

Detectors

{F1, F2, F3, F4}: Most popular formants set

{F1 ~ F19}:

Found all formants

{F1, F2, F4, F7, F14, F19}:

Optimal formants subset

Dim.: 4 (0~5,000Hz) Dim.: 19 (0~21,000Hz) Dim.: 6 (0~21,000Hz)

EDR (%)

FPR (%)

FNR (%)

EDR (%)

FPR (%)

FNR (%)

EDR (%)

FPR (%)

FNR (%)

Adaboost.M1 85.3 20.0 14.7 93.7 5.7 6.3 97.5 5.0 2.5

Table 2: Performance comparison among three different feature subsets.

239

Society of Agricultural Engineers, 48(1): 351-354, 2005. [6] Y. Chung, J. Lee, S. Oh, D. Park, H. Chang, and S. Kim.

Automatic detection of cow’s oestrus in audio surveillance system. Asian-Australasian Journal of Animal Sciences, 26(7): 1030-1037, 2013.

[7] K. A. Steen, O. R. Therkildsen, H. Karstoft, and O. Green. A vocal-based analytical method for goose behaviour recognition. Sensors, 12(3): 3773-3788, 2012.

[8] S. C. Yeon, J. H. Jeon, K. A. Houpt, H. H. Chang, H. C. Lee, and H. J. Lee. Acoustic features of vocalizations of Korean native cows in two different conditions. Applied animal behaviour science, 101(1): 1-9, 2006.

[9] G. Klir and B. Yuan. Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice-Hall, 1995.

[10] M. A. Hall. Correlation-based feature selection for machine learning. Ph.D. Thesis, The University of Waikato, 1999.

[11] S. F. Pratama, A. K. Muda, Y.-H. Choo, and N. A. Muda. A Comparative Study of Feature Selection Methods for Authorship Invarianceness in Writer Identification. International Journal of Computer Information Systems and Industrial Management Applications, 4: 467-476, 2012.

[12] J. Marques de Sá. Applied Statistics using SPSS, Statistica, Matlab and R. Springer, 2007.

[13] J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. 3rd Edition, Morgan Kaufman, 2012.

[14] L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In International Conference of Machine Learning, 3: 856-863, 2003.

[15] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1): 119-139, 1997.

[16] G. Eibl and K. P. Pfeiffer., How to make AdaBoost. M1 work for weak base classifiers by changing only one line of the code. In Machine Learning: ECML 2002, 72-83, 2002.

[17] P. Boersma and D. Weenink. Praat: Doing phonetics by computer [Computer program]. Version 5.3.52, 2013.

240

/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 200 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 2.00333 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 400 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.00167 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

/CreateJDFFile false /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles true /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /NA /PreserveEditing false /UntaggedCMYKHandling /UseDocumentProfile /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

Documents

Formant-based Acoustic Features for Cow’s Estrus Detection in …idb.korea.ac.kr/publication/all/avss2014_formant-based... · 2014. 8. 29. · Figure 2: Comparison of spec and estrus