Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
T-ASLP 2012-2010 (EDICS in Speech areas)
(and their likely conference paper origins)
2012
Chen, S.; Yang, J.; Chiang, C.; Liu, M.; Wang, Y., “A New Prosody-Assisted Mandarin ASR
System,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.20, no.6, pp.1669-1684,
2012
Likely origin:
Jyh-Her Yang, Ming-Chieh Liu, Hao-Hsiang Chang, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen:
Enriching Mandarin speech recognition by incorporating a hierarchical prosody model. ICASSP
2011:5052-5055
Serizel, R.; Moonen, M.; Wouters, J.; Jensen, S.H., “A Zone-of-Quiet Based Approach to Integrated
Active Noise Control and Noise Reduction for Speech Enhancement in Hearing Aids,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.20, no.6, pp.1685-1697, 2012
Likely origin:
Romain Serizel, Marc Moonen, Jan Wouters, Søren Holdt Jensen: A zone of quiet based approach to
integrated active noise control and noise reduction in hearing AIDS. WASPAA 2009:229-232
Sigg, C.D.; Dikk, T.; Buhmann, J.M., “Speech Enhancement Using Generative Dictionary
Learning,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.20, no.6, pp.1698-1712,
2012
Likely origin:
Christian D. Sigg, Tomas Dikk, Joachim M. Buhmann: Speech enhancement with sparse coding in
learned dictionaries.ICASSP 2010:4758-4761
Zen, H.; Braunschweiler, N.; Buchholz, S.; Gales, M.J.F.; Knill, K.; Krstulovic, S.; Latorre, J., “Statistical
Parametric Speech Synthesis Based on Speaker and Language Factorization,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.20, no.6, pp.1713-1724, 2012
Likely origin:
Heiga Zen: Speaker and language adaptive training for HMM-based polyglot speech
synthesis. INTERSPEECH 2010:410-413
Schuldt, C.; Lindstrom, F.; Claesson, I., “A Delay-Based Double-Talk Detector,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.20, no.6, pp.1725-1733, 2012
Likely origin:
None
Saito, D.; Watanabe, S.; Nakamura, A.; Minematsu, N.,“Statistical Voice Conversion Based on Noisy
Channel Model,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.6, pp.1784-
1794, 2012
Likely origin:
Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu: Probabilistic integration of joint
density model and speaker model for voice conversion. INTERSPEECH 2010:1728-1731
Wang, T. T.; Quatieri, T. F., “Two-Dimensional Speech-Signal Modeling,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.6, pp. 1843 – 1856, 2012
Likely origin:
Tianyu T. Wang, Thomas F. Quatieri: Multi-pitch estimation by a joint 2-d representation of pitch and pitch
dynamics. INTERSPEECH 2010:645-648
Chao-Ling Hsu; DeLiang Wang; Jang, J.-S.R.; Ke Hu, “A Tandem Algorithm for Singing Pitch
Extraction and Voice Separation From Music Accompaniment,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.5, pp. 1482 – 1491, 2012
Likely origin:
Chao-Ling Hsu, DeLiang Wang, Jyh-Shing Roger Jang: A trend estimation algorithm for singing pitch
detection in musical recordings. ICASSP 2011:393-396
Zhen-Hua Ling; Li-Rong Dai, “Minimum Kullback–Leibler Divergence Parameter Generation for
HMM-Based Speech Synthesis,” Audio, Speech, and Language Processing, IEEE Transactions on,
vol.20, no.5, pp. 1492 – 1502, 2012
Likely origin:
None
Sanand, D.R.; Umesh, S. , “VTLN Using Analytically Determined Linear-Transformation on
Conventional MFCC,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.5, pp.
1573 – 1584, 2012
Likely origin:
Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney: Revisiting VTLN using linear transformation on
conventional MFCC. INTERSPEECH 2010:538-541
Cumani, S.; Laface, P. ,“Analysis of Large-Scale SVM Training Algorithms for Language and
Speaker Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.5, pp.
1585 – 1596, 2012
Likely origin:
None
Xiaojia Zhao; Yang Shao; DeLiang Wang , “CASA-Based Robust Speaker Identification,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.20, no.5, pp. 1608 - 1616, 2012
Likely origin:
Xiaojia Zhao, Yang Shao, DeLiang Wang: Robust speaker identification using a CASA front-end. ICASSP
2011:5468-5471
Giacobello, D.; Christensen, M.G.; Murthi, M.N.; Jensen, S.H.; Moonen, M. , “Sparse Linear Prediction
and Its Applications to Speech Processing,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.5, pp. 1644 – 1657, 2012
Likely origin:
Daniele Giacobello, Mads Græsbøll Christensen, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Enhancing sparsity in linear prediction of speech by iteratively reweighted 1-norm minimization. ICASSP
2010:4650-4653
Nakagawa, S.; Longbiao Wang; Ohtsuka, S. , “Speaker Identification and Verification by Combining
MFCC and Phase Information,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20,
no.4, pp. 1085 – 1095, 2012
Likely origin:
Longbiao Wang, Kazue Minami, Kazumasa Yamamoto, Seiichi Nakagawa: Speaker identification by
combining MFCC and phase information in noisy environments. ICASSP 2010:4502-4505
Chu-Cheng Lin; Tsai, R.T.-H. , “A Generative Data Augmentation Model for Enhancing Chinese
Dialect Pronunciation Prediction,” Audio, Speech, and Language Processing, IEEE Transactions on,
vol.20, no.4, pp. 1109 – 1117, 2012
Likely origin:
None
Estellers, V.; Gurban, M.; Thiran, J. , “On Dynamic Stream Weighting for Audio-Visual Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.4, pp. 1145 –
1157, 2012
Likely origin:
None
Rosenkranz, T.; Puder, H. , “Improving Robustness of Codebook-Based Noise Estimation Approaches
With Delta Codebooks,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.4,
pp. 1177 – 1188, 2012
Likely origin:
None
Narwaria, M.; Weisi Lin; McLoughlin, I.V.; Emmanuel, S.; Liang-Tien Chia , “Nonintrusive Quality
Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector
Regression,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.4, pp. 1217 –
1232, 2012
Likely origin:
Manish Narwaria, Weisi Lin, Ian Vince McLoughlin, Sabu Emmanuel, Liang-Tien Chia: Non-intrusive
Speech Quality Assessment with Support Vector Regression. MMM 2010:325-335
Huang, Y.A.; Benesty, J. , “A Multi-Frame Approach to the Frequency-Domain Single-Channel Noise
Reduction Problem,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.4, pp.
1256 – 1269, 2012
Likely origin:
Jacob Benesty, Yiteng Huang: A single-channel noise reduction MVDR filter. ICASSP 2011:273-276
Godoy, E.; Rosec, O.; Chonavel, T. , “Voice Conversion Using Dynamic Frequency Warping With
Amplitude Scaling, for Parallel or Nonparallel Corpora,” Audio, Speech, and Language Processing,
IEEE Transactions on, vol.20, no.4, pp. 1313 – 1323, 2012
Likely origin:
Elizabeth Godoy, Olivier Rosec, Thierry Chonavel: Spectral Envelope Transformation Using DFW and
Amplitude Scaling for Voice Conversion with Parallel or Nonparallel Corpora. INTERSPEECH 2011:673-
676
Ruofei Chen; Cheung-Fat Chan; Hing Cheung So , “Model-Based Speech Enhancement With Improved
Spectral Envelope Estimation via Dynamics Tracking,” Audio, Speech, and Language Processing,
IEEE Transactions on, vol.20, no.4, pp. 1324 - 1336, 2012
Likely origin:
Ruofei Chen, Cheung-Fat Chan: Analysis-synthesis based speech enhancement with improved spectrum
envelope estimation by tracking speech dynamics. ICASSP 2011:4644-4647
Qun Feng Tan; Narayanan, S.S. , “Novel Variations of Group Sparse Regularization Techniques With
Applications to Noise Robust Automatic Speech Recognition,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.4, pp. 1337 – 1346, 2012
Likely origin:
None
Solera-Urena, R.; Garcia-Moral, A.I.; Pelaez-Moreno, C.; Martinez-Ramon, M.; Diaz-de-Maria, F. , “Real-
Time Robust Automatic Speech Recognition Using Compact Support Vector Machines,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.20, no.4, pp. 1347 - 1361, 2012
Likely origin:
None
Fazel, A.; Chakrabartty, S. , “Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-
Robust Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20,
no.4, pp. 1362 - 1371, 2012
Likely origin:
Amin Fazel, Shantanu Chakrabartty: Sparse kernel cepstral coefficients (SKCC): Inner-product based
features for noise-robust speech recognition. ISCAS 2011:2401-2404
Yoshii, K.; Goto, M. , “A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent
Harmonic Allocation,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp.
717 – 730, 2012
Likely origin:
Kazuyoshi Yoshii, Masataka Goto: Infinite Latent Harmonic Allocation: A Nonparametric Bayesian
Approach to Multipitch Analysis. ISMIR 2010:309-314
Parlak, S.; Saraclar, M. , “Performance Analysis and Improvement of Turkish Broadcast News
Retrieval,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 731 – 741,
2012
Likely origin:
S. Parlak and M. Saraclar, “Spoken information retrieval for Turkish broadcast news,” in Proc. SIGIR,
Boston, MA, 2009, pp. 782–783.
S. Parlak and M. Saraclar, “Spoken term detection for Turkish broadcast news,” in Proc. ICASSP, Las
Vegas, NV, 2008, pp. 5244–5247.
McLaren, M.; van Leeuwen, D. , “Source-Normalized LDA for Robust Speaker Recognition Using i-
Vectors From Multiple Speech Sources,” Audio, Speech, and Language Processing, IEEE Transactions
on, vol.20, no.3, pp. 755 – 766, 2012
Likely origin:
Mitchell McLaren, David A. van Leeuwen: Source-normalised-and-weighted LDA for robust speaker
recognition using i-vectors. ICASSP 2011:5456-5459
Mitchell McLaren, David A. van Leeuwen: Improved speaker recognition when using i-vectors from
multiple speech sources. ICASSP 2011:5460-5463
Qiang Fu; Yong Zhao; Biing-Hwang Juang , “Automatic Speech Recognition Based on Non-Uniform
Error Criteria,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 780 –
793, 2012
Likely origin:
Qiang Fu, Dwi Sianto Mansjur, Biing-Hwang Juang: Non-Uniform error criteria for automatic pattern and
speech recognition. ICASSP 2008:1853-1856
Heiga Zen; Gales, M.J.F.; Nankaku, Y.; Tokuda, K. , “Product of Experts for Statistical Parametric
Speech Synthesis,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp.
794 – 805, 2012
Likely origin:
Heiga Zen, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda: Statistical parametric speech synthesis
based on product of experts. ICASSP 2010:4242-4245
Helander, E.; Silen, H.; Virtanen, T.; Gabbouj, M. , “Voice Conversion Using Dynamic Kernel Partial
Least Squares Regression,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20,
no.3, pp. 806 – 817, 2012
Likely origin:
None
Ning Ma; Barker, J.; Christensen, H.; Green, P. , “Combining Speech Fragment Decoding and Adaptive
Noise Floor Modeling,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3,
pp. 818 - 827, 2012
Likely origin:
None
Liang-Che Sun; Lin-Shan Lee , “Modulation Spectrum Equalization for Improved Robust Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 828 -
843, 2012
Likely origin:
Liang-Che Sun, Chang-Wen Hsu, Lin-Shan Lee: Evaluation of modulation spectrum equalization
techniques for large vocabulary robust speech recognition. INTERSPEECH 2008:1004-1007
Siniscalchi, S.M.; Dau-Cheng Lyu; Svendsen, T.; Chin-Hui Lee , “Experiments on Cross-Language
Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 875 - 887, 2012
Likely origin:
Dau-Cheng Lyu, Sabato Marco Siniscalchi, Tae-Yoon Kim, Chin-Hui Lee: Continuous phone recognition
without target language training data. INTERSPEECH 2008:2687-2690
Xiang Lin; Khong, A.W.H.; Naylor, P.A. , “A Forced Spectral Diversity Algorithm for Speech
Dereverberation in the Presence of Near-Common Zeros,” Audio, Speech, and Language Processing,
IEEE Transactions on, vol.20, no.3, pp. 888 – 899, 2012
Likely origin:
Xiang Lin, Andy W. H. Khong, Patrick A. Naylor: Blind system identification for speech dereverberation
with Forced Spectral Diversity. ICASSP 2009:3737-3740
Chiu, Y.B.; Raj, B.; Stern, R.M., “Learning-Based Auditory Encoding for Robust Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 900 –
914, 2012
Likely origin:
Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern: Learning-based auditory encoding for robust
speech recognition.ICASSP 2010:4278-4281
Wei Chu; Alwan, A., “SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy
Conditions,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 933 – 944,
2012
Likely origin:
Wei Chu, Abeer Alwan: SAFE: a statistical algorithm for F0 estimation for both clean and noisy
speech. INTERSPEECH 2010:2590-2593
Panda, A.; Srikanthan, T. , “Psychoacoustic Model Compensation for Robust Speaker Verification in
Environmental Noise,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp.
945 - 953, 2012
Likely origin:
None
Drugman, T.; Dutoit, T. , “The Deterministic Plus Stochastic Model of the Residual Signal and Its
Applications,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 968 -
981, 2012
Likely origin:
Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit: A deterministic plus stochastic model of the residual
signal for improved parametric speech synthesis. INTERSPEECH 2009:1779-1782
Drugman, T.; Thomas, M.; Gudnason, J.; Naylor, P.; Dutoit, T. , “Detection of Glottal Closure Instants
From Speech Signals: A Quantitative Review,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.3, pp. 994 – 1006, 2012
Likely origin:
None
Xianyu Zhao; Yuan Dong , “Variational Bayesian Joint Factor Analysis Models for Speaker
Verification,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp. 1032 -
1042, 2012
Likely origin:
Xianyu Zhao, Yuan Dong, Jian Zhao, Liang Lu, Jiqing Liu, Haila Wang: Variational Bayesian Joint factor
analysis for speaker verification. ICASSP 2009:4049-4052
Pandey, A.; Mathews, V.J. , “Adaptive Gain Processing With Offending Frequency Suppression for
Digital Hearing Aids,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.3, pp.
1043 - 1055, 2012
Likely origin:
Ashutosh Pandey, V. John Mathews: Offending frequency suppression with a reset algorithm to improve
feedback cancellation in digital hearing aids. ICASSP 2011:301-304
Shoham, T.; Malah, D.; Shechtman, S. , “Quality Preserving Compression of a Concatenative Text-To-
Speech Acoustic Database,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20,
no.3, pp. 1056 – 1068, 2012
Likely origin:
None
Despotovic, V.; Goertz, N.; Peric, Z. , “Nonlinear Long-Term Prediction of Speech Based on
Truncated Volterra Series,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20,
no.3, pp. 1069 – 1073, 2012
Likely origin:
None
Anguera Miro, X.; Bozonnet, S.; Evans, N.; Fredouille, C.; Friedland, G.; Vinyals, O. , “Speaker
Diarization: A Review of Recent Research,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.2, pp. 356 - 370, 2012
Likely origin:
None
Friedland, G.; Janin, A.; Imseng, D.; Anguera Miro, X.; Gottlieb, L.; Huijbregts, M.; Knox, M.T.; Vinyals,
O. , “The ICSI RT-09 Speaker Diarization System,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.2, pp. 371 – 381, 2012
Likely origin:
?
Evans, N.; Bozonnet, S.; Dong Wang; Fredouille, C.; Troncy, R. , “A Comparative Study of Bottom-Up
and Top-Down Approaches to Speaker Diarization,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.2, pp. 382 – 392, 2012
Likely origin:
Simon Bozonnet, Dong Wang, Nicholas W. D. Evans, Raphaël Troncy: Linguistic influences on bottom-up
and top-down clustering for speaker diarization. ICASSP 2011:4424-4427
Huijbregts, M.; van Leeuwen, D.A.; Wooters, C. , “Speaker Diarization Error Analysis Using Oracle
Components,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.2, pp. 393 –
403, 2012
Likely origin:
None
Huijbregts, M.; van Leeuwen, D.A. , “Large-Scale Speaker Diarization for Long Recordings and Small
Collections,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.2, pp. 404 –
413, 2012
Likely origin:
None
Ben-Harush, O.; Ben-Harush, O.; Lapidot, I.; Guterman, H. , “Initialization of Iterative-Based Speaker
Diarization Systems for Telephone Conversations,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.2, pp. 414 – 425, 2012
Likely origin:
Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman: Incremental diarization of telephone
conversations. INTERSPEECH 2010:2226-2229
Pardo, J.M.; Barra-Chicote, R.; San-Segundo, R.; de Cordoba, R.; Martinez-Gonzalez, B. , “Speaker
Diarization Features: The UPM Contribution to the RT09 Evaluation,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.2, pp. 426 – 435, 2012
Likely origin:
None
Ishiguro, K.; Yamada, T.; Araki, S.; Nakatani, T.; Sawada, H. , “Probabilistic Speaker Diarization With
Bag-of-Words Representations of Speaker Angle Information,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.2, pp. 447 - 460, 2012
Likely origin:
Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani: A probabilistic speaker clustering
for DOA-based diarization. WASPAA 2009:241-244
Tin Lay Nwe; Hanwu Sun; Bin Ma; Haizhou Li , “Speaker Clustering and Cluster Purification
Methods for RT07 and RT09 Evaluation Meeting Data,” Audio, Speech, and Language Processing,
IEEE Transactions on, vol.20, no.2, pp. 461 – 473, 2012
Likely origin:
None
Batista, F.; Moniz, H.; Trancoso, I.; Mamede, N. , “Bilingual Experiments on Automatic Recovery of
Capitalization and Punctuation of Automatic Speech Transcripts,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.2, pp. 474 - 485, 2012
Likely origin:
None
Hain, T.; Burget, L.; Dines, J.; Garner, P.N.; Grezl, F.; Hannani, A.E.; Huijbregts, M.; Karafiat, M.;
Lincoln, M.; Wan, V. , “Transcribing Meetings With the AMIDA Systems,” Audio, Speech, and
Language Processing, IEEE Transactions on, vol.20, no.2, pp. 486 – 498, 2012
Likely origin:
Thomas Hain, Lukas Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin
Karafiát, Mike Lincoln, Vincent Wan: The AMIDA 2009 meeting transcription system. INTERSPEECH
2010:358-361
Hoffmeister, B.; Heigold, G.; Rybach, D.; Schluter, R.; Ney, H. , “WFST Enabled Solutions to ASR
Problems: Beyond HMM Decoding,” Audio, Speech, and Language Processing, IEEE Transactions on,
vol.20, no.2, pp. 551 - 564, 2012
Likely origin:
None
Sanchis, A.; Juan, A.; Vidal, E., “A Word-Based Naïve Bayes Classifier for Confidence Estimation in
Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.2, pp.
565 – 574, 2012
Likely origin:
None
Sungrack Yun; Yoo, C.D. , “Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion
Classification,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.2, pp. 585 –
598, 2012
Likely origin:
Sungrack Yun, Chang Dong Yoo: Speech emotion recognition via a max-margin framework incorporating
a loss function based on the Watson and Tellegen's emotion model. ICASSP 2009:4169-4172
Yousefian, N.; Loizou, P.C., “A Dual-Microphone Speech Enhancement Algorithm Based on the
Coherence Function,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.2, pp.
599 – 609, 2012
Likely origin:
None
Boucheron, L.E.; De Leon, P.L.; Sandoval, S., “Low Bit-Rate Speech Coding Through Quantization of
Mel-Frequency Cepstral Coefficients,” Audio, Speech, and Language Processing, IEEE Transactions on,
vol.20, no.2, pp. 610 – 619, 2012
Likely origin:
Laura E. Boucheron, Phillip L. De Leon, Steven Sandoval: Hybrid Scalar/Vector Quantization of Mel-
Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech. DCC 2011:103-112
Nam Soo Kim; Tae Gyoon Kang; Shin Jae Kang; Chang Woo Han; Doo Hwa Hong, “Speech Feature
Mapping Based on Switching Linear Dynamic System,” Audio, Speech, and Language Processing,
IEEE Transactions on, vol.20, no.2, pp. 620 – 631, 2012
Likely origin:
Chang Woo Han, Tae Gyoon Kang, Doo Hwa Hong, Nam Soo Kim, Kiwan Eom, Jaewon Lee: Switching
linear dynamic transducer for stereo data based speech feature mapping. ICASSP 2011:4776-4779
Yi-Cheng Pan; Hung-Yi Lee; Lin-Shan Lee, “Interactive Spoken Document Retrieval With Suggested
Key Terms Ranked by a Markov Decision Process,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.2, pp. 632 – 645, 2012
Likely origin:
None
Morgan, N., “Deep and Wide: Multiple Layers in Automatic Speech Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on, vol.20, no.1, pp. 7 – 13, 2012
Likely origin:
None
Mohamed, A.; Dahl, G.E.; Hinton, G., “Acoustic Modeling Using Deep Belief Networks,” Audio, Speech,
and Language Processing, IEEE Transactions on, vol.20, no.1, pp. 14 - 22, 2012
Definite Origin: Mohamed, A. R., Dahl, G. E. and Hinton, G. E. Deep belief networks for phone recognition. NIPS
workshop on deep learning for speech recognition, 2009.
Sivaram, G.S.V.S.; Hermansky, H., “Sparse Multilayer Perceptron for Phoneme Recognition,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.20, no.1, pp. 23 – 29, 2012
Definite Origin:
Garimella S. V. S. Sivaram, Hynek Hermansky: Multilayer perceptron with sparse hidden outputs for
phoneme recognition. ICASSP 2011:5336-5339
Dahl, G.E.; Dong Yu; Li Deng; Acero, A., “Context-Dependent Pre-Trained Deep Neural Networks for
Large-Vocabulary Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions
on, vol.20, no.1, pp. 30 – 42, 2012
Definite Origin: G. Dahl, Dong Yu, Li Deng, and Alex Acero, Large Vocabulary Continuous Speech Recognition With Context-
Dependent DBN-HMMS, Proc. ICASSP, May 2011
Ozbek, I.Y.; Hasegawa-Johnson, M.; Demirekler, M., “On Improving Dynamic State Space Approaches
to Articulatory Inversion With MAP-Based Parameter Estimation,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.1, pp. 67 – 81, 2012
Likely origin:
I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler: Formant trajectories for acoustic-to-
articulatory inversion. INTERSPEECH 2009:2807-2810
Thomas, M.R.P.; Gudnason, J.; Naylor, P.A., “Estimation of Glottal Closing and Opening Instants in
Voiced Speech Using the YAGA Algorithm,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.1, pp. 82 – 91, 2012
Likely origin:
Mark R. P. Thomas, Jon Gudnason, Patrick A. Naylor, Bernd Geiser, Peter Vary: Voice source estimation
for artificial bandwidth extension of telephone speech. ICASSP 2010:4794-4797
Jensen, J.; Hendriks, R.C., “Spectral Magnitude Minimum Mean-Square Error Estimation Using
Binary and Continuous Gain Functions,” Audio, Speech, and Language Processing, IEEE Transactions
on, vol.20, no.1, pp. 92 – 102, 2012
Likely origin:
Jesper Jensen, Richard C. Hendriks: Spectral magnitude minimum mean-square error binary masks for
DFT based speech enhancement. ICASSP 2011:4736-4739
Hen-Geul Yeh; Rangel-Ruiz, C. ,“Fixed-Point Implementation of Cascaded Forward–Backward
Adaptive Predictors,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.1, pp.
103 – 107, 2012
Likely origin:
Hen-Geul Yeh, Carlos Rangel Ruiz: Cascaded Forward-Backward Least Mean Square Adaptive
Predictors. ISM 2008:568-573
May, T.; van de Par, S.; Kohlrausch, A., “Noise-Robust Speaker Recognition Combining Missing Data
Techniques and Universal Background Modeling,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.1, pp. 108 – 121, 2012
Likely origin:
None
Mossavat, I.; Petkov, P.N.; Kleijn, W.B.; Amft, O., “A Hierarchical Bayesian Approach to Modeling
Heterogeneity in Speech Quality Assessment,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.1, pp. 136 – 146, 2012
Likely origin:
Petko N. Petkov, Iman S. Mossavat, W. Bastiaan Kleijn: A Bayesian approach to non-intrusive quality
assessment of speech. INTERSPEECH 2009:2875-2878
Christiansen, T.U.; Greenberg, S., “Perceptual Confusions Among Consonants, Revisited—Cross-
Spectral Integration of Phonetic-Feature Information and Consonant Recognition,” Audio, Speech,
and Language Processing, IEEE Transactions on, vol.20, no.1, pp. 147 – 161, 2012
Likely origin:
None
Berlin Chen; Shih-Hsiang Lin, “A Risk-Aware Modeling Framework for Speech Summarization,”
Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.1, pp. 211 - 222, 2012
Likely origin:
Shih-Hsiang Lin, Berlin Chen: A Risk Minimization Framework for Extractive Speech
Summarization. ACL 2010:79-87
Hendriks, R.C.; Gerkmann, T.,“Noise Correlation Matrix Estimation for Multi-Microphone Speech
Enhancement,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.20, no.1, pp. 223 -
233, 2012
Likely origin:
Richard C. Hendriks, Timo Gerkmann: Estimation of the noise correlation matrix. ICASSP 2011:4740-
4743
Sicuranza, G.L.; Carini, A.,“On the BIBO Stability Condition of Adaptive Recursive FLANN Filters
With Application to Nonlinear Active Noise Control,” Audio, Speech, and Language Processing, IEEE
Transactions on, vol.20, no.1, pp. 234 - 245, 2012
Likely origin:
Giovanni L. Sicuranza, Alberto Carini: Adaptive recursive FLANN filters for nonlinear active noise
control. ICASSP 2011:4312-4315
Lei Xie; Lilei Zheng; Zihan Liu; Yanning Zhang,“Laplacian Eigenmaps for Automatic Story
Segmentation of Broadcast News,” Audio, Speech, and Language Processing, IEEE Transactions on,
vol.20, no.1, pp. 276 - 289, 2012
Likely origin:
None
Shahnaz, C.; Wei-Ping Zhu; Ahmad, M.O.,“Pitch Estimation Based on a Harmonic Sinusoidal
Autocorrelation Model and a Time-Domain Matching Scheme,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.1, pp. 322 - 335, 2012
Likely origin:
Celia Shahnaz, Wei-Ping Zhu, M. Omair Ahmad: A Spectral Matching Method for Pitch Estimation from
Noise-corrupted Speech. ISCAS 2009:1413-1416
Garreton, C.; Yoma, N.B., “Telephone Channel Compensation in Speaker Verification Using a
Polynomial Approximation in the Log-Filter-Bank Energy Domain,” Audio, Speech, and Language
Processing, IEEE Transactions on, vol.20, no.1, pp. 336 - 341, 2012
Likely origin:
Claudio Garretón, Néstor Becerra Yoma: On enhancing feature sequence filtering with filter-bank energy
transformation in speaker verification with telephone speech. INTERSPEECH 2010:1461-1464
2011
Oudre, L.; Fevotte, C.; Grenier, Y.; , “Probabilistic Template-Based Chord Recognition,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2249-2259, Nov. 2011
Likely origin: Chord recognition using measures of fit, chord templates and filtering methods
Oudre, L.; Grenier, Y.; Fevotte, C.
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Digital Object Identifier: 10.1109/ASPAA.2009.5346546
Publication Year: 2009 , Page(s): 9 - 12
Benesty, J.; Jingdong Chen; Yiteng Huang; , “Binaural Noise Reduction in the Time Domain With a
Stereo Setup,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2260-
2272, Nov. 2011
Likely origin: A minimum speech distortion multichannel algorithm for noise reduction
Benesty, J.; Jingdong Chen; Yiteng Huang
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2008.4517611
Publication Year: 2008 , Page(s): 321 - 324
Zengli Yang; Zheng, Y.R.; Grant, S.L.; , “Proportionate Affine Projection Sign Algorithms for
Network Echo Cancellation,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.8, pp.2273-2284, Nov. 2011
Likely origin: Proportionate affine projection sign algorithms for sparse system identification in impulsive interference
Zengli Yang; Zheng, Y.R.; Grant, S.L.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2011.5947246
Publication Year: 2011 , Page(s): 4068 - 4071
Jun Du; Qiang Huo; , “A Feature Compensation Approach Using High-Order Vector Taylor Series
Approximation of an Explicit Distortion Model for Noisy Speech Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.8, pp.2285-2293, Nov. 2011
Likely origin: Evaluation of a Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of
an Explicit Distortion Modelon Aurora2, Aurora3, and Aurora4 Tasks
Du, J.; Qiang Huo; Yu Hu
Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on
Digital Object Identifier: 10.1109/CHINSL.2008.ECP.32
Publication Year: 2008 , Page(s): 1 - 4
Can, D.; Saraclar, M.; , “Lattice Indexing for Spoken Term Detection,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.8, pp.2338-2347, Nov. 2011
Likely origin: Effect of pronounciations on OOV queries in spoken term detection
Can, D.; Cooper, E.; Sethy, A.; White, C.; Ramabhadran, B.; Saraclar, M.
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2009.4960494
Publication Year: 2009 , Page(s): 3957 - 3960
Penagarikano, M.; Varona, A.; Rodriguez-Fuentes, L.J.; Bordel, G.; , “Improved Modeling of Cross-
Decoder Phone Co-Occurrences in SVM-Based Phonotactic Language Recognition,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2348-2363, Nov. 2011
Likely origin: A dynamic approach to the selection of high order n-grams in phonotactic language recognition
Penagarikano, M.; Varona, A.; Rodriguez-Fuentes, L.J.; Bordel, G.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2011.5947332
Publication Year: 2011 , Page(s): 4412 - 4415
Burton, T.G.; Goubran, R.A.; , “A Generalized Proportionate Subband Adaptive Second-Order
Volterra Filter for Acoustic Echo Cancellation in Changing Environments,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.8, pp.2364-2373, Nov. 2011
Likely origin: Nonlinear System Identification Using a Subband Adaptive Volterra Filter
Burton, T.; Beaucoup, F.; Goubran, R.
Instrumentation and Measurement Technology Conference Proceedings, 2008. IMTC 2008. IEEE
Publication Year: 2008 , Page(s): 939 - 944
-A New Structure for Combining Echo Cancellation and Beamforming in Changing Acoustical
Environments
Burton, T.; Goubran, R.
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Volume: 1, Publication Year: 2007 , Page(s): I-77 - I-80
Hioka, Y.; Niwa, K.; Sakauchi, S.; Furuya, K.; Haneda, Y.; , “Estimating Direct-to-Reverberant Energy
Ratio Using D/R Spatial Correlation Matrix Model,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.8, pp.2374-2384, Nov. 2011
Likely origin: Estimation of sound source orientation using eigenspace of spatial correlation matrix
Niwa, K.; Hioka, Y.; Sakauchi, S.; Furuya, K.; Haneda, Y.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 129 – 132
-Estimating direct-to-reverberant energy ratio based on spatial correlation model segregating direct sound
and reverberation
Hioka, Y.; Niwa, K.; Sakauchi, S.; Furuya, K.; Haneda, Y.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 149 - 152
Joder, C.; Essid, S.; Richard, G.; , “A Conditional Random Field Framework for Robust and Scalable
Audio-to-Score Matching,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.8, pp.2385-2397, Nov. 2011
Likely origin: Hidden Discrete Tempo Model: A tempo-aware timing model for audio-to-score alignment
Joder, C.; Essid, S.; Richard, G.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Publication Year: 2011 , Page(s): 397 - 400
-A comparative study of tonal acoustic features for a symbolic level music-to-score alignment
Joder, C.; Essid, S.; Richard, G.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 409 - 412
-Optimizing the mapping from a symbolic to an audio representation for music-to-score alignment
Joder, C.; Essid, S.; Richard, G.
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011 IEEE Workshop on
Publication Year: 2011 , Page(s): 121 - 124
Turner, R.E.; Sahani, M.; , “Demodulation as Probabilistic Inference,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.8, pp.2398-2411, Nov. 2011
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5741712&isnumber=5989936
Likely origin: Statistical inference for single- and multi-band Probabilistic Amplitude Demodulation
Turner, R.E.; Sahani, M.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 5466 - 5469
Sicuranza, G.L.; Carini, A.; , “A Generalized FLANN Filter for Nonlinear Active Noise Control,”
Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2412-2417, Nov. 2011
Likely origin: Adaptive recursive FLANN filters for nonlinear active noise control
Sicuranza, G.L.; Carini, A.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Publication Year: 2011 , Page(s): 4312 - 4315
Rafaely, B.; , “Bessel Nulls Recovery in Spherical Microphone Arrays for Time-Limited Signals,”
Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2430-2438, Nov. 2011
Likely origin: Spherical microphone array with multiple nulls for analysis of directional room impulse responses
Rafaely, B.
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2008.4517601
Publication Year: 2008 , Page(s): 281 - 284
Dong Yu; Jinyu Li; Li Deng; , “Calibration of Confidence Measures in Speech Recognition,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2461-2473, Nov. 2011
Definite origin:
Semantic confidence calibration for spoken dialog applications
Dong Yu; Li Deng
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495607
Publication Year: 2010 , Page(s): 4450 - 4453
Cong Liu; Yu Hu; Li-Rong Dai; Hui Jiang; , “Trust Region-Based Optimization for Maximum Mutual
Information Estimation of HMMs in Speech Recognition,” Audio, Speech, and Language Processing,
IEEE Transactions on , vol.19, no.8, pp.2474-2485, Nov. 2011
Likely origin: A bounded trust region optimization for discriminative training of HMMS in speech recognition
Cong Liu; Yu Hu; Hui Jiang; Li-Rong Dai
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 4914 - 4917
Nunes, L.O.; Biscainho, L.W.P.; Bowon Lee; Said, A.; Kalker, T.; Schafer, R.W.; , “Degradation Type
Classifier for Full Band Speech Contaminated With Echo, Broadband Noise, and Reverberation,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2516-2526, Nov. 2011
Likely origin: An objective method for quality assessment of ultra-wideband speech corrupted by echo
Biscainho, L.W.P.; Esquef, P.A.A.; Freeland, F.P.; Nunes, L.O.; Tygel, A.F.; Lee, B.; Said, A.; Kalker, T.;
Schafer, R.W.
Multimedia Signal Processing, 2009. MMSP '09. IEEE International Workshop on
Digital Object Identifier: 10.1109/MMSP.2009.5293349
Parthasarathi, S.H.K.; Gatica-Perez, D.; Bourlard, H.; Magimai-Doss, M.; , “Privacy-Sensitive Audio
Features for Speech/Nonspeech Detection,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.8, pp.2538-2551, Nov. 2011
Likely origin: Evaluating the robustness of privacy-sensitive audio features for speech detection in personal audio log
scenarios
Parthasarathi, S.H.K.; Magimai-Doss, M.; Bourlard, H.; Gatica-Perez, D.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495596
Publication Year: 2010 , Page(s): 4474 - 4477
Prasanna, S.R.M.; Pradhan, G.; , “Significance of Vowel-Like Regions for Speaker Verification Under
Degraded Conditions,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8,
pp.2552-2565, Nov. 2011
Likely origin: Multi-variability speech database for robust speaker recognition
Haris, B.C.; Pradhan, G.; Misra, A.; Shukla, S.; Sinha, R.; Prasanna, S.R.M.
Communications (NCC), 2011 National Conference on
Publication Year: 2011 , Page(s): 1 - 5
-Significance of speaker information in wideband speech
Pradhan, G.; Mahadeva Prasanna, S.R.
Communications (NCC), 2011 National Conference on
Publication Year: 2011 , Page(s): 1 - 5
Borgstrom, B.J.; Alwan, A.; , “A Unified Framework for Designing Optimal STSA Estimators
Assuming Maximum Likelihood Phase Equivalence of Speech and Noise,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.8, pp.2579-2590, Nov. 2011
Likely origin: Log-spectral amplitude estimation with Generalized Gamma distributions for speech enhancement
Borgstrom, B.J.; Alwan, A.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2011.5947418
Publication Year: 2011 , Page(s): 4756 - 4759
King, B.J.; Atlas, L.; , “Single-Channel Source Separation Using Complex Matrix Factorization,”
Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.8, pp.2591-2597, Nov. 2011
doi: 10.1109/TASL.2011.2156786
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5771055&isnumber=5989936
Likely origin: Single-channel source separation using simplified-training complex matrix factorization
King, B.; Atlas, L.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495699
Publication Year: 2010 , Page(s): 4206 - 4209
Sheng-Yi Kong; Lin-Shan Lee; , “Semantic Analysis and Organization of Spoken Documents Based on
Parameters Derived From Latent Topics,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.7, pp.1875-1889, Sept. 2011
Likely origin: Improved Spoken Document Summarization Using Probabilistic Latent Semantic Analysis (PLSA)
Sheng-Yi Kong; Lin-shan Lee
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International
Conference on
Volume: 1
Digital Object Identifier: 10.1109/ICASSP.2006.1660177
Publication Year: 2006 , Page(s): I - I
Hasan, T.; Hansen, J.H.L.; , “A Study on Universal Background Model Training in Speaker
Verification,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7, pp.1890-
1899, Sept. 2011
Likely origin: A novel feature sub-sampling method for efficient universal background model training in speaker
verification
Hasan, T.; Yun Lei; Chandrasekaran, A.; Hansen, J.H.L.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495601
Publication Year: 2010 , Page(s): 4494 - 4497
Madhu, N.; Martin, R.; , “A Versatile Framework for Speaker Separation Using a Model-Based
Speaker Localization Approach,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.7, pp.1900-1912, Sept. 2011
Likely origin: Temporal smoothing of spectral masks in the cepstral domain for speech separation
Madhu, N.; Breithaupt, C.; Martin, R.
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2008.4517542
Publication Year: 2008 , Page(s): 45 - 48
Markaki, M.; Stylianou, Y.; , “Voice Pathology Detection and Discrimination Based on Modulation
Spectral Features,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7,
pp.1938-1948, Sept. 2011
Likely origin: Dysphonia detection based on modulation spectral features and cepstral coefficients
Markaki, M.; Stylianou, Y.; Arias-Londo–o, J.D.; Godino-Llorente, J.I.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 5162 - 5165
-Modulation spectral features for objective voice quality assessment
Markaki, M.; Stylianou, Y.
Communications, Control and Signal Processing (ISCCSP), 2010 4th International Symposium on
Publication Year: 2010 , Page(s): 1 - 4
-Using modulation spectra for voice pathology detection and classification
Markaki, M.; Stylianou, Y.
Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the
IEEE
Publication Year: 2009 , Page(s): 2514 - 2517
Sungwoong Kim; Sungrack Yun; Yoo, C.D.; , "Large Margin Discriminative Semi-Markov Model for
Phonetic Recognition," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7,
pp.1999-2012, Sept. 2011
Likely origin: Largemargin training of semi-Markov model for phonetic recognition
Sungwoong Kim; Sungrack Yun; Yoo, C.D.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495329
Publication Year: 2010 , Page(s): 1910 - 1913
Akhtar, M.T.; Mitsuhashi, W.; , "Improving Performance of Hybrid Active Noise Control Systems for
Uncorrelated Narrowband Disturbances," Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.7, pp.2058-2066, Sept. 2011
doi: 10.1109/TASL.2011.2112349
Likely origin: Robust adaptive algorithm for active noise control of impulsive noise
Akhtar, M.T.; Mitsuhashi, W.
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2009.4959570
Publication Year: 2009 , Page(s): 261 - 264
Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J.; , "An Algorithm for Intelligibility Prediction of Time-
Frequency Weighted Noisy Speech," Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.7, pp.2125-2136, Sept. 2011
Likely origin: A short-time objective intelligibility measure for time-frequency weighted noisy speech
Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 4214 - 4217
Nishimura, R.; Mokhtari, P.; Takemoto, H.; Kato, H.; , "An Attempt to Calibrate Headphones for
Reproduction of Sound Pressure at the Eardrum," Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.7, pp.2137-2145, Sept. 2011
doi: 10.1109/TASL.2011.2118203
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5720282&isnumber=5954033
Likely origin: Median Plane Mislocalization of Virtual Sound Presented through Headphones
Nishimura, R.; Kato, H.; Mokhtari, P.; Takemoto, H.
Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2011 Seventh International
Conference on
Publication Year: 2011 , Page(s): 302 - 305
Pulakka, H.; Alku, P.; , "Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter
Bank Implementation for Highband Mel Spectrum," Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.7, pp.2170-2183, Sept. 2011
Likely origin: Speech bandwidth extension using Gaussian mixture model-based estimation of the highband mel spectrum
Pulakka, H.; Rentes, U.; Palomaki, K.; Kurimo, M.; Alku, P.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2011.5947504
Publication Year: 2011 , Page(s): 5100 - 5103
Yi-Hsuan Yang; Chen, H.H.; , "Prediction of the Distribution of Perceived Music Emotions Using Discrete
Samples," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7, pp.2184-2196,
Sept. 2011
Likely origins:
Music emotion ranking
Yi-Hsuan Yang; Chen, H.H.
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Publication Year: 2009 , Page(s): 1657 - 1660
-Exploiting genre for music emotion classification
Yu-Ching Lin; Yi-Hsuan Yang; Chen, H.H.; I-Bin Liao; Yeh-Chin Ho
Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on
Digital Object Identifier: 10.1109/ICME.2009.5202572
Publication Year: 2009 , Page(s): 618 - 621
Oudre, L.; Grenier, Y.; Fevotte, C.; , "Chord Recognition by Fitting Rescaled Chroma Vectors to Chord
Templates," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7, pp.2222-2233,
Sept. 2011
Likely origin: Chord recognition using measures of fit, chord templates and filtering methods
Oudre, L.; Grenier, Y.; Fevotte, C.
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Publication Year: 2009 , Page(s): 9 - 12
Rafaely, B.; Khaykin, D.; , "Optimal Model-Based Beamforming and Independent Steering for Spherical
Loudspeaker Arrays," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.7,
pp.2234-2238, Sept. 2011
Likely origin: Coherent signals direction-of-arrival estimation using a spherical microphone array: Frequency smoothing
approach
Khaykin, D.; Rafaely, B.
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Digital Object Identifier: 10.1109/ASPAA.2009.5346492
Publication Year: 2009 , Page(s): 221 - 224
Christensen, M.G.; Jensen, S.H.; , "New Results on Perceptual Distortion Minimization and Nonlinear
Least-Squares Frequency Estimation," Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.7, pp.2239-2244, Sept. 2011
Likely origin
A single snapshot optimal filtering method for fundamental frequency estimation
Jensen, J.R.; Christensen, M.G.; Jensen, S.H.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2011.5947297
Publication Year: 2011 , Page(s): 4272 – 4275
wa Maina, C.; Walsh, J.M.; , "Joint Speech Enhancement and Speaker Identification Using Approximate
Bayesian Inference," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.6,
pp.1517-1529, Aug. 2011
Likely origin: Compensating for noise and mismatch in speaker verification systems using approximate Bayesian
inference
wa Maina, C.; Walsh, J.M.
Information Sciences and Systems (CISS), 2011 45th Annual Conference on
Digital Object Identifier: 10.1109/CISS.2011.5766174
Publication Year: 2011 , Page(s): 1 - 6
Cecchi, S.; Romoli, L.; Peretti, P.; Piazza, F.; , "A Combined Psychoacoustic Approach for Stereo Acoustic
Echo Cancellation," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.6,
pp.1530-1539, Aug. 2011
Likely origin: A novel approach to channel decorrelation for stereo Acoustic Echo Cancellation based on missing
fundamental theory
Romoli, L.; Cecchi, S.; Palestini, L.; Peretti, P.; Piazza, F.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495884
Publication Year: 2010 , Page(s): 329 - 332
Tran, H.D.; Haizhou Li; , "Sound Event Recognition With Probabilistic Distance SVMs," Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.6, pp.1556-1568, Aug. 2011
Likely origin: Sound event classification based on Feature Integration, Recursive Feature Elimination and Structured
Classification
Tran, H.D.; Haizhou Li
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2009.4959549
Publication Year: 2009 , Page(s): 177 - 180
Talmon, R.; Cohen, I.; Gannot, S.; , "Transient Noise Reduction Using Nonlocal Diffusion Filters," Audio,
Speech, and Language Processing, IEEE Transactions on , vol.19, no.6, pp.1584-1599, Aug. 2011
Likely origin: Speech enhancement in transient noise environment using diffusion filtering
Talmon, R.; Cohen, I.; Gannot, S.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 4782 - 4785
Ke Hu; DeLiang Wang; , "Unvoiced Speech Segregation From Nonspeech Interference via CASA and
Spectral Subtraction," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.6,
pp.1600-1609, Aug. 2011
Likely origin: Incorporating spectral subtraction and noise type for unvoiced speech segregation
Ke Hu; DeLiang Wang
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Publication Year: 2009 , Page(s): 4425 - 4428
Almajai, I.; Milner, B.; , "Visually Derived Wiener Filters for Speech Enhancement," Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.6, pp.1642-1651, Aug. 2011
Likely origin: Visually-Derived Wiener Filters for Speech Enhancement
Almajai, I.; Milner, B.; Darch, J.; Vaseghi, S.
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Volume: 4
Publication Year: 2007 , Page(s): IV-585 - IV-588
Haitian Xu; Gales, M.J.F.; Chin, K.K.; , "Joint Uncertainty Decoding With Predictive Methods for Noise
Robust Speech Recognition," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.6, pp.1665-1676, Aug. 2011
Likely origin: Rapid joint speaker and noise compensation for robust speech recognition
Chin, K.K.; Haitian Xu; Gales, M.J.F.; Breslin, C.; Knill, K.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Publication Year: 2011 , Page(s): 5500 - 5503
Wallace, R.; Baker, B.; Vogt, R.; Sridharan, S.; , "Discriminative Optimization of the Figure of Merit for
Phonetic Spoken Term Detection," Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.6, pp.1677-1687, Aug. 2011
Likely origin: Optimising Figure of Merit for phonetic spoken term detection
Wallace, R.; Vogt, R.; Baker, B.; Sridharan, S.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 5298 - 5301
Grosche, P.; Muller, M.; , "Extracting Predominant Local Pulse Information From Music Recordings,"
Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.6, pp.1688-1701, Aug. 2011
Likely origin: Computing predominant local periodicity information in music recordings
Grosche, P.; Muller, M.
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Publication Year: 2009 , Page(s): 33 - 36
Wu, Y.J.; Abhayapala, T.D.; , "Spatial Multizone Soundfield Reproduction: Theory and Design," Audio,
Speech, and Language Processing, IEEE Transactions on , vol.19, no.6, pp.1711-1720, Aug. 2011
Likely origin: Spatial multizone soundfield reproduction
Wu, Y.J.; Abhayapala, T.D.
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Publication Year: 2009 , Page(s): 93 - 96
Parvaix, M.; Girin, L.; , "Informed Source Separation of Linear Instantaneous Under-Determined Audio
Mixtures by Source Index Embedding," Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.6, pp.1721-1733, Aug. 2011
Likely origin: Informed source separation of underdetermined instantaneous stereo mixtures using source index
embedding
Parvaix, M.; Girin, L.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 245 - 248
Bekrani, M.; Khong, A.W.H.; Lotfizad, M.; , "A Linear Neural Network-Based Approach to Stereophonic
Acoustic Echo Cancellation," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.6, pp.1743-1753, Aug. 2011
Likely origin: Neural network based adaptive echo cancellation for stereophonic teleconferencing application
Bekrani, M.; Khong, A.W.H.; Lotfizad, M.
Multimedia and Expo (ICME), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICME.2010.5583025
Publication Year: 2010 , Page(s): 1172 - 1177
Peeters, G.; Papadopoulos, H.; , "Simultaneous Beat and Downbeat-Tracking Using a Probabilistic
Framework: Theory and Large-Scale Evaluation," Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.6, pp.1754-1769, Aug. 2011
Likely origin: Simultaneous estimation of chord progression and downbeats from an audio file
Papadopoulos, H.; Peeters, G.
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2008.4517561
Publication Year: 2008 , Page(s): 121 - 124
Ki-Seung Lee; Seok-Pil Lee; , "A Relevant Distance Criterion for Interpolation of Head-Related Transfer
Functions," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.6, pp.1780-1790,
Aug. 2011
Likely origin: A novel adaptive stereo sound system with self-generating sound-based listener tracking
Seungsoo Yoo; Yeongmoon Kim; Ki-Seung Lee; Kyoungro Yoon; Sun Yong Kim; Seok-Pil Lee
Consumer Electronics (ISCE), 2010 IEEE 14th International Symposium on
Digital Object Identifier: 10.1109/ISCE.2010.5523690
Publication Year: 2010 , Page(s): 1 - 5
Qi Li; Yan Huang; , "An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification
Under Mismatched Conditions," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.6, pp.1791-1801, Aug. 2011
Likely origin: Robust speaker identification using an auditory-based feature
Qi Li; Yan Huang
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICASSP.2010.5495589
Publication Year: 2010 , Page(s): 4514 - 4517
Chatterjee, S.; Kleijn, W.B.; , "Auditory Model-Based Design and Optimization of Feature Vectors for
Automatic Speech Recognition," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.6, pp.1813-1825, Aug. 2011
Likely origin: Selecting static and dynamic features using an advanced auditory model for speech recognition
Koniaris, C.; Chatterjee, S.; Kleijn, W.B.
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Publication Year: 2010 , Page(s): 4342 - 4345
Bekrani, M.; Khong, A.W.H.; Lotfizad, M.; , "A Clipping-Based Selective-Tap Adaptive Filtering
Approach to Stereophonic Acoustic Echo Cancellation," Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.6, pp.1826-1836, Aug. 2011
Likely origin: Neural network based adaptive echo cancellation for stereophonic teleconferencing application
Bekrani, M.; Khong, A.W.H.; Lotfizad, M.
Multimedia and Expo (ICME), 2010 IEEE International Conference on
Digital Object Identifier: 10.1109/ICME.2010.5583025
Publication Year: 2010 , Page(s): 1172 - 1177
Lachambre, H.; Andre-Obrecht, R.; Pinquier, J.; , "Distinguishing Monophonies From Polyphonies Using
Weibull Bivariate Distributions," Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.6, pp.1837-1842, Aug. 2011
Likely origin: Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models
Lachambre, H.; Andre-Obrecht, R.; Pinquier, J.
Content-Based Multimedia Indexing, 2009. CBMI '09. Seventh International Workshop on
Digital Object Identifier: 10.1109/CBMI.2009.24
Publication Year: 2009 , Page(s): 68 - 72
“A Framework for Automatic Human Emotion
Classification Using Emotion Profiles,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.5, pp.1057-1070, July 2011
Likely origin:
Emily Mower, Kyu Jeong Han, Sungbok Lee, Shrikanth S. Narayanan: A cluster-profile representation of
emotion using agglomerative hierarchical clustering. INTERSPEECH 2010:797-800
Kai Yu; Young, S.; , “Continuous F0 Modeling for HMM Based Statistical Parametric Speech
Synthesis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1071-1079,
July 2011
Likely origin:
Kai Yu, François Mairesse, Steve Young: Word-level emphasis modelling in HMM-based speech
synthesis. ICASSP 2010:4238-4241
Degottex, G.; Roebel, A.; Rodet, X.; , “Phase Minimization for Glottal Model Estimation,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1080-1090, July 2011
Likely origin:
Gilles Degottex, Axel Röbel, Xavier Rodet: Joint estimate of shape and time-synchronization of a glottal
source model by phase flatness. ICASSP 2010:5058-5061
Zhaozhang Jin; DeLiang Wang; , “HMM-Based Multipitch Tracking for Noisy and Reverberant
Speech,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1091-1102,
July 2011
Likely origin:
Zhaozhang Jin, DeLiang Wang: Learning to maximize signal-to-noise ratio for reverberant speech
segregation. ICASSP 2009:4689-4692
Schluter, R.; Nussbaum-Thom, M.; Ney, H.; , “On the Relationship Between Bayes Risk and Word
Error Rate in ASR,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5,
pp.1103-1112, July 2011
Likely origin:
Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney: On the relation of Bayes risk, word error, and word
posteriors in ASR. INTERSPEECH 2010:230-233
Bellegarda, J.R.; , “A Data-Driven Affective Analysis Framework Toward Naturally Expressive
Speech Synthesis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5,
pp.1113-1122, July 2011
Likely origin:
None.
Yang Lu; Loizou, P.C.; , “Estimators of the Magnitude-Squared Spectrum and Methods for
Incorporating SNR Uncertainty,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.5, pp.1123-1137, July 2011
Likely origin:
Yang Lu, Philipos C. Loizou: Speech enhancement by combining statistical estimators of speech and
noise. ICASSP 2010:4754-4757
Heigold, G.; Ney, H.; Lehnen, P.; Gass, T.; Schluter, R.; , “Equivalence of Generative and Log-Linear
Models,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1138-1148,
July 2011
Likely origin:
Georg Heigold, Simon Wiesler, Markus Nußbaum-Thom, Patrick Lehnen, Ralf Schlüter, Hermann Ney:
Discriminative HMMS, log-linear models, and CRFS: What is the difference? ICASSP 2010:5546-5549
Shasha Xie; Yang Liu; , “Using N-Best Lists and Confusion Networks for Meeting
Summarization,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5,
pp.1160-1169, July 2011
Likely origin:
Yang Liu, Shasha Xie, Fei Liu: Using n-best recognition output for extractive summarization and keyword
extraction in meeting speech.ICASSP 2010:5310-5313
Shasha Xie, Yang Liu: Using Confusion Networks for Speech Summarization. HLT-NAACL 2010:46-54
Charoenruengkit, W.; Erdol, N.; , “The Effect of Spectral Estimation on Speech Enhancement
Performance,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1170-
1179, July 2011
Likely origin:
None
Ozbek, I.Y.; Hasegawa-Johnson, M.; Demirekler, M.; , “Estimation of Articulatory Trajectories Based
on Gaussian Mixture Model (GMM) With Audio-Visual Information Fusion and Dynamic Kalman
Smoothing,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1180-
1195, July 2011
Likely origin:
I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler: Formant trajectories for acoustic-to-
articulatory inversion. INTERSPEECH 2009:2807-2810
“Efficient MMSE Estimation and
Uncertainty Processing for Multienvironment Robust Speech Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.5, pp.1206-1220, July 2011
Likely origin:
None
Chi-Yueh Lin; Hsiao-Chuan Wang; , “Burst Onset Landmark Detection and Its Application to Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1253-
1264, July 2011
Likely origin:
Chi-Yueh Lin, Hsiao-Chuan Wang: Using burst onset information to improve stop/affricate phone
recognition. ICASSP 2010:4862-4865
Mowlaee, P.; Christensen, M.G.; Jensen, S.H.; , “New Results on Single-Channel Speech Separation
Using Sinusoidal Modeling,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.5, pp.1265-1277, July 2011
Likely origin:
Pejman Mowlaee, Rahim Saeidi, Zheng-Hua Tan, Mads Græsbøll Christensen, Tomi Kinnunen, Pasi
Fränti, Søren Holdt Jensen: Sinusoidal Approach for the Single-Channel Speech Separation and
Recognition Challenge. INTERSPEECH 2011:677-680
Tiomkin, S.; Malah, D.; Shechtman, S.; Kons, Z.; , “A Hybrid Text-to-Speech System That Combines
Concatenative and Statistical Synthesis Units,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.5, pp.1278-1288, July 2011
Likely origin:
Stas Tiomkin, David Malah: Statistical text-to-speech synthesis with improved dynamics. INTERSPEECH
2008:1841-1844
Han-Ping Shen; Jui-Feng Yeh; Chung-Hsien Wu; , “Speaker Clustering Using Decision Tree-Based
Phone Cluster Models With Multi-Space Probability Distributions,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.5, pp.1289-1300, July 2011
Likely origin:
None
Etame, T.; Le Bouquin Jeannes, R.; Quinquis, C.; Gros, L.; Faucon, G.; , “Towards a New Reference
Impairment System in the Subjective Evaluation of Speech Codecs,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.5, pp.1301-1315, July 2011
Likely origin:
None
Chengyuan Ma; Chin-Hui Lee; , “A Regularized Maximum Figure-of-Merit (rMFoM) Approach to
Supervised and Semi-Supervised Learning,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.5, pp.1316-1327, July 2011
Likely origin:
None
Ringeval, F.; Demouy, J.; Szaszak, G.; Chetouani, M.; Robel, L.; Xavier, J.; Cohen, D.; Plaza, M.; ,
“Automatic Intonation Recognition for the Prosodic Assessment of Language-Impaired
Children,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1328-1342,
July 2011
Likely origin:
None
Lehr, M.; Shafran, I.; , “Learning a Discriminative Weighted Finite-State Transducer for Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1360-
1367, July 2011
Likely origin:
Maider Lehr, Izhak Shafran: Discriminatively estimated joint acoustic, duration, and language model for
speech recognition. ICASSP 2010:5542-5545
Cornelis, B.; Moonen, M.; Wouters, J.; , “Performance Analysis of Multichannel Wiener Filter-Based
Noise Reduction in Hearing Aids Under Second Order Statistics Estimation Errors,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1368-1381, July 2011
Likely origin:
Bram Cornelis, Marc Moonen, Jan Wouters: Comparison of frequency domain noise reduction strategies
based on multichannel Wiener filtering and spatial prediction. ICASSP 2009:129-132
Yousafzai, J.; Sollich, P.; Cvetkovic, Z.; Bin Yu; , “Combined Features and Kernel Design for Noise
Robust Phoneme Classification Using Support Vector Machines,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.5, pp.1396-1407, July 2011
Likely origin:
Jibran Yousafzai, Zoran Cvetkovic, Peter Sollich: Towards robust phoneme classification with hybrid
features. ISIT 2010:1643-1647
Jibran Yousafzai, Zoran Cvetkovic, Peter Sollich: Tuning support vector machines for robust phoneme
classification with acoustic waveforms.INTERSPEECH 2009:2391-2394
Xing Fan; Hansen, J.H.L.; , “Speaker Identification Within Whispered Speech Audio Streams,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.19, no.5, pp.1408-1421, July 2011
Likely origin:
Xing Fan, Keith W. Godin, John H. L. Hansen: Acoustic Analysis of Whispered Speech for Phoneme and
Speaker Dependency.INTERSPEECH 2011:181-184
Birkholz, P.; Kroger, B.J.; Neuschaefer-Rube, C.; , “Model-Based Reproduction of Articulatory
Trajectories for Consonant–Vowel Sequences,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.5, pp.1422-1433, July 2011
Likely origin:
Xing Fan, John H. L. Hansen: Speaker Identification for Whispered Speech Using a Training Feature
Transformation from Neutral to Whisper. INTERSPEECH 2011:2425-2428
Wooil Kim; Hansen, J.H.L.; , “A Novel Mask Estimation Method Employing Posterior-Based
Representative Mean Estimate for Missing-Feature Speech Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.5, pp.1434-1443, July 2011
Likely origin:
Wooil Kim, John H. L. Hansen: Feature Compensation for Speech Recognition in Severely Adverse
Environments Due to Background Noise and Channel Distortion. INTERSPEECH 2011:1653-1656
Prahallad, K.; Black, A.W.; , “Segmentation of Monologues in Audio Books for Building Synthetic
Voices,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.5, pp.1444-1449,
July 2011
Likely origin:
None
Guilin Ma; Gran, F.; Jacobsen, F.; Agerkvist, F.T.; , “Adaptive Feedback Cancellation With Band-
Limited LPC Vocoder in Digital Hearing Aids,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.4, pp.677-687, May 2011
Likely origin:
Guilin Ma, Fredrik Gran, Finn Jacobsen, Finn T. Agerkvist: A new approach for modelling the dynamic
feedback path of digital hearing aids. ICASSP 2009:209-212
Dong Wang; King, S.; Frankel, J.; , “Stochastic Pronunciation Modeling for Out-of-Vocabulary
Spoken Term Detection,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.688-698, May 2011
Likely origin:
Dong Wang, Simon King, Joe Frankel, Peter Bell: Stochastic pronunciation modelling and soft match for
out-of-vocabulary spoken term detection. ICASSP 2010:5294-5297
Pandey, A.; Mathews, V.J.; , “Low-Delay Signal Processing for Digital Hearing Aids,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.4, pp.699-710, May 2011
Likely origin:
Ashutosh Pandey, V. John Mathews: Improving adaptive feedback cancellation in digital hearing aids
through offending frequency suppression. ICASSP 2010:173-176
Creusere, C.D.; Hardin, J.C.; , “Assessing the Quality of Audio Containing Temporally Varying
Distortions,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4, pp.711-720,
May 2011
Likely origin:
Srivatsan Kandadai, Joseph C. Hardin, Charles D. Creusere: Audio quality assessment using the mean
structural similarity measure. ICASSP 2008:221-224
Matusov, E.; Ney, H.; , “Lattice-Based ASR-MT Interface for Speech Translation,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.4, pp.721-732, May 2011
Likely origin:
Evgeny Matusov, Björn Hoffmeister, Hermann Ney: Spoken language translation systems: ASR word
lattice translation with exhaustive reordering is possible.INTERSPEECH 2008:2342-2345
van Dalen, R.C.; Gales, M.J.F.; , “Extended VTS for Noise-Robust Speech Recognition,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.4, pp.733-743, May 2011
Likely origin:
Rogier C. van Dalen, Mark J. F. Gales: Extended VTS for noise-robust speech recognition. ICASSP
2009:3829-3832
Barra-Chicote, R.; Pardo, J.M.; Ferreiros, J.; Montero, J.M.; , “Speaker Diarization Based on Intensity
Channel Contribution,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.754-761, May 2011
Likely origin:
None
Haque, M.A.; Islam, T.; Hasan, M.K.; , “Robust Speech Dereverberation Based on Blind Adaptive
Estimation of Acoustic Channels,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.4, pp.775-787, May 2011
Likely origin:
None
Dehak, N.; Kenny, P.J.; Dehak, R.; Dumouchel, P.; Ouellet, P.; , “Front-End Factor Analysis for
Speaker Verification,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.788-798, May 2011
Likely origin:
Najim Dehak, Réda Dehak, Patrick Kenny, Niko Brümmer,Pierre Ouellet, Pierre Dumouchel: Support
vector machines versus fast scoring in the low-dimensional total variability space for speaker
verification. INTERSPEECH 2009:1559-1562
Wohlmayr, M.; Stark, M.; Pernkopf, F.; , “A Probabilistic Interaction Model for Multipitch Tracking
With Factorial Hidden Markov Models,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.4, pp.799-810, May 2011
Likely origin:
Michael Wohlmayr, Michael Stark, Franz Pernkopf: A mixture maximization approach to multipitch
tracking with factorial hidden Markov models. ICASSP 2010:5070-5073
Groot, P.C.; Heskes, T.; Dijkstra, T.M.H.; Kates, J.M.; , “Predicting Preference Judgments of Individual
Normal and Hearing-Impaired Listeners With Gaussian Processes,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.4, pp.811-821, May 2011
Likely origin:
Adriana Birlutiu, Perry Groot, Tom Heskes: Multi-task Preference learning with Gaussian
Processes. ESANN 2009
Ji Ming; Srinivasan, R.; Crookes, D.; , “A Corpus-Based Approach to Speech Enhancement From
Nonstationary Noise,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.822-836, May 2011
Likely origin:
Ji Ming, Ramji Srinivasan, Danny Crookes: A corpus-based approach to speech enhancement from
nonstationary noise.INTERSPEECH 2010:1097-1100
Hung, H.; Yan Huang; Friedland, G.; Gatica-Perez, D.; , “Estimating Dominance in Multi-Party
Meetings Using Speaker Diarization,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.4, pp.847-860, May 2011
Likely origin:
Hayley Hung, Yan Huang, Gerald Friedland, Daniel Gatica-Perez: Estimating the dominant person in
multi-party conversations using speaker diarization strategies. ICASSP 2008:2197-2200
Kong Aik Lee; Chang Huai You; Haizhou Li; Kinnunen, T.; Khe Chai Sim; , “Using Discrete
Probabilities With Bhattacharyya Measure for SVM-Based Speaker Verification,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.4, pp.861-870, May 2011
Likely origin:
Chang Huai You, Kong-Aik Lee, Haizhou Li: A GMM supervector Kernel with the Bhattacharyya distance
for SVM based speaker recognition. ICASSP 2009:4221-4224
Shih-Hsiang Lin; Yao-Ming Yeh; Berlin Chen; , “Leveraging Kullback–Leibler Divergence Measures
and Information-Rich Cues for Speech Summarization,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.19, no.4, pp.871-882, May 2011
Likely origin:
Shih-Hsiang Lin, Berlin Chen: Improved speech summarization with multiple-hypothesis representations
and kullback-leibler divergence measures. INTERSPEECH 2009:1847-1850
Chi Zhang; Hansen, J.H.L.; , “Whisper-Island Detection Based on Unsupervised Segmentation With
Entropy-Based Speech Feature Processing,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.4, pp.883-894, May 2011
Likely origin:
Chi Zhang, John H. L. Hansen: Advancements in whisper-island detection using the linear predictive
residual. ICASSP 2010:5170-5173
Gibson, M.; Byrne, W.; , “Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for
HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.4, pp.895-904, May 2011
Likely origin:
Matthew Gibson, Teemu Hirsimäki, Reima Karhila, Mikko Kurimo, William Byrne: Unsupervised cross-
lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree
construction. ICASSP 2010:4642-4645
Min-Seok Choi; Hong-Goo Kang; , “A Two-Channel Noise Estimator for Speech Enhancement in a
Highly Nonstationary Environment,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.4, pp.905-915, May 2011
Likely origin:
Ho Seon Shin, Min-Seok Choi, Taesu Kim, Hong-Goo Kang: Binaural loudness based speech
reinforcement with a closed-form solution. ICASSP 2010:4274-4277
Mousazadeh, S.; Cohen, I.; , “AR-GARCH in Presence of Noise: Parameter Estimation and Its
Application to Voice Activity Detection,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.4, pp.916-926, May 2011
Likely origin:
None
Qiang Wu; Liqing Zhang; Guangchuan Shi; , “Robust Multifactor Speech Feature Extraction Based on
Gabor Analysis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4, pp.927-
936, May 2011
Likely origin:
Qiang Wu, Liqing Zhang, Guangchuan Shi: Robust speech feature extraction based on Gabor filtering and
tensor factorization. ICASSP 2009:4649-4652
Rudzicz, F.; , “Articulatory Knowledge in the Recognition of Dysarthric Speech,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.4, pp.947-960, May 2011
Likely origin:
Frank Rudzicz: Correcting Errors in Speech Recognition with Articulatory Dynamics. ACL 2010:60-68
Bin Gao; Woo, W.L.; Dlay, S.S.; , “Single-Channel Source Separation Using EMD-Subband Variable
Regularized Sparse Features,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.4, pp.961-976, May 2011
Likely origin:
Abd Majid Darsono, Bin Gao, Wai Lok Woo, Satnam Singh Dlay: Nonlinear single channel source
separation. CSNDSP 2010:507-511
Rudoy, D.; Quatieri, T.F.; Wolfe, P.J.; , “Time-Varying Autoregressions in Speech: Detection Theory
and Applications,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.977-989, May 2011
Likely origin:
Daniel Rudoy, Thomas F. Quatieri, Patrick J. Wolfe: Time-varying autoregressive tests for multiscale
speech analysis.INTERSPEECH 2009:2839-2842
Black, M.P.; Tepperman, J.; Narayanan, S.S.; , “Automatic Prediction of Children's Reading Ability for
High-Level Literacy Assessment,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.4, pp.1015-1028, May 2011
Likely origin:
Matthew Black, Joseph Tepperman, Abe Kazemzadeh,Sungbok Lee, Shrikanth Narayanan: Automatic
pronunciation verification of english letter-names for early literacy assessment of preliterate
children. ICASSP 2009:4861-4864
Dongho Kim; Kim, J.H.; Kee-Eung Kim; , “Robust Performance Evaluation of POMDP-Based
Dialogue Systems,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.4,
pp.1029-1040, May 2011
Likely origin:
None
Dmour, M.A.; Davies, M.; , “A New Framework for Underdetermined Speech Extraction Using
Mixture of Beamformers,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.3, pp.445-457, March 2011
Likely origin:
None
Garcia-Moral, A.I.; Solera-Urena, R.; Pelaez-Moreno, C.; Diaz-de-Maria, F.; , “Data Balancing for
Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.19, no.3, pp.468-481, March 2011
Likely origin:
None
Jen-Tzung Chien; Chuang-Hua Chueh; , “Dirichlet Class Language Models for Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3, pp.482-495,
March 2011
Likely origin:
Ying-Lang Chang, Jen-Tzung Chien: Latent Dirichlet learning for document summarization. ICASSP
2009:1689-1692
Feipeng Li; Allen, J.B.; , “Manipulation of Consonants in Natural Speech,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.3, pp.496-504, March 2011
Likely origin:
None
Donglai Zhu; Bin Ma; Haizhou Li; , “Speaker Verification With Feature-Space MAPLR
Parameters,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3, pp.505-515,
March 2011
Likely origin:
Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung,Haizhou Li: MAP estimation of subspace
transform for speaker recognition. INTERSPEECH 2010:1465-1468
Fei Liu; Feifan Liu; Yang Liu; , “A Supervised Framework for Keyword Extraction From Meeting
Transcripts,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3, pp.538-548,
March 2011
Likely origin:
None
Lin Wang; Heping Ding; Fuliang Yin; , “A Region-Growing Permutation Alignment Approach in
Frequency-Domain Blind Source Separation of Speech Mixtures,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.3, pp.549-557, March 2011
Likely origin:
None
Ekman, L.A.; Grancharov, V.; Kleijn, W.B.; , “Double-Ended Quality Assessment System for Super-
Wideband Speech,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3,
pp.558-569, March 2011
Likely origin:
None
Jia Jia; Shen Zhang; Fanbo Meng; Yongxin Wang; Lianhong Cai; , “Emotional Audio-Visual Speech
Synthesis Based on PAD,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3,
pp.570-582, March 2011
Likely origin:
Shen Zhang, Jia Jia, Yingjin Xu, Lianhong Cai: Emotional talking agent: System and evaluation. ICNC
2010:3573-3577
Nesta, F.; Wada, T.S.; Biing-Hwang Juang; , “Batch-Online Semi-Blind Source Separation Applied to
Multi-Channel Acoustic Echo Cancellation,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.3, pp.583-599, March 2011
Likely origin:
Francesco Nesta, Ted S. Wada, Shigeki Miyabe, Biing-Hwang Juang: On the non-uniqueness problem and
the semi-blind source separation. WASPAA 2009:101-104
Ghosh, P.K.; Tsiartas, A.; Narayanan, S.; , “Robust Voice Activity Detection Using Long-Term Signal
Variability,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.3, pp.600-613,
March 2011
Likely origin:
Andreas Tsiartas, Prasanta Kumar Ghosh, Panayiotis G. Georgiou, Shrikanth S. Narayanan: Robust word
boundary detection in spontaneous speech using acoustic and lexical cues.ICASSP 2009:4785-4788
Sheng Wu; Xiaojun Qiu; Ming Wu; , “Stereo Acoustic Echo Cancellation Employing Frequency-
Domain Preprocessing and Adaptive Filter,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.3, pp.614-623, March 2011
Likely origin:
None
Morales-Cordovilla, J.A.; Peinado, A.M.; Sanchez, V.; Gonzalez, J.A.; , “Feature Extraction Based on
Pitch-Synchronous Averaging for Robust Speech Recognition,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.3, pp.640-651, March 2011
Likely origin:
Juan Andres Morales-Cordovilla, Ning Ma, Victoria E. Sánchez,José L. Carmona, Antonio M. Peinado, Jon
Barker: A pitch based noise estimation technique for robust speech recognition with Missing Data. ICASSP
2011:4808-4811
Ferrer, M.; Gonzalez, A.; de Diego, M.; Pinero, G.; , “Transient Analysis of the Conventional Filtered-x
Affine Projection Algorithm for Active Noise Control,”Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.3, pp.652-657, March 2011
Likely origin:
None
Pinto, J.; Garimella, S.; Magimai-Doss, M.; Hermansky, H.; Bourlard, H.; , “Analysis of MLP-Based
Hierarchical Phoneme Posterior Probability Estimator,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.19, no.2, pp.225-241, Feb. 2011
Likely origin:
Joel Pinto, Garimella S. V. S. Sivaram, Hynek Hermansky,Mathew Magimai-Doss: Volterra series for
analyzing MLP based phoneme posterior estimator. ICASSP 2009:1813-1816
Stark, M.; Wohlmayr, M.; Pernkopf, F.; , “Source–Filter-Based Single-Channel Speech Separation
Using Pitch Information,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.2,
pp.242-255, Feb. 2011
Likely origin:
Michael Stark, Michael Wohlmayr, Franz Pernkopf: Single Channel Speech Separation Using Source-Filter
Representation.ICPR 2010:826-829
Wei-Qiang Zhang; Liang He; Yan Deng; Jia Liu; Johnson, M.T.; , “Time–Frequency Cepstral Features
and Heteroscedastic Linear Discriminant Analysis for Language Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.19, no.2, pp.266-276, Feb. 2011
Likely origin:
None
Breithaupt, C.; Martin, R.; , “Analysis of the Decision-Directed SNR Estimator for Speech
Enhancement With Respect to Low-SNR and Transient Conditions,”Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.2, pp.277-289, Feb. 2011
Likely origin:
None
Pantazis, Y.; Rosec, O.; Stylianou, Y.; , “Adaptive AM–FM Signal Decomposition With Application to
Speech Analysis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.2, pp.290-
300, Feb. 2011
Likely origin:
Yannis Pantazis, Georgios Tzedakis, Olivier Rosec, Yannis Stylianou: Analysis/synthesis of speech based
on an adaptive quasi-harmonic plus noise model. ICASSP 2010:4246-4249
Kim, D.K.; Gales, M.J.F.; , “Noisy Constrained Maximum-Likelihood Linear Regression for Noise-
Robust Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.2, pp.315-325, Feb. 2011
Likely origin:
D. K. Kim, M. J. F. Gales: Adaptive training with noisy constrained maximum likelihood linear regression
for noise robust speech recognition. INTERSPEECH 2009:2383-2386
Namgook Cho; Kuo, C.-C.J.; , “Sparse Music Representation With Source-Specific Dictionaries and
Its Application to Signal Separation,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.19, no.2, pp.326-337, Feb. 2011
Likely origin:
Namgook Cho, Yu Shiu, C. C. Jay Kuo: Efficient music representation with content adaptive
dictionaries. ISCAS 2008:3254-3257
Milner, B.; Darch, J.; , “Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency
Cepstral Coefficients,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.2,
pp.338-347, Feb. 2011
Likely origin:
Ben Milner, Jonathan Darch, Ibrahim Almajai: Reconstructing clean speech from noisy MFCC
vectors.INTERSPEECH 2009:1943-1946
Tepperman, J.; Sungbok Lee; Narayanan, S.; Alwan, A.; , “A Generative Student Model for Scoring
Word Reading Skills,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.2,
pp.348-360, Feb. 2011
Likely origin:
Matthew Black, Joseph Tepperman, Abe Kazemzadeh,Sungbok Lee, Shrikanth Narayanan: Automatic
pronunciation verification of english letter-names for early literacy assessment of preliterate
children. ICASSP 2009:4861-4864
Kuhne, M.; Togneri, R.; Nordholm, S.; , “A New Evidence Model for Missing Data Speech Recognition
With Applications in Reverberant Multi-Source Environments,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.2, pp.372-384, Feb. 2011
Likely origin:
Marco Kühne, Roberto Togneri, Sven Nordholm: Evidence Modeling for Missing Data Speech
Recognition Using Small Microphone Arrays. Robust Speech Recognition of Uncertain or Missing Data
2011:293-318
Zen, H.; Nankaku, Y.; Tokuda, K.; , “Continuous Stochastic Feature Mapping Based on Trajectory
HMMs,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19, no.2, pp.417-430, Feb.
2011
Likely origin:
Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda: Stereo-based stochastic noise compensation based on
trajectory GMMS.ICASSP 2009:4577-4580
Vijayasenan, D.; Valente, F.; Bourlard, H.; , “An Information Theoretic Combination of MFCC and
TDOA Features for Speaker Diarization,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.2, pp.431-438, Feb. 2011
Likely origin:
Deepu Vijayasenan, Fabio Valente, Hervé Bourlard: Multistream speaker diarization beyond two acoustic
feature streams. ICASSP 2010:4950-4953
den Brinker, A.C.; Krishnamoorthi, H.; Verbitskiy, E.A.; , “Similarities and Differences Between
Warped Linear Prediction and Laguerre Linear Prediction,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.19, no.1, pp.24-33, Jan. 2011
Likely origin:
None
Loizou, P.C.; Gibak Kim; , “Reasons why Current Speech-Enhancement Algorithms do not Improve
Speech Intelligibility and Suggested Solutions,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.1, pp.47-56, Jan. 2011
Likely origin:
Gibak Kim, Philipos C. Loizou: Why do speech-enhancement algorithms not improve speech
intelligibility? ICASSP 2010:4738-4741
Yoshioka, T.; Nakatani, T.; Miyoshi, M.; Okuno, H.G.; , “Blind Separation and Dereverberation of
Speech Mixtures by Joint Optimization,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.1, pp.69-84, Jan. 2011
Likely origin:
None
Yun Lei; Hansen, J.H.L.; , “Dialect Classification via Text-Independent Training and Testing for
Arabic, Spanish, and Chinese,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.19,
no.1, pp.85-96, Jan. 2011
Likely origin:
Yun Lei, John H. L. Hansen: Dialect classification via discriminative training. INTERSPEECH 2008:735-
738
Yun Lei, John H. L. Hansen: Factor analysis-based information integration for Arabic dialect
identification. ICASSP 2009:4337-4340
Van Segbroeck, M.; Van Hamme, H.; , “Advances in Missing Feature Techniques for Robust Large-
Vocabulary Continuous Speech Recognition,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.1, pp.123-137, Jan. 2011
Likely origin:
Jort F. Gemmeke, Maarten Van Segbroeck, Yujun Wang,Bert Cranen, Hugo Van hamme: Automatic
Speech Recognition Using Missing Data Techniques: Handling of Real-World Data. Robust Speech
Recognition of Uncertain or Missing Data 2011:157-185
Raitio, T.; Suni, A.; Yamagishi, J.; Pulakka, H.; Nurminen, J.; Vainio, M.; Alku, P.; , “HMM-Based
Speech Synthesis Utilizing Glottal Inverse Filtering,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.19, no.1, pp.153-165, Jan. 2011
Likely origin:
Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku: Utilizing glottal source pulse library
for generating improved excitation signal for HMM-based speech synthesis. ICASSP 2011:4564-4567
Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku: Analysis of HMM-Based Lombard Speech
Synthesis. INTERSPEECH 2011:2781-2784
Suhadi, S.; Last, C.; Fingscheidt, T.; , “A Data-Driven Approach to A Priori SNR Estimation,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.19, no.1, pp.186-195, Jan. 2011
Likely origin:
None
Ning Wang; Ching, P.C.; Nengheng Zheng; Tan Lee; , “Robust Speaker Recognition Using Denoised
Vocal Source and Vocal Tract Features,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.19, no.1, pp.196-205, Jan. 2011
Likely origin:
Ning Wang, P. C. Ching, Tan Lee: Exploration of vocal excitation modulation features for speaker
recognition.INTERSPEECH 2009:892-895
Krueger, A.; Warsitz, E.; Haeb-Umbach, R.; , “Speech Enhancement With a GSC-Like Structure
Employing Eigenvector-Based Transfer Function Ratios Estimation,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.19, no.1, pp.206-219, Jan. 2011
Likely origin:
Ernst Warsitz, Alexander Krueger, Reinhold Haeb-Umbach: Speech enhancement with a new generalized
eigenvector blocking matrix for application in a generalized sidelobe canceller. ICASSP 2008:73-76
2010 (From Dec to Jan)
Kalinli, O.; Seltzer, M.L.; Droppo, J.; Acero, A.; , “Noise Adaptive Training for Robust Automatic
Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8,
pp.1889-1901, Nov. 2010
Likely origin:
Ozlem Kalinli, Michael L. Seltzer, Alex Acero: Noise adaptive training using a vector taylor series
approach for noise robust automatic speech recognition. ICASSP 2009:3825-3828
Songfang Huang; Renals, S.; , “Hierarchical Bayesian Language Models for Conversational Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.1941-
1954, Nov. 2010
Likely origin:
Songfang Huang and Steve Renals, Towards the Application of Hierarchical Bayesian Models on
Language Models for Automatic Speech Recognition, the Nonparametric Bayes workshop at ICML'08,
Helsinki, Finland, July 2008.
Chi-Chun Hsia; Chung-Hsien Wu; Jung-Yun Wu; , “Exploiting Prosody Hierarchy and Dynamic
Features for Pitch Modeling and Generation in HMM-Based Speech Synthesis,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.8, pp.1994-2003, Nov. 2010
Likely origin:
None
Huijun Ding; Ing Yann Soon; Chai Kiat Yeo; , “Over-Attenuated Components Regeneration for Speech
Enhancement,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.2004-
2014, Nov. 2010
Likely origin:
Huijun Ding, Ing Yann Soon, Soo Ngee Koh, Chai Kiat Yeo: A post-processing technique for regeneration
of over-attenuated speech. ICASSP 2009:3889-3892
Reddy, A.; Rose, R.C.; , “Integration of Statistical Models for Dictation of Document Translations in a
Machine-Aided Human Translation Task,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.8, pp.2015-2027, Nov. 2010
Likely origin:
Aarthi Reddy, Richard C. Rose: Towards domain independence in machine aided human translation.
INTERSPEECH 2008:2358-2361
Aarthi Reddy, Richard C. Rose, Alain Désilets: Integration of ASR and machine translation models in a
document translation task. INTERSPEECH 2007:2457-2460
Imseng, D.; Friedland, G.; , “Tuning-Robust Initialization Methods for Speaker Diarization,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.2028-2037, Nov. 2010
Likely origin:
David Imseng, Gerald Friedland: An adaptive initialization method for speaker Diarization based on
prosodic features. ICASSP 2010:4946-4949
Guoning Hu; DeLiang Wang; , “A Tandem Algorithm for Pitch Estimation and Voiced Speech
Segregation,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.2067-
2079, Nov. 2010
Likely origin:
None
Seokhwan Jo; Yoo, C.D.; , “Psychoacoustically Constrained and Distortion Minimized Speech
Enhancement,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.2099-
2110, Nov. 2010
Likely origin:
Seokhwan Jo, Chang D. Yoo: Psychoacoustically constrained and distortion minimized speech
enhancement algorithm. ICASSP 2009:4669-4672
Wooil Kim; Hansen, J.; , “Missing-Feature Reconstruction by Leveraging Temporal Spectral
Correlation for Robust Speech Recognition in Background Noise Conditions,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.8, pp.2111-2120, Nov. 2010
Likely origin:
None
Zhiyao Duan; Pardo, B.; Changshui Zhang; , “Multiple Fundamental Frequency Estimation by
Modeling Spectral Peaks and Non-Peak Regions,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.8, pp.2121-2133, Nov. 2010
Likely origin:
Z. Duan and C. Zhang "A maximum likelihood approach to multiple fundamental frequency estimation
from the amplitude spectrum peaks", Proc. Neural Inf. Process. Syst. (NIPS) Workshop Music, Brain,
Cognition, 2007
Bassiou, N.; Moschou, V.; Kotropoulos, C.; , “Speaker Diarization Exploiting the Eigengap Criterion
and Cluster Ensembles,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8,
pp.2134-2144, Nov. 2010
Likely origin:
None
Rao, V.; Rao, P.; , “Vocal Melody Extraction in the Presence of Pitched Accompaniment in
Polyphonic Music,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8,
pp.2145-2154, Nov. 2010
Likely origin:
V. Rao and P. Rao "Vocal melody detection in the presence of pitched accompaniment using harmonic
matching methods", Proc. 11th Int. Conf. Digital Audio Effects (DAFx-08), 2008
Laska, B.N.M.; Bolic, M.; Goubran, R.A.; , “Particle Filter Enhancement of Speech Spectral
Amplitudes,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.8, pp.2155-
2167, Nov. 2010
Likely origin:
None
Sehr, A.; Maas, R.; Kellermann, W.; , “Reverberation Model-Based Decoding in the Logmelspec
Domain for Robust Distant-Talking Speech Recognition,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.7, pp.1676-1691, Sept. 2010
Likely origin:
Armin Sehr, Roland Maas, Walter Kellermann: Model-based dereverberation in the logmelspec domain for
robust distant-talking speech recognition. ICASSP 2010:4298-4301
Krueger, A.; Haeb-Umbach, R.; , “Model-Based Feature Enhancement for Reverberant Speech
Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.7, pp.1692-
1707, Sept. 2010
Likely origin:
Alexander Krueger, Reinhold Haeb-Umbach: Model based feature enhancement for automatic speech
recognition in reverberant environments. INTERSPEECH 2009: 1231-1234
Gomez, R.; Kawahara, T.; , “Robust Speech Recognition Based on Dereverberation Parameter
Optimization Using Acoustic Model Likelihood,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.7, pp.1708-1716, Sept. 2010
Likely origin:
Randy Gomez, Tatsuya Kawahara: Optimization of dereverberation parameters based on likelihood of
speech recognizer. INTERSPEECH 2009: 1223-1226
Nakatani, T.; Yoshioka, T.; Kinoshita, K.; Miyoshi, M.; Biing-Hwang Juang; , “Speech Dereverberation
Based on Variance-Normalized Delayed Linear Prediction,” Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.7, pp.1717-1731, Sept. 2010
Likely origin:
None
Jeub, M.; Schafer, M.; Esch, T.; Vary, P.; , “Model-Based Dereverberation Preserving Binaural
Cues,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.7, pp.1732-1745, Sept.
2010
Likely origin:
Marco Jeub, Peter Vary: Binaural dereverberation based on a dual-channel Wiener filter with optimized
noise field coherence. ICASSP 2010:4710-4713
Erkelens, J.S.; Heusdens, R.; , “Correlation-Based and Model-Based Blind Single-Channel Late-
Reverberation Suppression in Noisy Time-Varying Acoustical Environments,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.7, pp.1746-1765, Sept. 2010
Likely origin:
Jan S. Erkelens, Richard Heusdens: Noise and late-reverberation suppression in time-varying acoustical
environments. ICASSP 2010:4706-4709
Falk, T.H.; Chenxi Zheng; Wai-Yip Chan; , “A Non-Intrusive Quality and Intelligibility Measure of
Reverberant and Dereverberated Speech,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.7, pp.1766-1774, Sept. 2010
Likely origin:
T. H. Falk and W.-Y. Chan "A non-intrusive quality measure of dereverberated speech", Proc. Int.
Workshop Acoust. Echo Noise Control, 2008
Arai, T.; Hodoshima, N.; Yasu, K.; , “Using Steady-State Suppression to Improve Speech Intelligibility
in Reverberant Environments for Elderly Listeners,”Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.7, pp.1775-1780, Sept. 2010
Likely origin:
T. Arai , K. Kinoshita , N. Hodoshima , A. Kusumoto and T. Kitamura "Effects of suppressing steady-state
portions of speech on intelligibility in reverberant environments", Proc. Autumn Meet. Acoust. Soc.
Jpn., vol. 1, pp.449 2001
Ribeiro, F.; Cha Zhang; Florencio, D.A.; Ba, D.E.; , “Using Reverberation to Improve Range and
Elevation Discrimination for Small Array Sound Source Localization,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.7, pp.1781-1792, Sept. 2010
Likely origin:
None
Yan-Chen Lu; Cooke, M.; , “Binaural Estimation of Sound Source Distance via the Direct-to-
Reverberant Energy Ratio for Static and Moving Sources,” Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.7, pp.1793-1805, Sept. 2010
Likely origin:
None
Talantzis, F.; , “An Acoustic Source Localization and Tracking Framework Using Particle Filtering
and Information Theory,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.7,
pp.1806-1817, Sept. 2010
Likely origin:
F. Talantzis and A. G. Constantinides "Using information theory to detect voice activity", Proc. IEEE Int.
Conf. Acoust., Speech, Signal Process. (ICASSP), pp.4613 2009
Kowalski, M.; Vincent, E.; Gribonval, R.; , “Beyond the Narrowband Approximation: Wideband
Convex Methods for Under-Determined Reverberant Audio Source Separation,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.7, pp.1818-1829, Sept. 2010
Likely origin:
None
Duong, N.Q.K.; Vincent, E.; Gribonval, R.; , “Under-Determined Reverberant Audio Source
Separation Using a Full-Rank Spatial Covariance Model,” Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.7, pp.1830-1840, Sept. 2010
Likely origin:
Simon Arberet, Alexey Ozerov, Ngoc Q. K. Duong, Emmanuel Vincent, Rémi Gribonval, Frédéric Bimbot,
Pierre Vandergheynst: Nonnegative matrix factorization and spatial covariance model for under-determined
reverberant audio source separation. ISSPA 2010:1-4
Ngoc Q. K. Duong, Emmanuel Vincent, Rémi Gribonval: Under-Determined Reverberant Audio Source
Separation Using Local Observed Covariance and Auditory-Motivated Time-Frequency Representation.
LVA/ICA 2010:73-80
Masnadi-Shirazi, A.; Wenyi Zhang; Rao, B.D.; , “Glimpsing IVA: A Framework for
Overcomplete/Complete/Undercomplete Convolutive Source Separation,”Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.7, pp.1841-1855, Sept. 2010
Likely origin:
None
Woodruff, J.; DeLiang Wang; , “Sequential Organization of Speech in Reverberant Environments by
Integrating Monaural Grouping and Binaural Localization,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.7, pp.1856-1866, Sept. 2010
Likely origin:
John Woodruff, Rohit Prabhavalkar, Eric Fosler-Lussier, DeLiang Wang: Combining monaural and
binaural evidence for reverberant speech segregation. INTERSPEECH 2010: 406-409
John Woodruff, DeLiang Wang: Integrating monaural and binaural analysis for localizing multiple
reverberant sound sources. ICASSP 2010: 2706-2709
Hummersone, C.; Mason, R.; Brookes, T.; , “Dynamic Precedence Effect Modeling for Source
Separation in Reverberant Environments,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.7, pp.1867-1871, Sept. 2010
Likely origin:
C. Hummersone , R. Mason and T. Brookes "A comparison of computational precedence models for
source separation in reverberant environments", Proc. 128th Audio Eng. Soc. Conv., 2010
Mandel, M.I.; Bressler, S.; Shinn-Cunningham, B.; Ellis, D.P.W.; , “Evaluating Source Separation
Algorithms With Reverberant Speech,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.7, pp.1872-1883, Sept. 2010
Likely origin:
None
Ketabdar, H.; Bourlard, H.; , “Enhanced Phone Posteriors for Improving Speech Recognition
Systems,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1094-1106,
Aug. 2010
Likely origin:
Hamed Ketabdar, Mirko Hannemann, Hynek Hermansky: Detection of out-of-vocabulary words in
posterior based ASR. INTERSPEECH 2007: 1757-1760
Hamed Ketabdar, Hervé Bourlard: Hierarchical integration of phonetic and lexical knowledge in phone
posterior estimation. ICASSP 2008: 4065-4068
Canazza, S.; De Poli, G.; Mian, G.A.; , “Restoration of Audio Documents by Means of Extended
Kalman Filter,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1107-
1115, Aug. 2010
Likely origin:
None
Chunghsin Yeh; Roebel, A.; Rodet, X.; , “Multiple Fundamental Frequency Estimation and Polyphony
Inference of Polyphonic Music Signals,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.6, pp.1116-1126, Aug. 2010
Likely origin:
Chunghsin Yeh, Niels Bogaards, Axel Röbel: Synthesized Polyphonic Music Database with Verifiable
Ground Truth for Multiple F0 Estimation. ISMIR 2007:393-398
Jiucang Hao; Te-Won Lee; Sejnowski, T.J.; , “Speech Enhancement Using Gaussian Scale Mixture
Models,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1127-1136,
Aug. 2010
Likely origin:
None
Serizel, R.; Moonen, M.; Wouters, J.; Jensen, S.H.; , “Integrated Active Noise Control and Noise
Reduction in Hearing Aids,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.6, pp.1137-1146, Aug. 2010
Likely origin:
Romain Serizel, Marc Moonen, Jan Wouters, Søren Holdt Jensen: A zone of quiet based approach to
integrated active noise control and noise reduction in hearing AIDS. WASPAA 2009:229-232
Zhang, J.J.; Chan, R.H.Y.; Fung, P.; , “Extractive Speech Summarization Using Shallow Rhetorical
Structure Modeling,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6,
pp.1147-1157, Aug. 2010
Likely origin:
Pascale Fung, Ricky Ho Yin Chan, Justin Jian Zhang: Rhetorical-State Hidden Markov Models for
extractive speech summarization. ICASSP 2008: 4957-4960
Justin Jian Zhang, Pascale Fung: Learning deep rhetorical structure for extractive speech summarization.
ICASSP 2010: 5302-5305
Xiong Xiao; Jinyu Li; Eng Siong Chng; Haizhou Li; Chin-Hui Lee; , “A Study on the Generalization
Capability of Acoustic Models for Robust Speech Recognition,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.6, pp.1158-1169, Aug. 2010
Likely origin:
None
Chung-Hsien Wu; Chao-Hong Liu; Harris, M.; Liang-Chih Yu; , “Sentence Correction Incorporating
Relative Position and Parse Template Language Models,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.6, pp.1170-1181, Aug. 2010
Likely origin:
None
Vogt, R.; Sridharan, S.; Mason, M.; , “Making Confident Speaker Verification Decisions With Minimal
Speech,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1182-1192,
Aug. 2010
Likely origin:
R. Vogt , S. Sridharan and M. Mason "Making confident speaker verification decisions with minimal
speech", Proc. Interspeech, pp.1405 2008
Nion, D.; Mokios, K.N.; Sidiropoulos, N.D.; Potamianos, A.; , “Batch and Adaptive PARAFAC-Based
Blind Separation of Convolutive Speech Mixtures,”Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.6, pp.1193-1207, Aug. 2010
Likely origin:
Dimitri Nion, Nicholas D. Sidiropoulos: A PARAFAC-based technique for detection and localization of
multiple targets in a MIMO radar system. ICASSP 2009:2077-2080
Eksler, V.; Jelinek, M.; , “Glottal-Shape Codebook to Improve Robustness of CELP Codecs,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1208-1217, Aug. 2010
Likely origin:
None
Moo Young Kim; Kleijn, W.B.; , “Reduction of the Impact of Distortion Outliers and Source
Mismatch in Resolution-Constrained Quantization,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.6, pp.1218-1227, Aug. 2010
Likely origin:
M. Y. Kim and W. B. Kleijn "Resolution-constrained quantization with JND-based perceptual-distortion
measures", IEEE Signal Process. Lett., vol. 13, pp.703 2006
Fallon, M.F.; Godsill, S.; , “Acoustic Source Localization and Tracking Using Track Before
Detect,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1228-1242,
Aug. 2010
Likely origin:
M. Fallon and S. Godsill "Multi-target acoustic source tracking using track before detect", Proc.
Workshop Applicat. Signal Process. Audio Acoust. (WASPAA), pp.77 2007
Xiao, X.; Nickel, R.M.; , “Speech Enhancement With Inventory Style Speech Resynthesis,” Audio,
Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1243-1257, Aug. 2010
Likely origin:
Xiaoqiang Xiao, Peng Lee, Robert M. Nickel: Inventory based speech enhancement for speaker dedicated
speech communication systems. ICASSP 2009:3877-3880
“A Multipulse-Based Forward Error
Correction Technique for Robust CELP-Coded Speech Transmission Over Erasure
Channels,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1258-1268,
Aug. 2010
Likely origin:
José L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González: A
multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks.
INTERSPEECH 2010:2386-2389
Gibson, M.; Hain, T.; , “Error Approximation and Minimum Phone Error Acoustic Model
Estimation,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1269-
1279, Aug. 2010
Likely origin:
None
Chang Huai You; Kong Aik Lee; Haizhou Li; , “GMM-SVM Kernel With a Bhattacharyya-Based
Distance for Speaker Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.6, pp.1300-1312, Aug. 2010
Likely origin:
Chang Huai You, Kong-Aik Lee, Haizhou Li: A GMM supervector Kernel with the Bhattacharyya distance
for SVM based speaker recognition. ICASSP 2009:4221-4224
El-Kahlout, I.D.; Oflazer, K.; , “Exploiting Morphology and Local Word Reordering in English-to-
Turkish Phrase-Based Statistical Machine Translation,” Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.6, pp.1313-1322, Aug. 2010
Likely origin:
Kemal Oflazer: Statistical Machine Translation into a Morphologically Complex Language. CICLing 2008:
376-387
Reyyan Yeniterzi, Kemal Oflazer: Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical
Machine Translation from English to Turkish. ACL 2010: 454-464
Zhu, J.; Wang, H.; Tsou, B.K.; Ma, M.; , “Active Learning With Sampling by Uncertainty and Density
for Data Annotations,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6,
pp.1323-1331, Aug. 2010
Likely origin:
None
Chi-Sang Jung; Moo Young Kim; Hong-Goo Kang; , “Selecting Feature Frames for Automatic Speaker
Recognition Using Mutual Information,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.6, pp.1332-1340, Aug. 2010
Likely origin:
None
- “MMSE-Based Packet Loss
Concealment for CELP-Coded Speech Recognition,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.6, pp.1341-1353, Aug. 2010
Likely origin:
José L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González: A
multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks.
INTERSPEECH 2010: 2386-2389
Ishizuka, K.; Araki, S.; Kawahara, T.; , “Speech Activity Detection for Multi-Party Conversation
Analyses Based on Likelihood Ratio Test on Spatial Magnitude,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.6, pp.1354-1365, Aug. 2010
Likely origin:
None
Ferras, M.; Cheung-Chi Leung; Barras, C.; Gauvain, J.-L.; , “Comparison of Speaker Adaptation
Methods as Feature Extraction for SVM-Based Speaker Recognition,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.6, pp.1366-1378, Aug. 2010
Likely origin:
M. Ferras , C. C. Leung , C. Barras and J.-L. Gauvain "Constrained MLLR for speaker recognition", Proc.
Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2007
M. Ferras , C. C. Leung , C. Barras and J.-L. Gauvain "MLLR techniques for speaker recognition", Proc.
IEEE Speaker Odyssey Workshop, pp.21 2008
Boril, H.; Hansen, J.H.L.; , “Unsupervised Equalization of Lombard Effect for Speech Recognition in
Noisy Adverse Environments,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.6, pp.1379-1393, Aug. 2010
Likely origin:
Hynek Boril, John H. L. Hansen: Unsupervised equalization of Lombard effect for speech recognition in
noisy adverse environment. ICASSP 2009: 3937-3940
Chung-Hsien Wu; Chi-Chun Hsia; Chung-Han Lee; Mai-Chun Lin; , “Hierarchical Prosody Conversion
Using Regression-Based Clustering for Emotional Speech Synthesis,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.6, pp.1394-1405, Aug. 2010
Likely origin:
Chung-Han Lee, Chi-Chun Hsia, Chung-Hsien Wu, Mai-Chun Lin: Regression-based clustering for
hierarchical pitch conversion. ICASSP 2009: 3593-3596
Lee, K.; Ellis, D.P.W.; , “Audio-Based Semantic Concept Classification for Consumer Video,” Audio,
Speech, and Language Processing, IEEE Transactions on, vol.18, no.6, pp.1406-1416, Aug. 2010
Likely origin:
Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui: Audio-visual atoms for
generic video concept classification. TOMCCAP 6(3): (2010)
Tantibundhit, C.; Pernkopf, F.; Kubin, G.; , “Joint Time–Frequency Segmentation Algorithm for
Transient Speech Decomposition and Speech Enhancement,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.6, pp.1417-1428, Aug. 2010
Likely origin:
Charturong Tantibundhit, Franz Pernkopf, Gernot Kubin: Speech enhancement based on joint time-
frequency segmentation. ICASSP 2009:4673-4676
Charturong Tantibundhit, Gernot Kubin: Joint time-frequency segmentation for transient decomposition.
INTERSPEECH 2008:2502-2505
Birkenes, Ø.; Matsui, T.; Tanabe, K.; Siniscalchi, S.M.; Myrvoll, T.A.; Johnsen, M.H.; , “Penalized
Logistic Regression With HMM Log-Likelihood Regressors for Speech Recognition,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1440-1454, Aug. 2010
Likely origin:
Ø. Birkenes , T. Matsui , K. Tanabe and T. A. Myrvoll "N-best rescoring for speech recognition using
penalized logistic regression machines with garbage class", Proc. IEEE Int. Conf. Acoust., Speech, Signal
Process. (ICASSP), pp.449 2007
Bellegarda, J.R.; , “A Dynamic Cost Weighting Framework for Unit Selection Text–to–Speech
Synthesis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1455-1463,
Aug. 2010
Likely origin:
Jerome R. Bellegarda: A novel approach to cost weighting in unit selection TTS. INTERSPEECH
2009:744-747
Akbacak, M.; Hansen, J.H.L.; , “Spoken Proper Name Retrieval for Limited Resource Languages
Using Multilingual Hybrid Representations,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.6, pp.1486-1495, Aug. 2010
Likely origin:
M. Akbacak and J. H. L. Hansen "Spoken proper name retrieval in audio streams for limited-resource
languages via lattice based search using hybrid representations", Proc. IEEE Conf. Acoustics, Speech,
Signal Process. (ICASSP), pp.113 2006
McLaren, M.; Vogt, R.; Baker, B.; Sridharan, S.; , “Data-Driven Background Dataset Selection for
SVM-Based Speaker Verification,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.6, pp.1496-1506, Aug. 2010
Likely origin:
Mitchell McLaren, Brendan Baker, Robbie Vogt, Sridha Sridharan: Improved SVM speaker verification
through data-driven background dataset collection. ICASSP 2009: 4041-4044
Kameoka, H.; Ono, N.; Sagayama, S.; , “Speech Spectrum Modeling for Joint Estimation of Spectral
Envelope and Fundamental Frequency,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.6, pp.1507-1516, Aug. 2010
Likely origin:
Hirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama: Speech analyzer using a joint
estimation model of spectral envelope and fine structure. INTERSPEECH 2006
Holzapfel, A.; Stylianou, Y.; Gedik, A.C.; Bozkurt, B.; , “Three Dimensions of Pitched Instrument
Onset Detection,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6,
pp.1517-1527, Aug. 2010
Likely origin:
Emmanouil Benetos, Andre Holzapfel, Yannis Stylianou: Pitched Instrument Onset Detection based on
Auditory Spectra. ISMIR 2009:105-110
Heracleous, P.; Tran, V.-A.; Nagai, T.; Shikano, K.; , “Analysis and Recognition of NAM Speech Using
HMM Distances and Visual Information,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.6, pp.1528-1538, Aug. 2010
Likely origin:
None
Akita, Y.; Kawahara, T.; , “Statistical Transformation of Language and Pronunciation Models for
Spontaneous Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.6, pp.1539-1549, Aug. 2010
Likely origin:
Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya Kawahara: Improved statistical models for SMT-
based speaking style transformation. ICASSP 2010: 5206-5209
Yi-cheng Pan; Lin-shan Lee; , “Performance Analysis for Lattice-Based Speech Indexing Approaches
Using Words and Subword Units,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.6, pp.1562-1574, Aug. 2010
Likely origin:
Yi-Cheng Pan, Hung-lin Chang, Berlin Chen, Lin-Shan Lee: Subword-based position specific posterior
lattices (s-PSPL) for indexing speech information. INTERSPEECH 2007:318-321
Souden, M.; Benesty, J.; Affes, S.; , “Broadband Source Localization From an Eigenanalysis
Perspective,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1575-
1587, Aug. 2010
Likely origin:
Mehrez Souden, Jacob Benesty, Sofiène Affes: Eigenanalysis-based broadband source localization.
ICASSP 2010: 81-84
“Improved Recognition of Spontaneous
Hungarian Speech—Morphological and Acoustic Modeling Techniques for a Less Resourced
Task,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.6, pp.1588-1600, Aug.
2010
Likely origin:
P. Mihajlik , T. Fegy , Z. Tske and P. Ircing "A morpho-graphemic approach for the recognition of
spontaneous speech in agglutinative languageslike Hungarian", Proc. Interspeech, pp.1497 2007
Tur, G.; Stolcke, A.; Voss, L.; Peters, S.; Hakkani-Tur, D.; Dowding, J.; Favre, B.; Fernandez, R.;
Frampton, M.; Frandsen, M.; Frederickson, C.; Graciarena, M.; Kintzing, D.; Leveque, K.; Mason, S.;
Niekrasz, J.; Purver, M.; Riedhammer, K.; Shriberg, E.; Jing Tien; Vergyri, D.; Fan Yang; , “The CALO
Meeting Assistant System,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.6, pp.1601-1611, Aug. 2010
Likely origin:
G. Tur , A. Stolcke , L. Voss , J. Dowding , B. Favre , R. Fernandez , M. Frampton , M. Frandsen , C.
Frederickson , M. Graciarena , D. Hakkani-Tr , D. Kintzing , K. Leveque , S. Mason , J. Niekrasz , S.
Peters , M. Purver , K. Riedhammer , E. Shriberg , J. Tien , D. Vergyri and F. Yang "The CALO meeting
speech recognition and understanding system", Proc. IEEE/ACL SLT Workshop, 2008
Borgstrom, B.J.; Alwan, A.; , “HMM-Based Reconstruction of Unreliable Spectrographic Data for
Noise Robust Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.6, pp.1612-1623, Aug. 2010
Likely origin:
Bengt J. Borgström, Abeer Alwan: Improved Speech Presence Probabilities Using HMM-Based Inference,
With Applications to Speech Enhancement and ASR. J. Sel. Topics Signal Processing 4(5): 808-815 (2010)
Bengt J. Borgström, Per Henrik Borgström, Abeer Alwan: Efficient HMM-based estimation of missing
features, with applications to packet loss concealment. INTERSPEECH 2010: 2394-2397
Helander, E.; Virtanen, T.; Nurminen, J.; Gabbouj, M.; , “Voice Conversion Using Partial Least Squares
Regression,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.912-921,
July 2010
Likely origin:
None
Erro, D.; Moreno, A.; Bonafonte, A.; , “Voice Conversion Based on Weighted Frequency
Warping,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.922-931,
July 2010
Likely origin:
Daniel Erro, Asunción Moreno: Weighted frequency warping for voice conversion. INTERSPEECH 2007:
1965-1968
Jianhua Tao; Meng Zhang; Nurminen, J.; Jilei Tian; Xia Wang; , “Supervisory Data Alignment for Text-
Independent Voice Conversion,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.5, pp.932-943, July 2010
Likely origin:
None
Erro, D.; Moreno, A.; Bonafonte, A.; , “INCA Algorithm for Training Voice Conversion Systems From
Nonparallel Corpora,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5,
pp.944-953, July 2010
Likely origin:
None
Desai, S.; Black, A.W.; Yegnanarayana, B.; Prahallad, K.; , “Spectral Mapping Using Artificial Neural
Networks for Voice Conversion,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.5, pp.954-964, July 2010
Likely origin:
Srinivas Desai, E. Veera Raghavendra, B. Yegnanarayana, Alan W. Black, Kishore Prahallad: Voice
conversion using Artificial Neural Networks. ICASSP 2009: 3893-3896
Turk, O.; Schroder, M.; , “Evaluation of Expressive Speech Synthesis With Voice Conversion and
Copy Resynthesis Techniques,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.5, pp.965-973, July 2010
Likely origin:
Oytun Türk, Marc Schröder: A comparison of voice conversion methods for transforming voice quality in
emotional speech synthesis. INTERSPEECH 2008: 2282-2285
Erro, “Emotion Conversion Based on Prosodic Unit
Selection,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.974-983,
July 2010
Likely origin:
None
Yamagishi, J.; Usabaev, B.; King, S.; Watts, O.; Dines, J.; Jilei Tian; Yong Guan; Rile Hu; Oura, K.; Yi-
Jian Wu; Tokuda, K.; Karhila, R.; Kurimo, M.; , “Thousands of Voices for HMM-Based Speech
Synthesis–Analysis and Application of TTS Systems Built on Various ASR Corpora,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.18, no.5, pp.984-1004, July 2010
Likely origin:
John Dines, Junichi Yamagishi, Simon King: Measuring the Gap Between HMM-Based ASR and TTS. J.
Sel. Topics Signal Processing 4(6): 1046-1058 (2010)
Watts, O.; Yamagishi, J.; King, S.; Berkling, K.; , “Synthesis of Child Speech With HMM Adaptation
and Voice Conversion,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5,
pp.1005-1016, July 2010
Likely origin:
Oliver Watts, Junichi Yamagishi, Simon King, Kay Berkling: HMM adaptation and voice conversion for
the synthesis of child speech: a comparison. INTERSPEECH 2009: 2627-2630
Bedenbaugh, P.; Sarko, D.K.; Roth, H.L.; Martin, E.M.; , “Prosody-Preserving Voice Transformation to
Evaluate Brain Representations of Speech Sounds,”Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.5, pp.1017-1029, July 2010
Likely origin:
None
Felps, D.; Gutierrez-Osuna, R.; , “Developing Objective Measures of Foreign-Accent
Conversion,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.1030-
1040, July 2010
Likely origin:
None
“Maximum Entropy-Based
Reinforcement Learning Using a Confidence Measure in Speech Recognition for Telephone
Speech,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.1041-1052,
July 2010
Likely origin:
Carlos Molina, Néstor Becerra Yoma, Fernando Huenupán, Claudio Garretón: Unsupervised re-scoring of
observation probability based on maximum entropy criterion by using confidence measure with telephone
speech. INTERSPEECH 2008: 1016-1019
Xugang Lu; Jianwu Dang; , “Vowel Production Manifold: Intrinsic Factor Analysis of Vowel
Articulation,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5, pp.1053-
1062, July 2010
Likely origin:
None
Cong-Thanh Do; Pastor, D.; Goalic, A.; , “On the Recognition of Cochlear Implant-Like Spectrally
Reduced Speech With MFCC and HMM-Based ASR,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.5, pp.1065-1068, July 2010
Likely origin:
Cong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic: Recognizing cochlear implant-like
spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients.
INTERSPEECH 2010: 2634-2637
Mokhtari, P.; Takemoto, H.; Nishimura, R.; Kato, H.; , “Optimum Loss Factor for a Perfectly Matched
Layer in Finite-Difference Time-Domain Acoustic Simulation,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.5, pp.1068-1071, July 2010
Likely origin:
None
Souden, M.; Jingdong Chen; Benesty, J.; Affes, S.; , “Gaussian Model-Based Multichannel Speech
Presence Probability,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5,
pp.1072-1077, July 2010
Likely origin:
None
Tiomkin, S.; Malah, D.; Shechtman, S.; , “Statistical Text-to-Speech Synthesis Based on Segment-Wise
Representation With a Norm Constraint,” Audio, Speech, and Language Processing, IEEE Transactions
on , vol.18, no.5, pp.1077-1082, July 2010
Likely origin:
Stas Tiomkin, David Malah: Statistical text-to-speech synthesis with improved dynamics. INTERSPEECH
2008: 1841-1844
Garreton, C.; Yoma, N.B.; Torres, M.; , “Channel Robust Feature Transformation Based on Filter-
Bank Energy Filtering,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.5,
pp.1082-1086, July 2010
Likely origin:
Claudio Garretón, Néstor Becerra Yoma: On enhancing feature sequence filtering with filter-bank energy
transformation in speaker verification with telephone speech. INTERSPEECH 2010: 1461-1464
Glaser, C.; Heckmann, M.; Joublin, F.; Goerick, C.; , “Combining Auditory Preprocessing and Bayesian
Estimation for Robust Formant Tracking,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.2, pp.224-236, Feb. 2010
Likely origin:
None
Marelli, D.; Balazs, P.; , “On Pole-Zero Model Estimation Methods Minimizing a Logarithmic
Criterion for Speech Analysis,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.2, pp.237-248, Feb. 2010
Likely origin:
None
Souden, M.; Benesty, J.; Affes, S.; , “On Optimal Frequency-Domain Multichannel Linear Filtering
for Noise Reduction,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.2,
pp.260-276, Feb. 2010
Likely origin:
Mehrez Souden, Jacob Benesty, Sofiène Affes: Linear filtering for noise reduction and interference
rejection. ICASSP 2010: 89-92
Buera, L.; Miguel, A.; Saz, O.; Ortega, A.; Lleida, E.; , “Unsupervised Data-Driven Feature Vector
Normalization With Acoustic Model Adaptation for Robust Speech Recognition,” Audio, Speech, and
Language Processing, IEEE Transactions on , vol.18, no.2, pp.296-309, Feb. 2010
Likely origin:
None
Guz, U.; Cuendet, S.; Hakkani-Tur, D.; Tur, G.; , “Multi-View Semi-Supervised Learning for Dialog
Act Segmentation of Speech,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.2, pp.320-329, Feb. 2010
Likely origin:
None
Cornelis, B.; Doclo, S.; Van dan Bogaert, T.; Moonen, M.; Wouters, J.; , “Theoretical Analysis of
Binaural Multimicrophone Noise Reduction Techniques,”Audio, Speech, and Language Processing,
IEEE Transactions on , vol.18, no.2, pp.342-355, Feb. 2010
Likely origin:
None
Wen Jin; Xin Liu; Scordilis, M.S.; Lu Han; , “Speech Enhancement Using Harmonic Emphasis and
Adaptive Comb Filtering,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.2, pp.356-368, Feb. 2010
Likely origin:
None
Camelin, N.; Bechet, F.; Damnati, G.; De Mori, R.; , “Detection and Interpretation of Opinion
Expressions in Spoken Surveys,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.2, pp.369-381, Feb. 2010
Likely origin:
N. Camelin , G. Damnati , F. Bechet and R. De Mori "Opinion mining in a telephone survey corpus", Proc.
Int. Conf. Spoken Lang. Process., pp.1041 2006
Mandel, M.I.; Weiss, R.J.; Ellis, D.; , “Model-Based Expectation-Maximization Source Separation and
Localization,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.2, pp.382-394,
Feb. 2010
Likely origin:
M. I. Mandel and D. P. W. Ellis "EM localization and separation using interaural level and phase
cues", Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., pp.275 2007
Watanabe, S.; Nakamura, A.; , “Predictor–Corrector Adaptation by Using Time Evolution System
With Macroscopic Time Scale,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.2, pp.395-406, Feb. 2010
Likely origin:
Shinji Watanabe, Atsushi Nakamura: On-line adaptation and Bayesian detection of environmental changes
based on a macroscopic time evolution system. ICASSP 2009: 4373-4376
Kumaresan, R.; Panchal, N.; , “Encoding Bandpass Signals Using Zero/Level Crossings: A Model-
Based Approach,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.1, pp.17-
33, Jan. 2010
Likely origin:
None
Balazs, P.; Laback, B.; Eckel, G.; Deutsch, W.A.; , “Time–Frequency Sparsity by Removing
Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking,” Audio, Speech,
and Language Processing, IEEE Transactions on , vol.18, no.1, pp.34-49, Jan. 2010
Likely origin:
None
Valin, J.-M.; Terriberry, T.B.; Montgomery, C.; Maxwell, G.; , “A High-Quality Speech and Audio
Codec With Less Than 10-ms Delay,” Audio, Speech, and Language Processing, IEEE Transactions on ,
vol.18, no.1, pp.58-67, Jan. 2010
Likely origin:
None
Falk, T.H.; Wai-Yip Chan; , “Modulation Spectral Features for Robust Far-Field Speaker
Identification,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.1, pp.90-100,
Jan. 2010
Likely origin:
Tiago H. Falk, Wai-Yip Chan: Spectro-temporal features for robust far-field speaker identification.
INTERSPEECH 2008: 634-637
Reju, V.G.; Soo Nqee Koh; Ing Yann Soon; , “Underdetermined Convolutive Blind Source Separation
via Time–Frequency Masking,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18,
no.1, pp.101-116, Jan. 2010
Likely origin:
None
McLoughlin, I.; , “Vowel Intelligibility in Chinese,” Audio, Speech, and Language Processing, IEEE
Transactions on , vol.18, no.1, pp.117-125, Jan. 2010
Likely origin:
None
Matschkal, B.; Huber, J.B.; , “Spherical Logarithmic Quantization,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.1, pp.126-140, Jan. 2010
Likely origin:
J. B. Huber and B. Matschkal "Spherical logarithmic quantization and its application for DPCM", Proc.
5th Int. ITG Conf. Source Channel Coding (SCC), pp.349 2004
Shih-Sian Cheng; Hsin-Min Wang; Hsin-Chia Fu; , “BIC-Based Speaker Segmentation Using Divide-
and-Conquer Strategies With Application to Speaker Diarization,” Audio, Speech, and Language
Processing, IEEE Transactions on , vol.18, no.1, pp.141-157, Jan. 2010
Likely origin:
Shih-Sian Cheng, Chun-Han Tseng, Chia-Ping Chen, Hsin-Min Wang: Speaker diarization using divide-
and-conquer. INTERSPEECH 2009: 1055-1058
Wang, T.T.; Quatieri, T.F.; , “High-Pitch Formant Estimation by Exploiting Temporal Change of
Pitch,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.1, pp.171-186, Jan.
2010
Likely origin:
Tianyu T. Wang, Thomas F. Quatieri: Multi-pitch estimation by a joint 2-d representation of pitch and pitch
dynamics. INTERSPEECH 2010: 645-648
Feifan Liu; Yang Liu; , “Exploring Correlation Between ROUGE and Human Evaluation on Meeting
Summaries,” Audio, Speech, and Language Processing, IEEE Transactions on , vol.18, no.1, pp.187-196,
Jan. 2010
Likely origin:
Feifan Liu, Yang Liu: Correlation between ROUGE and Human Evaluation of Extractive Meeting
Summaries. ACL (Short Papers) 2008: 201-204