


default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 20
Volume 20, Number 1, 2012
- Helen Meng:

Farewell Editorial. 1 - Li Deng:

Inaugural Editorial: Riding the Tidal Wave of Human-Centric Information Processing - Innovate, Outreach, Collaborate, Connect, Expand, and Win. 2-3 - Dong Yu, Geoffrey E. Hinton, Nelson Morgan, Jen-Tzung Chien

, Shigeki Sagayama:
Introduction to the Special Section on Deep Learning for Speech and Language Processing. 4-6 - Nelson Morgan:

Deep and Wide: Multiple Layers in Automatic Speech Recognition. 7-13 - Abdel-rahman Mohamed, George E. Dahl, Geoffrey E. Hinton:

Acoustic Modeling Using Deep Belief Networks. 14-22 - Garimella S. V. S. Sivaram, Hynek Hermansky

:
Sparse Multilayer Perceptron for Phoneme Recognition. 23-29 - George E. Dahl, Dong Yu, Li Deng, Alex Acero

:
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. 30-42 - George Saon

, Jen-Tzung Chien
:
Bayesian Sensing Hidden Markov Models. 43-54 - Jen-Tzung Chien

, Chuang-Hua Chueh
:
Topic-Based Hierarchical Segmentation. 55-66 - I. Yücel Özbek

, Mark Hasegawa-Johnson, Mübeccel Demirekler:
On Improving Dynamic State Space Approaches to Articulatory Inversion With MAP-Based Parameter Estimation. 67-81 - Mark R. P. Thomas, Jón Guðnason

, Patrick A. Naylor
:
Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm. 82-91 - Jesper Jensen, Richard C. Hendriks:

Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions. 92-102 - Hen-Geul Yeh

, Carlos Rangel Ruiz:
Fixed-Point Implementation of Cascaded Forward-Backward Adaptive Predictors. 103-107 - Tobias May

, Steven van de Par, Armin Kohlrausch:
Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling. 108-121 - Alberto Carini

, Stefania Cecchi
, Francesco Piazza, Ivan Omiciuolo, Giovanni L. Sicuranza
:
Multiple Position Room Response Equalization in Frequency Domain. 122-135 - Iman S. Mossavat, Petko Nikolov Petkov, W. Bastiaan Kleijn

, Oliver Amft
:
A Hierarchical Bayesian Approach to Modeling Heterogeneity in Speech Quality Assessment. 136-146 - Thomas Ulrich Christiansen, Steven Greenberg:

Perceptual Confusions Among Consonants, Revisited - Cross-Spectral Integration of Phonetic-Feature Information and Consonant Recognition. 147-161 - Enzo De Sena

, Hüseyin Hacihabiboglu
, Zoran Cvetkovic:
On the Design and Implementation of Higher Order Differential Microphones. 162-174 - Ted S. Wada, Biing-Hwang Juang:

Enhancement of Residual Echo for Robust Acoustic Echo Cancellation. 175-189 - Adam M. Stark, Mark D. Plumbley

:
Performance Following: Real-Time Prediction of Musical Sequences Without a Score. 190-199 - Matthias Mauch, Hiromasa Fujihara, Masataka Goto

:
Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment. 200-210 - Berlin Chen, Shih-Hsiang Lin:

A Risk-Aware Modeling Framework for Speech Summarization. 211-222 - Richard C. Hendriks, Timo Gerkmann

:
Noise Correlation Matrix Estimation for Multi-Microphone Speech Enhancement. 223-233 - Giovanni L. Sicuranza

, Alberto Carini
:
On the BIBO Stability Condition of Adaptive Recursive FLANN Filters With Application to Nonlinear Active Noise Control. 234-245 - Francesco Nesta, Maurizio Omologo

:
Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources. 246-260 - Yasmín Montenegro M., José Carlos M. Bermudez

:
Transient Mean-Square Analysis of Prediction Error Method-Based Adaptive Feedback Cancellation in Hearing Aids. 261-275 - Lei Xie, Lilei Zheng, Zihan Liu, Yanning Zhang:

Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News. 276-289 - Norberto Degara, Enrique Argones-Rúa, Antonio S. Pena

, Soledad Torres-Guijarro
, Matthew E. P. Davies
, Mark D. Plumbley
:
Reliability-Informed Beat Tracking of Musical Signals. 290-301 - Jen-Tzung Chien

, Hsin-Lung Hsieh:
Convex Divergence ICA for Blind Source Separation. 302-313 - Hangil Moon:

A Low-Complexity Design for an MP3 Multi-Channel Audio Decoding System. 314-321 - Celia Shahnaz

, Wei-Ping Zhu
, M. Omair Ahmad:
Pitch Estimation Based on a Harmonic Sinusoidal Autocorrelation Model and a Time-Domain Matching Scheme. 322-335 - Claudio Garretón, Néstor Becerra Yoma:

Telephone Channel Compensation in Speaker Verification Using a Polynomial Approximation in the Log-Filter-Bank Energy Domain. 336-341 - Vishweshwara Rao, Pradeep Gaddipati, Preeti Rao:

Signal-Driven Window-Length Adaptation for Sinusoid Detection in Polyphonic Music. 342-348
Volume 20, Number 2, February 2012
- Xavier Anguera Miró

, Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Gerald Friedland, Oriol Vinyals:
Speaker Diarization: A Review of Recent Research. 356-370 - Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera Miró

, Luke R. Gottlieb, Marijn Huijbregts, Mary Tai Knox, Oriol Vinyals:
The ICSI RT-09 Speaker Diarization System. 371-381 - Nicholas W. D. Evans, Simon Bozonnet, Dong Wang, Corinne Fredouille, Raphaël Troncy

:
A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization. 382-392 - Marijn Huijbregts, David A. van Leeuwen, Chuck Wooters

:
Speaker Diarization Error Analysis Using Oracle Components. 393-403 - Marijn Huijbregts, David A. van Leeuwen:

Large-Scale Speaker Diarization for Long Recordings and Small Collections. 404-413 - Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman

:
Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations. 414-425 - José Manuel Pardo, Roberto Barra-Chicote

, Rubén San Segundo
, Ricardo de Córdoba
, Beatriz Martínez-González:
Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation. 426-435 - Martin Zelenák, Carlos Segura, Jordi Luque

, Javier Hernando:
Simultaneous Speech Detection With Spatial Features for Speaker Diarization. 436-446 - Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki

, Tomohiro Nakatani, Hiroshi Sawada:
Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information. 447-460 - Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li

:
Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data. 461-473 - Fernando Batista

, Helena Moniz
, Isabel Trancoso
, Nuno J. Mamede
:
Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts. 474-485 - Thomas Hain

, Lukás Burget
, John Dines, Philip N. Garner
, Frantisek Grézl, Asmaa El Hannani
, Marijn Huijbregts, Martin Karafiát
, Mike Lincoln, Vincent Wan:
Transcribing Meetings With the AMIDA Systems. 486-498 - Takaaki Hori, Shoko Araki

, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe
, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami
, Keisuke Kinoshita
, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato
:
Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera. 499-513 - Joan Serrà, Holger Kantz

, Xavier Serra
, Ralph G. Andrzejak
:
Predictability of Music Descriptor Time Series and its Application to Cover Song Detection. 514-525 - Marco Dinarelli, Alessandro Moschitti

, Giuseppe Riccardi:
Discriminative Reranking for Spoken Language Understanding. 526-539 - Ebru Arisoy, Murat Saraclar

, Brian Roark, Izhak Shafran:
Discriminative Language Modeling With Linguistic and Statistically Derived Features. 540-550 - Björn Hoffmeister, Georg Heigold, David Rybach

, Ralf Schlüter
, Hermann Ney:
WFST Enabled Solutions to ASR Problems: Beyond HMM Decoding. 551-564 - Alberto Sanchís, Alfons Juan, Enrique Vidal:

A Word-Based Naïve Bayes Classifier for Confidence Estimation in Speech Recognition. 565-574 - Wen Zhang, Mengqiu Zhang, Rodney A. Kennedy

, Thushara D. Abhayapala
:
On High-Resolution Head-Related Transfer Function Measurements: An Efficient Sampling Scheme. 575-584 - Sungrack Yun, Chang D. Yoo:

Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification. 585-598 - Nima Yousefian, Philipos C. Loizou:

A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function. 599-609 - Laura E. Boucheron, Phillip L. De Leon

, Steven Sandoval
:
Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients. 610-619 - Nam Soo Kim, Tae Gyoon Kang, Shin Jae Kang, Chang Woo Han, Doo Hwa Hong:

Speech Feature Mapping Based on Switching Linear Dynamic System. 620-631 - Yi-Cheng Pan, Hung-yi Lee

, Lin-Shan Lee:
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process. 632-645 - Jake Gunther:

Learning Echo Paths During Continuous Double-Talk Using Semi-Blind Source Separation. 646-660 - Meng Yu, Wenye Ma, Jack Xin, Stanley J. Osher:

Multi-Channel l1 Regularized Convex Speech Enhancement Model and Fast Computation by the Split Bregman Method. 661-675 - Hüseyin Hacihabiboglu

, Zoran Cvetkovic:
Multichannel Dereverberation Theorems and Robustness Issues. 676-689 - Laura Romoli, Stefania Cecchi

, Paolo Peretti, Francesco Piazza:
A Mixed Decorrelation Approach for Stereo Acoustic Echo Cancellation Based on the Estimation of the Fundamental Frequency. 690-698 - Jacob Benesty

, Mehrez Souden, Yiteng Huang:
A Perspective on Differential Microphone Arrays in the Context of Noise Reduction. 699-704 - Frédéric Mustière, Martin Bouchard

, Miodrag Bolic:
All-Pole Modeling of Discrete Spectral Powers: A Unified Approach. 705-708 - Takayuki Arai, Nao Hodoshima

, Keiichi Yasu:
Errata to "Using Steady-State Suppression to Improve Speech Intelligibility in Reverberant Environments for Elderly Listeners". 709
Volume 20, Number 3, March 2012
- Kazuyoshi Yoshii

, Masataka Goto
:
A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation. 717-730 - Siddika Parlak, Murat Saraclar

:
Performance Analysis and Improvement of Turkish Broadcast News Retrieval. 731-741 - Haohai Sun, Shefeng Yan, U. Peter Svensson:

Optimal Higher Order Ambisonics Encoding With Predefined Constraints. 742-754 - Mitchell McLaren, David A. van Leeuwen:

Source-Normalized LDA for Robust Speaker Recognition Using i-Vectors From Multiple Speech Sources. 755-766 - Elias K. Kokkinis, Joshua D. Reiss, John Mourjopoulos:

A Wiener Filter Approach to Microphone Leakage Reduction in Close-Microphone Applications. 767-779 - Qiang Fu, Yong Zhao, Biing-Hwang Juang:

Automatic Speech Recognition Based on Non-Uniform Error Criteria. 780-793 - Heiga Zen

, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda:
Product of Experts for Statistical Parametric Speech Synthesis. 794-805 - Elina Helander

, Hanna Silén, Tuomas Virtanen
, Moncef Gabbouj
:
Voice Conversion Using Dynamic Kernel Partial Least Squares Regression. 806-817 - Ning Ma

, Jon Barker
, Heidi Christensen
, Phil D. Green
:
Combining Speech Fragment Decoding and Adaptive Noise Floor Modeling. 818-827 - Liang-Che Sun, Lin-Shan Lee:

Modulation Spectrum Equalization for Improved Robust Speech Recognition. 828-843 - Matija Marolt:

Automatic Transcription of Bell Chiming Recordings. 844-853 - Emanuël Anco Peter Habets

, Jacob Benesty
, Patrick A. Naylor
:
A Speech Distortion and Interference Rejection Constraint Beamformer. 854-867 - Yousheng Chen, Qin Gong:

A Normalized Beamforming Algorithm for Broadband Speech Using a Continuous Interleaved Sampling Strategy. 868-874 - Sabato Marco Siniscalchi

, Dau-Cheng Lyu, Torbjørn Svendsen
, Chin-Hui Lee:
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data. 875-887 - Xiang Lin, Andy W. H. Khong, Patrick A. Naylor

:
A Forced Spectral Diversity Algorithm for Speech Dereverberation in the Presence of Near-Common Zeros. 888-899 - Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern

:
Learning-Based Auditory Encoding for Robust Speech Recognition. 900-914 - Amir Adler

, Valentin Emiya
, Maria G. Jafari, Michael Elad, Rémi Gribonval, Mark D. Plumbley
:
Audio Inpainting. 922-932 - Ana M. Barbancho

, Anssi Klapuri, Lorenzo J. Tardón
, Isabel Barbancho
:
Automatic Transcription of Guitar Chords and Fingering From Audio. 915-921 - Wei Chu, Abeer Alwan:

SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy Conditions. 933-944 - Ashish Panda, Thambipillai Srikanthan:

Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise. 945-953 - Emanuël A. P. Habets

, Jacob Benesty
:
A Perspective on Frequency-Domain Beamformers in Room Acoustics. 947-960 - Thomas Drugman, Thierry Dutoit:

The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications. 968-981 - Shing-Chow Chan, Y. Chu:

Performance Analysis and Design of FxLMS Algorithm in Broadband ANC System With Online Secondary-Path Modeling. 982-993 - Thomas Drugman, Mark R. P. Thomas, Jón Guðnason

, Patrick A. Naylor
, Thierry Dutoit:
Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review. 994-1006 - Alfonso Pérez Carrillo

, Jordi Bonada
, Esteban Maestre
, Enric Guaus
, Merlijn Blaauw
:
Performance Control Driven Violin Timbre Model Based on Neural Networks. 1007-1021 - Ravi K. Chivukula

, Yuriy A. Reznik, Venkat Devarajan, Mythreya Jayendra-Lakshman:
Fast Algorithms for Low-Delay SBR Filterbanks in MPEG-4 AAC-ELD. 1022-1031 - Xianyu Zhao, Yuan Dong:

Variational Bayesian Joint Factor Analysis Models for Speaker Verification. 1032-1042 - Ashutosh Pandey, V. John Mathews:

Adaptive Gain Processing With Offending Frequency Suppression for Digital Hearing Aids. 1043-1055 - Tamar Shoham, David Malah

, Slava Shechtman:
Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database. 1056-1068 - Vladimir Despotovic

, Norbert Goertz, Zoran Peric
:
Nonlinear Long-Term Prediction of Speech Based on Truncated Volterra Series. 1069-1073 - Siow Yong Low, Svetha Venkatesh

, Sven Nordholm
:
A Spectral Slit Approach to Doubletalk Detection. 1074-1080
Volume 20, Number 4, May 2012
- Seiichi Nakagawa, Longbiao Wang, Shinji Ohtsuka:

Speaker Identification and Verification by Combining MFCC and Phase Information. 1085-1095 - Riccardo Miotto

, Gert R. G. Lanckriet:
A Generative Context Model for Semantic Music Annotation and Retrieval. 1096-1108 - Chu-Cheng Lin, Richard Tzong-Han Tsai

:
A Generative Data Augmentation Model for Enhancing Chinese Dialect Pronunciation Prediction. 1109-1117 - Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot:

A General Flexible Framework for the Handling of Prior Information in Audio Source Separation. 1118-1133 - Jia-Min Ren, Jyh-Shing Roger Jang

:
Discovering Time-Constrained Sequential Patterns for Music Genre Classification. 1134-1144 - Virginia Estellers, Mihai Gurban, Jean-Philippe Thiran

:
On Dynamic Stream Weighting for Audio-Visual Speech Recognition. 1145-1157 - Navin Chatlani, John J. Soraghan:

EMD-Based Filtering (EMDF) of Low-Frequency Noise for Speech Enhancement. 1158-1166 - Haiyan Shu, Haibin Huang, Susanto Rahardja

:
Analysis of Bit-Plane Probability for Generalized Gaussian Distribution and its Application in Audio Coding. 1167-1176 - Tobias Rosenkranz, Henning Puder:

Improving Robustness of Codebook-Based Noise Estimation Approaches With Delta Codebooks. 1177-1188 - Ines Hafizovic, Carl-Inge Colombo Nilsen, Sverre Holm

:
Transformation Between Uniform Linear and Spherical Microphone Arrays With Symmetric Responses. 1189-1195 - Xiaohong Yang, Yufang Yang:

Prosodic Realization of Rhetorical Structure in Chinese Discourse. 1196-1206 - David T. Yeh:

Automated Physical Modeling of Nonlinear Audio Circuits for Real-Time Audio Effects - Part II: BJT and Vacuum Tube Examples. 1207-1216 - Manish Narwaria

, Weisi Lin, Ian Vince McLoughlin
, Sabu Emmanuel, Liang-Tien Chia:
Nonintrusive Quality Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector Regression. 1217-1232 - Wei-Ho Tsai, Hsin-Chieh Lee:

Automatic Evaluation of Karaoke Singing Based on Pitch, Volume, and Rhythm Features. 1233-1243 - Takanobu Oba, Takaaki Hori, Atsushi Nakamura, Akinori Ito

:
Round-Robin Duel Discriminative Language Models. 1244-1255 - Yiteng Arden Huang, Jacob Benesty

:
A Multi-Frame Approach to the Frequency-Domain Single-Channel Noise Reduction Problem. 1256-1269 - Miroslav Zivanovic

, Johan Schoukens:
Single and Piecewise Polynomials for Modeling of Pitched Sounds. 1270-1281 - Yaakov Bucris, Israel Cohen, Miriam A. Doron:

Bayesian Focusing for Coherent Wideband Beamforming. 1282-1296 - Hélène Papadopoulos, Geoffroy Peeters:

Local Key Estimation From an Audio Signal Relying on Harmonic and Metrical Structures. 1297-1312 - Elizabeth Godoy, Olivier Rosec, Thierry Chonavel

:
Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora. 1313-1323 - Ruofei Chen, Cheung-Fat Chan

, Hing-Cheung So:
Model-Based Speech Enhancement With Improved Spectral Envelope Estimation via Dynamics Tracking. 1324-1336 - Qun Feng Tan, Shrikanth S. Narayanan:

Novel Variations of Group Sparse Regularization Techniques With Applications to Noise Robust Automatic Speech Recognition. 1337-1346 - Rubén Solera-Ureña

, Ana I. García-Moral, Carmen Peláez-Moreno
, Manel Martínez-Ramón, Fernando Díaz-de-María
:
Real-Time Robust Automatic Speech Recognition Using Compact Support Vector Machines. 1347-1361 - Amin Fazel, Shantanu Chakrabartty:

Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech Recognition. 1362-1371 - Jorge I. Marin-Hurtado

, Devangi N. Parikh, David V. Anderson:
Perceptually Inspired Noise-Reduction Method for Binaural Hearing Aids. 1372-1382 - Timo Gerkmann

, Richard C. Hendriks:
Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay. 1383-1393 - Haiquan Zhao, Xiangping Zeng, Xiaoqiang Zhang

, Zhengyou He, Tianrui Li
, Weidong Jin:
Adaptive Extended Pipelined Second-Order Volterra Filter for Nonlinear Active Noise Controller. 1394-1399 - Damián Marelli, Mitsuko Aramaki, Richard Kronland-Martinet

, Charles Verron:
An Efficient Time-Frequency Method for Synthesizing Noisy Sounds With Short Transients and Narrow Spectral Components. 1400-1408 - Maurice F. Fallon, Simon J. Godsill:

Acoustic Source Localization and Tracking of a Time-Varying Number of Speakers. 1409-1415
Volume 20, Number 5, July 2012
- Vesa Välimäki, Julian D. Parker, Lauri Savioja, Julius O. Smith III

, Jonathan S. Abel:
Fifty Years of Artificial Reverberation. 1421-1448 - Flavio P. Ribeiro, Dinei A. F. Florêncio, Demba E. Ba, Cha Zhang:

Geometrically Constrained Room Modeling With Compact Microphone Arrays. 1449-1460 - Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka

, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li
:
Bitext Dependency Parsing With Auto-Generated Bilingual Treebank. 1461-1472 - K. Lakhdhar, Roch Lefebvre:

Context-Based Adaptive Arithmetic Encoding of EAVQ Indices. 1473-1481 - Chao-Ling Hsu, DeLiang Wang, Jyh-Shing Roger Jang

, Ke Hu:
A Tandem Algorithm for Singing Pitch Extraction and Voice Separation From Music Accompaniment. 1482-1491 - Zhen-Hua Ling, Li-Rong Dai:

Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis. 1492-1502 - John Woodruff, DeLiang Wang:

Binaural Localization of Multiple Sources in Reverberant and Noisy Environments. 1503-1512 - Welly Naptali, Masatoshi Tsuchiya

, Seiichi Nakagawa:
Topic-Dependent-Class-Based $n$-Gram Language Model. 1513-1525 - Jesper Rindom Jensen

, Jacob Benesty
, Mads Græsbøll Christensen
, Søren Holdt Jensen:
Non-Causal Time-Domain Filters for Single-Channel Noise Reduction. 1526-1541 - Kamil Adiloglu, Robert Anniés, Elio Wahlen, Hendrik Purwins, Klaus Obermayer:

A Graphical Representation and Dissimilarity Measure for Basic Everyday Sound Events. 1542-1552 - Cees H. Taal, Richard C. Hendriks, Richard Heusdens:

A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications. 1553-1564 - Huawei Chen, Wee Ser, Jianjiang Zhou:

Robust Nearfield Wideband Beamformer Design Using Worst Case Mean Performance Optimization With Passband Response Variance Constraint. 1565-1572 - D. Rama Sanand

, Srinivasan Umesh
:
VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC. 1573-1584 - Sandro Cumani, Pietro Laface:

Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition. 1585-1596 - Xiaoyan Cai, Wenjie Li

:
Mutually Reinforced Manifold-Ranking Based Relevance Propagation Model for Query-Focused Multi-Document Summarization. 1597-1607 - Xiaojia Zhao, Yang Shao, DeLiang Wang:

CASA-Based Robust Speaker Identification. 1608-1616 - Saeed Mosayyebpour, Hamid Sheikhzadeh, T. Aaron Gulliver

, Morteza Esmaeili:
Single-Microphone LP Residual Skewness-Based Inverse Filtering of the Room Impulse Response. 1617-1632 - Upendra V. Chaudhari, Michael Picheny:

Matching Criteria for Vocabulary-Independent Search. 1633-1643 - Daniele Giacobello

, Mads Græsbøll Christensen
, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Sparse Linear Prediction and Its Applications to Speech Processing. 1644-1657 - Stefan Bilbao:

Optimized FDTD Schemes for 3-D Acoustic Wave Propagation. 1658-1663
Volume 20, Number 6, August 2012
- Sin-Horng Chen, Jyh-Her Yang, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang:

A New Prosody-Assisted Mandarin ASR System. 1669-1684 - Romain Serizel, Marc Moonen, Jan Wouters

, Søren Holdt Jensen:
A Zone-of-Quiet Based Approach to Integrated Active Noise Control and Noise Reduction for Speech Enhancement in Hearing Aids. 1685-1697 - Christian D. Sigg

, Tomas Dikk, Joachim M. Buhmann:
Speech Enhancement Using Generative Dictionary Learning. 1698-1712 - Heiga Zen

, Norbert Braunschweiler
, Sabine Buchholz, Mark J. F. Gales, Kate M. Knill, Sacha Krstulovic
, Javier Latorre:
Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization. 1713-1724 - Christian Schüldt

, Fredric Lindström, Ingvar Claesson:
A Delay-Based Double-Talk Detector. 1725-1733 - Alastair J. Manders, David M. Simpson, Steven L. Bell

:
Objective Prediction of the Sound Quality of Music Processed by an Adaptive Feedback Canceller. 1734-1745 - Shoichi Koyama

, Ken'ichi Furuya
, Yusuke Hiwasaki, Yoichi Haneda:
Reproducing Virtual Sound Sources in Front of a Loudspeaker Array Using Inverse Wave Propagator. 1746-1758 - Justin Salamon

, Emilia Gómez:
Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. 1759-1770 - Yizhao Ni

, Matt McVicar, Raúl Santos-Rodriguez
, Tijl De Bie:
An End-to-End Machine Learning System for Harmonic Analysis of Music. 1771-1783 - Daisuke Saito, Shinji Watanabe

, Atsushi Nakamura, Nobuaki Minematsu:
Statistical Voice Conversion Based on Noisy Channel Model. 1784-1794 - Daniel Angus

, Andrew E. Smith, Janet Wiles
:
Human Communication as Coupled Time Series: Quantifying Multi-Participant Recurrence. 1795-1807 - Claire Masterson, Gavin Kearney, Marcin Gorzel, Francis M. Boland

:
HRIR Order Reduction Using Approximate Factorization. 1808-1817 - Jan Vanek

, Jan Trmal, Josef V. Psutka, Josef Psutka:
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors. 1818-1828 - Jan Ole Jungmann, Radoslaw Mazur, Markus Kallinger, Tiemin Mei

, Alfred Mertins:
Combined Acoustic MIMO Channel Crosstalk Cancellation and Room Impulse Response Reshaping. 1829-1842 - Tianyu T. Wang, Thomas F. Quatieri:

Two-Dimensional Speech-Signal Modeling. 1843-1856 - Isabel Barbancho

, Lorenzo J. Tardón
, Simone Sammartino
, Ana M. Barbancho
:
Inharmonicity-Based Method for the Automatic Generation of Guitar Tablature. 1857-1868 - Amit Das, John H. L. Hansen:

Constrained Iterative Speech Enhancement Using Phonetic Classes. 1869-1883 - Abbas Keshavarz, Saeed Mosayyebpour, Mehrzad Biguesh, T. Aaron Gulliver

, Morteza Esmaeili:
Speech-Model Based Accurate Blind Reverberation Time Estimation Using an LPC Filter. 1884-1893 - Anil Kumar Vuppala, Jainath Yadav, Saswat Chakrabarti, K. Sreenivasa Rao:

Vowel Onset Point Detection for Low Bit Rate Coded Speech. 1894-1903
Volume 20, Number 7, 2012
- Theodoros Giannakopoulos, Sergios Petridis

:
Fisher Linear Semi-Discriminant Analysis for Speaker Diarization. 1913-1922 - Xiaodong Cui, Jing Huang, Jen-Tzung Chien

:
Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition. 1923-1935 - Jacob L. Newman

, Stephen J. Cox:
Language Identification Using Visual Features. 1936-1947 - Jesper Rindom Jensen

, Jacob Benesty
, Mads Græsbøll Christensen
, Søren Holdt Jensen:
Enhancement of Single-Channel Periodic Signals in the Time-Domain. 1948-1963 - Marco Compagnoni, Paolo Bestagini

, Fabio Antonacci
, Augusto Sarti, Stefano Tubaro:
Localization of Acoustic Sources Through the Fitting of Propagation Cones Using Multiple Independent Arrays. 1964-1975 - Jung-Woo Choi

, Yang-Hann Kim:
Integral Approach for Reproduction of Virtual Sound Source Surrounded by Loudspeaker Array. 1976-1989 - Tomi Kinnunen, Rahim Saeidi

, Filip Sedlak, Kong-Aik Lee
, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li
:
Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification. 1990-2001 - Wen-Lin Zhang, Weiqiang Zhang

, Bi-Cheng Li, Dan Qu, Michael T. Johnson:
Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model. 2002-2015 - Tobias May

, Steven van de Par, Armin Kohlrausch:
A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation. 2016-2030 - Armando Muscariello, Guillaume Gravier, Frédéric Bimbot:

Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination. 2031-2044 - César González Ferreras

, David Escudero Mancebo
, Carlos Vivaracho-Pascual
, Valentín Cardeñoso-Payo
:
Improving Automatic Classification of Prosodic Events by Pairwise Coupling. 2045-2058 - Maximo Cobos

, José J. López
:
Maximum a Posteriori Binary Mask Estimation for Underdetermined Source Separation Using Smoothed Posteriors. 2059-2064 - Sarmad Malik, Gerald Enzner

:
State-Space Frequency-Domain Adaptive Filtering for Nonlinear Acoustic Echo Cancellation. 2065-2079 - Ryoichi Miyazaki, Hiroshi Saruwatari, Takayuki Inoue, Yu Takahashi

, Kiyohiro Shikano, Kazunobu Kondo:
Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction. 2080-2094 - Hung-yi Lee

, Chia-Ping Chen, Lin-Shan Lee:
Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection. 2095-2110 - Mohamed I. Alkanhal

, Mohamed Al-Badrashiny, Mansour M. Alghamdi, Abdulaziz O. Al-Qabbany:
Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions. 2111-2122 - Stephen J. Elliott, Jordan Cheer

, Jung-Woo Choi
, Youngtae Kim:
Robustness and Regularization of Personal Audio Systems. 2123-2133 - Lakshmi Babu Saheer

, John Dines, Philip N. Garner
:
Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis. 2134-2148 - Yongqiang Wang, Mark J. F. Gales:

Speaker and Noise Factorization for Robust Speech Recognition. 2149-2158
Volume 20, Number 8, October 2012
- Leonardo O. Nunes, Flávio R. Avila, Alan Freihof Tygel, Luiz W. P. Biscainho

, Bowon Lee, Amir Said, Ronald W. Schafer:
A Parametric Objective Quality Assessment Tool for Speech Signals Degraded by Acoustic Echo. 2181-2190 - Yong Zhao, Biing-Hwang Juang:

Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition. 2191-2206 - Brian McFee

, Luke Barrington, Gert R. G. Lanckriet:
Learning Content Similarity for Music Recommendation. 2207-2218 - Hannu Pulakka, Ulpu Remes

, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku
:
Bandwidth Extension of Telephone Speech to Low Frequencies Using Sinusoidal Synthesis and a Gaussian Mixture Model. 2219-2231 - Iynkaran Natgunanathan, Yong Xiang, Yue Rong

, Wanlei Zhou
, Song Guo
:
Robust Patchwork-Based Embedding and Decoding Scheme for Digital Audio Watermarking. 2232-2239 - Yotaro Kubo, Shinji Watanabe

, Takaaki Hori, Atsushi Nakamura:
Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition. 2240-2251 - Xiaodong Cui, Jian Xue, Xin Chen, Peder A. Olsen, Pierre L. Dognin, Upendra V. Chaudhari, John R. Hershey, Bowen Zhou:

Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages. 2252-2264 - Amit Das, John H. L. Hansen:

Phoneme Selective Speech Enhancement Using Parametric Estimators and the Mixture Maximum Model: A Unifying Approach. 2265-2279 - Phillip L. De Leon

, Michael Pucher, Junichi Yamagishi, Inma Hernáez
, Ibon Saratxaga
:
Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech. 2280-2290 - Wei-Ho Tsai, Hsin-Chieh Lee:

Singer Identification Based on Spoken Data in Voice Characterization. 2291-2300 - Daniel Felps, Christian Geng, Ricardo Gutierrez-Osuna

:
Foreign Accent Conversion Through Concatenative Synthesis in the Articulatory Domain. 2301-2312 - Gustavo Reis, Francisco Fernández de Vega, Aníbal J. S. Ferreira

:
Automatic Transcription of Polyphonic Piano Music Using Genetic Algorithms, Adaptive Spectral Envelope Modeling, and Dynamic Noise Level Estimation. 2313-2328 - Soroosh Mariooryad, Carlos Busso

:
Generating Human-Like Behaviors Using Joint, Speech-Driven Models for Conversational Agents. 2329-2340 - Hasim Sak, Murat Saraclar

, Tunga Gungor
:
Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition. 2341-2351 - Yongwon Jeong:

Adaptation of Hidden Markov Models Using Model-as-Matrix Representation. 2352-2364 - Seyedmahdad Mirsamadi

, Shabnam Ghaffarzadegan, Hamid Sheikhzadeh, Seyed Mohammad Ahadi, Amir Hossein Rezaie:
Efficient Frequency Domain Implementation of Noncausal Multichannel Blind Deconvolution for Convolutive Mixtures of Speech. 2365-2377 - Barry-John Theobald, Iain A. Matthews:

Relating Objective and Subjective Performance Measures for AAM-Based Visual Speech Synthesis. 2378-2387 - Terence Betlehem, Christopher S. Withers

:
Sound Field Reproduction With Energy Constraint on Loudspeaker Weights. 2388-2392
Volume 20, Number 9, November 2012
- Nikolaos Mitianoudis

:
A Generalized Directional Laplacian Distribution : Estimation, Mixture Models and Audio Source Separation. 2397-2408 - Janne Pylkkönen, Mikko Kurimo:

Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs. 2409-2419 - Alex Southern, Damian T. Murphy, Lauri Savioja:

Spatial Encoding of Finite Difference Time Domain Acoustic Models for Auralization. 2420-2432 - Marco Crocco

, Andrea Trucco
:
Stochastic and Analytic Optimization of Sparse Aperiodic Arrays and Broadband Beamformers With Robust Superdirective Patterns. 2433-2447 - Masashi Okada, Takao Onoye, Wataru Kobayashi:

A Ray Tracing Simulation of Sound Diffraction Based on the Analytic Secondary Source Model. 2448-2460 - Ryouichi Nishimura:

Audio Watermarking Using Spatial Masking and Ambisonics. 2461-2469 - Flávio R. Avila, Luiz W. P. Biscainho

:
Bayesian Restoration of Audio Signals Degraded by Impulsive Noise Modeled as Individual Pulses. 2470-2481 - Woojay Jeon, Changxue Ma, Dusan Macho:

Statistical Utterance Comparison for Speaker Clustering Using Factor Analysis. 2482-2491 - Justin Jian Zhang, Pascale Fung:

Automatic Parliamentary Meeting Minute Generation Using Rhetorical Structure Modeling. 2492-2504 - Tomoki Toda

, Mikihiro Nakagiri, Kiyohiro Shikano:
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement. 2505-2517 - Arun Narayanan, DeLiang Wang:

A CASA-Based System for Long-Term SNR Estimation. 2518-2527 - Ronen Talmon, Israel Cohen, Sharon Gannot

, Ronald R. Coifman:
Supervised Graph-Based Processing for Sequential Transient Interference Suppression. 2528-2538 - Andre Holzapfel, Matthew E. P. Davies

, José Ricardo Zapata
, João Lobato Oliveira
, Fabien Gouyon
:
Selective Sampling for Beat Tracking Evaluation. 2539-2548 - Meng Guo

, Søren Holdt Jensen, Jesper Jensen:
Novel Acoustic Feedback Cancellation Approaches in Hearing Aid Applications Using Probe Noise and Probe Noise Enhancement. 2549-2563 - Jens Ahrens

, Sascha Spors
:
A Modal Analysis of Spatial Discretization of Spherical Loudspeaker Distributions Used for Sound Field Synthesis. 2564-2574 - Vladimir Tourbabin, Morag Agmon, Boaz Rafaely

, Joseph Tabrikian
:
Optimal Real-Weighted Beamforming With Application to Linear and Spherical Arrays. 2575-2585 - Pejman Mowlaee

, Rahim Saeidi
, Mads Græsbøll Christensen
, Zheng-Hua Tan
, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen:
A Joint Approach for Single-Channel Speaker Identification and Speech Separation. 2586-2601 - Berlin Chen, Kuan-Yu Chen, Pei-Ning Chen, Yi-Wen Chen:

Spoken Document Retrieval With Unsupervised Query Modeling Techniques. 2602-2612 - Kruthiventi S. S. Srinivas, Kishore Prahallad:

An FIR Implementation of Zero Frequency Filtering of Speech Signals. 2613-2617
Volume 20, Number 10, December 2012
- Mari Ostendorf:

A Message from the Vice President of Publications on New Developments in Signal Processing Society Publications. 2625 - Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas:

A Mixture Model Approach for Formant Tracking and the Robustness of Student's-t Distribution. 2626-2636 - Steven Hargreaves

, Anssi Klapuri, Mark B. Sandler
:
Structural Segmentation of Multitrack Audio. 2637-2647 - Matthew Gibson, Thomas Hain

:
Correctness-Adjusted Unsupervised Discriminative Acoustic Model Adaptation. 2648-2656 - Bruno Defraene, Toon van Waterschoot

, Hans Joachim Ferreau, Moritz Diehl, Marc Moonen:
Real-Time Perception-Based Clipping of Audio Signals Using Convex Optimization. 2657-2671 - Gopal Ananthakrishnan, Olov Engwall

, Daniel Neiberg:
Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings. 2672-2682 - Fabio Antonacci

, Jason Filos, Mark R. P. Thomas, Emanuël Anco Peter Habets
, Augusto Sarti, Patrick A. Naylor
, Stefano Tubaro:
Inference of Room Geometry From Acoustic Impulse Responses. 2683-2695 - João Lobato Oliveira

, Matthew E. P. Davies
, Fabien Gouyon
, Luís Paulo Reis
:
Beat Tracking for Multiple Applications: A Multi-Agent System Architecture With State Recovery. 2696-2706 - Takuya Yoshioka, Tomohiro Nakatani:

Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening. 2707-2720

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














