


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 24
Volume 24, Number 1, January 2016
- Sandrine Brognaux, Thomas Drugman:

HMM-Based Speech Segmentation: Improvements of Fully Automatic Approaches. 5-15 - Marie Tahon

, Laurence Devillers:
Towards a Small Set of Robust Acoustic Features for Emotion Recognition: Challenges. 16-28 - Hamid Behravan

, Ville Hautamäki
, Sabato Marco Siniscalchi
, Tomi Kinnunen, Chin-Hui Lee:
i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition. 29-41 - Rahim Saeidi

, Paavo Alku
, Tom Bäckström
:
Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch. 42-53 - Iman Tabatabaei Ardekani

, Jari P. Kaipio
, Alireza Nasiri, Hamid R. Sharifzadeh, Waleed H. Abdulla
:
A Statistical Inverse Problem Approach to Online Secondary Path Modeling in Active Noise Control. 54-64 - Themos Stafylakis

, Patrick Kenny, Md. Jahangir Alam, Marcel Kockmann:
Speaker and Channel Factors in Text-Dependent Speaker Recognition. 65-78 - Yanzhang He, Peter Baumann, Hao Fang, Brian Hutchinson

, Aaron Jaech, Mari Ostendorf, Eric Fosler-Lussier, Janet B. Pierrehumbert:
Using Pronunciation-Based Morphological Subword Units to Improve OOV Handling in Keyword Search. 79-92 - Meng Sun

, Xiongwei Zhang, Hugo Van hamme
, Thomas Fang Zheng:
Unseen Noise Estimation Using Separable Deep Auto Encoder for Speech Enhancement. 93-104 - Luciana Ferrer

, Yun Lei, Mitchell McLaren, Nicolas Scheffer:
Study of Senone-Based Deep Neural Network Approaches for Spoken Language Recognition. 105-116 - Stefan Ingi Adalbjornsson

, Ted Kronvall, Simon Burgess, Kalle Åström
, Andreas Jakobsson
:
Sparse Localization of Harmonic Audio Sources. 117-129 - Man-Wai Mak, Xiaomin Pang, Jen-Tzung Chien

:
Mixture of PLDA for Noise Robust I-Vector Speaker Verification. 130-142 - Craig A. Anderson, Paul D. Teal

, Mark A. Poletti
:
Spatial Correlation of Radial Gaussian and Uniform Spherical Volume Near-Field Source Distributions. 143-150 - Humberto M. Torres

, Jorge A. Gurlekian:
Novel Estimation Method for the Superpositional Intonation Model. 151-160 - Stefan Bilbao, Brian Hamilton, Jonathan Botts, Lauri Savioja:

Finite Volume Time Domain Room Acoustics Simulation under General Impedance Boundary Conditions. 161-173 - Amir Hossein Harati Nejad Torbati, Joseph Picone:

A Doubly Hierarchical Dirichlet Process Hidden Markov Model with a Non-Ergodic Structure. 174-184 - Jen-Tzung Chien

, Po-Kai Yang:
Bayesian Factorization and Learning for Monaural Source Separation. 185-195 - David Lou Alon, Boaz Rafaely

:
Beamforming with Optimal Aliasing Cancellation in Spherical Microphone Arrays. 196-210
Volume 24, Number 2, February 2016
- Eugen Rasumow, Martin Hansen, Steven van de Par, Dirk Puschel, Volker Mellert, Simon Doclo

, Matthias Blau
:
Regularization Approaches for Synthesizing HRTF Directivity Patterns. 215-225 - Chao Pan, Jacob Benesty

, Jingdong Chen:
Design of Directivity Patterns with a Unique Null of Maximum Multiplicity. 226-235 - Jeih-Weih Hung, Hsin-Ju Hsieh, Berlin Chen:

Robust Speech Recognition via Enhancing the Complex-Valued Acoustic Spectrum in Modulation Domain. 236-251 - Xiao-Lei Zhang, DeLiang Wang:

Boosting Contextual Information for Deep Neural Network Based Voice Activity Detection. 252-264 - M. A. Tugtekin Turan

, Engin Erzin
:
Source and Filter Estimation for Throat-Microphone Speech Enhancement. 265-275 - Nasser Mohammadiha, Simon Doclo

:
Speech Dereverberation Using Non-Negative Convolutive Transfer Function and Spectro-Temporal Modeling. 276-289 - Anil Sharma, Sanjit K. Kaul:

Two-Stage Supervised Learning-Based Method to Detect Screams and Cries in Urban Environments. 290-299 - Xiaoguang Wu, Huawei Chen:

Directivity Factors of the First-Order Steerable Differential Array With Microphone Mismatches: Deterministic and Worst-Case Analysis. 300-315 - Andreas I. Koutrouvelis, George P. Kafentzis

, Nikolay D. Gaubitch, Richard Heusdens:
A Fast Method for High-Resolution Voiced/Unvoiced Detection and Glottal Closure/Opening Instant Estimation of Speech. 316-328 - Tomohiko Nakamura

, Eita Nakamura, Shigeki Sagayama:
Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips. 329-339 - Adrian Bahne, Anders Ahlén:

Optimizing the Similarity of Loudspeaker-Room Responses in Multiple Listening Positions. 340-353 - James M. Kates, Kathryn Hoberg Arehart:

The Hearing-Aid Audio Quality Index (HAAQI). 354-365 - Henning F. Schepker

, Simon Doclo
:
A Semidefinite Programming Approach to Min-max Estimation of the Common Part of Acoustic Feedback Paths in Hearing Aids. 366-377 - Bong-Ki Lee, Joon-Hyuk Chang:

Packet Loss Concealment Based on Deep Neural Networks for Digital Speech Transmission. 378-387 - Luisa Bentivogli

, Nicola Bertoldi, Mauro Cettolo, Marcello Federico, Matteo Negri
, Marco Turchi:
On the Evaluation of Adaptive Machine Translation for Human Post-Editing. 388-399
Volume 24, Number 3, March 2016
- Reinhard Sonnleitner, Gerhard Widmer

:
Robust Quad-Based Audio Fingerprinting. 409-421 - Li Dong, Furu Wei, Ke Xu, Shixia Liu

, Ming Zhou:
Adaptive Multi-Compositionality for Recursive Neural Network Models. 422-431 - Zheng Lin, Xiaolong Jin, Xueke Xu, Yuanzhuo Wang, Xueqi Cheng, Weiping Wang

, Dan Meng:
An Unsupervised Cross-Lingual Topic Model Framework for Sentiment Classification. 432-444 - Anil M. Nagathil

, Claus Weihs, Rainer Martin
:
Spectral Complexity Reduction of Music Signals for Mitigating Effects of Cochlear Hearing Loss. 445-458 - Tian Tan, Yanmin Qian, Kai Yu:

Cluster Adaptive Training for Deep Neural Network Based Acoustic Model. 459-468 - Leijon Leijon, Gustav Eje Henter, Martin Dahlquist:

Bayesian Analysis of Phoneme Confusion Matrices. 469-482 - Donald S. Williamson

, Yuxuan Wang, DeLiang Wang:
Complex Ratio Masking for Monaural Speech Separation. 483-492 - Johannes Traa, David Wingate, Noah D. Stein, Paris Smaragdis:

Robust Source Localization and Enhancement With a Probabilistic Steered Response Power Model. 493-503 - Sven Ewan Shepstone, Kong-Aik Lee

, Haizhou Li
, Zheng-Hua Tan
, Søren Holdt Jensen:
Total Variability Modeling Using Source-Specific Priors. 504-517 - Martin Schneider, Walter Kellermann:

Multichannel Acoustic Echo Cancellation in the Wave Domain With Increased Robustness to Nonuniqueness. 518-529 - Ken O'Hanlon, Hidehisa Nagano, Nicolas Keriven, Mark D. Plumbley

:
Non-Negative Group Sparsity with Subspace Note Modelling for Polyphonic Transcription. 530-542 - Elior Hadad

, Simon Doclo
, Sharon Gannot
:
The Binaural LCMV Beamformer and its Performance Analysis. 543-558 - Felipe Grijalva

, Luiz Martini
, Dinei Florêncio, Siome Goldenstein:
A Manifold Learning Approach for Personalizing HRTFs from Anthropometric Features. 559-570 - Lin Wang, Simon Doclo

:
Correlation Maximization-Based Sampling Rate Offset Estimation for Distributed Microphone Arrays. 571-582 - Nasim Radmanesh, Ian S. Burnett

, Bhaskar D. Rao:
A Lasso-LS Optimization with a Frequency Variable Dictionary in a Multizone Sound System. 583-593 - Xin Liu, Changchun Bao:

Audio Bandwidth Extension Based on Ensemble Echo State Networks with Temporal Evolution. 594-607
Volume 24, Number 4, April 2016
- Peifeng Li, Guodong Zhou

:
Joint Argument Inference in Chinese Event Extraction with Argument Consistency and Event Relevance. 612-622 - Jianming Liu, Steven L. Grant:

Proportionate Adaptive Filtering for Block-Sparse System Identification. 623-630 - Jesper Rindom Jensen

, Jacob Benesty
, Mads Græsbøll Christensen
:
Noise Reduction with Optimal Variable Span Linear Filters. 631-644 - Sidsel Marie Nørholm, Jesper Rindom Jensen

, Mads Græsbøll Christensen
:
Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech. 645-658 - Daryush D. Mehta, Jarrad H. Van Stan, Robert E. Hillman:

Relationships Between Vocal Function Measures Derived from an Acoustic Microphone and a Subglottal Neck-Surface Accelerometer. 659-668 - Herman Kamper

, Aren Jansen, Sharon Goldwater:
Unsupervised Word Segmentation and Lexicon Discovery Using Acoustic Word Embeddings. 669-679 - Ina Kodrasi

, Simon Doclo
:
Joint Dereverberation and Noise Reduction Based on Acoustic Multi-Channel Equalization. 680-693 - Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab K. Ward:

Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval. 694-707 - Michael Jeffet, Noam R. Shabtai, Boaz Rafaely

:
Theory and Perceptual Evaluation of the Binaural Reproduction and Beamforming Tradeoff in the Generalized Spherical Array Beamformer. 708-718 - Pablo Peso Parada, Dushyant Sharma, Jose Lainez, Daniel Barreda, Toon van Waterschoot

, Patrick A. Naylor
:
A Single-Channel Non-Intrusive C50 Estimator Correlated With Speech Recognition Performance. 719-732 - Ming-Hsiang Su

, Chung-Hsien Wu
, Yu-Ting Zheng:
Exploiting Turn-Taking Temporal Evolution for Personality Trait Perception in Dyadic Conversations. 733-744 - Sadaf Abdul-Rauf, Holger Schwenk, Patrik Lambert, Mohammad Nawaz:

Empirical Use of Information Retrieval to Build Synthetic Data for SMT Domain Adaptation. 745-754 - Shinnosuke Takamichi, Tomoki Toda

, Alan W. Black, Graham Neubig, Sakriani Sakti, Satoshi Nakamura:
Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis. 755-767 - Zhizheng Wu, Phillip L. De Leon

, Cenk Demiroglu, Ali Khodabakhsh
, Simon King, Zhen-Hua Ling, Daisuke Saito, Bryan Stewart, Tomoki Toda
, Mirjam Wester, Junichi Yamagishi:
Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance. 768-783 - Kristian Timm Andersen, Marc Moonen:

Adaptive Time-Frequency Analysis for Noise Reduction in an Audio Filter Bank With Low Delay. 784-795 - Zhong-Qiu Wang, DeLiang Wang:

A Joint Training Framework for Robust Automatic Speech Recognition. 796-806 - Huy Phan, Lars Hertel, Marco Maaß

, Radoslaw Mazur, Alfred Mertins:
Learning Representations for Nonspeech Audio Events Through Their Similarities to Speech Patterns. 807-822
Volume 24, Number 5, May 2016
- T. J. Tsai, Andreas Stolcke:

Robust and Efficient Multiple Alignment of Unsynchronized Meeting Recordings. 833-845 - Simon Receveur, Robin Weib, Tim Fingscheidt

:
Turbo Automatic Speech Recognition. 846-862 - Ricard Marxer

, Hendrik Purwins:
Unsupervised Incremental Online Learning and Prediction of Musical Audio Signals. 863-874 - Mohammad Adeli

, Jean Rouat, Sean U. N. Wood, Stephane Molotchnikoff, Eric Plourde:
A Flexible Bio-Inspired Hierarchical Model for Analyzing Musical Timbre. 875-889 - Geliang Zhang

, Simon J. Godsill:
Fundamental Frequency Estimation in Speech Signals With Variable Rate Particle Filters. 890-900 - Nadine Kroher, Emilia Gómez:

Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings. 901-913 - Fiete Winter, Jens Ahrens

, Sascha Spors:
On Analytic Methods for 2.5-D Local Sound Field Synthesis Using Circular Distributions of Secondary Sources. 914-926 - Siddharth Sigtia, Emmanouil Benetos

, Simon Dixon:
An End-to-End Neural Network for Polyphonic Piano Music Transcription. 927-939 - Martin Krawczyk-Becker, Timo Gerkmann

:
Fundamental Frequency Informed Speech Enhancement in a Flexible Statistical Framework. 940-951 - Joseph Szurley, Alexander Bertrand

, Bas van Dijk, Marc Moonen:
Binaural Noise Cue Preservation in a Binaural Noise Reduction System With a Remote Microphone Signal. 952-966 - Xiao-Lei Zhang, DeLiang Wang:

A Deep Ensemble Learning Method for Monaural Speech Separation. 967-977 - Haotian Xu, Zhijian Ou:

Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering. 978-989
Volume 24, Number 6, June 2016
- Asli Celikyilmaz

, Ruhi Sarikaya, Minwoo Jeong, Anoop Deoras:
An Empirical Investigation of Word Class-Based Features for Natural Language Understanding. 994-1005 - Duc Hoang Ha Nguyen, Xiong Xiao, Eng Siong Chng

, Haizhou Li
:
Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition. 1006-1019 - Xiaojun Qian, Helen M. Meng, Frank K. Soong:

A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training. 1020-1028 - Lijiang Chen, Xia Mao, Hong Yan:

Text-Independent Phoneme Segmentation Combining EGG and Speech Data. 1029-1037 - Vincent Mohammad Tavakoli

, Jesper Rindom Jensen
, Mads Græsbøll Christensen
, Jacob Benesty
:
A Framework for Speech Enhancement With Ad Hoc Microphone Arrays. 1038-1051 - Yan-You Chen, Chung-Hsien Wu

, Yi-Chin Huang, Shih-Lun Lin, Jhing-Fa Wang:
Candidate Expansion and Prosody Adjustment for Natural Speech Synthesis Using a Small Corpus. 1052-1065 - Xueliang Zhang

, Hui Zhang, Shuai Nie, Guanglai Gao, Wenju Liu:
A Pairwise Algorithm Using the Deep Stacking Network for Speech Separation and Pitch Estimation. 1066-1078 - Lin Wang, Tsz-Kin Hon, Joshua D. Reiss, Andrea Cavallaro:

An Iterative Approach to Source Counting and Localization Using Two Distant Microphones. 1079-1093 - Seán O'Leary, Axel Röbel

:
A Montage Approach to Sound Texture Synthesis. 1094-1105 - Chahid Ouali, Pierre Dumouchel, Vishwa Gupta:

Fast Audio Fingerprinting System Using GPU and a Clustering-Based Technique. 1106-1118 - Francisco Raposo

, Ricardo Ribeiro
, David Martins de Matos
:
Using Generic Summarization to Improve Music Information Retrieval Tasks. 1119-1128 - Lantian Li

, Dong Wang, Chenhao Zhang, Thomas Fang Zheng:
Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes. 1129-1139 - Jalal Taghia, Rainer Martin

:
A Frequency-Domain Adaptive Line Enhancer With Step-Size Control Based on Mutual Information for Harmonic Noise Reduction. 1140-1154
Volume 24, Number 7, July 2016
- Min Gao, Jing Lu, Xiaojun Qiu

:
A Simplified Subband ANC Algorithm Without Secondary Path Modeling. 1164-1174 - Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki:

Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion. 1175-1184 - Kai Chen, Qiang Huo:

Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach. 1185-1193 - Themos Stafylakis

, Md. Jahangir Alam, Patrick Kenny:
Text-Dependent Speaker Recognition With Random Digit Strings. 1194-1203 - K. T. Deepak, S. R. Mahadeva Prasanna:

Foreground Speech Segmentation and Enhancement Using Glottal Closure Instants and Mel Cepstral Coefficients. 1204-1218 - Habib Hajimolahoseini, Rassoul Amirfattahi, Saeed Gazor

, Hamid Soltanian-Zadeh
:
Robust Estimation and Tracking of Pitch Period Using an Efficient Bayesian Filter. 1219-1229 - Subhasmita Sahoo

, Aurobinda Routray:
A Novel Method of Glottal Inverse Filtering. 1230-1241 - Gilles Degottex

, Luc Ardaillon, Axel Roebel:
Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice. 1242-1254 - Zhizheng Wu, Simon King:

Improving Trajectory Modelling for DNN-Based Speech Synthesis by Using Stacked Bottleneck Features and Minimum Generation Error Training. 1255-1265 - Xabier Jaureguiberry

, Emmanuel Vincent, Gaël Richard:
Fusion Methods for Speech Enhancement and Audio Source Separation. 1266-1279 - Rajib Lochan Das, Mrityunjoy Chakraborty

:
Improving the Performance of the PNLMS Algorithm Using l1 Norm Regularization. 1280-1290 - Maja Taseska, Emanuël A. P. Habets:

Spotforming: Spatial Filtering With Distributed Arrays for Position-Selective Sound Acquisition. 1291-1304 - Guangyou Zhou, Zhiwen Xie, Tingting He, Jun Zhao, Xiaohua Tony Hu:

Learning the Multilingual Translation Representations for Question Retrieval in Community Question Answering via Non-Negative Matrix Factorization. 1305-1314 - Chanwoo Kim

, Richard M. Stern
:
Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition. 1315-1329
Volume 24, Number 8, August 2016
- Henning F. Schepker

, Simon Doclo
:
Least-Squares Estimation of the Common Pole-Zero Filter of Acoustic Feedback Paths in Hearing Aids. 1334-1347 - Hannes Pessentheiner

, Martin Hagmüller
, Gernot Kubin
:
Localization and Characterization of Multiple Harmonic Sources. 1348-1363 - Hanieh Khalilian

, Ivan V. Bajic
, Rodney G. Vaughan:
Comparison of Loudspeaker Placement Methods for Sound Field Reproduction. 1364-1379 - Cheng-Yen Yang, Chih-Wei Liu, Shyh-Jye Jou

:
A Systematic ANSI S1.11 Filter Bank Specification Relaxation and Its Efficient Multirate Architecture for Hearing-Aid Systems. 1380-1392 - Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot

:
Semi-Supervised Sound Source Localization Based on Manifold Regularization. 1393-1407 - Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot

, Radu Horaud:
A Variational EM Algorithm for the Separation of Time-Varying Convolutive Audio Mixtures. 1408-1423 - Jun Du

, Yanhui Tu, Li-Rong Dai, Chin-Hui Lee:
A Regression Approach to Single-Channel Speech Separation Via High-Resolution Deep Neural Networks. 1424-1437 - Xunying Liu

, Xie Chen, Yongqiang Wang, Mark J. F. Gales, Philip C. Woodland:
Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models. 1438-1449 - Pawel Swietojanski

, Jinyu Li
, Steve Renals
:
Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation. 1450-1463 - Meng Zhang, Yang Liu

, Huanbo Luan, Maosong Sun:
Listwise Ranking Functions for Statistical Machine Translation. 1464-1472
Volume 24, Number 9, September 2016
- Daniel Cruz Cavalieri

, Sira E. Palazuelos-Cagigas, Teodiano Freire Bastos-Filho
, Mário Sarcinelli Filho
:
Combination of Language Models for Word Prediction: An Exponential Approach. 1481-1494 - Ofer Schwartz, Sharon Gannot

, Emanuël A. P. Habets:
An Expectation-Maximization Algorithm for Multimicrophone Speech Dereverberation and Noise Reduction With Coherence Matrix Estimation. 1495-1510 - Symeon Delikaris-Manias

, Juha Vilkamo, Ville Pulkki
:
Signal-Dependent Spatial Filtering Based on Weighted-Orthogonal Beamformers in the Spherical Harmonic Domain. 1511-1523 - Sheng Li

, Yuya Akita, Tatsuya Kawahara
:
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses. 1524-1534 - Christian Dittmar

, Meinard Müller
:
Reverse Engineering the Amen Break - Score-Informed Separation and Restoration Applied to Drum Recordings. 1535-1547 - Chao Pan, Jingdong Chen

, Jacob Benesty:
Reduced-Order Robust Superdirective Beamforming With Uniform Linear Microphone Arrays. 1548-1559 - Derry Fitzgerald

, Antoine Liutkus, Roland Badeau:
Projection-Based Demixing of Spatial Audio. 1560-1572 - Lin Wang, Joshua D. Reiss, Andrea Cavallaro:

Over-Determined Source Separation and Localization Using Distributed Microphones. 1573-1588 - Yang Liu, Sujian Li, Furu Wei, Heng Ji:

Relation Classification Via Modeling Augmented Dependency Paths. 1589-1598 - Adam Kuklasinski

, Simon Doclo
, Søren Holdt Jensen, Jesper Jensen:
Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise. 1599-1612 - Sam Karimian-Azari, Jesper Rindom Jensen

, Mads Græsbøll Christensen
:
Computationally Efficient and Noise Robust DOA and Pitch Estimation. 1613-1625 - Daichi Kitamura, Nobutaka Ono

, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari:
Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization. 1626-1641 - Nicolas Obin, Axel Roebel

:
Similarity Search of Acted Voices for Automatic Voice Casting. 1642-1651 - Aditya Arie Nugraha

, Antoine Liutkus, Emmanuel Vincent:
Multichannel Audio Source Separation With Deep Neural Networks. 1652-1664 - Stephen H. Shum

, David F. Harwath, Najim Dehak
, James R. Glass:
On the Use of Acoustic Unit Discovery for Language Recognition. 1665-1676
Volume 24, Number 10, October 2016
- James Eaton, Nikolay D. Gaubitch, Alastair H. Moore, Patrick A. Naylor

:
Estimation of Room Acoustic Parameters: The ACE Challenge. 1681-1693 - Takashi Nose

:
Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis. 1694-1704 - Shabnam Ghaffarzadegan, Hynek Boril

, John H. L. Hansen:
Generative Modeling of Pseudo-Whisper for Robust Whispered Speech Recognition. 1705-1720 - Seyedmahdad Mirsamadi

, John H. L. Hansen:
A Generalized Nonnegative Tensor Factorization Approach for Distant Speech Recognition With Distributed Microphones. 1721-1731 - Laura Fuster

, Maria de Diego
, Luis Antonio Azpicueta-Ruiz
, Miguel Ferrer
:
Adaptive Filtered-x Algorithms for Room Equalization Based on Block-Based Combination Schemes. 1732-1745 - Kamil Adiloglu

, Emmanuel Vincent:
Variational Bayesian Inference for Source Separation and Robust Feature Extraction. 1746-1758 - Steffen Kortlang

, Giso Grimm
, Volker Hohmann, Birger Kollmeier, Stephan Dieter Ewert
:
Auditory Model-Based Dynamic Compression Controlled by Subband Instantaneous Frequency and Speech Presence Probability Estimates. 1759-1772 - Pawel Swietojanski

, Steve Renals
:
Differentiable Pooling for Unsupervised Acoustic Model Adaptation. 1773-1784 - Kenta Niwa

, Yusuke Hioka
, Kazunori Kobayashi:
Optimal Microphone Array Observation for Clear Recording of Distant Sound Sources. 1785-1795 - Nicolas Epain, Craig T. Jin

:
Spherical Harmonic Signal Covariance and Sound Field Diffuseness. 1796-1807 - Tudor-Catalin Zorila

, Yannis Stylianou, Tatsuma Ishihara, Masami Akamine:
Near and Far Field Speech-in-Noise Intelligibility Improvements Based on a Time-Frequency Energy Reallocation Approach. 1808-1818 - Xi Ma, Dong Wang, Javier Tejedor

:
Similar Word Model for Unfrequent Word Enhancement in Speech Recognition. 1819-1830 - Mohammad Hadi Bokaei

, Hossein Sameti, Yang Liu:
Summarizing Meeting Transcripts Based on Functional Segmentation. 1831-1841 - Jiajun Zhang

, Yu Zhou, Chengqing Zong
:
Abstractive Cross-Language Summarization via Translation Model Enhanced Predicate Argument Structure Fusing. 1842-1853 - Grégoire Lafay, Mathieu Lagrange

, Mathias Rossignol, Emmanouil Benetos
, Axel Roebel
:
A Morphological Model for Simulating Acoustic Scenes and Its Application to Sound Event Detection. 1854-1864 - An Ji, Michael T. Johnson, Jeffrey J. Berry:

Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion. 1865-1875
Volume 24, Number 11, November 2016
- Aggelos Gkiokas, Vassilis Katsouros

, George Carayannis:
Towards Multi-Purpose Spectral Rhythm Features: An Application to Dance Style, Meter and Tempo Estimation. 1885-1896 - Yi-Chin Huang, Chung-Hsien Wu

, Si-Ting Weng:
Improving Mandarin Prosody Generation Using Alternative Smoothing Techniques. 1897-1907 - Asger Heidemann Andersen

, Jan Mark de Haan, Zheng-Hua Tan
, Jesper Jensen:
Predicting the Intelligibility of Noisy and Nonlinearly Processed Binaural Speech. 1908-1920 - Qiaoling Zhang, Zhe Chen, Fuliang Yin:

Distributed Marginalized Auxiliary Particle Filter for Speaker Tracking in Distributed Microphone Networks. 1921-1934 - Marc Ferras, Srikanth R. Madikeri, Hervé Bourlard:

Speaker Diarization and Linking of Meeting Data. 1935-1945 - Yuzong Liu, Katrin Kirchhoff:

Graph-Based Semisupervised Learning for Acoustic Modeling in Automatic Speech Recognition. 1946-1956 - Jin Wang

, Liang-Chih Yu
, K. Robert Lai, Xuejie Zhang:
Community-Based Weighted Graph Model for Valence-Arousal Prediction of Affective Words. 1957-1968 - Alberto Carini

, Stefania Cecchi
, Laura Romoli:
Robust Room Impulse Response Measurement Using Perfect Sequences for Legendre Nonlinear Filters. 1969-1982 - Sebastian Ewert

, Mark B. Sandler
:
Piano Transcription in the Studio Using an Extensible Alternating Directions Framework. 1983-1997 - Yu-Ren Chien, Hsin-Min Wang

, Shyh-Kang Jeng:
Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling. 1998-2008 - Jesper Jensen

, Cees H. Taal:
An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers. 2009-2022 - Xiaodong Cui, Vaibhava Goel

:
Maximum Likelihood Nonlinear Transformations Based on Deep Neural Networks. 2023-2031 - Toru Nakashika, Tetsuya Takiguchi, Yasuhiro Minami:

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine. 2032-2045 - I-Bin Liao, Chen-Yu Chiang

, Yih-Ru Wang, Sin-Horng Chen:
Speaker Adaptation of SR-HPM for Speaking Rate-Controlled Mandarin TTS. 2046-2058 - Hiroki Ouchi, Kevin Duh, Hiroyuki Shindo

, Yuji Matsumoto:
Transition-Based Dependency Parsing Exploiting Supertags. 2059-2068 - Tong Xiao

, Derek F. Wong
, Jingbo Zhu:
A Loss-Augmented Approach to Training Syntactic Machine Translation Systems. 2069-2083 - Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii

:
Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation. 2084-2095 - Siddharth Sigtia, Adam M. Stark, Sacha Krstulovic

, Mark D. Plumbley
:
Automatic Environmental Sound Recognition: Performance Versus Computational Cost. 2096-2107 - Srinivas Parthasarathy, Roddy Cowie

, Carlos Busso
:
Using Agreement on Direction of Change to Build Rank-Based Emotion Classifiers. 2108-2121 - Jia-Ching Wang, Yuan-Shan Lee, Chang-Hong Lin, Shu-Fan Wang, Chih-Hao Shih, Chung-Hsien Wu

:
Compressive Sensing-Based Speech Enhancement. 2122-2131 - Siying Wang, Sebastian Ewert

, Simon Dixon:
Robust and Efficient Joint Alignment of Multiple Musical Performances. 2132-2145 - Xie Chen, Xunying Liu, Yongqiang Wang, Mark J. F. Gales, Philip C. Woodland:

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition. 2146-2157 - Ping-Keng Jao, Li Su

, Yi-Hsuan Yang, Brendt Wohlberg
:
Monaural Music Source Separation Using Convolutional Sparse Coding. 2158-2170 - Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot

:
Estimation of the Direct-Path Relative Transfer Function for Supervised Sound-Source Localization. 2171-2186 - Duc Le, Keli Licata, Carol Persad, Emily Mower Provost

:
Automatic Assessment of Speech Intelligibility for Individuals With Aphasia. 2187-2199 - Thijs van de Laar

, Bert de Vries:
A Probabilistic Modeling Approach to Hearing Loss Compensation. 2200-2213
Volume 24, Number 12, December 2016
- Andrea Cogliati

, Zhiyao Duan, Brendt Wohlberg
:
Context-Dependent Piano Music Transcription With Convolutional Sparse Coding. 2218-2230 - Yanmin Qian, Tian Tan, Dong Yu:

Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition. 2231-2240 - Lahiru Samarakoon, Khe Chai Sim:

Factorized Hidden Layer Adaptation for Deep Neural Network Based Acoustic Modeling. 2241-2250 - Martin Krawczyk-Becker, Timo Gerkmann

:
On MMSE-Based Estimation of Amplitude and Complex Speech Spectral Coefficients Under Phase-Uncertainty. 2251-2262 - Yanmin Qian, Mengxiao Bi, Tian Tan, Kai Yu:

Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition. 2263-2276 - Yi-Chan Wu, Homer H. Chen:

Generation of Affective Accompaniment in Accordance With Emotion Flow. 2277-2287 - Mahmood Movassagh, Peter Kabal:

Scalable Audio Coding Using Trellis-Based Optimized Joint Entropy Coding and Quantization. 2288-2300 - Milos Cernak

, Alexandros Lazaridis
, Afsaneh Asaei, Philip N. Garner
:
Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding. 2301-2312 - David Dov

, Ronen Talmon, Israel Cohen:
Kernel Method for Voice Activity Detection in the Presence of Transients. 2313-2326 - Jesús Antonio Villalba López

, Antonio Miguel, Alfonso Ortega
, Eduardo Lleida
:
Bayesian Networks to Model the Variability of Speaker Verification Scores in Adverse Environments. 2327-2340 - Hardik B. Sailor

, Hemant A. Patil:
Novel Unsupervised Auditory Filterbank Learning Using Convolutional RBM for Speech Recognition. 2341-2353 - Sidsel Marie Nørholm, Jesper Rindom Jensen

, Mads Græsbøll Christensen
:
Instantaneous Fundamental Frequency Estimation With Optimal Segmentation for Nonstationary Voiced Speech. 2354-2367 - Sheng Zhang, Jiashu Zhang, Hongyu Han:

Robust Variable Step-Size Decorrelation Normalized Least-Mean-Square Algorithm and its Application to Acoustic Echo Cancellation. 2368-2376 - Tom Barker, Tuomas Virtanen

:
Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms. 2377-2389 - Jinxin Liu

, Xuefeng Chen:
Adaptive Compensation of Misequalization in Narrowband Active Noise Equalizer Systems. 2390-2399 - Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura:

Estimating Speech Recognition Accuracy Based on Error Type Classification. 2400-2413 - Finnian Kelly

, John H. L. Hansen:
Score-Aging Calibration for Speaker Verification. 2414-2424 - Bochen Li, Zhiyao Duan:

An Approach to Score Following for Piano Performances With the Sustained Effect. 2425-2438 - Niko Moritz, Birger Kollmeier, Jörn Anemüller:

Integration of Optimized Modulation Filter Sets Into Deep Neural Networks for Automatic Speech Recognition. 2439-2452 - Simon Leglaive

, Roland Badeau, Gaël Richard:
Multichannel Audio Source Separation With Probabilistic Reverberation Priors. 2453-2465 - Sakari Tervo:

Single Snapshot Detection and Estimation of Reflections From Room Impulse Responses in the Spherical Harmonic Domain. 2466-2480 - Dejan Markovic, Fabio Antonacci

, Lucio Bianchi, Stefano Tubaro, Augusto Sarti:
Extraction of Acoustic Sources Through the Processing of Sound Field Maps in the Ray Space. 2481-2494 - Pavlos Papadopoulos, Andreas Tsiartas, Shrikanth S. Narayanan:

Long-Term SNR Estimation of Speech Signals in Known and Unknown Channel Conditions. 2495-2506 - Ingo R. Titze

, Anil Palaparthi:
Sensitivity of Source-Filter Interaction to Specific Vocal Tract Shapes. 2507-2515 - Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

:
A Hybrid Approach for Speech Enhancement Using MoG Model and Neural Network Phoneme Classifier. 2516-2530 - Gongping Huang

, Jacob Benesty
, Jingdong Chen:
Superdirective Beamforming Based on the Krylov Matrix. 2531-2543

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














