default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 26
Volume 26, Number 1, January 2018
- Dianna Yee, A. Homayoun Kamkar-Parsi, Rainer Martin, Henning Puder:
A Noise Reduction Postfilter for Binaurally Linked Single-Microphone Hearing Aids Utilizing a Nearby External Microphone. 5-18 - Tom Bäckström, Johannes Fischer:
Fast Randomization for Distributed Low-Bitrate Coding of Speech and Audio. 19-30 - Jun Deng, Xinzhou Xu, Zixing Zhang, Sascha Frühholz, Björn W. Schuller:
Semisupervised Autoencoders for Speech Emotion Recognition. 31-43 - Md. Sahidullah, Dennis Alexander Lehmann Thomsen, Rosa González Hautamäki, Tomi Kinnunen, Zheng-Hua Tan, Robert Parts, Martti Pitkänen:
Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones. 44-56 - Gilles Degottex, Pierre Lanchantin, Mark J. F. Gales:
A Log Domain Pulse Model for Parametric Speech Synthesis. 57-70 - Johannes Abel, Tim Fingscheidt:
Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation. 71-83 - Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks. 84-96 - Kristian Timm Andersen, Marc Moonen:
Robust Speech-Distortion Weighted Interframe Wiener Filters for Single-Channel Noise Reduction. 97-107 - Chen-Yu Chiang:
Cross-Dialect Adaptation Framework for Constructing Prosodic Models for Chinese Dialect Text-to-Speech Systems. 108-121 - Bingquan Liu, Zhen Xu, Chengjie Sun, Baoxun Wang, Xiaolong Wang, Derek F. Wong, Min Zhang:
Content-Oriented User Modeling for Personalized Response Ranking in Chatbots. 122-133 - Zhiyuan Tang, Dong Wang, Yixiang Chen, Lantian Li, Andrew Abel:
Phonetic Temporal Neural Model for Language Identification. 134-144 - Soumitro Chakrabarty, Emanuël A. P. Habets:
A Bayesian Approach to Informed Spatial Filtering With Robustness Against DOA Estimation Errors. 145-160 - Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
An Information Distillation Framework for Extractive Summarization. 161-170 - Ma Jin, Yan Song, Ian McLoughlin, Li-Rong Dai:
LID-Senones and Their Statistics for Language Identification. 171-183 - Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong:
Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition. 184-196 - Shivesh Ranjan, John H. L. Hansen:
Curriculum Learning Based Approaches for Noise Robust Speaker Recognition. 197-210
Volume 26, Number 2, February 2018
- Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno:
Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms. 215-230 - Yu-Ping Ruan, Qian Chen, Zhen-Hua Ling:
A Sequential Neural Encoder With Latent Structured Description for Modeling Sentences. 231-242 - Amelia Jane Gully, Helena Daffern, Damian T. Murphy:
Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh. 243-255 - Chunyang Wu, Mark J. F. Gales, Anton Ragni, Penny Karanasou, Khe Chai Sim:
Improving Interpretability and Regularization in Deep Learning. 256-265 - Kehai Chen, Tiejun Zhao, Muyun Yang, Lemao Liu, Akihiro Tamura, Rui Wang, Masao Utiyama, Eiichiro Sumita:
A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation. 266-280 - Joonas Nikunen, Aleksandr Diment, Tuomas Virtanen:
Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking. 281-295 - Johan Sward, Hongbin Li, Andreas Jakobsson:
Off-Grid Fundamental Frequency Estimation. 296-303 - Dylan Menzies, Marcos F. Simón Gálvez, Filippo Maria Fazi:
A Low-Frequency Panning Method With Compensation for Head Rotation. 304-317 - Branimir Dropuljic, Igor Mijic, Davor Petrinovic, Tanja Jovanovic, Kresimir Cosic:
Vocal Analysis of Acoustic Startle Responses. 318-329 - Philipp Aichinger, Martin Hagmüller, Berit Schneider-Stickler, Jean Schoentgen, Franz Pernkopf:
Tracking of Multiple Fundamental Frequencies in Diplophonic Voices. 330-341 - Anastasios Alexandridis, Athanasios Mouchtaris:
Multiple Sound Source Location Estimation in Wireless Acoustic Sensor Networks Using DOA Estimates: The Data-Association Problem. 342-356 - Robert Rehr, Timo Gerkmann:
On the Importance of Super-Gaussian Speech Priors for Machine-Learning Based Speech Enhancement. 357-366 - Sonia Djaziri Larbi, Gaël Mahé, Imen Marrakchi-Mezghani, Monia Turki, Meriem Jaïdane:
Watermark-Driven Acoustic Echo Cancellation. 367-378 - Annamaria Mesaros, Toni Heittola, Emmanouil Benetos, Peter Foster, Mathieu Lagrange, Tuomas Virtanen, Mark D. Plumbley:
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge. 379-393 - Cheng-Tao Chung, Lin-Shan Lee:
Unsupervised Discovery of Structured Acoustic Tokens With Applications to Spoken Term Detection. 394-405 - Tobias May:
Robust Speech Dereverberation With a Neural Network-Based Post-Filter That Exploits Multi-Conditional Training of Binaural Cues. 406-414 - Majid Mirbagheri, Les Atlas, Adrian K. C. Lee:
Regression Factor Analysis With an Application to Continuous HRIR Measurement. 415-421 - Jen-Tzung Chien:
Bayesian Nonparametric Learning for Hierarchical and Sparse Topics. 422-435 - Johannes Stahl, Pejman Mowlaee:
A Pitch-Synchronous Simultaneous Detection-Estimation Framework for Speech Enhancement. 436-450
Volume 26, Number 3, March 2018
- César D. Salvador, Shuichi Sakamoto, Jorge Treviño, Yôiti Suzuki:
Boundary Matching Filters for Spherical Microphone and Loudspeaker Arrays. 461-474 - Ahmed Hussen Abdelaziz:
Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition. 475-484 - Satoru Emura:
Residual Echo Reduction for Multichannel Acoustic Echo Cancelers With a Complex-Valued Residual Echo Estimate. 485-500 - Van Hai Do, Nancy F. Chen, Boon Pang Lim, Mark A. Hasegawa-Johnson:
Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched Transcription. 501-514 - Mehdi Zohourian, Gerald Enzner, Rainer Martin:
Binaural Speaker Localization Integrated Into an Adaptive Beamformer for Hearing Aids. 515-528 - Yong Xiang, Iynkaran Natgunanathan, Dezhong Peng, Guang Hua, Bo Liu:
Spread Spectrum Audio Watermarking Using Multiple Orthogonal PN Sequences and Variable Embedding Strengths and Polarities. 529-539 - Chuanqi Tan, Furu Wei, Qingyu Zhou, Nan Yang, Bowen Du, Weifeng Lv, Ming Zhou:
Context-Aware Answer Sentence Selection With Hierarchical Gated Recurrent Neural Networks. 540-549 - Jie Zhang, Sundeep Prabhakar Chepuri, Richard Christian Hendriks, Richard Heusdens:
Microphone Subset Selection for MVDR Beamformer Based Noise Reduction. 550-563 - Syu-Siang Wang, Payton Lin, Yu Tsao, Jeih-Weih Hung, Borching Su:
Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition. 564-579 - Yu Wang, Mike Brookes:
Model-Based Speech Enhancement in the Modulation Domain. 580-594 - Christian Huemmer, Christian Hofmann, Roland Maas, Walter Kellermann:
Estimating Parameters of Nonlinear Systems Using the Elitist Particle Filter Based on Evolutionary Strategies. 595-608 - Daniele Salvati, Carlo Drioli, Gian Luca Foresti:
A Low-Complexity Robust Beamforming Using Diagonal Unloading for Acoustic Source Localization. 609-622 - Jinsong Su, Jiali Zeng, Deyi Xiong, Yang Liu, Mingxuan Wang, Jun Xie:
A Hierarchy-to-Sequence Attentional Neural Machine Translation Model. 623-632 - Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre:
A Unified Joint Model to Deal With Nuisance Variabilities in the i-Vector Space. 633-645 - Gregory Gelly, Jean-Luc Gauvain:
Optimization of RNN-Based Speech Activity Detection. 646-656 - Maja Taseska, Emanuël A. P. Habets:
Blind Source Separation of Moving Sources Using Sparsity-Based Source Detection and Tracking. 657-670 - Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang:
Refining Word Embeddings Using Intensity Scores for Sentiment Analysis. 671-681 - Yuval Dorfan, Axel Plinge, Gershon Hazan, Sharon Gannot:
Distributed Expectation-Maximization Algorithm for Speaker Localization in Reverberant Environments. 682-695
Volume 26, Number 4, April 2018
- Zhili Tan, Man-Wai Mak, Brian Kan-Wing Mak:
DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification. 700-712 - Ya-Jun Hu, Zhen-Hua Ling:
Extracting Spectral Features Using Deep Autoencoders With Binary Distributed Hidden Units for Statistical Parametric Speech Synthesis. 713-724 - Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot:
A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models. 725-735 - Sandro Cumani, Pietro Laface:
Speaker Recognition Using e-Vectors. 736-748 - Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang:
Generalizing I-Vector Estimation for Rapid Speaker Recognition. 749-759 - Yaakov Buchris, Israel Cohen, Jacob Benesty:
Frequency-Domain Design of Asymmetric Circular Differential Microphone Arrays. 760-773 - Jihui Zhang, Thushara D. Abhayapala, Wen Zhang, Prasanga N. Samarasinghe, Shouda Jiang:
Active Noise Control Over Space: A Wave Domain Approach. 774-786 - Yi Luo, Zhuo Chen, Nima Mesgarani:
Speaker-Independent Speech Separation With Deep Attractor Network. 787-796 - Neethu Mariam Joy, Sandeep Reddy Kothinti, Srinivasan Umesh:
FMLLR Speaker Normalization With i-Vector: In Pseudo-FMLLR and Distillation Framework. 797-805 - Swati Chandna, Wenwu Wang:
Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions. 806-819 - Zhili Tan, Man-Wai Mak, Brian Kan-Wing Mak, Yingke Zhu:
Denoised Senone I-Vectors for Robust Speaker Verification. 820-830 - Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models. 831-846
Volume 26, Number 5, May 2018
- Youssef El Baba, Andreas Walther, Emanuël A. P. Habets:
3D Room Geometry Inference Based on Room Impulse Response Stacks. 857-872 - Qian Zhang, John H. L. Hansen:
Language/Dialect Recognition Based on Unsupervised Deep Learning. 873-882 - Zhen-Hua Ling, Yang Ai, Yu Gu, Li-Rong Dai:
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension. 883-894 - Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Christian Huemmer, Tomohiro Nakatani:
Context Adaptive Neural Network Based Acoustic Models for Rapid Adaptation. 895-908 - Linh Thi Thuc Tran, Sven Erik Nordholm, Henning F. Schepker, Hai Huyen Dam, Simon Doclo:
Two-Microphone Hearing Aids Using Prediction Error Method for Adaptive Feedback Control. 909-923 - Jiho Chang, Marton Marschall:
Periphony-Lattice Mixed-Order Ambisonic Scheme for Spherical Microphone Arrays. 924-936 - Nikolaos Dionelis, Mike Brookes:
Phase-Aware Single-Channel Speech Enhancement With Modulation-Domain Kalman Filtering. 937-950 - Chengshi Zheng, Antoine Deleforge, Xiaodong Li, Walter Kellermann:
Statistical Analysis of the Multichannel Wiener Filter Using a Bivariate Normal Distribution for Sample Covariance Matrices. 951-966 - Colin Vaz, Vikram Ramanarayanan, Shrikanth S. Narayanan:
Acoustic Denoising Using Dictionary Learning With Spectral and Temporal Regularization. 967-980 - Lin Wang, Andrea Cavallaro:
Pseudo-Determined Blind Source Separation for Ad-hoc Microphone Networks. 981-994 - Sandro Cumani, Pietro Laface:
Scoring Heterogeneous Speaker Vectors Using Nonlinear Transformations and Tied PLDA Models. 995-1009 - Giuliano Bernardi, Toon van Waterschoot, Jan Wouters, Marc Moonen:
Subjective and Objective Sound-Quality Evaluation of Adaptive Feedback Cancellation Algorithms. 1010-1024
Volume 26, Number 6, June 2018
- Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, Li Li:
Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization. 1025-1036 - Jacob Donley, Christian H. Ritz, W. Bastiaan Kleijn:
Multizone Soundfield Reproduction With Privacy- and Quality-Based Speech Masking Filters. 1037-1051 - Sebastian Braun, Adam Kuklasinski, Ofer Schwartz, Oliver Thiergart, Emanuël A. P. Habets, Sharon Gannot, Simon Doclo, Jesper Jensen:
Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators. 1052-1067 - Elie-Laurent Benaroya, Nicolas Obin, Marco Liuni, Axel Roebel, Wilson Raumel, Sylvain Argentieri:
Binaural Localization of Multiple Sound Sources by Non-Negative Tensor Factorization. 1068-1078 - Nathanaël Perraudin, Nicki Holighaus, Piotr Majdak, Péter Balázs:
Inpainting of Long Audio Segments With Similarity Graphs. 1079-1090 - Paul Magron, Roland Badeau, Bertrand David:
Model-Based STFT Phase Recovery for Audio Source Separation. 1091-1101 - Ina Kodrasi, Simon Doclo:
Analysis of Eigenvalue Decomposition-Based Late Reverberation Power Spectral Density Estimation. 1102-1114 - Sebastian Braun, Emanuël A. P. Habets:
Linear Prediction-Based Online Dereverberation and Noise Reduction Using Alternating Kalman Filters. 1115-1125 - Dhananjay Ram, Afsaneh Asaei, Hervé Bourlard:
Sparse Subspace Modeling for Query by Example Spoken Term Detection. 1126-1139 - Martin Krawczyk-Becker, Timo Gerkmann:
On Speech Enhancement Under PSD Uncertainty. 1140-1149 - Simon Leglaive, Roland Badeau, Gaël Richard:
Student's t Source and Mixing Models for Multichannel Audio Source Separation. 1150-1164
Volume 26, Number 7, July 2018
- Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Mel-Cepstrum-Based Quantization Noise Shaping Applied to Neural-Network-Based Speech Waveform Synthesis. 1173-1180 - Qing Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Multiobjective Learning and Ensembling Approach to High-Performance Speech Enhancement With Compact Neural Network Architectures. 1181-1193 - Miguel Ángel del Agua, Adrià Giménez, Alberto Sanchís, Jorge Civera, Alfons Juan:
Speaker-Adapted Confidence Measures for ASR Using Deep Bidirectional Recurrent Neural Networks. 1194-1202 - Jorge Proença, Carla Lopes, Michael Tjalve, Andreas Stolcke, Sara Candeias, Fernando Perdigão:
Mispronunciation Detection in Children's Reading of Sentences. 1203-1215 - Ljubisa Stankovic, Milos Brajovic:
Analysis of the Reconstruction of Sparse Signals in the DCT Domain Applied to Audio Signals. 1216-1231 - João Felipe Santos, Tiago H. Falk:
Speech Dereverberation With Context-Aware Recurrent Neural Networks. 1232-1242 - Michele Geronazzo, Simone Spagnol, Federico Avanzini:
Do We Need Individual Head-Related Transfer Functions for Vertical Localization? The Case Study of a Spectral Notch Distance Metric. 1243-1256 - Daniel Marquardt, Simon Doclo:
Interaural Coherence Preservation for Binaural Noise Reduction Using Partial Noise Estimation and Spectral Postfiltering. 1257-1270 - Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
Bias-Compensated Informed Sound Source Localization Using Relative Transfer Functions. 1271-1285 - Fei Tao, Carlos Busso:
Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition. 1286-1298
Volume 26, Number 8, August 2018
- Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis, Derry FitzGerald, Bryan Pardo:
An Overview of Lead and Accompaniment Separation in Music. 1307-1335 - Chien-Yao Wang, Jia-Ching Wang, Andri Santoso, Chin-Chin Chiang, Chung-Hsien Wu:
Sound Event Recognition Using Auditory-Receptive-Field Binary Pattern and Hierarchical-Diving Deep Belief Network. 1336-1351 - Liner Yang, Meishan Zhang, Yang Liu, Maosong Sun, Nan Yu, Guohong Fu:
Joint POS Tagging and Dependence Parsing With Transition-Based Neural Networks. 1352-1358 - Kai Yu, Zijian Zhao, Xueyang Wu, Hongtao Lin, Xuan Liu:
Rich Short Text Conversation Using Semantic-Key-Controlled Sequence Generation. 1359-1368 - Bernhard Lehner, Jan Schlüter, Gerhard Widmer:
Online, Loudness-Invariant Vocal Detection in Mixed Music Signals. 1369-1380 - Simon Stone, Michael Marxen, Peter Birkholz:
Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model. 1381-1392 - Tian Tan, Yanmin Qian, Hu Hu, Ying Zhou, Wen Ding, Kai Yu:
Adaptive Very Deep Convolutional Residual Network for Noise Robust Speech Recognition. 1393-1405 - Xin Wang, Shinji Takaki, Junichi Yamagishi:
Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis. 1406-1419 - Cassia Valentini-Botinhao, Junichi Yamagishi:
Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech. 1420-1433 - Andreas I. Koutrouvelis, Thomas W. Sherson, Richard Heusdens, Richard C. Hendriks:
A Low-Cost Robust Distributed Linearly Constrained Beamformer for Wireless Acoustic Sensor Networks With Arbitrary Topology. 1434-1448
Volume 26, Number 9, September 2018
- Chih-Wei Wu, Christian Dittmar, Carl Southall, Richard Vogl, Gerhard Widmer, Jason Hockman, Meinard Müller, Alexander Lerch:
A Review of Automatic Drum Transcription. 1457-1483 - Christine Evers, Patrick A. Naylor:
Acoustic SLAM. 1484-1498 - Clement Laroche, Matthieu Kowalski, Hélène Papadopoulos, Gaël Richard:
Hybrid Projective Nonnegative Matrix Factorization With Drum Dictionaries for Harmonic/Percussive Source Separation. 1499-1511 - Julio J. Carabias-Orti, Joonas Nikunen, Tuomas Virtanen, Pedro Vera-Candeas:
Multichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorization. 1512-1527 - Meishan Zhang, Nan Yu, Guohong Fu:
A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging. 1528-1538 - Dylan Menzies,