default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 30
Volume 30, 2022
- Qianying Liu, Wenyu Guan, Sujian Li, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi:
RODA: Reverse Operation Based Data Augmentation for Solving Math Word Problems. 1-11 - Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, Minje Kim:
Scalable and Efficient Neural Speech Coding: A Hybrid Design. 12-25 - Sen Yang, Yang Liu, Dawei Feng, Dongsheng Li:
Text Generation From Data With Dynamic Planning. 26-34 - Stefan Liebich, Peter Vary:
Occlusion Effect Cancellation in Headphones and Hearing Devices - The Sister of Active Noise Cancellation. 35-48 - Zhuosheng Zhang, Haojie Yu, Hai Zhao, Masao Utiyama:
Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles. 49-59 - Zhenyu Wang, John H. L. Hansen:
Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition. 60-75 - Kengtao Zheng, Nankai Lin, Shengyi Jiang:
Unsupervised Character Embedding Correction and Candidate Word Denoising. 76-86 - Bing Ma, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao:
Extractive Dialogue Summarization Without Annotation Based on Distantly Supervised Machine Reading Comprehension in Customer Service. 87-97 - Shengcai Liu, Ning Lu, Cheng Chen, Ke Tang:
Efficient Combinatorial Optimization for Word-Level Adversarial Textual Attack. 98-111 - Alessandro Terenzi, Nicola Ortolani, Inês Nolasco, Emmanouil Benetos, Stefania Cecchi:
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity. 112-122 - Shuiyang Mao, P. C. Ching, Tan Lee:
Enhancing Segment-Based Speech Emotion Recognition by Iterative Self-Learning. 123-134 - Abdolreza Sabzi Shahrebabaki, Giampiero Salvi, Torbjørn Svendsen, Sabato Marco Siniscalchi:
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. 135-147 - Javier Jorge, Adrià Giménez, Joan Albert Silvestre-Cerdà, Jorge Civera, Alberto Sanchís, Alfons Juan:
Live Streaming Speech Recognition Using Deep Bidirectional LSTM Acoustic Models and Interpolated Language Models. 148-161 - P. V. Muhammed Shifas, Catalin Zorila, Yannis Stylianou:
End-to-End Neural Based Modification of Noisy Speech for Speech-in-Noise Intelligibility Improvement. 162-173 - Joon-Young Yang, Joon-Hyuk Chang:
VACE-WPE: Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation. 174-189 - Chenpeng Du, Kai Yu:
Phone-Level Prosody Modelling With GMM-Based MDN for Diverse and Controllable Speech Synthesis. 190-201 - Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. 202-217 - Mixiao Hou, Zheng Zhang, Qi Cao, David Zhang, Guangming Lu:
Multi-View Speech Emotion Recognition Via Collective Relation Construction. 218-229 - Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee:
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. 230-243 - Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu:
Word-Region Alignment-Guided Multimodal Neural Machine Translation. 244-259 - Zhuosheng Zhang, Yiqing Zhang, Hai Zhao:
Syntax-Aware Multi-Spans Generation for Reading Comprehension. 260-268 - Pengfei Zhu, Zhuosheng Zhang, Hai Zhao, Xiaoguang Li:
DUMA: Reading Comprehension With Transposition Thinking. 269-279 - Jiayuan Xie, Ningxin Peng, Yi Cai, Tao Wang, Qingbao Huang:
Diverse Distractor Generation for Constructing High-Quality Multiple Choice Questions. 280-291 - Jie Zhang, Guanghui Zhang:
A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing. 292-304 - Luca Turchet, Johan Pauwels:
Music Emotion Recognition: Intention of Composers-Performers Versus Perception of Musicians, Non-Musicians, and Listening Machines. 305-316 - Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki:
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition. 317-329 - Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita:
Integrating Prior Translation Knowledge Into Neural Machine Translation. 330-339 - Keqi Deng, Gaofeng Cheng, Runyan Yang, Yonghong Yan:
Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification. 340-354 - Zuchao Li, Junru Zhou, Hai Zhao, Kevin Parnow:
HPSG-Inspired Joint Neural Constituent and Dependency Parsing in O($n^3$) Time Complexity. 355-366 - Xuan Shi, Erica Cooper, Junichi Yamagishi:
Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds. 367-377 - Zengwei Yao, Wenjie Pei, Fanglin Chen, Guangming Lu, David Zhang:
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-Order Latent Domain. 378-393 - Yanmin Qian, Zhikai Zhou:
Optimizing Data Usage for Low-Resource Speech Recognition. 394-403 - Narla John Metilda Sagaya Mary, Srinivasan Umesh, Sandesh Varadaraju Katta:
S-Vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder. 404-413 - Bengt J. Borgström:
Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification. 414-428 - Menglong Lu, Zhen Huang, Binyang Li, Yunxiang Zhao, Zheng Qin, Dong Sheng Li:
SIFTER: A Framework for Robust Rumor Detection. 429-442 - Lantian Li, Dong Wang, Jiawen Kang, Renyu Wang, Jing Wu, Zhendong Gao, Xiao Chen:
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition. 443-455 - Feiran Yang:
Analysis of Deficient-Length Partitioned-Block Frequency-Domain Adaptive Filters. 456-467 - Hui Jiang, Linfeng Song, Yubin Ge, Fandong Meng, Junfeng Yao, Jinsong Su:
An AST Structure Enhanced Decoder for Code Generation. 468-476 - Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi:
Optimizing Tandem Speaker Verification and Anti-Spoofing Systems. 477-488 - Xin Ni, Jia Ren:
FC-U2-Net: A Novel Deep Neural Network for Singing Voice Separation. 489-494 - Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, Marco Tagliasacchi:
SoundStream: An End-to-End Neural Audio Codec. 495-507 - Wageesha Manamperi, Thushara D. Abhayapala, Jihui Zhang, Prasanga N. Samarasinghe:
Drone Audition: Sound Source Localization Using On-Board Microphones. 508-519 - Qian Li, Hao Peng, Jianxin Li, Jia Wu, Yuanxing Ning, Lihong Wang, Philip S. Yu, Zheng Wang:
Reinforcement Learning-Based Dialogue Guided Event Extraction to Exploit Argument Relations. 520-533 - Santiago Ruiz, Toon van Waterschoot, Marc Moonen:
Distributed Combined Acoustic Echo Cancellation and Noise Reduction in Wireless Acoustic Sensor and Actuator Networks. 534-547 - Lukas Grinewitschus, Peter Jung:
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection. 548-561 - Ziyao Lu, Xiang Li, Yang Liu, Chulun Zhou, Jianwei Cui, Bin Wang, Min Zhang, Jinsong Su:
Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation. 562-570 - Jingxuan Yang, Si Li, Sheng Gao, Jun Guo:
CorefDPR: A Joint Model for Coreference Resolution and Dropped Pronoun Recovery in Chinese Conversations. 571-581 - Timuçin Berk Atalay, Zühre Sü Gül, Enzo De Sena, Zoran Cvetkovic, Hüseyin Hacihabiboglu:
Scattering Delay Network Simulator of Coupled Volume Acoustics. 582-593 - Yi Zhang, Lei Li, Yunfang Wu, Qi Su, Xu Sun:
Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge. 594-604 - Ke Tan, Zhong-Qiu Wang, DeLiang Wang:
Neural Spectrospatial Filtering. 605-621 - Qianren Mao, Jianxin Li, Chenghua Lin, Congwen Chen, Hao Peng, Lihong Wang, Philip S. Yu:
Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks. 622-634 - Zifeng Cheng, Zhiwei Jiang, Yafeng Yin, Cong Wang, Qing Gu:
Learning to Classify Open Intent via Soft Labeling and Manifold Mixup. 635-645 - Xiaochun An, Frank K. Soong, Lei Xie:
Disentangling Style and Speaker Attributes for TTS Style Transfer. 646-658 - Zhuang Chen, Tieyun Qian:
Retrieve-and-Edit Domain Adaptation for End2End Aspect Based Sentiment Analysis. 659-672 - Jian Liu, Mengshi Yu, Yufeng Chen, Jinan Xu:
Cross-Domain Slot Filling as Machine Reading Comprehension: A New Perspective. 673-685 - Yongkang Liu, Qingbao Huang, Jing Li, Linzhang Mo, Yi Cai, Qing Li:
SSAP: Storylines and Sentiment Aware Pre-Trained Model for Story Ending Generation. 686-694 - Ying Zhou, Xuefeng Liang, Yu Gu, Yifei Yin, Longshan Yao:
Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition. 695-705 - Poul Hoang, Jan Mark de Haan, Zheng-Hua Tan, Jesper Jensen:
Multichannel Speech Enhancement With Own Voice-Based Interfering Speech Suppression for Hearing Assistive Devices. 706-720 - Weijie Yu, Chen Xu, Jun Xu, Liang Pang, Ji-Rong Wen:
Distribution Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains. 721-733 - Heming Wang, DeLiang Wang:
Neural Cascade Architecture With Triple-Domain Loss for Speech Enhancement. 734-743 - Riccardo R. De Lucia, Antonio Canclini, Fabio Antonacci, Augusto Sarti:
Group Dictionary Equivalent Source Method for Sparse Nearfield Acoustic Holography. 744-757 - Tong Ma, Ying Wei, Xin Lou:
Reconfigurable Nonuniform Filter Bank for Hearing Aid Systems. 758-771 - Victoria Mingote, Antonio Miguel, Dayana Ribas, Alfonso Ortega, Eduardo Lleida:
aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. 772-784 - Quansheng Tu, Huawei Chen:
Theoretical Lower Bounds on the Performance of the First-Order Differential Microphone Arrays With Sensor Imperfections. 785-801 - Taihui Wang, Feiran Yang, Jun Yang:
Convolutive Transfer Function-Based Multichannel Nonnegative Matrix Factorization for Overdetermined Blind Source Separation. 802-815 - Yi Zhang, Guangyou Zhou, Zhiwen Xie, Jimmy Xiangji Huang:
HGEN: Learning Hierarchical Heterogeneous Graph Encoding for Math Word Problem Solving. 816-828 - Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Xavier Serra:
FSD50K: An Open Dataset of Human-Labeled Sound Events. 829-852 - Yi Lei, Shan Yang, Xinsheng Wang, Lei Xie:
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis. 853-864 - Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. 865-878 - Simon Stone, Yingming Gao, Peter Birkholz:
Articulatory Synthesis of Vocalized /r/ Allophones in German. 879-889 - Prashant Serai, Vishal Sunder, Eric Fosler-Lussier:
Hallucination of Speech Recognition Errors With Sequence to Sequence Learning. 890-900 - Bin Wu, Sakriani Sakti, Jinsong Zhang, Satoshi Nakamura:
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR. 901-916 - Mi Zhang, Tieyun Qian, Bing Liu:
Exploit Feature and Relation Hierarchy for Relation Extraction. 917-930 - Wenxiang Jiao, Xing Wang, Shilin He, Zhaopeng Tu, Irwin King, Michael R. Lyu:
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation. 931-943 - Youzhi Tu, Man-Wai Mak:
Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding. 944-957 - Zhixing Tan, Zeyuan Yang, Meng Zhang, Qun Liu, Maosong Sun, Yang Liu:
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation. 958-967 - Weiwei Lin, Man-Wai Mak:
Mixture Representation Learning for Deep Speaker Embedding. 968-978 - Peng Zhu, Dawei Cheng, Fangzhou Yang, Yifeng Luo, Dingjiang Huang, Weining Qian, Aoying Zhou:
Improving Chinese Named Entity Recognition by Large-Scale Syntactic Dependency Graph. 979-991 - Xiaobo Liang, Lijun Wu, Juntao Li, Tao Qin, Min Zhang, Tie-Yan Liu:
Multi-Teacher Distillation With Single Model for Neural Machine Translation. 992-1002 - Xiaofeng Chen, Guohua Wang, Haopeng Ren, Yi Cai, Ho-fung Leung, Tao Wang:
Task-Adaptive Feature Fusion for Generalized Few-Shot Relation Classification in an Open World Environment. 1003-1015 - Yu-Chen Lin, Cheng Yu, Yi-Te Hsu, Szu-Wei Fu, Yu Tsao, Tei-Wei Kuo:
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points. 1016-1031 - Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. 1032-1047 - Jianhua Geng, Sifan Wang, Qinglai Liu, Xin Lou:
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor. 1048-1060 - Qinzhuo Wu, Qi Zhang, Xuanjing Huang:
Automatic Math Word Problem Generation With Topic-Expression Co-Attention Mechanism and Reinforcement Learning. 1061-1072 - Michael Nigro, Sridhar Krishnan:
Multimodal System for Audio Scene Source Counting and Analysis. 1073-1082 - Yishu Peng, Sheng Zhang, Jiashu Zhang, Wei Xing Zheng:
Combined-Sample Multiband-Structured Subband Filtering Algorithms. 1083-1092 - Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. 1093-1107 - Xudong Dang, Wen Ma, Emanuël A. P. Habets, Hongyan Zhu:
TDOA-Based Robust Sound Source Localization With Sparse Regularization in Wireless Acoustic Sensor Networks. 1108-1123 - Shan Gao, Jing Lin, Xihong Wu, Tianshu Qu:
Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process. 1124-1135 - Giovanni Pepe, Leonardo Gabrielli, Stefano Squartini, Carlo Tripodi, Nicolo Strozzi:
Deep Optimization of Parametric IIR Filters for Audio Equalization. 1136-1149 - Moa Lee, Junmo Lee, Joon-Hyuk Chang:
Non-Autoregressive Fully Parallel Deep Convolutional Neural Speech Synthesis. 1150-1159 - Liam Barrett, Junchao Hu, Peter Howell:
Systematic Review of Machine Learning Approaches for Detecting Developmental Stuttering. 1160-1172 - Sang-Hoon Lee, Hyeong-Rae Noh, Woo-Jeoung Nam, Seong-Whan Lee:
Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck. 1173-1183 - Zhihong Shao, Zhongqin Wu, Minlie Huang:
AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text. 1184-1196 - Dhanunjaya Varma Devalraju, Padmanabhan Rajan:
Multiview Embeddings for Soundscape Classification. 1197-1206 - Chengyu Wang, Suyang Dai, Yipeng Wang, Fei Yang, Minghui Qiu, Kehan Chen, Wei Zhou, Jun Huang:
ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding. 1207-1218 - Jonah Ong, Ba-Tuong Vo, Sven Nordholm, Ba-Ngu Vo, Diluka Moratuwage, Changbeom Shim:
Audio-Visual Based Online Multi-Source Separation. 1219-1234 - Leyang Cui, Yafu Li, Yue Zhang:
Label Attention Network for Structured Prediction. 1235-1248 - Sarinah Sutojo, Tobias May, Steven van de Par:
Segmentation of Multitalker Mixtures Based on Local Feature Contrasts and Auditory Glimpses. 1249-1262 - Hao Gao, Xuelei Feng, Yong Shen:
Weighted Loudspeaker Placement Method for Sound Field Reproduction. 1263-1276 - Gongping Huang, Jacob Benesty, Israel Cohen, Jingdong Chen:
Kronecker Product Multichannel Linear Filtering for Adaptive Weighted Prediction Error-Based Speech Dereverberation. 1277-1289 - Takehiro Sugimoto:
Loudness-Level-Chasing Algorithm for Multiformat Live Audio Production. 1290-1304 - Junshuang Wu, Richong Zhang, Yongyi Mao, Jinpeng Huai:
Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity Typing. 1305-1318 - Anton Ragni, Mark J. F. Gales, Oliver Rose, Katherine M. Knill, Alexandros Kastanos, Qiujia Li, Preben Ness:
Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. 1319-1329 - Zhongxin Bai, Jianyu Wang, Xiao-Lei Zhang, Jingdong Chen:
End-to-End Speaker Verification via Curriculum Bipartite Ranking Weighted Binary Cross-Entropy. 1330-1344 - Shang-Yi Chuang, Hsin-Min Wang, Yu Tsao:
Improved Lite Audio-Visual Speech Enhancement. 1345-1359 - Gaofeng Cheng, Haoran Miao, Runyan Yang, Keqi Deng, Yonghong Yan:
ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture. 1360-1373 - Ashutosh Pandey, DeLiang Wang:
Self-Attending RNN for Speech Enhancement to Improve Cross-Corpus Generalization. 1374-1385 - Di Jin, Shuyang Gao, Seokhwan Kim, Yang Liu, Dilek Hakkani-Tür:
Towards Textual Out-of-Domain Detection Without In-Domain Labels. 1386-1395 - K. Mrinalini, P. Vijayalakshmi, T. Nagarajan:
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems. 1396-1406 - Changhong Wang, Emmanouil Benetos, Vincent Lostanlen, Elaine Chew:
Adaptive Scattering Transforms for Playing Technique Recognition. 1407-1421 - Danwei Cai, Weiqing Wang, Ming Li:
Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition. 1422-1435