


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 30
Volume 30, 2022
- Qianying Liu

, Wenyu Guan, Sujian Li, Fei Cheng, Daisuke Kawahara
, Sadao Kurohashi
:
RODA: Reverse Operation Based Data Augmentation for Solving Math Word Problems. 1-11 - Kai Zhen

, Jongmo Sung
, Mi Suk Lee, Seungkwon Beack, Minje Kim
:
Scalable and Efficient Neural Speech Coding: A Hybrid Design. 12-25 - Sen Yang, Yang Liu, Dawei Feng, Dongsheng Li

:
Text Generation From Data With Dynamic Planning. 26-34 - Stefan Liebich

, Peter Vary
:
Occlusion Effect Cancellation in Headphones and Hearing Devices - The Sister of Active Noise Cancellation. 35-48 - Zhuosheng Zhang

, Haojie Yu, Hai Zhao
, Masao Utiyama
:
Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles. 49-59 - Zhenyu Wang, John H. L. Hansen

:
Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition. 60-75 - Kengtao Zheng

, Nankai Lin
, Shengyi Jiang:
Unsupervised Character Embedding Correction and Candidate Word Denoising. 76-86 - Bing Ma, Haifeng Sun

, Jingyu Wang
, Qi Qi, Jianxin Liao:
Extractive Dialogue Summarization Without Annotation Based on Distantly Supervised Machine Reading Comprehension in Customer Service. 87-97 - Shengcai Liu

, Ning Lu
, Cheng Chen
, Ke Tang
:
Efficient Combinatorial Optimization for Word-Level Adversarial Textual Attack. 98-111 - Alessandro Terenzi

, Nicola Ortolani, Inês Nolasco
, Emmanouil Benetos
, Stefania Cecchi
:
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity. 112-122 - Shuiyang Mao

, P. C. Ching
, Tan Lee
:
Enhancing Segment-Based Speech Emotion Recognition by Iterative Self-Learning. 123-134 - Abdolreza Sabzi Shahrebabaki

, Giampiero Salvi
, Torbjørn Svendsen
, Sabato Marco Siniscalchi
:
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. 135-147 - Javier Jorge

, Adrià Giménez
, Joan Albert Silvestre-Cerdà
, Jorge Civera
, Alberto Sanchís
, Alfons Juan
:
Live Streaming Speech Recognition Using Deep Bidirectional LSTM Acoustic Models and Interpolated Language Models. 148-161 - P. V. Muhammed Shifas

, Catalin Zorila, Yannis Stylianou:
End-to-End Neural Based Modification of Noisy Speech for Speech-in-Noise Intelligibility Improvement. 162-173 - Joon-Young Yang, Joon-Hyuk Chang

:
VACE-WPE: Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation. 174-189 - Chenpeng Du

, Kai Yu
:
Phone-Level Prosody Modelling With GMM-Based MDN for Diverse and Controllable Speech Synthesis. 190-201 - Haibin Wu

, Xu Li
, Andy T. Liu
, Zhiyong Wu
, Helen Meng, Hung-Yi Lee
:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. 202-217 - Mixiao Hou

, Zheng Zhang
, Qi Cao
, David Zhang
, Guangming Lu
:
Multi-View Speech Emotion Recognition Via Collective Relation Construction. 218-229 - Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang

, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee
:
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. 230-243 - Yuting Zhao

, Mamoru Komachi
, Tomoyuki Kajiwara, Chenhui Chu
:
Word-Region Alignment-Guided Multimodal Neural Machine Translation. 244-259 - Zhuosheng Zhang

, Yiqing Zhang, Hai Zhao
:
Syntax-Aware Multi-Spans Generation for Reading Comprehension. 260-268 - Pengfei Zhu

, Zhuosheng Zhang
, Hai Zhao
, Xiaoguang Li:
DUMA: Reading Comprehension With Transposition Thinking. 269-279 - Jiayuan Xie, Ningxin Peng, Yi Cai

, Tao Wang
, Qingbao Huang
:
Diverse Distractor Generation for Constructing High-Quality Multiple Choice Questions. 280-291 - Jie Zhang

, Guanghui Zhang
:
A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing. 292-304 - Luca Turchet

, Johan Pauwels
:
Music Emotion Recognition: Intention of Composers-Performers Versus Perception of Musicians, Non-Musicians, and Listening Machines. 305-316 - Wenxin Hou, Han Zhu

, Yidong Wang
, Jindong Wang
, Tao Qin, Renjun Xu
, Takahiro Shinozaki
:
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition. 317-329 - Kehai Chen

, Rui Wang
, Masao Utiyama
, Eiichiro Sumita
:
Integrating Prior Translation Knowledge Into Neural Machine Translation. 330-339 - Keqi Deng

, Gaofeng Cheng
, Runyan Yang
, Yonghong Yan
:
Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification. 340-354 - Zuchao Li

, Junru Zhou, Hai Zhao
, Kevin Parnow:
HPSG-Inspired Joint Neural Constituent and Dependency Parsing in O($n^3$) Time Complexity. 355-366 - Xuan Shi

, Erica Cooper
, Junichi Yamagishi
:
Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds. 367-377 - Zengwei Yao

, Wenjie Pei, Fanglin Chen
, Guangming Lu
, David Zhang
:
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-Order Latent Domain. 378-393 - Yanmin Qian

, Zhikai Zhou
:
Optimizing Data Usage for Low-Resource Speech Recognition. 394-403 - Narla John Metilda Sagaya Mary

, Srinivasan Umesh, Sandesh Varadaraju Katta:
S-Vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder. 404-413 - Bengt J. Borgström

:
Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification. 414-428 - Menglong Lu

, Zhen Huang, Binyang Li, Yunxiang Zhao
, Zheng Qin, Dong Sheng Li
:
SIFTER: A Framework for Robust Rumor Detection. 429-442 - Lantian Li

, Dong Wang
, Jiawen Kang, Renyu Wang, Jing Wu, Zhendong Gao, Xiao Chen:
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition. 443-455 - Feiran Yang

:
Analysis of Deficient-Length Partitioned-Block Frequency-Domain Adaptive Filters. 456-467 - Hui Jiang

, Linfeng Song
, Yubin Ge, Fandong Meng, Junfeng Yao
, Jinsong Su
:
An AST Structure Enhanced Decoder for Code Generation. 468-476 - Anssi Kanervisto

, Ville Hautamäki
, Tomi Kinnunen, Junichi Yamagishi
:
Optimizing Tandem Speaker Verification and Anti-Spoofing Systems. 477-488 - Xin Ni

, Jia Ren:
FC-U2-Net: A Novel Deep Neural Network for Singing Voice Separation. 489-494 - Neil Zeghidour

, Alejandro Luebs, Ahmed Omran, Jan Skoglund
, Marco Tagliasacchi
:
SoundStream: An End-to-End Neural Audio Codec. 495-507 - Wageesha Manamperi

, Thushara D. Abhayapala
, Jihui Zhang
, Prasanga N. Samarasinghe
:
Drone Audition: Sound Source Localization Using On-Board Microphones. 508-519 - Qian Li

, Hao Peng
, Jianxin Li
, Jia Wu
, Yuanxing Ning, Lihong Wang, Philip S. Yu
, Zheng Wang
:
Reinforcement Learning-Based Dialogue Guided Event Extraction to Exploit Argument Relations. 520-533 - Santiago Ruiz

, Toon van Waterschoot
, Marc Moonen
:
Distributed Combined Acoustic Echo Cancellation and Noise Reduction in Wireless Acoustic Sensor and Actuator Networks. 534-547 - Lukas Grinewitschus

, Peter Jung
:
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection. 548-561 - Ziyao Lu

, Xiang Li, Yang Liu
, Chulun Zhou, Jianwei Cui, Bin Wang, Min Zhang, Jinsong Su
:
Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation. 562-570 - Jingxuan Yang

, Si Li
, Sheng Gao
, Jun Guo
:
CorefDPR: A Joint Model for Coreference Resolution and Dropped Pronoun Recovery in Chinese Conversations. 571-581 - Timuçin Berk Atalay

, Zühre Sü Gül, Enzo De Sena
, Zoran Cvetkovic
, Hüseyin Hacihabiboglu
:
Scattering Delay Network Simulator of Coupled Volume Acoustics. 582-593 - Yi Zhang

, Lei Li, Yunfang Wu, Qi Su
, Xu Sun
:
Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge. 594-604 - Ke Tan

, Zhong-Qiu Wang
, DeLiang Wang
:
Neural Spectrospatial Filtering. 605-621 - Qianren Mao

, Jianxin Li
, Chenghua Lin
, Congwen Chen, Hao Peng
, Lihong Wang
, Philip S. Yu
:
Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks. 622-634 - Zifeng Cheng

, Zhiwei Jiang
, Yafeng Yin
, Cong Wang
, Qing Gu
:
Learning to Classify Open Intent via Soft Labeling and Manifold Mixup. 635-645 - Xiaochun An, Frank K. Soong, Lei Xie

:
Disentangling Style and Speaker Attributes for TTS Style Transfer. 646-658 - Zhuang Chen

, Tieyun Qian
:
Retrieve-and-Edit Domain Adaptation for End2End Aspect Based Sentiment Analysis. 659-672 - Jian Liu

, Mengshi Yu, Yufeng Chen, Jinan Xu:
Cross-Domain Slot Filling as Machine Reading Comprehension: A New Perspective. 673-685 - Yongkang Liu, Qingbao Huang

, Jing Li, Linzhang Mo, Yi Cai, Qing Li
:
SSAP: Storylines and Sentiment Aware Pre-Trained Model for Story Ending Generation. 686-694 - Ying Zhou

, Xuefeng Liang
, Yu Gu
, Yifei Yin, Longshan Yao:
Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition. 695-705 - Poul Hoang

, Jan Mark de Haan, Zheng-Hua Tan
, Jesper Jensen
:
Multichannel Speech Enhancement With Own Voice-Based Interfering Speech Suppression for Hearing Assistive Devices. 706-720 - Weijie Yu

, Chen Xu, Jun Xu
, Liang Pang, Ji-Rong Wen:
Distribution Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains. 721-733 - Heming Wang

, DeLiang Wang
:
Neural Cascade Architecture With Triple-Domain Loss for Speech Enhancement. 734-743 - Riccardo R. De Lucia

, Antonio Canclini
, Fabio Antonacci, Augusto Sarti
:
Group Dictionary Equivalent Source Method for Sparse Nearfield Acoustic Holography. 744-757 - Tong Ma

, Ying Wei
, Xin Lou
:
Reconfigurable Nonuniform Filter Bank for Hearing Aid Systems. 758-771 - Victoria Mingote

, Antonio Miguel
, Dayana Ribas
, Alfonso Ortega
, Eduardo Lleida
:
aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. 772-784 - Quansheng Tu, Huawei Chen

:
Theoretical Lower Bounds on the Performance of the First-Order Differential Microphone Arrays With Sensor Imperfections. 785-801 - Taihui Wang

, Feiran Yang
, Jun Yang
:
Convolutive Transfer Function-Based Multichannel Nonnegative Matrix Factorization for Overdetermined Blind Source Separation. 802-815 - Yi Zhang

, Guangyou Zhou
, Zhiwen Xie
, Jimmy Xiangji Huang
:
HGEN: Learning Hierarchical Heterogeneous Graph Encoding for Math Word Problem Solving. 816-828 - Eduardo Fonseca

, Xavier Favory, Jordi Pons
, Frederic Font
, Xavier Serra:
FSD50K: An Open Dataset of Human-Labeled Sound Events. 829-852 - Yi Lei, Shan Yang, Xinsheng Wang

, Lei Xie
:
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis. 853-864 - Tao Wang

, Ruibo Fu
, Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. 865-878 - Simon Stone

, Yingming Gao
, Peter Birkholz
:
Articulatory Synthesis of Vocalized /r/ Allophones in German. 879-889 - Prashant Serai

, Vishal Sunder, Eric Fosler-Lussier
:
Hallucination of Speech Recognition Errors With Sequence to Sequence Learning. 890-900 - Bin Wu

, Sakriani Sakti
, Jinsong Zhang, Satoshi Nakamura
:
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR. 901-916 - Mi Zhang, Tieyun Qian

, Bing Liu
:
Exploit Feature and Relation Hierarchy for Relation Extraction. 917-930 - Wenxiang Jiao

, Xing Wang, Shilin He, Zhaopeng Tu, Irwin King
, Michael R. Lyu:
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation. 931-943 - Youzhi Tu

, Man-Wai Mak
:
Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding. 944-957 - Zhixing Tan

, Zeyuan Yang
, Meng Zhang, Qun Liu
, Maosong Sun
, Yang Liu
:
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation. 958-967 - Weiwei Lin

, Man-Wai Mak
:
Mixture Representation Learning for Deep Speaker Embedding. 968-978 - Peng Zhu

, Dawei Cheng
, Fangzhou Yang, Yifeng Luo
, Dingjiang Huang, Weining Qian, Aoying Zhou:
Improving Chinese Named Entity Recognition by Large-Scale Syntactic Dependency Graph. 979-991 - Xiaobo Liang, Lijun Wu

, Juntao Li
, Tao Qin
, Min Zhang, Tie-Yan Liu:
Multi-Teacher Distillation With Single Model for Neural Machine Translation. 992-1002 - Xiaofeng Chen, Guohua Wang, Haopeng Ren, Yi Cai

, Ho-fung Leung
, Tao Wang
:
Task-Adaptive Feature Fusion for Generalized Few-Shot Relation Classification in an Open World Environment. 1003-1015 - Yu-Chen Lin

, Cheng Yu, Yi-Te Hsu, Szu-Wei Fu
, Yu Tsao
, Tei-Wei Kuo
:
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points. 1016-1031 - Tomohiro Nakatani

, Rintaro Ikeshita
, Keisuke Kinoshita
, Hiroshi Sawada
, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. 1032-1047 - Jianhua Geng

, Sifan Wang
, Qinglai Liu
, Xin Lou
:
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor. 1048-1060 - Qinzhuo Wu

, Qi Zhang
, Xuanjing Huang
:
Automatic Math Word Problem Generation With Topic-Expression Co-Attention Mechanism and Reinforcement Learning. 1061-1072 - Michael Nigro

, Sridhar Krishnan
:
Multimodal System for Audio Scene Source Counting and Analysis. 1073-1082 - Yishu Peng

, Sheng Zhang
, Jiashu Zhang
, Wei Xing Zheng
:
Combined-Sample Multiband-Structured Subband Filtering Algorithms. 1083-1092 - Shoukang Hu

, Xurong Xie, Mingyu Cui, Jiajun Deng
, Shansong Liu, Jianwei Yu
, Mengzhe Geng
, Xunying Liu
, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. 1093-1107 - Xudong Dang

, Wen Ma, Emanuël A. P. Habets
, Hongyan Zhu
:
TDOA-Based Robust Sound Source Localization With Sparse Regularization in Wireless Acoustic Sensor Networks. 1108-1123 - Shan Gao, Jing Lin, Xihong Wu, Tianshu Qu

:
Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process. 1124-1135 - Giovanni Pepe

, Leonardo Gabrielli
, Stefano Squartini
, Carlo Tripodi, Nicolo Strozzi
:
Deep Optimization of Parametric IIR Filters for Audio Equalization. 1136-1149 - Moa Lee, Junmo Lee, Joon-Hyuk Chang

:
Non-Autoregressive Fully Parallel Deep Convolutional Neural Speech Synthesis. 1150-1159 - Liam Barrett

, Junchao Hu
, Peter Howell
:
Systematic Review of Machine Learning Approaches for Detecting Developmental Stuttering. 1160-1172 - Sang-Hoon Lee

, Hyeong-Rae Noh
, Woo-Jeoung Nam
, Seong-Whan Lee
:
Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck. 1173-1183 - Zhihong Shao

, Zhongqin Wu, Minlie Huang
:
AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text. 1184-1196 - Dhanunjaya Varma Devalraju

, Padmanabhan Rajan:
Multiview Embeddings for Soundscape Classification. 1197-1206 - Chengyu Wang

, Suyang Dai, Yipeng Wang, Fei Yang, Minghui Qiu
, Kehan Chen, Wei Zhou, Jun Huang:
ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding. 1207-1218 - Jonah Ong

, Ba-Tuong Vo
, Sven Nordholm
, Ba-Ngu Vo
, Diluka Moratuwage
, Changbeom Shim
:
Audio-Visual Based Online Multi-Source Separation. 1219-1234 - Leyang Cui

, Yafu Li
, Yue Zhang
:
Label Attention Network for Structured Prediction. 1235-1248 - Sarinah Sutojo

, Tobias May
, Steven van de Par:
Segmentation of Multitalker Mixtures Based on Local Feature Contrasts and Auditory Glimpses. 1249-1262 - Hao Gao

, Xuelei Feng
, Yong Shen
:
Weighted Loudspeaker Placement Method for Sound Field Reproduction. 1263-1276 - Gongping Huang

, Jacob Benesty
, Israel Cohen
, Jingdong Chen
:
Kronecker Product Multichannel Linear Filtering for Adaptive Weighted Prediction Error-Based Speech Dereverberation. 1277-1289 - Takehiro Sugimoto

:
Loudness-Level-Chasing Algorithm for Multiformat Live Audio Production. 1290-1304 - Junshuang Wu

, Richong Zhang
, Yongyi Mao, Jinpeng Huai:
Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity Typing. 1305-1318 - Anton Ragni

, Mark J. F. Gales
, Oliver Rose, Katherine M. Knill
, Alexandros Kastanos
, Qiujia Li
, Preben Ness:
Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. 1319-1329 - Zhongxin Bai

, Jianyu Wang, Xiao-Lei Zhang
, Jingdong Chen
:
End-to-End Speaker Verification via Curriculum Bipartite Ranking Weighted Binary Cross-Entropy. 1330-1344 - Shang-Yi Chuang

, Hsin-Min Wang
, Yu Tsao
:
Improved Lite Audio-Visual Speech Enhancement. 1345-1359 - Gaofeng Cheng

, Haoran Miao
, Runyan Yang
, Keqi Deng
, Yonghong Yan
:
ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture. 1360-1373 - Ashutosh Pandey

, DeLiang Wang
:
Self-Attending RNN for Speech Enhancement to Improve Cross-Corpus Generalization. 1374-1385 - Di Jin

, Shuyang Gao, Seokhwan Kim
, Yang Liu, Dilek Hakkani-Tür
:
Towards Textual Out-of-Domain Detection Without In-Domain Labels. 1386-1395 - K. Mrinalini

, P. Vijayalakshmi
, T. Nagarajan
:
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems. 1396-1406 - Changhong Wang

, Emmanouil Benetos
, Vincent Lostanlen
, Elaine Chew
:
Adaptive Scattering Transforms for Playing Technique Recognition. 1407-1421 - Danwei Cai

, Weiqing Wang, Ming Li
:
Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition. 1422-1435 - Yu Luo

, Lina Pu
:
EC-ANC: Edge Case-Enhanced Active Noise Cancellation for True Wireless Stereo Earbuds. 1436-1447 - Tao Li

, Xinsheng Wang
, Qicong Xie, Zhichao Wang, Lei Xie
:
Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis. 1448-1460 - Yilin Zhao

, Zhuosheng Zhang
, Hai Zhao
:
Reference Knowledgeable Network for Machine Reading Comprehension. 1461-1473 - Fu-Hao Yu

, Kuan-Yu Chen
, Ke-Han Lu
:
Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition. 1474-1482 - Yiming Cui

, Ting Liu
, Wanxiang Che
, Zhigang Chen, Shijin Wang
:
Teaching Machines to Read, Answer and Explain. 1483-1492 - Shota Horiguchi

, Yusuke Fujita
, Shinji Watanabe
, Yawen Xue, Paola García
:
Encoder-Decoder Based Attractors for End-to-End Neural Diarization. 1493-1507 - Chenda Li

, Zhuo Chen, Yanmin Qian
:
Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation. 1508-1520 - Yu Tong

, Jingzhi Guo
, Jizhe Zhou
:
Separation Inference: A Unified Framework for Word Segmentation in East Asian Languages. 1521-1530 - Hitoshi Suda

, Daisuke Saito, Satoru Fukayama, Tomoyasu Nakano, Masataka Goto
:
Singer Diarization for Polyphonic Music With Unison Singing. 1531-1545 - Xinnian Liang

, Jing Li, Shuangzhi Wu, Mu Li, Zhoujun Li
:
Improving Unsupervised Extractive Summarization by Jointly Modeling Facet and Redundancy. 1546-1557 - Sung-Feng Huang

, Chyi-Jiunn Lin
, Da-Rong Liu, Yi-Chen Chen
, Hung-yi Lee
:
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech. 1558-1571 - Ziyi Xu

, Maximilian Strake, Tim Fingscheidt
:
Deep Noise Suppression Maximizing Non-Differentiable PESQ Mediated by a Non-Intrusive PESQNet. 1572-1585 - Lin Li

, Fuchuan Tong
, Qingyang Hong
:
When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends. 1586-1599 - Vinay Kothapally

, John H. L. Hansen
:
SkipConvGAN: Monaural Speech Dereverberation Using Generative Adversarial Networks via Complex Time-Frequency Masking. 1600-1613 - Hiromu Yakura

, Kento Watanabe
, Masataka Goto
:
Self-Supervised Contrastive Learning for Singing Voices. 1614-1623 - Chen Gong

, Zhenghua Li
, Min Zhang:
Neural Coupled Sequence Labeling for Heterogeneous Annotation Conversion. 1624-1636 - Fangfang Su

, Yue Zhang
, Fei Li
, Donghong Ji:
Balancing Precision and Recall for Neural Biomedical Event Extraction. 1637-1649 - Zexu Pan

, Ruijie Tao, Chenglin Xu
, Haizhou Li
:
Selective Listening by Synchronizing Speech With Lips. 1650-1664 - Qianren Mao

, Jianxin Li
, Hao Peng
, Shizhu He
, Lihong Wang
, Philip S. Yu
, Zheng Wang
:
Fact-Driven Abstractive Summarization by Utilizing Multi-Granular Multi-Relational Knowledge. 1665-1678 - Neeraj Kumar

, Ankur Narang, Brejesh Lall
:
Zero-Shot Normalization Driven Multi-Speaker Text to Speech Synthesis. 1679-1693 - Carlos Tarjano

, Valdecy Pereira
:
An Efficient Algorithm for Segmenting Quasi-Periodic Digital Signals Into Pseudo Cycles: Application in Lossy Audio Compression. 1694-1703 - Han Wang

, Hongling Sun
, Jianfeng Guo
, Ming Wu, Jun Yang
:
Analysis of the Frequency Interference in the Narrowband Active Noise Control System. 1704-1717 - Maryam Hosseini

, Luca Celotti, Eric Plourde
:
End-to-End Brain-Driven Speech Enhancement in Multi-Talker Conditions. 1718-1733 - Mathieu Fontaine

, Kouhei Sekiguchi
, Aditya Arie Nugraha
, Yoshiaki Bando
, Kazuyoshi Yoshii
:
Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation. 1734-1748 - Thi Ngoc Tho Nguyen

, Karn N. Watcharasupat
, Ngoc Khanh Nguyen, Douglas L. Jones, Woon-Seng Gan
:
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection. 1749-1762 - Changfeng Gao

, Gaofeng Cheng
, Ta Li
, Pengyuan Zhang
, Yonghong Yan
:
Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model. 1763-1774 - Jiacheng Ye

, Xiang Zhou, Xiaoqing Zheng
, Tao Gui
, Qi Zhang
:
Uncertainty-Aware Sequence Labeling. 1775-1788 - Rui Liu

, Berrak Sisman
, Guanglai Gao, Haizhou Li
:
Decoding Knowledge Transfer for Neural Text-to-Speech Training. 1789-1802 - Weiquan Fan

, Xiangmin Xu
, Bolun Cai, Xiaofen Xing
:
ISNet: Individual Standardization Network for Speech Emotion Recognition. 1803-1814 - Jianfeng Wu

, Sijie Mai
, Haifeng Hu
:
Interpretable Multimodal Capsule Fusion. 1815-1826 - Qian Wang

, Jiajun Zhang
, Chengqing Zong
:
Synchronous Inference for Multilingual Neural Machine Translation. 1827-1839 - Jilu Jin

, Jacob Benesty
, Gongping Huang
, Jingdong Chen
:
On Differential Beamforming With Nonuniform Linear Microphone Arrays. 1840-1852 - Chiranjibi Sitaula

, Jinyuan He, Archana Priyadarshi, Mark B. Tracy, Omid Kavehei
, Murray Hinder, Anusha Withana
, Alistair Lee McEwan
, Faezeh Marzbanrad
:
Neonatal Bowel Sound Detection Using Convolutional Neural Network and Laplace Hidden Semi-Markov Model. 1853-1864 - Chao Pan

, Jingdong Chen
, Jacob Benesty
:
Microphone Array Beamforming With High Flexible Interference Attenuation and Noise Reduction. 1865-1876 - Han Li

, Kean Chen
, Bernhard U. Seeber
:
Gestalt Principles Emerge When Learning Universal Sound Source Separation. 1877-1891 - Juan Manuel Miramont

, Marcelo Alejandro Colominas
, Gastón Schlotthauer
:
Emulating Perceptual Evaluation of Voice Using Scattering Transform Based Features. 1892-1901 - Lucas Ondel

, Bolaji Yusuf
, Lukás Burget
, Murat Saraçlar
:
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery. 1902-1917 - Jialu Li

, Mark Hasegawa-Johnson
:
Autosegmental Neural Nets 2.0: An Extensive Study of Training Synchronous and Asynchronous Phones and Tones for Under-Resourced Tonal Languages. 1918-1926 - Yu Lu

, Jiajun Zhang
, Jiali Zeng
, Shuangzhi Wu, Chengqing Zong
:
Attention Analysis and Calibration for Transformer in Natural Language Generation. 1927-1938 - Minseung Kim

, Jong Won Shin
:
Improved Speech Enhancement Considering Speech PSD Uncertainty. 1939-1951 - Avital Kleiman

, Israel Cohen
, Baruch Berdugo:
Constant-Beamwidth Beamforming With Nonuniform Concentric Ring Arrays. 1952-1962 - Jacopo de Berardinis

, Angelo Cangelosi
, Eduardo Coutinho
:
Measuring the Structural Complexity of Music: From Structural Segmentations to the Automatic Evaluation of Models for Music Generation. 1963-1976 - Dino Oglic

, Zoran Cvetkovic
, Peter Sollich
, Steve Renals
, Bin Yu:
Towards Robust Waveform-Based Acoustic Models. 1977-1992 - Bo Chen

, Chenpeng Du
, Kai Yu
:
Neural Fusion for Voice Cloning. 1993-2001 - Saurabhchand Bhati

, Jesús Villalba
, Piotr Zelasko
, Laureano Moro-Velázquez
, Najim Dehak
:
Unsupervised Speech Segmentation and Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding. 2002-2014 - Bo Yang

, Lijun Wu
, Jinhua Zhu
, Bo Shao, Xiaola Lin, Tie-Yan Liu
:
Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning. 2015-2024 - Songbin Li

, Jingang Wang
, Peng Liu
:
General Frame-Wise Steganalysis of Compressed Speech Based on Dual-Domain Representation and Intra-Frame Correlation Leaching. 2025-2035 - Yang Ai

, Zhen-Hua Ling
, Wei-Lu Wu, Ang Li:
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis. 2036-2048 - Alastair H. Moore

, Sina Hafezi
, Rebecca R. Vos, Patrick A. Naylor
, Mike Brookes
:
A Compact Noise Covariance Matrix Model for MVDR Beamforming. 2049-2061 - Leo McCormack

, Archontis Politis
, Raimundo Gonzalez
, Tapio Lokki, Ville Pulkki
:
Parametric Ambisonic Encoding of Arbitrary Microphone Arrays. 2062-2075 - Adriana Fernandez-Lopez

, Federico M. Sukno
:
End-to-End Lip-Reading Without Large-Scale Data. 2076-2090 - Jinwon An

, Sungzoon Cho
, Junseong Bang, Misuk Kim
:
Domain-Slot Relationship Modeling Using a Pre-Trained Language Encoder for Multi-Domain Dialogue State Tracking. 2091-2102 - Roberto San Millán-Castillo

, Luca Martino
, Eduardo Morgado
, Fernando Llorente
:
An Exhaustive Variable Selection Study for Linear Models of Soundscape Emotions: Rankings and Gibbs Analysis. 2460-2474 - Kunkun SongGong

, Wenwu Wang
, Huawei Chen
:
Acoustic Source Localization in the Circular Harmonic Domain Using Deep Learning Architecture. 2475-2491 - Yanjue Song

, Nilesh Madhu
:
Improved CEM for Speech Harmonic Enhancement in Single Channel Noise Suppression. 2492-2503 - Anderson Queiroz

, Rosângela Coelho
:
Noisy Speech Based Temporal Decomposition to Improve Fundamental Frequency Estimation. 2504-2513 - Myeongjun Jang

, Thomas Lukasiewicz:
NoiER: An Approach for Training More Reliable Fine-Tuned Downstream Task Models. 2514-2525 - Jiayi Wang

, Rongzhou Bao
, Zhuosheng Zhang
, Hai Zhao
:
Rethinking Textual Adversarial Defense for Pre-Trained Language Models. 2526-2540 - Sungho Lee

, Hyeong-Seok Choi
, Kyogu Lee
:
Differentiable Artificial Reverberation. 2541-2556 - Chuang Fan

, Jiaming Li, Xuan Luo
, Ruifeng Xu
:
Enhancing Structure Preservation in Coreference Resolution by Constrained Graph Encoding. 2557-2567 - Richong Zhang

, Qianben Chen
, Yaowei Zheng
, Samuel Mensah
, Yongyi Mao:
Aspect-Level Sentiment Analysis via a Syntax-Based Neural Network. 2568-2583 - Moti Lugasi

, Anjali Menon, Vladimir Tourbabin, Boaz Rafaely
:
Spatial Audio Signal Enhancement by a Two-Stage Source - System Estimation With Frequency Smoothing for Improved Perception. 2584-2596 - Mengzhe Geng

, Xurong Xie, Zi Ye
, Tianzi Wang, Guinan Li
, Shujie Hu
, Xunying Liu
, Helen Meng:
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition. 2597-2611 - Elior Hadad

, Simon Doclo
, Sven Nordholm
, Sharon Gannot
:
A Class of Pareto Optimal Binaural Beamformers. 2612-2628 - Guochen Yu

, Andong Li, Hui Wang, Yutian Wang
, Yuxuan Ke, Chengshi Zheng
:
DBT-Net: Dual-Branch Federative Magnitude and Phase Estimation With Attention-in-Attention Transformer for Monaural Speech Enhancement. 2629-2644 - Weiqing Wang

, Qingjian Lin, Danwei Cai
, Ming Li
:
Similarity Measurement of Segment-Level Speaker Embeddings in Speaker Diarization. 2645-2658 - Sunwoo Kim, Minje Kim

:
Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation. 2659-2672 - Sashi Novitasari

, Sakriani Sakti
, Satoshi Nakamura
:
A Machine Speech Chain Approach for Dynamically Adaptive Lombard TTS in Static and Dynamic Noise Environments. 2673-2688 - Qiupu Chen

, Guimin Huang, Yabing Wang:
The Weighted Cross-Modal Attention Mechanism With Sentiment Prediction Auxiliary Task for Multimodal Sentiment Analysis. 2689-2695 - Leilei Gan

, Zhiyang Teng, Yue Zhang
, Linchao Zhu
, Fei Wu
, Yi Yang:
SemGloVe: Semantic Co-Occurrences for GloVe From BERT. 2696-2704 - Suliang Bu, Yunxin Zhao

, Tuo Zhao
, Shaojun Wang, Mei Han:
Modeling Speech Structure to Improve T-F Masks for Speech Enhancement and Recognition. 2705-2715 - Hendrik Schröter

, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas K. Maier
:
Low Latency Speech Enhancement for Hearing Aids Using Deep Filtering. 2716-2728 - Zhihao Zhang

, Yuan Zuo
, Junjie Wu
:
Aspect Sentiment Triplet Extraction: A Seq2Seq Approach With Span Copy Enhanced Dual Decoder. 2729-2742 - Leonardo Gabrielli

, Stefano D'Angelo, Pier Paolo La Pastina, Stefano Squartini
:
Antiderivative Antialiasing for Arbitrary Waveform Generation. 2743-2753 - Huadong Wang

, Xin Shen, Mei Tu, Yimeng Zhuang, Zhiyuan Liu
:
Improved Transformer With Multi-Head Dense Collaboration. 2754-2767 - Chuang Shi

, Feiyu Du, Qianyang Wu:
A Digital Twin Architecture for Wireless Networked Adaptive Active Noise Control. 2768-2777 - Wenmeng Xiong

, Changchun Bao
, Mao-shen Jia
, José Picheral:
Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources. 2778-2790 - Hassan Taherian

, Ke Tan
, DeLiang Wang
:
Multi-Channel Talker-Independent Speaker Separation Through Location-Based Training. 2791-2800 - Han Zhang

, Bin Liang, Min Yang, Hui Wang, Ruifeng Xu
:
Prompt-Based Prototypical Framework for Continual Relation Extraction. 2801-2813 - Christof Weiß

, Geoffroy Peeters
:
Comparing Deep Models and Evaluation Strategies for Multi-Pitch Estimation in Music Recordings. 2814-2827 - Laura-Maria Dogariu, Jacob Benesty

, Constantin Paleologu
, Silviu Ciochina
:
Identification of Room Acoustic Impulse Responses via Kronecker Product Decompositions. 2828-2841 - Yanmin Qian

, Xun Gong
, Houjun Huang:
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. 2842-2853 - Liumeng Xue

, Frank K. Soong, Shaofei Zhang, Lei Xie
:
ParaTTS: Learning Linguistic and Prosodic Cross-Sentence Information in Paragraph-Based TTS. 2854-2864 - Soojoong Hwang, Minseung Kim

, Jong Won Shin
:
Dual Microphone Speech Enhancement Based on Statistical Modeling of Interchannel Phase Difference. 2865-2874 - Chao Pan

, Jingdong Chen
:
A Framework of Directional-Gain Beamforming and a White-Noise-Gain-Controlled Solution. 2875-2887 - Fei He

, Xiaoyi Hu, Ce Zhu, Ying Li, Yipeng Liu:
Multi-Scale Spatial and Temporal Speech Associations to Swallowing for Dysphagia Screening. 2888-2899 - Boyang Xue

, Shoukang Hu
, Junhao Xu
, Mengzhe Geng
, Xunying Liu
, Helen Meng:
Bayesian Neural Network Language Modeling for Speech Recognition. 2900-2917 - Kang Xu

, Fei Li
, Dongdong Xie, Donghong Ji:
Revisiting Aspect-Sentiment-Opinion Triplet Extraction: Detailed Analyses Towards a Simple and Effective Span-Based Model. 2918-2927 - Koichi Saito, Tomohiko Nakamura

, Kohei Yatabe
, Hiroshi Saruwatari
:
Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation. 2928-2943 - Juliano G. C. Ribeiro

, Natsuki Ueno
, Shoichi Koyama
, Hiroshi Saruwatari
:
Region-to-Region Kernel Interpolation of Acoustic Transfer Functions Constrained by Physical Properties. 2944-2954 - Ruixin Hong

, Hongming Zhang, Xintong Yu
, Changshui Zhang
:
Learning Event Extraction From a Few Guideline Examples. 2955-2967 - Zhengjun Yue

, Erfan Loweimi
, Heidi Christensen
, Jon Barker, Zoran Cvetkovic
:
Acoustic Modelling From Raw Source and Filter Components for Dysarthric Speech Recognition. 2968-2980 - Gaku Kotani

, Daisuke Saito, Nobuaki Minematsu
:
Voice Conversion Based on Deep Neural Networks for Time-Variant Linear Transformations. 2981-2992 - Xiaoyu Bie

, Simon Leglaive
, Xavier Alameda-Pineda
, Laurent Girin
:
Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders. 2993-3007 - Yi Luo

:
A Time-Domain Real-Valued Generalized Wiener Filter for Multi-Channel Neural Separation Systems. 3008-3019 - Kaile Shi, Xiaoyan Cai

, Libin Yang
, Jintao Zhao, Shirui Pan
:
StarSum: A Star Architecture Based Model for Extractive Summarization. 3020-3031 - Zexu Pan

, Meng Ge
, Haizhou Li
:
USEV: Universal Speaker Extraction With Visual Cue. 3032-3045 - Yutao Xie

, Qiyu Wu, Wei Chen
, Tengjiao Wang
:
Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning. 3046-3059 - Lei Luo

, Wenzhao Zhu:
An Optimized Zero-Attracting LMS Algorithm for the Identification of Sparse System. 3060-3073 - Gongping Huang

, Jacob Benesty
, Jingdong Chen
:
Fundamental Approaches to Robust Differential Beamforming With High Directivity Factors. 3074-3088 - Xiaoqiang Wang

, Yanqing Liu, Jinyu Li
, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems. 3089-3097 - Jiaxin Zhong

, Tao Zhuang
, Ray Kirby
, Mahmoud Karimi
, Xiaojun Qiu
, Haishan Zou
, Jing Lu
:
Low Frequency Audio Sound Field Generated by a Focusing Parametric Array Loudspeaker. 3098-3109 - Jens Ahrens

, Hannes Helmholz
, David Lou Alon, Sebastià Vicenc Amengual Garí
:
Spherical Harmonic Decomposition of a Sound Field Using Microphones on a Circumferential Contour Around a Non-Spherical Baffle. 3110-3119 - Yonggang Hu

, Prasanga N. Samarasinghe
, Sharon Gannot
, Thushara D. Abhayapala
:
Decoupled Multiple Speaker Direction-of-Arrival Estimator Under Reverberant Environments. 3120-3133 - Heming Wang

, Xueliang Zhang
, DeLiang Wang
:
Fusing Bone-Conduction and Air-Conduction Sensors for Complex-Domain Speech Enhancement. 3134-3143 - Joon-Young Yang

, Joon-Hyuk Chang
:
Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification. 3144-3159 - Jian Liu

, Yufeng Chen, Jinan Xu:
MRCAug: Data Augmentation via Machine Reading Comprehension for Document-Level Event Argument Extraction. 3160-3172 - Wangyou Zhang

, Xuankai Chang
, Christoph Böddeker, Tomohiro Nakatani
, Shinji Watanabe
, Yanmin Qian
:
End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party. 3173-3188 - Michele Ducceschi

, Stefan Bilbao
:
Non-Iterative Simulation Methods for Virtual Analog Modelling. 3189-3198 - Miguel Ferrer

, Maria de Diego
, Amin Hassani
, Marc Moonen
, Gema Piñero
, Alberto González
:
Multi-Tone Active Noise Equalizer With Spatially Distributed User-Selected Profiles. 3199-3213 - Gasper Begus

, Alan Zhou
:
Interpreting Intermediate Convolutional Layers of Generative CNNs Trained on Waveforms. 3214-3229 - Sixing Wu

, Ying Li
, Dawei Zhang, Zhonghai Wu:
Generating Rational Commonsense Knowledge-Aware Dialogue Responses With Channel-Aware Knowledge Fusing Network. 3230-3239 - Donghui Zhu

, Ning Chen
:
Multi-Source Domain Adaptation and Fusion for Speaker Verification. 2103-2116 - Daniel Yang

, Thaxter Shaw
, Timothy Tsai
:
A Study of Parallelizable Alternatives to Dynamic Time Warping for Aligning Long Sequences. 2117-2127 - Yi Yu

, Hongsen He
, Rodrigo C. de Lamare
, Badong Chen
:
General Robust Subband Adaptive Filtering: Algorithms and Applications. 2128-2140 - Mahdie Karbasi

, Steffen Zeiler, Dorothea Kolossa
:
Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice. 2141-2155 - Andong Li

, Chengshi Zheng
, Guochen Yu
, Juanjuan Cai
, Xiaodong Li
:
Filtering and Refining: A Collaborative-Style Framework for Single-Channel Speech Enhancement. 2156-2172 - Silin Gao

, Ryuichi Takanobu
, Antoine Bosselut
, Minlie Huang
:
End-to-End Task-Oriented Dialog Modeling With Semi-Structured Knowledge Management. 2173-2187 - Yingying Zhu, Haiquan Zhao

, Xiaoqiong He, Zeliang Shu, Badong Chen
:
Cascaded Random Fourier Filter for Robust Nonlinear Active Noise Control. 2188-2200 - Siyuan Wang

, Zhongkun Liu, Wanjun Zhong, Ming Zhou
, Zhongyu Wei
, Zhumin Chen, Nan Duan
:
From LSAT: The Progress and Challenges of Complex Reasoning. 2201-2216 - Cheng Lu

, Yuan Zong
, Wenming Zheng
, Yang Li
, Chuangao Tang
, Björn W. Schuller
:
Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition. 2217-2230 - Bo Zhang

, Jian Wang
, Hongfei Lin
, Hui Ma
, Bo Xu
:
Exploiting Pairwise Mutual Information for Knowledge-Grounded Dialogue. 2231-2240 - Tao Wang

, Jiangyan Yi
, Ruibo Fu
, Jianhua Tao
, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. 2241-2254 - Ying-Ren Chien

, Chih-Hsiang Yu, Hen-Wai Tsao
:
Affine-Projection-Like Maximum Correntropy Criteria Algorithm for Robust Active Noise Control. 2255-2266 - Rui Wang

, Zhihua Wei
, Haoran Duan
, Shouling Ji
, Yang Long, Zhen Hong
:
EfficientTDNN: Efficient Architecture Search for Speaker Recognition. 2267-2279 - Xiaoxue Gao

, Chitralekha Gupta
, Haizhou Li
:
Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning. 2280-2294 - Jirí Málek

, Jakub Janský, Zbynek Koldovský
, Tomás Kounovský, Jaroslav Cmejla
, Jindrich Zdánský:
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification. 2295-2309 - Jung-Woo Choi

, Franz Zotter
, Byeongho Jo
, Jae-Hyoun Yoo
:
Multiarray Eigenbeam-ESPRIT for 3D Sound Source Localization With Multiple Spherical Microphone Arrays. 2310-2325 - Hao Zhang

, DeLiang Wang
:
Neural Cascade Architecture for Multi-Channel Acoustic Echo Suppression. 2326-2336 - Alessandro Opinto

, Marco Martalò
, Alessandro Costalunga, Nicolo Strozzi
, Carlo Tripodi
, Riccardo Raheli
:
Experimental Analysis and Design Guidelines for Microphone Virtualization in Automotive Scenarios. 2337-2346 - Xing Tian

, Jie Huang, Xuelei Feng
, Yong Shen
:
An Intermittent FxLMS Algorithm for Active Noise Control Systems With Saturation Nonlinearity. 2347-2356 - Haichao Zhu

, Li Dong, Furu Wei, Bing Qin
, Ting Liu:
Transforming Wikipedia Into Augmented Data for Query-Focused Summarization. 2357-2367 - Kouhei Sekiguchi

, Yoshiaki Bando
, Aditya Arie Nugraha
, Mathieu Fontaine
, Kazuyoshi Yoshii
, Tatsuya Kawahara
:
Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation. 2368-2382 - Brij Mohan Lal Srivastava

, Mohamed Maouche
, Md. Sahidullah
, Emmanuel Vincent
, Aurélien Bellet, Marc Tommasi
, Natalia A. Tomashenko
, Xin Wang
, Junichi Yamagishi
:
Privacy and Utility of X-Vector Based Speaker Anonymization. 2383-2395 - Luciana Ferrer

, Diego Castán, Mitchell McLaren, Aaron Lawson:
A Discriminative Hierarchical PLDA-Based Model for Spoken Language Recognition. 2396-2410 - Xiaohuai Le

, Tong Lei
, Kai Chen, Jing Lu
:
Inference Skipping for More Efficient Real-Time Speech Enhancement With Parallel RNNs. 2411-2421 - Chitralekha Gupta

, Haizhou Li
, Masataka Goto
:
Deep Learning Approaches in Topics of Singing Information Processing. 2422-2451 - Atharva Anand Joshi

, Harshavardhan Settibhaktini
, Ananthakrishna Chintanpalli
:
Modeling Concurrent Vowel Scores Using the Time Delay Neural Network and Multitask Learning. 2452-2459

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














