default search action
ICASSP 2021: Toronto, ON, Canada
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. IEEE 2021, ISBN 978-1-7281-7606-2
- Yi Luo, Zhuo Chen, Cong Han, Chenda Li, Tianyan Zhou, Nima Mesgarani:
Rethinking The Separation Layers In Speech Separation Networks. 1-5 - Xiaoyu Liu, Jordi Pons:
On Permutation Invariant Training For Speech Source Separation. 6-10 - Zhong-Qiu Wang, DeLiang Wang:
Count And Separate: Incorporating Speaker Counting For Continuous Speaker Separation. 11-15 - Yi Luo, Cong Han, Nima Mesgarani:
Ultra-Lightweight Speech Separation Via Group Communication. 16-20 - Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, Jianyuan Zhong:
Attention Is All You Need In Speech Separation. 21-25 - Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor:
Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features. 26-30 - Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy:
Semi-Supervised Singing Voice Separation With Noisy Self-Training. 31-35 - Giorgia Cantisani, Slim Essid, Gaël Richard:
Neuro-Steered Music Source Separation With EEG-Based Auditory Attention Decoding And Contrastive-NMF. 36-40 - Yixuan Zhang, Yuzhou Liu, DeLiang Wang:
Complex Ratio Masking For Singing Voice Separation. 41-45 - Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux:
Transcription Is All You Need: Learning To Separate Musical Mixtures With Score As Supervision. 46-50 - Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
All For One And One For All: Improving Music Separation By Bridging Networks. 51-55 - Yongwei Gao, Xingjian Du, Bilei Zhu, Xiaoheng Sun, Wei Li, Zejun Ma:
An Hrnet-Blstm Model With Two-Stage Training For Singing Melody Extraction. 56-60 - Satwinder Singh, Ruili Wang, Yuanhang Qiu:
DeepF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals. 61-65 - Marco A. Martínez Ramírez, Oliver Wang, Paris Smaragdis, Nicholas J. Bryan:
Differentiable Signal Processing With Black-Box Audio Effects. 66-70 - Christian J. Steinmetz, Jordi Pons, Santiago Pascual, Joan Serrà:
Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects. 71-75 - Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss. 76-80 - Junghyun Koo, Seungryeol Paik, Kyogu Lee:
Reverb Conversion Of Mixed Vocal Tracks Using An End-To-End Convolutional Deep Neural Network. 81-85 - Bo-Wei Tseng, Yih-Liang Shen, Tai-Shih Chi:
Extending Music Based On Emotion And Tonality Via Generative Adversarial Network. 86-90 - William Vickers, Ben Milner, Robert Lee:
Improving The Robustness Of Right Whale Detection In Noisy Conditions Using Denoising Autoencoders And Augmented Training. 91-95 - Ondrej Cífka, Alexey Ozerov, Umut Simsekli, Gaël Richard:
Self-Supervised VQ-VAE for One-Shot Music Style Transfer. 96-100 - Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du:
Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers. 101-105 - T. J. Tsai:
Segmental Dtw: A Parallelizable Alternative to Dynamic Time Warping. 106-110 - Keitaro Tanaka, Ryo Nishikimi, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima:
Pitch-Timbre Disentanglement Of Musical Instrument Sounds Based On Vae-Based Metric Learning. 111-115 - Robert Ayrapetian, Philip Hilmes, Mohamed Mansour, Trausti Kristjansson, Carlo Murgia:
Asynchronous Acoustic Echo Cancellation Over Wireless Channels. 116-120 - Mhd Modar Halimeh, Thomas Haubner, Annika Briegleb, Alexander Schmidt, Walter Kellermann:
Combining Adaptive Filtering And Complex-Valued Deep Postfiltering For Acoustic Echo Cancellation. 121-125 - Amir Ivry, Israel Cohen, Baruch Berdugo:
Deep Residual Echo Suppression With A Tunable Tradeoff Between Signal Distortion And Echo Suppression. 126-130 - Saeed Bagheri, Daniele Giacobello:
Robust STFT Domain Multi-Channel Acoustic Echo Cancellation with Adaptive Decorrelation of the Reference Signals. 131-135 - Meng Guo:
A Method for Determining Periodically Time-Varying Bias and Its Applications in Acoustic Feedback Cancellation. 136-140 - Ziteng Wang, Yueyue Na, Zhang Liu, Biao Tian, Qiang Fu:
Weighted Recursive Least Square Filter and Neural Network Based Residual ECHO Suppression for the AEC-Challenge. 141-145 - Renhua Peng, Linjuan Cheng, Chengshi Zheng, Xiaodong Li:
ICASSP 2021 Acoustic Echo Cancellation Challenge: Integrated Adaptive Echo Cancellation with Time Alignment and Deep Learning-Based Residual Echo Plus Noise Suppression. 146-150 - Kusha Sridhar, Ross Cutler, Ando Saabas, Tanel Pärnamaa, Markus Loide, Hannes Gamper, Sebastian Braun, Robert Aichner, Sriram Srinivasan:
ICASSP 2021 Acoustic Echo Cancellation Challenge: Datasets, Testing Framework, and Results. 151-155 - Jan Franzen, Ernst Seidel, Tim Fingscheidt:
AEC in A Netshell: on Target and Topology Choices for FCRN Acoustic Echo Cancellation. 156-160 - Jesper Brunnström, Shoichi Koyama:
Kernel-Interpolation-Based Filtered-X Least Mean Square for Spatial Active Noise Control In Time Domain. 161-165 - Jian Xu, Kean Chen, Yunhe Li:
Wave-Domain Optimization of Secondary Source Placement Free From Information of Error Sensor Positions. 166-170 - Woo-Sung Choi, Minseok Kim, Jaehwa Chung, Soonyoung Jung:
Lasaft: Latent Source Attentive Frequency Transformation For Conditioned Source Separation. 171-175 - Robin Scheibler, Masahito Togami:
Surrogate Source Model Learning for Determined Source Separation. 176-180 - Han Li, Kean Chen, Bernhard U. Seeber:
Auditory Filterbanks Benefit Universal Sound Source Separation. 181-185 - Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R. Hershey:
What's all the Fuss about Free Universal Sound Separation Data? 186-190 - Shota Inoue, Hirokazu Kameoka, Li Li, Shoji Makino:
SepNet: A Deep Separation Matrix Prediction Network for Multichannel Audio Source Separation. 191-195 - Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein:
CDPAM: Contrastive Learning for Perceptual Audio Similarity. 196-200 - Soichiro Oyabu, Daichi Kitamura, Kohei Yatabe:
Linear Multichannel Blind Source Separation based on Time-Frequency Mask Obtained by Harmonic/Percussive Sound Separation. 201-205 - Daniel Arteaga, Jordi Pons:
Multichannel-based Learning for Audio Object Extraction. 206-210 - Ali Aroudi, Sebastian Braun:
DBnet: Doa-Driven Beamforming Network for end-to-end Reverberant Sound Source Separation. 211-215 - Taishi Nakashima, Robin Scheibler, Masahito Togami, Nobutaka Ono:
Joint Dereverberation and Separation With Iterative Source Steering. 216-220 - Ingvi Örnolfsson, Torsten Dau, Ning Ma, Tobias May:
Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference. 221-225 - Jirí Málek, Jakub Janský, Tomás Kounovský, Zbynek Koldovský, Jindrich Zdánský:
Blind Extraction of Moving Audio Source in a Challenging Environment Supported by Speaker Identification Via X-Vectors. 226-230 - Ashvala Vinay, Alexander Lerch, Grace Leslie:
Mind the Beat: Detecting Audio Onsets from EEG Recordings of Music Listening. 231-235 - Mojtaba Heydari, Zhiyao Duan:
Don't Look Back: An Online Beat Tracking Method Using RNN and Enhanced Particle Filtering. 236-240 - Xingjian Du, Bilei Zhu, Qiuqiang Kong, Zejun Ma:
Singing Melody Extraction from Polyphonic Music based on Spectral Correlation Modeling. 241-245 - I-Chieh Wei, Chih-Wei Wu, Li Su:
Improving Automatic Drum Transcription Using Large-Scale Audio-to-Midi Aligned Data. 246-250 - Shuai Yu, Xiaoheng Sun, Yi Yu, Wei Li:
Frequency-Temporal Attention Network for Singing Melody Extraction. 251-255 - Yuki Hiramatsu, Go Shibata, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii:
Statistical Correction of Transcribed Melody Notes Based on Probabilistic Integration of a Music Language Model and a Transcription Error Model. 256-260 - Sebastian Rosenzweig, Frank Scherbaum, Meinard Müller:
Reliability Assessment of Singing Voice F0-Estimates Using Multiple Algorithms. 261-265 - Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
End-to-End Lyrics Recognition with Voice to Singing Style Transfer. 266-270 - Lenny Renault, Andrea Vaglio, Romain Hennequin:
Singing Language Identification Using a Deep Phonotactic Approach. 271-275 - Jun-You Wang, Jyh-Shing Roger Jang:
On the Preparation and Validation of a Large-Scale Dataset of Singing Transcription. 276-280 - Lele Liu, Veronica Morfi, Emmanouil Benetos:
Joint Multi-Pitch Detection and Score Transcription for Polyphonic Piano Music. 281-285 - Yuan Wang, Shigeki Tanaka, Keita Yokoyama, Hsin-Tai Wu, Yi Fang:
Karaoke Key Recommendation Via Personalized Competence-Based Rating Prediction. 286-290 - Afagh Farhadi, Skyler G. Jennings, Elizabeth A. Strickland, Laurel H. Carney:
A Closed-Loop Gain-Control Feedback Model for The Medial Efferent System of The Descending Auditory Pathway. 291-295 - Zehai Tu, Ning Ma, Jon Barker:
DHASP: Differentiable Hearing Aid Speech Processing. 296-300 - Anil M. Nagathil, Florian Göbel, Alexandru Nelus, Ian C. Bruce:
Computationally Efficient DNN-Based Approximation of an Auditory Model for Applications in Speech Processing. 301-305 - Hideki Kawahara, Kohei Yatabe:
Cascaded All-Pass Filters with Randomized Center Frequencies and Phase Polarity for Acoustic and Speech Measurement and Data Augmentation. 306-310 - Danni Ma, Neville Ryant, Mark Liberman:
Probing Acoustic Representations for Phonetic Properties. 311-315 - Zhuohuang Zhang, Piyush Vyas, Xuan Dong, Donald S. Williamson:
An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework. 316-320 - Yu Wang, Nicholas J. Bryan, Mark Cartwright, Juan Pablo Bello, Justin Salamon:
Few-Shot Continual Learning for Audio Classification. 321-325 - Huang Xie, Okko Räsänen, Tuomas Virtanen:
Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections. 326-330 - Hsin-Ping Huang, Krishna C. Puvvada, Ming Sun, Chao Wang:
Unsupervised and Semi-Supervised Few-Shot Acoustic Event Classification. 331-335 - Kota Dohi, Takashi Endo, Harsh Purohit, Ryo Tanabe, Yohei Kawaguchi:
Flow-Based Self-Supervised Density Estimation for Anomalous Sound Detection. 336-340 - Sangwook Park, Ashwin Bellur, David K. Han, Mounya Elhilali:
Self-Training for Sound Event Detection in Audio Mixtures. 341-345 - Shubhr Singh, Helen L. Bear, Emmanouil Benetos:
Prototypical Networks for Domain Adaptation in Acoustic Scene Classification. 346-350 - Helin Wang, Yuexian Zou, Wenwu Wang:
A Global-Local Attention Framework for Weakly Labelled Audio Tagging. 351-355 - Xu Zheng, Yan Song, Ian McLoughlin, Lin Liu, Li-Rong Dai:
An Improved Mean Teacher Based Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection. 356-360 - Léo Cances, Thomas Pellegrini:
Comparison of Deep Co-Training and Mean-Teacher Approaches for Semi-Supervised Audio Tagging. 361-365 - Shawn Hershey, Daniel P. W. Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R. Channing Moore, Manoj Plakal:
The Benefit of Temporally-Strong Labels in Audio Event Classification. 366-370 - Eduardo Fonseca, Diego Ortego, Kevin McGuinness, Noel E. O'Connor, Xavier Serra:
Unsupervised Contrastive Learning of Sound Event Representations. 371-375 - Chih-Yuan Koh, You-Siang Chen, Yi-Wen Liu, Mingsian R. Bai:
Sound Event Detection by Consistency Training and Pseudo-Labeling With Feature-Pyramid Convolutional Recurrent Neural Networks. 376-380 - Joan Serrà, Jordi Pons, Santiago Pascual:
SESQA: Semi-Supervised Learning for Speech Quality Assessment. 381-385 - Helmer Nylén, Saikat Chatterjee, Sten Ternström:
Detecting Signal Corruptions in Voice Recordings For Speech Therapy. 386-390 - Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiang-Yang Li, Tao Qin:
MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network. 391-395 - Jana Roßbach, Saskia Röttges, Christopher F. Hauth, Thomas Brand, Bernd T. Meyer:
Non-Intrusive Binaural Prediction of Speech Intelligibility Based on Phoneme Classification. 396-400 - Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines:
Warp-Q: Quality Prediction for Generative Neural Speech Codecs. 401-405 - Ross Cutler, Babak Nadari, Markus Loide, Sten Sootla, Ando Saabas:
Crowdsourcing Approach for Subjective Evaluation of Echo Impairment. 406-410 - Shoichi Koyama, Takashi Amakasu, Natsuki Ueno, Hiroshi Saruwatari:
Amplitude Matching: Majorization-Minimization Algorithm for Sound Field Control Only with Amplitude Constraint. 411-415 - Huanyu Zuo, Thushara D. Abhayapala, Prasanga N. Samarasinghe:
3D Multizone Soundfield Reproduction in a Reverberant Environment Using Intensity Matching Method. 416-420 - Jens Ahrens, Hannes Helmholz, David Lou Alon, Sebastià V. Amengual Garí:
The Far-Field Equatorial Array for Binaural Rendering. 421-425 - Fabrice Katzberg, Marco Maaß, Alfred Mertins:
Spherical Harmonic Representation for Dynamic Sound-Field Measurements. 426-430 - Adrian Herzog, Daniele Mirabilii, Emanuël A. P. Habets:
Direction Preserving Wind Noise Reduction Of B-Format Signals. 431-435 - Robin Scheibler, Masahito Togami:
Refinement of Direction of Arrival Estimators by Majorization-Minimization Optimization on the Array Manifold. 436-440 - Yaxuan Zhou, Hao Jiang, Vamsi Krishna Ithapu:
On the Predictability of Hrtfs from Ear Shapes Using Deep Networks. 441-445 - Lior Arbel, Zamir Ben-Hur, David Lou Alon, Boaz Rafaely:
Applied Methods for Sparse Sampling of Head-Related Transfer Functions. 446-450 - Mengfan Zhang, Jui-Hsien Wang, Doug L. James:
Personalized HRTF Modeling Using DNN-Augmented BEM. 451-455 - Fabian Hübner, Wolfgang Mack, Emanuël A. P. Habets:
Efficient Training Data Generation for Phase-Based DOA Estimation. 456-460 - Giovanni Bologni, Richard Heusdens, Jorge Martínez:
Acoustic Reflectors Localization from Stereo Recordings Using Neural Networks. 1-5 - Usama Saqib, Antoine Deleforge, Jesper Rindom Jensen:
Detecting Acoustic Reflectors Using A Robot's Ego-Noise. 466-470 - Ziqi Fan, Vibhav Vineet, Chenshen Lu, T. W. Wu, Kyla A. McMullen:
Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks. 471-475 - Tom Shlomo, Boaz Rafaely:
Blind Amplitude Estimation of Early Room Reflections Using Alternating Least Squares. 476-480 - Thomas McKenzie, Sebastian J. Schlecht, Ville Pulkki:
Acoustic Analysis and Dataset of Transitions Between Coupled Rooms. 481-485 - Yuying Li, Yuchen Liu, Donald S. Williamson:
On Loss Functions for Deep-Learning Based T60 Estimation. 486-490 - Hideyuki Tachibana:
Towards Listening to 10 People Simultaneously: An Efficient Permutation Invariant Training of Audio Source Separation Using Sinkhorn's Algorithm. 491-495 - Andreas Brendel, Walter Kellermann:
Accelerating Auxiliary Function-Based Independent Vector Analysis. 496-500 - Beat Gfeller, Dominik Roblek, Marco Tagliasacchi:
One-Shot Conditional Audio Filtering of Arbitrary Sounds. 501-505 - Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki, Shoji Makino:
Low Latency Online Blind Source Separation Based on Joint Optimization with Blind Dereverberation. 506-510 - Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii:
Autoregressive Fast Multichannel Nonnegative Matrix Factorization For Joint Blind Source Separation And Dereverberation. 511-515 - Paul Magron, Pierre-Hugo Vial, Thomas Oberlin, Cédric Févotte:
Phase Recovery with Bregman Divergences for Audio Source Separation. 516-520 - Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial Attacks on Audio Source Separation. 521-525 - Mieszko Fras, Konrad Kowalczyk:
Maximum a Posteriori Estimator for Convolutive Sound Source Separation with Sub-Source Based NTF Model and the Localization Probabilistic Prior on the Mixing Matrix. 526-530 - Efthymios Tzinis, Dimitrios Bralios, Paris Smaragdis:
Unified Gradient Reweighting for Model Biasing with Applications to Source Separation. 531-535 - Andres Ferraro, Yuntae Kim, Soohyeon Lee, Biho Kim, Namjun Jo, Semi Lim, Suyon Lim, Jungtaek Jang, Sehwan Kim, Xavier Serra, Dmitry Bogdanov:
Melon Playlist Dataset: A Public Dataset for Audio-Based Playlist Generation and Music Tagging. 536-540 - Furkan Yesiler, Emilio Molina, Joan Serrà, Emilia Gómez:
Investigating the Efficacy of Music Version Retrieval Systems for Setlist Identification. 541-545 - Kevin Ji, Daniel Yang, T. J. Tsai:
Instrument Classification of Solo Sheet Music Images. 546-550 - Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma:
Bytecover: Cover Song Identification Via Multi-Loss Training. 551-555 - Ho-Hsiang Wu, Chieh-Chi Kao, Qingming Tang, Ming Sun, Brian McFee, Juan Pablo Bello, Chao Wang:
Multi-Task Self-Supervised Pre-Training for Music Classification. 556-560 - Shreyan Chowdhury, Gerhard Widmer:
Towards Explaining Expressive Qualities in Piano Recordings: Transfer of Explanatory Features Via Acoustic Domain Adaptation. 561-565 - Ju-Chiang Wang, Jordan B. L. Smith, Jitong Chen, Xuchen Song, Yuxuan Wang:
Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-Task Learning. 566-570 - Ruchit Agrawal, Daniel Wolff, Simon Dixon:
Structure-Aware Audio-to-Score Alignment Using Progressively Dilated Convolutional Neural Networks. 571-575