default search action
EURASIP Journal on Audio, Speech, and Music Processing, Volume 2024
Volume 2024, Number 1, December 2024
- Yunfei Shao, Xinxin Ma, Yong Ma, Weiqiang Zhang:
Deep semantic learning for acoustic scene classification. 1 - Khomdet Phapatanaburi, Longbiao Wang, Meng Liu, Seiichi Nakagawa, Talit Jumphoo, Peerapong Uthansakul:
Significance of relative phase features for shouted and normal speech classification. 2 - Junya Koguchi, Masanori Morise:
Neural electric bass guitar synthesis framework enabling attack-sustain-representation-based technique control. 3 - Shangda Wu, Yue Yang, Zhaowen Wang, Xiaobing Li, Maosong Sun:
Generating chord progression from melody with flexible harmonic rhythm and controllable harmonic density. 4 - Stijn Kindt, Jenthe Thienpondt, Luca Becker, Nilesh Madhu:
Correction: Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios. 5 - Gebremichael Kibret Sheferaw, Waweru Mwangi, Michael W. Kimwele, Adane Letta Mamuye:
Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoder. 6 - Lingyun Xie, Yuehong Wang, Yan Gao:
Acoustical feature analysis and optimization for aesthetic recognition of Chinese traditional music. 7 - Sivaramakrishna Yechuri, Sunny Dayal Vanambathina:
Sub-convolutional U-Net with transformer attention network for end-to-end single-channel speech enhancement. 8 - Reemt Hinrichs, Kevin Gerkens, Alexander Lange, Jörn Ostermann:
Blind extraction of guitar effects through blind system inversion and neural guitar effect modeling. 9 - Priyanka Gupta, Hemant A. Patil, Rodrigo Capobianco Guido:
Vulnerability issues in Automatic Speaker Verification (ASV) systems. 10 - Huda Barakat, Oytun Türk, Cenk Demiroglu:
Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources. 11 - Marcos Lazaro Alvarez, Laura Arjona, Miguel Enrique Iglesias Martínez, Alfonso Bahillo:
Automatic classification of the physical surface in sound uroflowmetry using machine learning methods. 12 - Zining Liang, Wen Zhang, Thushara D. Abhayapala:
Sound field reconstruction using neural processes with dynamic kernels. 13 - Serhat Hizlisoy, Recep Sinan Arslan, Emel Çolakoglu:
Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning. 14 - Javier Tejedor, Doroteo T. Toledano:
Whisper-based spoken term detection systems for search on speech ALBAYZIN evaluation challenge. 15 - Shivam Saini, Isaac Engel, Jürgen Peissig:
An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment. 16 - Luca Comanducci, Fabio Antonacci, Augusto Sarti:
Synthesis of soundfields through irregular loudspeaker arrays based on convolutional neural networks. 17 - Rabbia Mahum, Aun Irtaza, Ali Javed, Haitham A. Mahmoud, Haseeb Hassan:
DeepDet: YAMNet with BottleNeck Attention Module (BAM) TTS synthesis detection. 18 - Sandeep Reddy Kothinti, Mounya Elhilali:
Multi-rate modulation encoding via unsupervised learning for audio event detection. 19 - Zehua Zhang, Lu Zhang, Xuyi Zhuang, Yukun Qian, Mingjiang Wang:
Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement. 20 - Rabbia Mahum, Aun Irtaza, Ali Javed, Haitham A. Mahmoud, Haseeb Hassan:
Correction: DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection. 21 - Usama Saqib, Mads Græsbøll Christensen, Jesper Rindom Jensen:
Robust acoustic reflector localization using a modified EM algorithm. 22 - Chunxi Wang, Maoshen Jia, Meiran Li, Changchun Bao, Wenyu Jin:
Exploring the power of pure attention mechanisms in blind room parameter estimation. 23 - Tomasz Wojnar, Jaroslaw Hryszko, Adam Roman:
Mi-Go: tool which uses YouTube as data source for evaluating general-purpose speech recognition machine learning models. 24 - David Gimeno-Gómez, Carlos David Martínez-Hinarejos:
Continuous lipreading based on acoustic temporal alignments. 25 - Otto Mikkonen, Alec Wright, Vesa Välimäki:
Sampling the user controls in neural modeling of audio devices. 26 - Joanna Luberadzka, Hendrik Kayser, Jörg Lücke, Volker Hohmann:
Towards multidimensional attentive voice tracking - estimating voice state from auditory glimpses with regression neural networks and Monte Carlo sampling. 27 - Zhiyong Chen, Zhiqi Ai, Youxuan Ma, Xinnuo Li, Shugong Xu:
Optimizing feature fusion for improved zero-shot adaptation in text-to-speech synthesis. 28 - Yunpeng Liu, Xukui Yang, Dan Qu:
Exploration of Whisper fine-tuning strategies for low-resource ASR. 29 - Jeremiah Abimbola, Daniel Kostrzewa, Pawel Kasprowski:
Music time signature detection using ResNet18. 30 - Marcin Lewandowski:
Estimating the first and second derivatives of discrete audio data. 31 - Adam Kujawski, Art J. R. Pelling, Ennes Sarradj:
MIRACLE - a microphone array impulse response dataset for acoustic learning. 32 - Shaik Sajiha, Kodali Radha, Dhulipalla Venkata Rao, Nammi Sneha, Gunnam Suryanarayana, Durga Prasad Bavirisetti:
Automatic dysarthria detection and severity level assessment using CWT-layered CNN model. 33 - Mengzhen Ma, Ying Hu, Liang He, Hao Huang:
GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration. 34 - Tahira Kanwal, Rabbia Mahum, AbdulMalik Al-Salman, Mohamed Sharaf, Haseeb Hassan:
Fake speech detection using VGGish with attention block. 35 - Xin Feng, Yue Zhao, Wei Zong, Xiaona Xu:
Adaptive multi-task learning for speech to text translation. 36 - Yigang Liu, Yue Zhao, Xiaona Xu, Liang Xu, Xubei Zhang, Qiang Ji:
Exploring task-diverse meta-learning on Tibetan multi-dialect speech recognition. 37 - Samuel Poirot, Stefan Bilbao, Richard Kronland-Martinet:
A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes. 38 - Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The whole is greater than the sum of its parts: improving music source separation by bridging networks. 39 - Daiki Mori, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka:
Recognition of target domain Japanese speech using language model replacement. 40 - Samuel A. Verburg, Filip Elvander, Toon van Waterschoot, Efren Fernandez-Grande:
Optimal sensor placement for the spatial reconstruction of sound fields. 41 - Marco Olivieri, Xenofon Karakonstantis, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti, Efren Fernandez-Grande:
Physics-informed neural network for volumetric sound field reconstruction of speech signals. 42 - Juliano G. C. Ribeiro, Shoichi Koyama, Hiroshi Saruwatari:
Physics-constrained adaptive kernel interpolation for region-to-region acoustic transfer function: a Bayesian approach. 43 - Zijin Li, Wenwu Wang, Kejun Zhang, Mengyao Zhu:
Guest editorial: AI for computational audition - sound and music processing. 44 - Martin Jälmby, Filip Elvander, Toon van Waterschoot:
Compression of room impulse responses for compact storage and fast low-latency convolution. 45 - Yuma Kinoshita, Nobutaka Ono:
End-to-end training of acoustic scene classification using distributed sound-to-light conversion devices: verification through simulation experiments. 46 - Xiao Zeng, Shiyun Xu, Mingjiang Wang:
A time-frequency fusion model for multi-channel speech enhancement. 47 - Chaoyang Zhang, Yan Hua:
Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos. 48 - Stefano Damiano, Luca Bondi, Andre Guntoro, Toon van Waterschoot:
A framework for the acoustic simulation of passing vehicles using variable length delay lines. 49
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.