


default search action
Wenwu Wang 0001
Person information
- affiliation: University of Surrey, Guildford, UK
Other persons with the same name
- Wenwu Wang 0002
— Qufu Normal University, Qufu, Shandong, China
- Wenwu Wang 0003
— Xidian University, Xi'an, China
- Wenwu Wang 0004 — Wuhan University, Wuhan, China
- Wenwu Wang 0005 — Harbin Institute of Technology, Harbin, China
- Wenwu Wang 0006 — Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China
- Wenwu Wang 0007
— Sichuan University, School of Mechanical Engineering, Chengdu, China
- Wenwu Wang 0008
— Wuhan University of Science and Technology, School of Information Science and Engineering, China
- Wenwu Wang 0009
— Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China
- Wenwu Wang 0010
— Guangxi University of Science and Technology, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [j119]Wujiang Zhu, Xinyuan Zhou, Shiyong Lan
, Wenwu Wang, Zhiang Hou, Yao Ren, Tianyi Pan:
A dual branch graph neural network based spatial interpolation method for traffic data inference in unobserved locations. Inf. Fusion 114: 102703 (2025) - [j118]Kai Wu, Jing Dong, Guifu Hu, Chang Liu, Wenwu Wang:
TDU-DLNet: A transformer-based deep unfolding network for dictionary learning. Signal Process. 231: 109886 (2025) - [i118]Yuanbo Hou, Qiaoqiao Ren, Wenwu Wang, Dick Botteldooren:
Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot Interaction. CoRR abs/2501.00038 (2025) - 2024
- [j117]Kanghao Li
, Shuguo Yang
, Li Zhao, Wenwu Wang:
Weakly labeled sound event detection with a capsule-transformer model. Digit. Signal Process. 146: 104347 (2024) - [j116]Zijin Li, Wenwu Wang, Kejun Zhang, Mengyao Zhu:
Guest editorial: AI for computational audition - sound and music processing. EURASIP J. Audio Speech Music. Process. 2024(1): 44 (2024) - [j115]Wei Ma, Yao Li
, Shiyong Lan
, Wenwu Wang, Weikang Huang, Wujiang Zhu:
Semantic-aware normalizing flow with feature fusion for image anomaly detection. Neurocomputing 590: 127728 (2024) - [j114]Yiming Zhang
, Ruoyi Du
, Zheng-Hua Tan
, Wenwu Wang
, Zhanyu Ma
:
Generating Accurate and Diverse Audio Captions Through Variational Autoencoder Framework. IEEE Signal Process. Lett. 31: 2520-2524 (2024) - [j113]Yuanbo Hou
, Bo Kang
, Andrew Mitchell
, Wenwu Wang
, Jian Kang
, Dick Botteldooren
:
Cooperative Scene-Event Modelling for Acoustic Scene Classification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 68-82 (2024) - [j112]Haohe Liu
, Yi Yuan
, Xubo Liu
, Xinhao Mei
, Qiuqiang Kong
, Qiao Tian
, Yuping Wang, Wenwu Wang
, Yuxuan Wang, Mark D. Plumbley
:
AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2871-2883 (2024) - [j111]Xinhao Mei
, Xubo Liu
, Jianyuan Sun, Mark D. Plumbley
, Wenwu Wang
:
Towards Generating Diverse Audio Captions via Adversarial Training. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3311-3323 (2024) - [j110]Xinhao Mei
, Chutong Meng
, Haohe Liu
, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley
, Yuexian Zou
, Wenwu Wang
:
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3339-3354 (2024) - [j109]Sara Atito Ali Ahmed
, Muhammad Awais, Wenwu Wang
, Mark D. Plumbley
, Josef Kittler
:
ASiT: Local-Global Audio Spectrogram Vision Transformer for Event Classification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3684-3693 (2024) - [j108]Shengxi Li, Xuelong Li, Leonardo Chiariglione, Jiebo Luo, Wenwu Wang, Zhengyuan Yang, Danilo P. Mandic, Hamido Fujita:
Introduction to the Special Issue on AI-Generated Content for Multimedia. IEEE Trans. Circuits Syst. Video Technol. 34(8): 6809-6813 (2024) - [j107]Fatemeh Nazarieh
, Zhenhua Feng
, Muhammad Awais, Wenwu Wang
, Josef Kittler
:
A Survey of Cross-Modal Visual Content Generation. IEEE Trans. Circuits Syst. Video Technol. 34(8): 6814-6832 (2024) - [j106]Feng Zhan
, Wenwu Wang
, Qian Chen
, Yina Guo
, Lidan He
, Lili Wang
:
Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J. Biomed. Health Informatics 28(4): 2175-2186 (2024) - [j105]Yang Liu
, Yong Xu
, Peipei Wu
, Wenwu Wang
:
Labelled Non-Zero Diffusion Particle Flow SMC-PHD Filtering for Multi-Speaker Tracking. IEEE Trans. Multim. 26: 2544-2559 (2024) - [j104]Zhaogeng Liu
, Jielong Yang
, Xionghu Zhong
, Wenwu Wang
, Hechang Chen
, Yi Chang
:
A Novel Composite Graph Neural Network. IEEE Trans. Neural Networks Learn. Syst. 35(10): 13411-13425 (2024) - [j103]Shili Peng
, Wenwu Wang
, Yinli Chen, Xueling Zhong, Qinghua Hu
:
Regression-Based Hyperparameter Learning for Support Vector Machines. IEEE Trans. Neural Networks Learn. Syst. 35(12): 18799-18813 (2024) - [c209]Haohe Liu, Xubo Liu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley:
Learning Temporal Resolution in Spectrogram for Audio Classification. AAAI 2024: 13873-13881 - [c208]Qiushi Huang, Xubo Liu, Tom Ko, Bo Wu, Wenwu Wang, Yu Zhang, Lilian Tang:
Selective Prompting Tuning for Personalized Conversations with LLMs. ACL (Findings) 2024: 16212-16226 - [c207]Junqi Zhao, Xubo Liu, Jinzheng Zhao, Yi Yuan, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang:
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder. EUSIPCO 2024: 1-5 - [c206]Jinzheng Zhao, Xinyuan Qian, Yong Xu, Haohe Liu, Yin Cao, Davide Berghi, Wenwu Wang:
Text-Queried Target Sound Event Localization. EUSIPCO 2024: 261-265 - [c205]John-Joseph Brady, Yuhui Luo, Wenwu Wang, Víctor Elvira, Yunpeng Li:
Regime Learning for Differentiable Particle Filters. FUSION 2024: 1-6 - [c204]Yi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang:
Retrieval-Augmented Text-to-Audio Generation. ICASSP 2024: 581-585 - [c203]Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren:
Multi-Level Graph Learning For Audio Event Classification And Human-Perceived Annoyance Rating Prediction. ICASSP 2024: 716-720 - [c202]Haohe Liu, Ke Chen, Qiao Tian, Wenwu Wang, Mark D. Plumbley:
Audiosr: Versatile Audio Super-Resolution at Scale. ICASSP 2024: 1076-1080 - [c201]Hejing Zhang, Qiaoxi Zhu, Jian Guan, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang:
First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio Generation. ICASSP 2024: 1271-1275 - [c200]Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang:
Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection under Domain Shift. ICASSP 2024: 7670-7674 - [c199]Yaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang:
CM-PIE: Cross-Modal Perception for Interactive-Enhanced Audio-Visual Video Parsing. ICASSP 2024: 8421-8425 - [c198]Kunkun SongGong, Pufen Zhang, Xiongwei Zhang, Meng Sun, Wenwu Wang:
Multi-Speaker Localization in the Circular Harmonic Domain on Small Aperture Microphone Arrays Using Deep Convolutional Networks. ICASSP 2024: 8586-8590 - [c197]Davide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang, Philip J. B. Jackson:
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection. ICASSP 2024: 8816-8820 - [c196]Xuenan Xu, Arshdeep Singh, Mengyue Wu, Wenwu Wang, Mark D. Plumbley:
Investigating Passive Filter Pruning for Efficient CNN-Transformer Audio Captioning. MLSP 2024: 1-6 - [c195]Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang:
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining. MLSP 2024: 1-6 - [c194]Seyed Ahmad Soleymani, Mohammad Shojafar, Chaun Heng Foh, Shidrokh Goudarzi, Wenwu Wang:
Secure Target-Tracking by UAVs in O-RAN Environment. IFIP Networking 2024: 204-212 - [i117]Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang:
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining. CoRR abs/2404.17806 (2024) - [i116]Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo:
ComposerX: Multi-Agent Symbolic Music Composition with LLMs. CoRR abs/2404.18081 (2024) - [i115]Haohe Liu, Xuenan Xu, Yi Yuan, Mengyue Wu, Wenwu Wang, Mark D. Plumbley:
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound. CoRR abs/2405.00233 (2024) - [i114]John-Joseph Brady, Yuhui Luo, Wenwu Wang, Victor Elvira, Yunpeng Li:
Regime Learning for Differentiable Particle Filters. CoRR abs/2405.04865 (2024) - [i113]Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang
, Tony Belpaeme, Dick Botteldooren:
Soundscape Captioning using Sound Affective Quality Network and Large Language Model. CoRR abs/2406.05914 (2024) - [i112]Yiming Zhang, Xuenan Xu, Ruoyi Du, Haohe Liu, Yuan Dong, Zheng-Hua Tan, Wenwu Wang, Zhanyu Ma:
Zero-Shot Audio Captioning Using Soft and Hard Prompts. CoRR abs/2406.06295 (2024) - [i111]Meng Cui, Xubo Liu, Haohe Liu, Jinzheng Zhao, Daoliang Li, Wenwu Wang:
Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Review. CoRR abs/2406.17800 (2024) - [i110]Qiushi Huang, Xubo Liu, Tom Ko, Bo Wu, Wenwu Wang, Yu Zhang, Lilian Tang:
Selective Prompting Tuning for Personalized Conversations with LLMs. CoRR abs/2406.18187 (2024) - [i109]Qiushi Huang, Shuai Fu, Xubo Liu, Wenwu Wang, Tom Ko, Yu Zhang, Lilian Tang:
Learning Retrieval Augmentation for Personalized Dialogue Generation. CoRR abs/2406.18847 (2024) - [i108]Yi Yuan, Dongya Jia, Xiaobin Zhuang, Yuanzhe Chen, Zhengxi Liu, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xubo Liu, Mark D. Plumbley, Wenwu Wang:
Improving Audio Generation with Visual Enhanced Caption. CoRR abs/2407.04416 (2024) - [i107]Feiyang Xiao, Jian Guan, Qiaoxi Zhu, Xubo Liu, Wenbo Wang, Shuhan Qi, Kejia Zhang, Jianyuan Sun, Wenwu Wang:
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining. CoRR abs/2407.04936 (2024) - [i106]Junqi Zhao, Xubo Liu, Jinzheng Zhao, Yi Yuan, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang:
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder. CoRR abs/2407.11745 (2024) - [i105]Xuenan Xu, Haohe Liu, Mengyue Wu, Wenwu Wang, Mark D. Plumbley:
Efficient Audio Captioning with Encoder-Level Knowledge Distillation. CoRR abs/2407.14329 (2024) - [i104]Yi Yuan, Xubo Liu, Haohe Liu, Mark D. Plumbley, Wenwu Wang:
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching. CoRR abs/2409.07614 (2024) - [i103]John-Joseph Brady, Yuhui Luo, Wenwu Wang, Víctor Elvira, Yunpeng Li:
Differentiable Interacting Multiple Model Particle Filtering. CoRR abs/2410.00620 (2024) - [i102]Jinbo Hu, Yin Cao, Ming Wu, Fang Kang, Feiran Yang, Wenwu Wang, Mark D. Plumbley, Jun Yang:
PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and Detection. CoRR abs/2411.06399 (2024) - [i101]Haohe Liu, Gaël Le Lan, Xinhao Mei, Zhaoheng Ni, Anurag Kumar, Varun Nagaraja, Wenwu Wang, Mark D. Plumbley, Yangyang Shi, Vikas Chandra:
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text. CoRR abs/2412.15220 (2024) - 2023
- [j102]Mukunthan Tharmakulasingam
, Wenwu Wang
, Michael Kerby, Roberto La Ragione
, Anil Fernando
:
TransAMR: An Interpretable Transformer Model for Accurate Prediction of Antimicrobial Resistance Using Antibiotic Administration Data. IEEE Access 11: 75337-75350 (2023) - [j101]Jian Guan
, Youde Liu, Qiuqiang Kong, Feiyang Xiao, Qiaoxi Zhu, Jiantong Tian, Wenwu Wang:
Transformer-based autoencoder with ID constraint for unsupervised anomalous sound detection. EURASIP J. Audio Speech Music. Process. 2023(1): 42 (2023) - [j100]Yina Guo
, Ting Liu, Xiaofei Zhang, Anhong Wang, Wenwu Wang:
End-to-end translation of human neural activity to speech with a dual-dual generative adversarial network. Knowl. Based Syst. 277: 110837 (2023) - [j99]Jing Dong
, Kai Wu
, Chang Liu, Xue Mei, Wenwu Wang
:
Discriminative analysis dictionary learning with adaptively ordinal locality preserving. Neural Networks 165: 298-309 (2023) - [j98]Liming Shi, Xinheng Wang
, Limin Yu, Wenwu Wang, Zhi Wang, Muddesar Iqbal
, Charalampos C. Tsimenidis, Shahid Mumtaz:
A long-range aerial acoustic communication scheme. Phys. Commun. 60: 102135 (2023) - [j97]Feiyang Xiao
, Jian Guan
, Qiaoxi Zhu
, Wenwu Wang
:
Graph Attention for Automated Audio Captioning. IEEE Signal Process. Lett. 30: 413-417 (2023) - [j96]Yuanbo Hou
, Siyang Song
, Chuang Yu
, Wenwu Wang
, Dick Botteldooren
:
Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification. IEEE Signal Process. Lett. 30: 1382-1386 (2023) - [j95]Shidrokh Goudarzi
, Seyed Ahmad Soleymani
, Wenwu Wang
, Pei Xiao
:
UAV-Enabled Mobile Edge Computing for Resource Allocation Using Cooperative Evolutionary Computation. IEEE Trans. Aerosp. Electron. Syst. 59(5): 5134-5147 (2023) - [j94]Yi Li
, Yang Sun
, Wenwu Wang
, Syed Mohsen Naqvi
:
U-Shaped Transformer With Frequency-Band Aware Attention for Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1511-1521 (2023) - [j93]Yiming Zhang
, Hong Yu
, Ruoyi Du, Zheng-Hua Tan
, Wenwu Wang
, Zhanyu Ma
, Yuan Dong
:
ACTUAL: Audio Captioning With Caption Feature Space Regularization. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2643-2657 (2023) - [j92]Weitao Yuan
, Shengbei Wang
, Jianming Wang
, Masashi Unoki
, Wenwu Wang
:
Unsupervised Deep Unfolded Representation Learning for Singing Voice Separation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3206-3220 (2023) - [j91]Cheng Xue, Xionghu Zhong
, Minjie Cai
, Hao Chen
, Wenwu Wang
:
Audio-Visual Event Localization by Learning Spatial and Semantic Co-Attention. IEEE Trans. Multim. 25: 418-429 (2023) - [c193]Qiushi Huang, Yu Zhang, Tom Ko, Xubo Liu, Bo Wu, Wenwu Wang, H. Lilian Tang:
Personalized Dialogue Generation with Persona-Adaptive Attention. AAAI 2023: 12916-12923 - [c192]Qiushi Huang, Shuai Fu
, Xubo Liu, Wenwu Wang, Tom Ko, Yu Zhang, Lilian Tang:
Learning Retrieval Augmentation for Personalized Dialogue Generation. EMNLP 2023: 2523-2540 - [c191]Özkan Çayli
, Xubo Liu, Volkan Kiliç
, Wenwu Wang:
Knowledge Distillation for Efficient Audio-Visual Video Captioning. EUSIPCO 2023: 745-749 - [c190]Feiyang Xiao, Qiaoxi Zhu, Jian Guan, Wenwu Wang:
Enhancing Audio Retrieval with Attention-based Encoder for Audio Feature Representation. EUSIPCO 2023: 755-759 - [c189]Yi Yuan, Haohe Liu, Jinhua Liang, Xubo Liu, Mark D. Plumbley, Wenwu Wang:
Leveraging Pre-Trained AudioLDM for Sound Generation: A Benchmark Study. EUSIPCO 2023: 765-769 - [c188]Bowei Pu, Shiyong Lan, Wenwu Wang, Caiying Yang, Wei Pan, Hongyu Yang, Wei Ma:
GanNeXt: A New Convolutional GAN for Anomaly Detection. ICANN (3) 2023: 39-49 - [c187]Xinyuan Zhou, Shiyong Lan, Wenwu Wang, Xinyang Li, Siyuan Zhou, Hongyu Yang:
Visual-Haptic-Kinesthetic Object Recognition with Multimodal Transformer. ICANN (7) 2023: 233-245 - [c186]Piaoyang Li, Shiyong Lan, Shipeng Sun, Wenwu Wang, Yongyang Gao, Yongyu Yang, Guangyu Yu:
Siamese Network Based on MLP and Multi-head Cross Attention for Visual Object Tracking. ICANN (10) 2023: 420-431 - [c185]Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang:
Time-Weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection. ICASSP 2023: 1-5 - [c184]Jian Guan, Feiyang Xiao, Youde Liu, Qiaoxi Zhu, Wenwu Wang:
Anomalous Sound Detection Using Audio Representation with Machine ID Based Contrastive Learning Pretraining. ICASSP 2023: 1-5 - [c183]Yuanbo Hou, Yun Wang, Wenwu Wang, Dick Botteldooren:
Gct: Gated Contextual Transformer for Sequential Audio Tagging. ICASSP 2023: 1-5 - [c182]Xubo Liu, Haohe Liu, Qiuqiang Kong, Xinhao Mei, Mark D. Plumbley, Wenwu Wang:
Simple Pooling Front-Ends for Efficient Audio Classification. ICASSP 2023: 1-5 - [c181]Weitao Yuan, Yuren Bian, Shengbei Wang, Masashi Unoki, Wenwu Wang:
An Improved Optimal Transport Kernel Embedding Method with Gating Mechanism for Singing Voice Separation and Speaker Identification. ICASSP 2023: 1-5 - [c180]Xiaoxiao Yin, Shiyong Lan, Weikang Huang, Yitong Ma, Wenwu Wang, Hongyu Yang, Yilin Zheng:
DLAHSD: Dynamic Label Adopted In Auxiliary Head for SAR Detection. ICIP 2023: 3434-3438 - [c179]Wei Ma, Shiyong Lan, Weikang Huang, Wenwu Wang, Hongyu Yang, Yitong Ma, Yongjie Ma:
A Semantics-Aware Normalizing Flow Model for Anomaly Detection. ICME 2023: 2207-2212 - [c178]Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo P. Mandic, Wenwu Wang, Mark D. Plumbley:
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. ICML 2023: 21450-21474 - [c177]Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang:
Adapting Language-Audio Models as Few-Shot Audio Learners. INTERSPEECH 2023: 276-280 - [c176]Yuanbo Hou, Siyang Song, Cheng Luo, Andrew Mitchell
, Qiaoqiao Ren, Weicheng Xie, Jian Kang
, Wenwu Wang, Dick Botteldooren
:
Joint Prediction of Audio Event and Annoyance Rating in an Urban Soundscape by Hierarchical Graph Representation Learning. INTERSPEECH 2023: 331-335 - [c175]Xubo Liu, Qiushi Huang, Xinhao Mei, Haohe Liu, Qiuqiang Kong, Jianyuan Sun, Shengchen Li, Tom Ko, Yu Zhang, H. Lilian Tang, Mark D. Plumbley, Volkan Kiliç
, Wenwu Wang:
Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention. INTERSPEECH 2023: 2838-2842 - [c174]Haohe Liu, Qiuqiang Kong, Xubo Liu, Xinhao Mei, Wenwu Wang, Mark D. Plumbley:
Ontology-aware Learning and Evaluation for Audio Tagging. INTERSPEECH 2023: 3799-3803 - [c173]Jianyuan Sun, Xubo Liu, Xinhao Mei, Volkan Kiliç
, Mark D. Plumbley, Wenwu Wang:
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning. INTERSPEECH 2023: 4164-4168 - [c172]Wenhan Li, Xiongjie Chen, Wenwu Wang, Víctor Elvira, Yunpeng Li:
Differentiable Bootstrap Particle Filters for Regime-Switching Models. SSP 2023: 200-204 - [i100]Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo P. Mandic, Wenwu Wang, Mark D. Plumbley:
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. CoRR abs/2301.12503 (2023) - [i99]Wenhan Li, Xiongjie Chen, Wenwu Wang, Víctor Elvira, Yunpeng Li:
Differentiable Bootstrap Particle Filters for Regime-Switching Models. CoRR abs/2302.10319 (2023) - [i98]Yi Yuan, Haohe Liu, Jinhua Liang, Xubo Liu, Mark D. Plumbley, Wenwu Wang:
Leveraging Pre-trained AudioLDM for Text to Sound Generation: A Benchmark Study. CoRR abs/2303.03857 (2023) - [i97]Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley, Yuexian Zou, Wenwu Wang:
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research. CoRR abs/2303.17395 (2023) - [i96]Feiyang Xiao, Jian Guan, Qiaoxi Zhu, Wenwu Wang:
Graph Attention for Automated Audio Captioning. CoRR abs/2304.03586 (2023) - [i95]Jian Guan, Feiyang Xiao, Youde Liu, Qiaoxi Zhu, Wenwu Wang:
Anomalous Sound Detection using Audio Representation with Machine ID based Contrastive Learning Pretraining. CoRR abs/2304.03588 (2023) - [i94]Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang:
Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection. CoRR abs/2305.03328 (2023) - [i93]Yi Yuan, Haohe Liu, Xubo Liu, Xiyuan Kang, Mark D. Plumbley, Wenwu Wang:
Latent Diffusion Model Based Foley Sound Generation System For DCASE Challenge 2023 Task 7. CoRR abs/2305.15905 (2023) - [i92]Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang:
Adapting Language-Audio Models as Few-Shot Audio Learners. CoRR abs/2305.17719 (2023) - [i91]Jianyuan Sun, Xubo Liu, Xinhao Mei, Volkan Kiliç, Mark D. Plumbley, Wenwu Wang:
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning. CoRR abs/2305.18753 (2023) - [i90]Yi Yuan, Haohe Liu, Xubo Liu, Xiyuan Kang, Peipei Wu, Mark D. Plumbley, Wenwu Wang:
Text-Driven Foley Sound Generation With Latent Diffusion Model. CoRR abs/2306.10359 (2023) - [i89]Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang:
WavJourney: Compositional Audio Creation with Large Language Models. CoRR abs/2307.14335 (2023) - [i88]Xubo Liu, Qiuqiang Kong, Yan Zhao, Haohe Liu, Yi Yuan, Yuzhuo Liu, Rui Xia, Yuxuan Wang, Mark D. Plumbley, Wenwu Wang:
Separate Anything You Describe. CoRR abs/2308.05037 (2023) - [i87]Haohe Liu, Qiao Tian, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley:
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining. CoRR abs/2308.05734 (2023) - [i86]Jinbo Hu, Yin Cao, Ming Wu, Feiran Yang, Ziying Yu, Wenwu Wang, Mark D. Plumbley, Jun Yang:
META-SELD: Meta-Learning for Fast Adaptation to the new environment in Sound Event Localization and Detection. CoRR abs/2308.08847 (2023) - [i85]Yuanbo Hou, Siyang Song, Cheng Luo, Andrew Mitchell, Qiaoqiao Ren, Weicheng Xie, Jian Kang
, Wenwu Wang, Dick Botteldooren:
Joint Prediction of Audio Event and Annoyance Rating in an Urban Soundscape by Hierarchical Graph Representation Learning. CoRR abs/2308.11980 (2023) - [i84]Siddique Latif, Moazzam Shoukat, Fahad Shamshad, Muhammad Usama, Yi Ren, Heriberto Cuayáhuitl, Wenwu Wang, Xulong Zhang
, Roberto Togneri, Erik Cambria, Björn W. Schuller:
Sparks of Large Audio Models: A Survey and Outlook. CoRR abs/2308.12792 (2023) - [i83]Meng Cui, Xubo Liu, Haohe Liu, Zhuangzhuang Du, Tao Chen
, Guoping Lian, Daoliang Li, Wenwu Wang:
Multimodal Fish Feeding Intensity Assessment in Aquaculture. CoRR abs/2309.05058 (2023) - [i82]Haohe Liu, Ke Chen, Qiao Tian, Wenwu Wang, Mark D. Plumbley:
AudioSR: Versatile Audio Super-resolution at Scale. CoRR abs/2309.07314 (2023) - [i81]Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang:
Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection Under Domain Shift. CoRR abs/2309.07498 (2023) - [i80]Yi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang:
Retrieval-Augmented Text-to-Audio Generation. CoRR abs/2309.08051 (2023) - [i79]Feiyang Xiao, Qiaoxi Zhu, Jian Guan, Xubo Liu, Haohe Liu, Kejia Zhang, Wenwu Wang:
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision. CoRR abs/2309.09705 (2023) - [i78]