default search action

combined dblp search
author search
venue search
publication search

ask others

Zhisheng Zheng

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Chen0LXLZ0025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Chen0LXLZ0025
Wenxi Chen, Ziyang Ma, Xiquan Li, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Kai Yu, Xie Chen:
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs. ICASSP 2025: 1-5
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiC0XLZK025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiC0XLZK025
Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen:
DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning. ICASSP 2025: 1-5
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/ictai/GaoZK25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ictai/GaoZK25
Xiang Gao, Zhisheng Zheng, Alois Knoll:
Compositional Neural Distance Field with Latent Code Embedding for Dynamic Objects. ICTAI 2025: 889-896
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-04713
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-04713
Anuj Diwan, Zhisheng Zheng, David Harwath, Eunsol Choi:
Scaling Rich Style-Prompted Text-to-Speech Datasets. CoRR abs/2503.04713 (2025)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-13032
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-13032
Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, Tianrui Wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, Chng Eng Siong, Xie Chen:
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix. CoRR abs/2505.13032 (2025)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-24693
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-24693
Zihan Liu, Zhikang Niu, Qiuyang Xiao, Zhisheng Zheng, Ruoqi Yuan, Yuhang Zang, Yuhang Cao, Xiaoyi Dong, Jianze Liang, Xie Chen, Leilei Sun, Dahua Lin, Jiaqi Wang:
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence. CoRR abs/2510.24693 (2025)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-12347
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-12347
Zhisheng Zheng, Puyuan Peng, Anuj Diwan, Cong Phuoc Huynh, Xiaohang Sun, Zhu Liu, Vimal Bhat, David Harwath:
VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing. CoRR abs/2511.12347 (2025)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-20974
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-20974
Zhisheng Zheng, Xiaohang Sun, Tuan Dinh, Abhishek Yanamandra, Abhinav Jain, Zhu Liu, Sunil Hadap, Vimal Bhat, Manoj Aggarwal, Gérard G. Medioni, David Harwath:
RosettaSpeech: Zero-Shot Speech-to-Speech Translation from Monolingual Data. CoRR abs/2511.20974 (2025)
2024
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/MaZYLGZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/MaZYLGZ024
Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. ACL (Findings) 2024: 15747-15760
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaWZG0Z024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaWZG0Z024
Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition. ICASSP 2024: 11146-11150
[c10]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ZhengPM0CH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ZhengPM0CH24
Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath:
BAT: Learning to Reason about Spatial Sounds with Large Language Models. ICML 2024
[c9]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/ChenLMZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/ChenLMZ024
Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen:
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer. IJCAI 2024: 3807-3815
[c8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaCZZCLY0H24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaCZZCLY0H24
Ziyang Ma, Mingjie Chen, Hezhao Zhang, Zhisheng Zheng, Wenxi Chen, Xiquan Li, Jiaxin Ye, Xie Chen, Thomas Hain:
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark. INTERSPEECH 2024
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/XiaMZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/XiaMZ024
Zhengshun Xia, Ziyang Ma, Zhisheng Zheng, Xie Chen:
Improving Emotion Recognition with Pre-Trained Models, Multimodality, and Contextual Information. ISCSLP 2024: 636-640
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-03497
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-03497
Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen:
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer. CoRR abs/2401.03497 (2024)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-01591
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-01591
Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath:
BAT: Learning to Reason about Spatial Sounds with Large Language Models. CoRR abs/2402.01591 (2024)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07162
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07162
Ziyang Ma, Mingjie Chen, Hezhao Zhang, Zhisheng Zheng, Wenxi Chen, Xiquan Li, Jiaxin Ye, Xie Chen, Thomas Hain:
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark. CoRR abs/2406.07162 (2024)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-09472
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-09472
Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen:
DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning. CoRR abs/2410.09472 (2024)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-09503
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-09503
Wenxi Chen, Ziyang Ma, Xiquan Li, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Kai Yu, Xie Chen:
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs. CoRR abs/2410.09503 (2024)
2023
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/WangTMZCZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/WangTMZCZ23
Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang:
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition. ASRU 2023: 1-6
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/YangMZSNC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/YangMZSNC23
Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning. ASRU 2023: 1-7
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenMTWZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenMTWZ23
Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng:
Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition. ICASSP 2023: 1-5
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaZTW023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaZTW023
Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen:
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets. INTERSPEECH 2023: 82-86
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaZY00023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaZY00023
Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. INTERSPEECH 2023: 1269-1273
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhengM0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhengM0023
Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen:
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition. INTERSPEECH 2023: 3307-3311
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-09331
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-09331
Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng:
Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition. CoRR abs/2302.09331 (2023)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08920
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08920
Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. CoRR abs/2306.08920 (2023)
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-14814
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-14814
Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen:
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition. CoRR abs/2308.14814 (2023)
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-10294
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-10294
Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition. CoRR abs/2309.10294 (2023)
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13860
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13860
Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning. CoRR abs/2309.13860 (2023)
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15185
Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. CoRR abs/2312.15185 (2023)
2022
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/tgrs/LiuZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tgrs/LiuZ22
Yang Liu, Zhisheng Zheng:
Noniterative f -x-y Streaming Prediction Filtering for Random Noise Attenuation on Seismic Data. IEEE Trans. Geosci. Remote. Sens. 60: 1-9 (2022)
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15631
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15631
Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang:
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition. CoRR abs/2210.15631 (2022)
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-07321
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-07321
Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen:
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets. CoRR abs/2211.07321 (2022)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.