


default search action
Xiaoda Yang
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [c8]Yuhang Ma, Wenting Xu, Chaoyi Zhao, Keqiang Sun, Qinfeng Jin, Xiaoda Yang, Zeng Zhao, Changjie Fan, Zhipeng Hu:
Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection. AAAI 2025: 6027-6035 - [c7]Wenrui Liu, Jionghao Bai, Xize Cheng, Jialong Zuo, Ziyue Jiang, Shengpeng Ji, Minghui Fang, Xiaoda Yang, Qian Yang, Zhou Zhao:
VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation. COLING 2025: 10293-10297 - [c6]Xize Cheng, Ruofan Hu, Xiaoda Yang, Jingyu Lu, Dongjie Fu, Zehan Wang, Shengpeng Ji, Rongjie Huang, Boyang Zhang, Tao Jin, Zhou Zhao:
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words? ICLR 2025 - [c5]Weicai Yan, Wang Lin, Zirun Guo, Ye Wang, Fangming Feng, Xiaoda Yang, Zehan Wang, Tao Jin:
Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision. ICLR 2025 - [c4]Minjie Hong
, Yan Xia
, Zehan Wang
, Jieming Zhu
, Ye Wang
, Sihang Cai
, Xiaoda Yang
, Quanyu Dai
, Zhenhua Dong
, Zhimeng Zhang
, Zhou Zhao
:
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration. WWW 2025: 2754-2762 - [i10]Xize Cheng, Dongjie Fu, Xiaoda Yang, Minghui Fang, Ruofan Hu, Jingyu Lu, Jionghao Bai, Zehan Wang, Shengpeng Ji, Rongjie Huang, Linjun Li, Yu Chen, Tao Jin, Zhou Zhao:
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios. CoRR abs/2501.01384 (2025) - [i9]Minjie Hong, Yan Xia, Zehan Wang, Jieming Zhu, Ye Wang, Sihang Cai, Xiaoda Yang, Quanyu Dai, Zhenhua Dong, Zhimeng Zhang, Zhou Zhao:
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration. CoRR abs/2502.14735 (2025) - [i8]Ziyue Jiang, Yi Ren, Ruiqi Li, Shengpeng Ji, Zhenhui Ye, Chen Zhang, Jionghao Bai, Xiaoda Yang, Jialong Zuo, Yu Zhang
, Rui Liu, Xiang Yin, Zhou Zhao:
Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis. CoRR abs/2502.18924 (2025) - [i7]Xiaoda Yang, Junyu Lu, Hongshun Qiu, Sijing Li, Hao Li, Shengpeng Ji, Xudong Tang, Jiayang Xu, Jiaqi Duan, Ziyue Jiang, Cong Lin, Sihang Cai, Zejian Xie, Zhuoyang Song, Songxin Zhang:
Astrea: A MOE-based Visual Understanding Model with Progressive Alignment. CoRR abs/2503.09445 (2025) - [i6]Xiaoda Yang, Jiayang Xu, Kaixuan Luan, Xinyu Zhan, Hongshun Qiu, Shijun Shi, Hao Li, Shuai Yang, Li Zhang, Checheng Yu, Cewu Lu, Lixin Yang:
OmniCam: Unified Multimodal Video Generation via Camera Control. CoRR abs/2504.02312 (2025) - [i5]Sijing Li, Tianwei Lin, Lingshuai Lin, Wenqiao Zhang, Jiang Liu, Xiaoda Yang, Juncheng Li, Yucheng He, Xiaohui Song, Jun Xiao, Yueting Zhuang, Beng Chin Ooi:
EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model. CoRR abs/2504.13650 (2025) - [i4]Weicai Yan, Wang Lin, Zirun Guo, Ye Wang, Fangming Feng, Xiaoda Yang, Zehan Wang, Tao Jin:
Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision. CoRR abs/2504.21423 (2025) - 2024
- [c3]Xiaoda Yang, Xize Cheng, Jiaqi Duan, Hongshun Qiu, Minjie Hong, Minghui Fang, Shengpeng Ji, Jialong Zuo, Zhiqing Hong, Zhimeng Zhang, Tao Jin:
AudioVSR: Enhancing Video Speech Recognition with Audio Data. EMNLP 2024: 15352-15361 - [c2]Dongjie Fu
, Xize Cheng
, Xiaoda Yang
, Hanting Wang
, Zhou Zhao
, Tao Jin
:
Boosting Speech Recognition Robustness to Modality-Distortion with Contrast-Augmented Prompts. ACM Multimedia 2024: 3838-3847 - [c1]Xiaoda Yang
, Xize Cheng
, Dongjie Fu
, Minghui Fang
, Jialong Zuo
, Shengpeng Ji
, Zhou Zhao
, Tao Jin:
SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning. ACM Multimedia 2024: 8149-8158 - [i3]Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao:
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling. CoRR abs/2406.17507 (2024) - [i2]Shengpeng Ji, Ziyue Jiang, Xize Cheng, Yifu Chen, Minghui Fang, Jialong Zuo, Qian Yang, Ruiqi Li, Ziang Zhang, Xiaoda Yang, Rongjie Huang, Yidi Jiang, Qian Chen, Siqi Zheng, Wen Wang, Zhou Zhao:
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling. CoRR abs/2408.16532 (2024) - [i1]Shengpeng Ji, Yifu Chen, Minghui Fang, Jialong Zuo, Jingyu Lu, Hanting Wang, Ziyue Jiang, Long Zhou, Shujie Liu, Xize Cheng, Xiaoda Yang, Zehan Wang, Qian Yang, Jian Li, Yidi Jiang, Jingzhen He, Yunfei Chu, Jin Xu, Zhou Zhao:
WavChat: A Survey of Spoken Dialogue Models. CoRR abs/2411.13577 (2024)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-05-28 20:34 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint