default search action

combined dblp search
author search
venue search
publication search

ask others

Yifan Yang 0005

> Home > Persons

Person information

affiliation: Xiaomi Corp., Beijing, China
affiliation: Shanghai Jiao Tong University, X-LANCE Lab, AI Institute, MoE Key Lab of Artificial Intelligence, Shanghai, China

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

Journal Articles

see FAQ

What is the meaning of the colors in the publication lists?

2025
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/WangYLLMLZSLQ25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/WangYLLMLZSLQ25
Hui Wang, Yifan Yang, Shujie Liu, Jinyu Li, Lingwei Meng, Tie-Yan Liu, Jiaming Zhou, Haoqin Sun, Yan Lu, Yong Qin:
StreamMel: Real-Time Zero-Shot Text-to-Speech Via Interleaved Continuous Autoregressive Modeling. IEEE Signal Process. Lett. 32: 3530-3534 (2025)

Conference and Workshop Papers

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/MaY0GWDY0ZZ025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/MaY0GWDY0ZZ025
Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration. AAAI 2025: 24840-24848
[c21]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl/Chen0YLLXN00L0025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/Chen0YLLXN00L0025
Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie Liu, Xie Chen:
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training. ACL (Findings) 2025: 2262-2282
[c20]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl/0005SZCLYD0LWL025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/0005SZCLYD0LWL025
Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen:
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement. ACL (1) 2025: 2673-2686
[c19]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl/DuP0YYD0XL025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/DuP0YYD0XL025
Yexing Du, Youcheng Pan, Ziyang Ma, Bo Yang, Yifan Yang, Keqi Deng, Xie Chen, Yang Xiang, Ming Liu, Bing Qin:
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning. ACL (1) 2025: 12466-12478
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DuGW0NWZC025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DuGW0NWZC025
Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, Hui Zhang, Xie Chen, Kai Yu:
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech. ICASSP 2025: 1-5
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/YangZJMYYGKKLPC25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/YangZJMYYGKKLPC25
Yifan Yang, Jianheng Zhuo, Zengrui Jin, Ziyang Ma, Xiaoyu Yang, Zengwei Yao, Liyong Guo, Wei Kang, Fangjun Kuang, Long Lin, Daniel Povey, Xie Chen:
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning. ICME 2025: 1-6
2024
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangSDM0P024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangSDM0P024
Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. ICASSP 2024: 10401-10405
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangZX24a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangZX24a
Yifan Yang, Yice Zhang, Ruifeng Xu:
Enhancing Generative Aspect-Based Sentiment Analysis with Relation-Level Supervision and Prompt. ICASSP 2024: 10526-10530
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangKYYGKLP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangKYYGKLP24
Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey:
PromptASR for Contextualized ASR with Controllable Style. ICASSP 2024: 10536-10540
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KangYYKYGLP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KangYYKYGLP24
Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey:
Libriheavy: A 50, 000 Hours ASR Corpus with Punctuation Casing and Context. ICASSP 2024: 10991-10995
[c12]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/YaoGY0KYJLP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/YaoGY0KYJLP24
Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey:
Zipformer: A faster and better encoder for automatic speech recognition. ICLR 2024
[c11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JinYS00YKGML00P24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JinYS00YKGML00P24
Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey:
LibriheavyMix: A 20, 000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization. INTERSPEECH 2024
[c10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SongZ0M0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SongZ0M0024
Zheshu Song, Jianheng Zhuo, Yifan Yang, Ziyang Ma, Shixiong Zhang, Xie Chen:
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR. INTERSPEECH 2024
[c9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wang0L0Z024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wang0L0Z024
Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen:
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer. INTERSPEECH 2024
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/GuoWYWMDWLL0Z0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/GuoWYWMDWLL0Z0024
Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Xu Li, Shuai Fan, Hui Zhang, Xie Chen, Kai Yu:
The X-Lance Technical Report for Interspeech 2024 Speech Processing using Discrete Speech Unit Challenge. ISCSLP 2024: 641-645
2023
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ZhangYLC0X23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ZhangYLC0X23
Yice Zhang, Yifan Yang, Bin Liang, Shiwei Chen, Bing Qin, Ruifeng Xu:
An Empirical Study of Sentiment-Enhanced Pre-Training for Aspect-Based Sentiment Analysis. ACL (Findings) 2023: 9633-9651
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/aims-ws/LiZYX23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aims-ws/LiZYX23
Yihui Li, Yice Zhang, Yifan Yang, Ruifeng Xu:
A Generative Model for Structured Sentiment Analysis. AIMS 2023: 28-38
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/ZhangYLLCX23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ZhangYLLCX23
Yice Zhang, Yifan Yang, Meng Li, Bin Liang, Shiwei Chen, Ruifeng Xu:
Target-to-Source Augmentation for Aspect Sentiment Triplet Extraction. EMNLP 2023: 12165-12177
[c4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Yao0KGYYLP23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Yao0KGYYLP23
Zengwei Yao, Wei Kang, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Yifan Yang, Long Lin, Daniel Povey:
Delay-penalized CTC Implemented Based on Finite State Transducer. INTERSPEECH 2023: 1329-1333
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangYGY0KL0P23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangYGY0KL0P23
Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. INTERSPEECH 2023: 4409-4413
2022
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/ZhangYLLCD0X22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ZhangYLLCD0X22
Yice Zhang, Yifan Yang, Yihui Li, Bin Liang, Shiwei Chen, Yixue Dang, Min Yang, Ruifeng Xu:
Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction. EMNLP 2022: 6485-6498
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/semeval/LiYZX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/semeval/LiYZX22
Yihui Li, Yifan Yang, Yice Zhang, Ruifeng Xu:
HITSZ-HLT at SemEval-2022 Task 10: A Span-Relation Extraction Framework for Structured Sentiment Analysis. SemEval@NAACL 2022: 1406-1411

Informal and Other Publications

see FAQ

What is the meaning of the colors in the publication lists?

2025
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-11128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-11128
Hui Wang, Shujie Liu, Lingwei Meng, Jinyu Li, Yifan Yang, Shiwan Zhao, Haiyang Sun, Yanqing Liu, Haoqin Sun, Jiaming Zhou, Yan Lu, Yong Qin:
FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching. CoRR abs/2502.11128 (2025)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-01743
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-01743
Abdelrahman Abouelenin, Atabak Ashfaq, Adam Atkinson, Hany Awadalla, Nguyen Bach, Jianmin Bao, Alon Benhaim, Martin Cai, Vishrav Chaudhary, Congcong Chen, Dong Chen, Dongdong Chen, Jun-Kun Chen, Weizhu Chen, Yen-Chun Chen, Yi-ling Chen, Qi Dai, Xiyang Dai, Ruchao Fan, Mei Gao, Min Gao, Amit Garg, Abhishek Goswami, Junheng Hao, Amr Hendy, Yuxuan Hu, Xin Jin, Mahmoud Khademi, Dongwoo Kim, Young Jin Kim, Gina Lee, Jinyu Li, Yunsheng Li, Chen Liang, Xihui Lin, Zeqi Lin, Mengchen Liu, Yang Liu, Gilsinia Lopez, Chong Luo, Piyush Madan, Vadim Mazalov, Arindam Mitra, Ali Mousavi, Anh Nguyen, Jing Pan, Daniel Perez-Becker, Jacob Platin, Thomas Portet, Kai Qiu, Bo Ren, Liliang Ren, Sambuddha Roy, Ning Shang, Yelong Shen, Saksham Singhal, Subhojit Som, Xia Song, Tetyana Sych, Praneetha Vaddamanu, Shuohang Wang, Yiming Wang, Zhenghao Wang, Haibin Wu, Haoran Xu, Weijian Xu, Yifan Yang, Ziyi Yang, Donghan Yu, Ishmam Zabir, Jianwen Zhang, Li Lyna Zhang, Yunan Zhang, Xiren Zhou:
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs. CoRR abs/2503.01743 (2025)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-10352
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-10352
Yifan Yang, Shujie Liu, Jinyu Li, Yuxuan Hu, Haibin Wu, Hui Wang, Jianwei Yu, Lingwei Meng, Haiyang Sun, Yanqing Liu, Yan Lu, Kai Yu, Xie Chen:
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis. CoRR abs/2504.10352 (2025)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-12867
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-12867
Guanrou Yang, Chen Yang, Qian Chen, Ziyang Ma, Wenxi Chen, Wen Wang, Tianrui Wang, Yifan Yang, Zhikang Niu, Wenrui Liu, Fan Yu, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting. CoRR abs/2504.12867 (2025)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-19669
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-19669
Haiyang Sun, Shujie Hu, Shujie Liu, Lingwei Meng, Hui Wang, Bing Han, Yifan Yang, Yanqing Liu, Sheng Zhao, Yan Lu, Yanmin Qian:
Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling. CoRR abs/2505.19669 (2025)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-21527
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-21527
Jianheng Zhuo, Yifan Yang, Yiwen Shao, Yong Xu, Dong Yu, Kai Yu, Xie Chen:
VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining. CoRR abs/2505.21527 (2025)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-24875
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-24875
Yu Zhang, Yunqi Li, Yifan Yang, Rui Wang, Yuqing Yang, Dai Qi, Jianmin Bao, Dongdong Chen, Chong Luo, Lili Qiu:
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL. CoRR abs/2505.24875 (2025)
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-12570
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-12570
Hui Wang, Yifan Yang, Shujie Liu, Jinyu Li, Lingwei Meng, Yanqing Liu, Jiaming Zhou, Haoqin Sun, Yan Lu, Yong Qin:
StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling. CoRR abs/2506.12570 (2025)
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2507-23779
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2507-23779
Miaosen Zhang, Ziqiang Xu, Jialiang Zhu, Qi Dai, Kai Qiu, Yifan Yang, Chong Luo, Tianyi Chen, Justin Wagle, Tim Franklin, Baining Guo:
Phi-Ground Tech Report: Advancing Perception in GUI Grounding. CoRR abs/2507.23779 (2025)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-14664
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-14664
Hui Wang, Jinghua Zhao, Yifan Yang, Shujie Liu, Junyang Chen, Yanzhe Zhang, Shiwan Zhao, Jinyu Li, Jiaming Zhou, Haoqin Sun, Yan Lu, Yong Qin:
SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation. CoRR abs/2510.14664 (2025)
2024
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-14321
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-14321
Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, Hui Zhang, Xie Chen, Kai Yu:
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech. CoRR abs/2401.14321 (2024)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-08846
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-08846
Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-06079
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-06079
Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Shuai Fan, Hui Zhang, Xie Chen, Kai Yu:
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge. CoRR abs/2404.06079 (2024)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06619
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06619
Zheshu Song, Jianheng Zhuo, Yifan Yang, Ziyang Ma, Shixiong Zhang, Xie Chen:
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR. CoRR abs/2406.06619 (2024)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-11546
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-11546
Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen:
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement. CoRR abs/2406.11546 (2024)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-00819
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-00819
Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey:
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization. CoRR abs/2409.00819 (2024)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-08797
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-08797
Mingyu Cui, Yifan Yang, Jiajun Deng, Jiawen Kang, Shujie Hu, Tianzi Wang, Zhaoqing Li, Shiliang Zhang, Xie Chen, Xunying Liu:
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR. CoRR abs/2409.08797 (2024)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-08805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-08805
Mingyu Cui, Daxin Tan, Yifan Yang, Dingdong Wang, Huimeng Wang, Xiao Chen, Xie Chen, Xunying Liu:
Exploring SSL Discrete Tokens for Multilingual ASR. CoRR abs/2409.08805 (2024)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19510
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19510
Yexing Du, Ziyang Ma, Yifan Yang, Keqi Deng, Xie Chen, Bo Yang, Yang Xiang, Ming Liu, Bing Qin:
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought. CoRR abs/2409.19510 (2024)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-13552
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-13552
Rui Tian, Qi Dai, Jianmin Bao, Kai Qiu, Yifan Yang, Chong Luo, Zuxuan Wu, Yu-Gang Jiang:
REDUCIO! Generating 1024⨉1024 Video within 16 Seconds using Extremely Compressed Motion Latents. CoRR abs/2411.13552 (2024)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-04531
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-04531
Miaosen Zhang, Qi Dai, Yifan Yang, Jianmin Bao, Dongdong Chen, Kai Qiu, Chong Luo, Xin Geng, Baining Guo:
MageBench: Bridging Large Multimodal Models to Agents. CoRR abs/2412.04531 (2024)
2023
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11558
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11558
Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. CoRR abs/2305.11558 (2023)
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07377
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07377
Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. CoRR abs/2309.07377 (2023)
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07414
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07414
Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey:
PromptASR for contextualized ASR with controllable style. CoRR abs/2309.07414 (2023)
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07648
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07648
Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen:
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer. CoRR abs/2309.07648 (2023)
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08105
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08105
Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey:
Libriheavy: a 50, 000 hours ASR corpus with punctuation casing and context. CoRR abs/2309.08105 (2023)
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-11230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-11230
Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey:
Zipformer: A faster and better encoder for automatic speech recognition. CoRR abs/2310.11230 (2023)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.