


default search action
Songyang Gao
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Conference and Workshop Papers
- 2025
[c19]Shihan Dou, Yan Liu, Enyu Zhou, Songyang Gao, Tianlong Li, Limao Xiong, Xin Zhao, Haoxiang Jia, Junjie Ye
, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning. AAAI 2025: 23805-23813
[c18]Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen:
Are Your LLMs Capable of Stable Reasoning? ACL (Findings) 2025: 17594-17632
[c17]Qiming Ge, Shuhao Xing, Songyang Gao, Yunhua Zhou, Yicheng Zou, Songyang Zhang, Zhi Chen, Hang Yan, Qi Zhang, Qipeng Guo, Kai Chen:
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law. ACL (1) 2025: 23746-23761
[c16]Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments. ACL (1) 2025: 27914-27961
[c15]Junjie Ye, Guanyu Li, Songyang Gao, Caishuang Huang, Yilong Wu, Sixian Li, Xiaoran Fan, Shihan Dou, Tao Ji, Qi Zhang, Tao Gui, Xuanjing Huang:
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios. COLING 2025: 156-187- 2024
[c14]Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Wei Shen, Limao Xiong, Yuhao Zhou, Xiao Wang, Zhiheng Xi, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin. ACL (1) 2024: 1932-1945
[c13]Junjie Ye
, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang:
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages. ACL (1) 2024: 2181-2211
[c12]Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin:
Navigating the OverKill in Large Language Models. ACL (1) 2024: 4602-4614
[c11]Junjie Ye, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang:
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning. EMNLP 2024: 313-333
[c10]Han Xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang, Xuanjing Huang:
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data. EMNLP (Findings) 2024: 8178-8188
[c9]Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin:
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback. ICML 2024
[c8]Shihan Dou
, Songyang Gao
, Tao Gui
, Qi Zhang:
CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing. NLPCC (1) 2024: 284-297- 2023
[c7]Xiao Wang, Weikang Zhou, Qi Zhang, Jie Zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, Tao Gui:
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model. ACL (Findings) 2023: 555-568
[c6]Songyang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan:
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization. ACL (1) 2023: 12177-12189
[c5]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan:
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection. ACL (Findings) 2023: 13573-13581
[c4]Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang:
RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms. EMNLP (Findings) 2023: 10262-10274
[c3]Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Jia Liu, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement. EMNLP (Findings) 2023: 11383-11406- 2022
[c2]Shihan Dou, Rui Zheng, Ting Wu, Songyang Gao, Junjie Shan, Qi Zhang, Yueming Wu, Xuanjing Huang:
Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective. COLING 2022: 2278-2287
[c1]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang:
Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding. EMNLP 2022: 4112-4122
Informal and Other Publications
- 2025
[i33]Chengqi Lyu, Songyang Gao, Yuzhe Gu, Wenwei Zhang, Jianfei Gao, Kuikun Liu, Ziyi Wang, Shuaibin Li, Qian Zhao, Haian Huang, Weihan Cao, Jiangning Liu, Hongwei Liu, Junnan Liu, Songyang Zhang, Dahua Lin, Kai Chen:
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning. CoRR abs/2502.06781 (2025)
[i32]Xiaomin Yu, Pengxiang Ding, Wenjie Zhang, Siteng Huang, Songyang Gao, Chengwei Qin, Kejian Wu, Zhaoxin Fan, Ziyue Qiao, Donglin Wang:
Unicorn: Text-Only Data Synthesis for Vision Language Model Training. CoRR abs/2503.22655 (2025)
[i31]Jialun Zhong, Wei Shen, Yanzeng Li, Songyang Gao, Hua Lu, Yicheng Chen, Yang Zhang, Wei Zhou, Jinjie Gu, Lei Zou:
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future. CoRR abs/2504.12328 (2025)
[i30]Qiming Ge, Shuhao Xing, Songyang Gao, Yunhua Zhou, Yicheng Zou, Songyang Zhang, Zhi Chen, Hang Yan, Qi Zhang, Qipeng Guo, Kai Chen:
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law. CoRR abs/2506.13216 (2025)
[i29]Shihan Dou, Shichun Liu, Yuming Yang, Yicheng Zou, Yunhua Zhou, Shuhao Xing, Chenhao Huang, Qiming Ge, Demin Song, Haijun Lv, Songyang Gao, Chengqi Lv, Enyu Zhou, Honglin Guo, Zhiheng Xi, Wenwei Zhang, Qipeng Guo, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Tao Gui, Kai Chen:
Pre-Trained Policy Discriminators are General Reward Models. CoRR abs/2507.05197 (2025)
[i28]Zhouqi Hua, Wenwei Zhang, Chengqi Lyu, Yuzhe Gu, Songyang Gao, Kuikun Liu, Kai Chen:
The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner. CoRR abs/2507.13332 (2025)
[i27]Junhao Shen, Haiteng Zhao, Yuzhe Gu, Songyang Gao, Kuikun Liu, Haian Huang, Jianfei Gao, Dahua Lin, Wenwei Zhang, Kai Chen:
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning. CoRR abs/2507.16814 (2025)
[i26]Shudong Liu
, Hongwei Liu, Junnan Liu, Linchen Xiao, Songyang Gao, Chengqi Lyu, Yuzhe Gu, Wenwei Zhang, Derek F. Wong, Songyang Zhang, Kai Chen:
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward. CoRR abs/2508.03686 (2025)
[i25]Lei Bai, Zhongrui Cai, Yuhang Cao, Maosong Cao, Weihan Cao, Chiyu Chen, Haojiong Chen, Kai Chen, Pengcheng Chen, Ying Chen, Yongkang Chen, Yu Cheng, Pei Chu, Tao Chu, Erfei Cui, Ganqu Cui, Long Cui, Ziyun Cui, Nianchen Deng, Ning Ding, Nanqing Dong, Peijie Dong, Shihan Dou, Sinan Du, Haodong Duan, Caihua Fan, Ben Gao, Changjiang Gao, Jianfei Gao, Songyang Gao, Yang Gao, Zhangwei Gao, Jiaye Ge, Qiming Ge, Lixin Gu, Yuzhe Gu, Aijia Guo, Qipeng Guo, Xu Guo, Conghui He, Junjun He, Yili Hong, Siyuan Hou, Caiyu Hu, Hanglei Hu, Jucheng Hu, Ming Hu, Zhouqi Hua, Haian Huang, Junhao Huang, Xu Huang, Zixian Huang, Zhe Jiang, Lingkai Kong, Linyang Li, Peiji Li, Pengze Li, Shuaibin Li, Tianbin Li, Wei Li, Yuqiang Li, Dahua Lin, Junyao Lin, Tianyi Lin, Zhishan Lin, Hongwei Liu, Jiangning Liu, Jiyao Liu, Junnan Liu, Kai Liu, Kaiwen Liu, Kuikun Liu, Shichun Liu, Shudong Liu, Wei Liu, Xinyao Liu, Yuhong Liu, Zhan Liu, Yinquan Lu, Haijun Lv, Hongxia Lv, Huijie Lv, Qitan Lv, Ying Lv, Chengqi Lyu, Chenglong Ma, Jianpeng Ma, Ren Ma, Runmin Ma, Runyuan Ma, Xinzhu Ma, Yichuan Ma, Zihan Ma, Sixuan Mi, Junzhi Ning, Wenchang Ning, Xinle Pang, Jiahui Peng, Runyu Peng, Yu Qiao:
Intern-S1: A Scientific Multimodal Foundation Model. CoRR abs/2508.15763 (2025)- 2024
[i24]Junjie Ye
, Guanyu Li, Songyang Gao, Caishuang Huang, Yilong Wu, Sixian Li, Xiaoran Fan, Shihan Dou, Qi Zhang, Tao Gui, Xuanjing Huang:
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios. CoRR abs/2401.00741 (2024)
[i23]Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
Secrets of RLHF in Large Language Models Part II: Reward Modeling. CoRR abs/2401.06080 (2024)
[i22]Junjie Ye
, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang:
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning. CoRR abs/2401.08326 (2024)
[i21]Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye
, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin:
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback. CoRR abs/2401.11458 (2024)
[i20]Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin:
Navigating the OverKill in Large Language Models. CoRR abs/2401.17633 (2024)
[i19]Junjie Ye
, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang:
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages. CoRR abs/2402.10753 (2024)
[i18]Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang:
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models. CoRR abs/2403.12171 (2024)
[i17]Chen Yang, Junzhuo Li, Xinyao Niu, Xinrun Du, Songyang Gao, Haoran Zhang
, Zhaoliang Chen, Xingwei Qu, Ruibin Yuan, Yizhi Li, Jiaheng Liu, Stephen W. Huang, Shawn Yue, Wenhu Chen, Jie Fu, Ge Zhang:
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis. CoRR abs/2404.01204 (2024)
[i16]Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Binhang Yuan, Wenhu Chen, Jie Fu, Ge Zhang:
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model. CoRR abs/2404.04167 (2024)
[i15]Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments. CoRR abs/2406.04151 (2024)
[i14]Han Xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang, Xuanjing Huang:
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data. CoRR abs/2408.14874 (2024)
[i13]Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen:
Are Your LLMs Capable of Stable Reasoning? CoRR abs/2412.13147 (2024)- 2023
[i12]Songyang Gao, Shihan Dou, Junjie Shan, Qi Zhang, Xuanjing Huang:
CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing. CoRR abs/2305.02865 (2023)
[i11]Xiao Wang, Weikang Zhou, Qi Zhang, Jie Zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, Tao Gui:
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model. CoRR abs/2305.12816 (2023)
[i10]Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement. CoRR abs/2305.14497 (2023)
[i9]Songyang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan:
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization. CoRR abs/2306.15164 (2023)
[i8]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan:
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection. CoRR abs/2306.15705 (2023)
[i7]Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu
, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang:
Secrets of RLHF in Large Language Models Part I: PPO. CoRR abs/2307.04964 (2023)
[i6]Sizhou Chen, Songyang Gao, Sen Fang:
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks. CoRR abs/2309.07765 (2023)
[i5]Xiao Wang, Yuansen Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xianjun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang:
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models. CoRR abs/2310.06762 (2023)
[i4]Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang:
RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms. CoRR abs/2310.11227 (2023)
[i3]Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment. CoRR abs/2312.09979 (2023)- 2022
[i2]Shihan Dou, Rui Zheng, Ting Wu, Songyang Gao, Qi Zhang, Yueming Wu, Xuanjing Huang:
Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective. CoRR abs/2202.08048 (2022)
[i1]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang:
Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding. CoRR abs/2210.07547 (2022)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-11-04 22:28 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







