


default search action
Juntao Dai
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2026
[j2]Jiaming Ji, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Borong Zhang, Donghai Hong, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Lukas Vierling, Zhaowei Zhang, Fanzhi Zeng, Juntao Dai, Xuehai Pan, Hua Xu, Aidan O'Gara, Kwan Ng, Brian Tse, Jie Fu, Stephen Mcaleer, Yanfeng Wang, Mingchuan Yang, Yunhuai Liu, Yizhou Wang, Song-Chun Zhu, Yike Guo, Yaodong Yang, Wen Gao:
AI Alignment: A Contemporary Survey. ACM Comput. Surv. 58(5): 132:1-132:38 (2026)- 2025
[j1]Bei Sun, Zhixuan Peng, Juntao Dai, Yonggang Li:
A control-oriented operation mode recognizing method using fuzzy evaluation and attention LSTM networks. Appl. Soft Comput. 180: 113326 (2025)
[c6]Juntao Dai, Taiye Chen, Yaodong Yang, Qian Zheng, Gang Pan:
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization. ICLR 2025
[i27]Pengcheng Wen, Jiaming Ji, Chi-Min Chan, Juntao Dai, Donghai Hong, Yaodong Yang, Sirui Han, Yike Guo
:
ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs. CoRR abs/2503.12918 (2025)
[i26]Jiaming Ji, Xinyu Chen, Rui Pan, Han Zhu
, Conghui Zhang, Jiahao Li, Donghai Hong, Boyuan Chen, Jiayi Zhou, Kaile Wang, Juntao Dai, Chi-Min Chan, Sirui Han, Yike Guo
, Yaodong Yang:
Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models. CoRR abs/2503.17682 (2025)
[i25]Juntao Dai, Taiye Chen, Yaodong Yang, Qian Zheng, Gang Pan:
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization. CoRR abs/2503.18130 (2025)
[i24]Chuxue Cao, Zhenghao Zhu, Junqi Zhu, Guoying Lu, Siyu Peng, Juntao Dai, Weijie Shi, Sirui Han, Yike Guo:
Measuring Hong Kong Massive Multi-Task Language Understanding. CoRR abs/2505.02177 (2025)
[i23]Jiaming Ji, Wenqi Chen, Kaile Wang, Donghai Hong, Sitong Fang, Boyuan Chen, Jiayi Zhou, Juntao Dai, Sirui Han, Yike Guo, Yaodong Yang:
Mitigating Deceptive Alignment via Self-Monitoring. CoRR abs/2505.18807 (2025)
[i22]Jiaming Ji, Sitong Fang, Wenjing Cao, Jiahao Li, Xuyao Wang, Juntao Dai, Chi-Min Chan, Sirui Han, Yike Guo, Yaodong Yang:
The Mirage of Multimodality: Where Truth is Tested and Honesty Unravels. CoRR abs/2505.20214 (2025)
[i21]Boyuan Chen, Donghai Hong, Jiaming Ji, Jiacheng Zheng, Bowen Dong, Jiayi Zhou, Kaile Wang, Juntao Dai, Xuyao Wang, Wenqi Chen, Qirui Zheng, Wenxin Li, Sirui Han, Yike Guo, Yaodong Yang:
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback. CoRR abs/2505.23950 (2025)
[i20]Chuxue Cao, Han Zhu
, Jiaming Ji, Qichao Sun, Zhenghao Zhu, Yinyu Wu, Juntao Dai, Yaodong Yang, Sirui Han, Yike Guo:
SafeLawBench: Towards Safe Alignment of Large Language Models. CoRR abs/2506.06636 (2025)
[i19]Guoxi Zhang, Jiawei Chen, Tianzhuo Yang, Jiaming Ji, Yaodong Yang, Juntao Dai:
A Game-Theoretic Negotiation Framework for Cross-Cultural Consensus in LLMs. CoRR abs/2506.13245 (2025)
[i18]Chuxue Cao, Mengze Li, Juntao Dai, Jinluan Yang, Zijian Zhao
, Shengyu Zhang, Weijie Shi, Chengzhong Liu, Sirui Han, Yike Guo:
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving. CoRR abs/2506.17104 (2025)
[i17]Yoshua Bengio, Tegan Maharaj, Luke Ong, Stuart Russell, Dawn Song, Max Tegmark, Lan Xue, Ya-Qin Zhang, Stephen Casper, Wan Sie Lee, Sören Mindermann, Vanessa Wilfred, Vidhisha Balachandran, Fazl Barez, Michael Belinsky, Imane Bello, Malo Bourgon, Mark Brakel, Siméon Campos, Duncan Cass-Beggs, Jiahao Chen, Rumman Chowdhury, Kuan Chua Seah, Jeff Clune, Juntao Dai, Agnès Delaborde, Nouha Dziri, Francisco Eiras, Joshua Engels, Jinyu Fan, Adam Gleave, Noah Goodman, Fynn Heide, Johannes Heidecke, Dan Hendrycks, Cyrus Hodes, Bryan Low Kian Hsiang, Minlie Huang, Sami Jawhar, Wang Jingyu, Adam Tauman Kalai, Meindert Kamphuis, Mohan S. Kankanhalli, Subhash Kantamneni, Mathias Bonde Kirk, Thomas Kwa, Jeffrey Ladish, Kwok-Yan Lam, Wan Lee Sie, Taewhi Lee, Xiaojian Li, Jiajun Liu, Chaochao Lu, Yifan Mai, Richard Mallah, Julian Michael, Nick Moës, Simon Möller, Kihyuk Nam, Kwan Yee Ng, Mark Nitzberg, Besmira Nushi, Seán Ó hÉigeartaigh, Alejandro Ortega, Pierre Peigné
, James Petrie, Benjamin Prud'homme, Reihaneh Rabbany, Nayat Sánchez-Pi
, Sarah Schwettmann, Buck Shlegeris, Saad Siddiqui, Aradhana Sinha, Martín Soto, Cheston Tan, Dong Ting, William-Chandra Tjhi, Robert Trager, Brian Tse, Anthony Tung K. H., John Willes, Denise Wong, Wei Xu, Rongwu Xu, Yi Zeng, HongJiang Zhang, Djordje Zikelic:
The Singapore Consensus on Global AI Safety Research Priorities. CoRR abs/2506.20702 (2025)
[i16]Han Zhu, Juntao Dai, Jiaming Ji, Haoran Li, Chengkun Cai, Pengcheng Wen, Chi-Min Chan, Boyuan Chen, Yaodong Yang, Sirui Han, Yike Guo:
SafeMT: Multi-turn Safety for Multimodal Language Models. CoRR abs/2510.12133 (2025)
[i15]Yakun Cui, Fushuo Huo, Weijie Shi, Juntao Dai, Hang Du, Zhenghao Zhu, Sirui Han, Yike Guo:
Perception, Understanding and Reasoning, A Multimodal Benchmark for Video Fake News Detection. CoRR abs/2510.24816 (2025)
[i14]Ruiyang Zhang, Jiahao Luo, Xiaoru Feng, Qiufan Pang, Yaodong Yang, Juntao Dai:
SafeEditor: Unified MLLM for Efficient Post-hoc T2I Safety Editing. CoRR abs/2510.24820 (2025)
[i13]Boyuan Chen, Sitong Fang, Jiaming Ji, Yanxu Zhu, Pengcheng Wen, Jinzhou Wu, Yingshui Tan, Boren Zheng, Mengying Yuan, Wenqi Chen, Donghai Hong, Alex Qiu, Xin Chen, Jiayi Zhou, Kaile Wang, Juntao Dai, Borong Zhang, Tianzhuo Yang, Saad Siddiqui, Isabella Duan, Yawen Duan, Brian Tse, Jen-Tse Huang, Kun Wang, Baihui Zheng, Jiaheng Liu, Jian Yang, Yiming Li, Wenting Chen, Dongrui Liu, Lukas Vierling, Zhiheng Xi, Haobo Fu, Wenxuan Wang, Jitao Sang, Zhengyan Shi, Chi-Min Chan, Eugenie Shi, Simin Li, Juncheng Li, Jian Yang, Wei Ji, Dong Li, Jinglin Yang, Jun Song, Yinpeng Dong, Jie Fu, Bo Zheng, Min Yang, Yike Guo, Philip Torr, Robert Trager, Yi Zeng, Zhongyuan Wang, Yaodong Yang, Tiejun Huang, Ya-Qin Zhang, Hongjiang Zhang, Andrew Yao:
AI Deception: Risks, Dynamics, and Controls. CoRR abs/2511.22619 (2025)
[i12]Dadi Guo, Qingyu Liu, Dongrui Liu, Qihan Ren, Shuai Shao, Tianyi Qiu, Haoran Li, Yi R. Fung, Zhongjie Ba, Juntao Dai, Jiaming Ji, Zhikai Chen, Jialing Tao, Yaodong Yang, Jing Shao, Xia Hu:
Are Your Agents Upward Deceivers? CoRR abs/2512.04864 (2025)
[i11]Borong Zhang, Jiahao Li, Jiachen Shen, Yishuai Cai, Yuhao Zhang, Yuanpei Chen, Juntao Dai, Jiaming Ji, Yaodong Yang:
VLA-Arena: An Open-Source Framework for Benchmarking Vision-Language-Action Models. CoRR abs/2512.22539 (2025)- 2024
[c5]Juntao Dai, Yaodong Yang, Qian Zheng, Gang Pan:
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation. ICML 2024
[c4]Juntao Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang:
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset. NeurIPS 2024
[c3]Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Tianyi Qiu, Juntao Dai, Yaodong Yang:
Aligner: Efficient Alignment by Learning to Correct. NeurIPS 2024
[i10]Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang
:
Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction. CoRR abs/2402.02416 (2024)
[i9]Jiayi Zhou, Jiaming Ji, Juntao Dai, Yaodong Yang:
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback. CoRR abs/2409.00162 (2024)
[i8]Juntao Dai, Yaodong Yang, Qian Zheng, Gang Pan:
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation. CoRR abs/2412.11138 (2024)- 2023
[c2]Juntao Dai, Jiaming Ji, Long Yang, Qian Zheng, Gang Pan:
Augmented Proximal Policy Optimization for Safe Reinforcement Learning. AAAI 2023: 7288-7295
[i7]Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang:
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research. CoRR abs/2305.09304 (2023)
[i6]Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang
:
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. CoRR abs/2307.04657 (2023)
[i5]Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, Weipeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu:
Baichuan 2: Open Large-scale Language Models. CoRR abs/2309.10305 (2023)
[i4]Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang
:
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark. CoRR abs/2310.12567 (2023)
[i3]Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang
, Yizhou Wang, Song-Chun Zhu, Yike Guo
, Wen Gao:
AI Alignment: A Comprehensive Survey. CoRR abs/2310.19852 (2023)- 2022
[c1]Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan:
Constrained Update Projection Approach to Safe Policy Optimization. NeurIPS 2022
[i2]Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan:
CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning. CoRR abs/2202.07565 (2022)
[i1]Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang
, Gang Pan:
Constrained Update Projection Approach to Safe Policy Optimization. CoRR abs/2209.07089 (2022)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-01-28 04:57 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







