![](https://dblp.dagstuhl.de/img/logo.320x120.png)
![search dblp search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
![search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
default search action
Yang Yu 0001
Person information
- affiliation (PhD 2011): Nanjing University, State Key Laboratory for Novel Software Technology, China
- affiliation: Pazhou Lab, Guangzhou, China
Other persons with the same name
- Yang Yu — disambiguation page
- Yang Yu 0002
— University of Technology Sydney, Faculty of Engineering and Information Technology, NSW, Australia (and 1 more)
- Yang Yu 0003
— North China Electric Power University, State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources, Baoding, China
- Yang Yu 0004
— Rochester Institute of Technology, Saunders College of Business, Rochester, NY, USA (and 1 more)
- Yang Yu 0005
— Jiangsu University of Technology, School of Electric Information Engineering, Changzhou, China
- Yang Yu 0006
— National University of Defense Technology, College of Electrical Science and Engineering, National Key Laboratory of Science and Technology on ATR, Changsha, China
- Yang Yu 0007
— National University of Defense Technology, College of Computer, Changsha, China
- Yang Yu 0008
— Tsinghua University, Department of Computer Science and Technology, Beijing, China (and 1 more)
- Yang Yu 0009 — Motorola Labs, Schaumburg, IL, USA (and 1 more)
- Yang Yu 0010
— Rutgers University, Department of Computer Science, Piscataway, NJ, USA
- Yang Yu 0011
— Tsinghua University, Institute for Interdisciplinary Information Sciences, Beijing, China (and 1 more)
- Yang Yu 0012 — University of Sheffield, UK
- Yang Yu 0013
— Nanjing University of Posts and Telecommunications, College of Automation / College of Artificial Intelligence, China (and 1 more)
- Yang Yu 0014
— National University of Defense Technology, College of Intelligence Science and Technology, Changsha, China (and 1 more)
- Yang Yu 0015
— Harbin Institute of Technology, Department of Automatic Test and Control, Harbin, China
- Yang Yu 0016
— Northeastern University, College of Information Science and Engineering, Shenyang, China
- Yang Yu 0017
— Changchun University of Technology, School of Mechatronic Engineering, Changchun, China
- Yang Yu 0018
— Harbin Jiancheng Group Company, Harbin, China
- Yang Yu 0019
— Shanghai Jiao Tong University, School of Mechanical Engineering, State Key Laboratory of Mechanical System and Vibration, Shanghai, China
- Yang Yu 0020
— Tongji University, State Key Laboratory of Marine Geology, Shanghai, China
- Yang Yu 0021
— University of Technology Sydney, School of Civil and Environmental Engineering, Sydney, Australia
- Yang Yu 0022
— Hebei University of Technology, School of Computer Science and Engineering, Tianjin, China
- Yang Yu 0023
— Wuhan University, School of Urban Design, Department of Urban Planning, Wuhan, China
- Yang Yu 0024
— Tongji University, Department of Control Science and Engineering, Shanghai, China
- Yang Yu 0025 — Rutgers University, Department of Mathematics, Piscataway, NJ, USA
- Yang Yu 0026
— China Agricultural University, College of Engineering, Beijing, China
- Yang Yu 0027
— Sun Yat-sen University, School of Data and Computer Science, Guangzhou, China
- Yang Yu 0028
— Hong Kong University of Science and Technology, Department of Electronic and Computer Engineering, Robotics and Multi-Perception Laborotary, Hong Kong
- Yang Yu 0029 — Google, Mountain View, CA, USA (and 3 more)
- Yang Yu 0030
— Tianjin University, College of Intelligence and Computing, China
- Yang Yu 0031
— Southwest Forestry University, School of Machinery and Transportation, Kunming, China (and 1 more)
- Yang Yu 0032 — National University of Defense Technology, Center of Material Science, College of Liberal Arts and Sciences, College of Advanced Interdisciplinary Studies, College of Sciences, Changsha, China
- Yang Yu 0033
— Guizhou Medical University, School of Biology and Engineering, Guiyang, China (and 1 more)
- Yang Yu 0034 — Nanjing University of Posts and Telecommunications, College of Communication & Information Engineering, China
- Yang Yu 0035
— Royal Institute of Technology, Stockholm, Sweden
- Yang Yu 0036
— University of Duisburg-Essen, Germany
- Yang Yu 0037
— Auckland University of Technology, Institute of Biomedical Technologies, New Zealand
- Yang Yu 0038
— University of Science and Technology of China, State Key Laboratory of Cognitive Intelligence, Hefei, China
- Yang Yu 0039
— Beijing Jiaotong University, Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing, China
- Yang Yu 0040
— Northwestern Polytechnical University, School of Marine Science and Technology, Xi'an, China
- Yang Yu 0041 — Shanghai Jiao Tong University, Department of Electronic Engineering, Network Coding and Transmission Laboratory, Shanghai, China (and 1 more)
- Yang Yu 0042
— Kookmin University, Department of Computer Science, Seoul, South Korea
- Yang Yu 0043
— Qingdao University, School of Automation, Shandong Key Laboratory of Industrial Control Technology, Qingdao, China
- Yang Yu 0044
— Zhengzhou University of Light Industry, Software Engineering College, Zhengzhou, China
- Yang Yu 0045
— Chinese Academy of Sciences, Shanghai Institute of Technical Physics, Key Laboratory of Infrared System Detecting and Imaging Technology, Shanghai, China
- Yang Yu 0046
— Japan Advanced Institute of Science and Technology (JAIST), School of Knowledge Science, Nomi, Japan
- Yang Yu 0047
— Hubei Three Gorges Polytechnic, Electronic Information School, Yichang, China
- Yang Yu 0048
— Northwestern Polytechnical University, School of Electronics and Information, Xi'an, China
- Yang Yu 0049
— Lanzhou Jiaotong University, School of Traffic and Transportation, Lanzhou, China
- Yang Yu 0050
— Shanghai Jiao Tong University, Antai College of Economics and Management, Shanghai, China
- Yang Yu 0051
— Tianjin University, Tianjin Key Laboratory of Port and Ocean Engineering, State Key Laboratory of Hydraulic Engineering Simulation and Safety, Tianjin, China
- Yang Yu 0052 — Purdue University, Department of Statistics, West Lafayette, IN, USA
- Yang Yu 0053 — University of North Carolina at Chapel Hill, Department of Statistics and Operations Research, Chapel Hill, NC, USA
- Yang Yu 0054
— Halliburton Ltd, Singapore (and 1 more)
- Yang Yu 0055
— Taylor Hobson Ltd. AMETEK Ultra Precision Technologies, Leicester, UK (and 1 more)
- Yang Yu 0056
— University of Chinese Academy of Sciences, School of Artificial Intelligence, Beijing, China (and 1 more)
- Yang Yu 0057
— Qilu University of Technology, Shandong Computer Science Center, Shandong Provincial Key Laboratory of Computer Networks, Jinan, China
- Yang Yu 0058
— Beijing Jiaotong University, Institute of Data Science and Intelligent Decision Support, Beijing, China (and 2 more)
- Yang Yu 0059
— Liaoning Institute of Science and Engineering, School of Management Engineering, Jinzhou, China
- Yang Yu 0060
— Wuhan University of Technology, School of Information Engineering, Wuhan, China
- Yang Yu 0061
— Nanjing University of Posts and Telecommunications, Institute of Signal Processing Transmission, Nanjing, China
- Yang Yu 0062
— University of California Davis, Department of Land Air and Water Resources, Davis, CA, USA (and 1 more)
- Yang Yu 0063
— Pennsylvania State University, Department of Architectural Engineering, University Park, PA, USA
- Yang Yu 0064
— Beihang University, School of Aeronautic Science and Engineering, Beijing, China
- Yang Yu 0065
— Shanghai Conservatory of Music, Shanghai Key Laboratory for Music Acoustic, Shanghai, China
- Yang Yu 0066
— Semiconductor Manufacturing International Corporation, R&D Department, Shanghai, China
- Yang Yu 0067
— Tianjin University of Science and Technology, College of Artificial Intelligence, Tianjin, China
- Yang Yu 0068
— Chinese Academy of Sciences, National Space Science Center, Key Laboratory of Microwave Remote Sensing, Beijing, China (and 3 more)
- Yang Yu 0069
— Northwestern Polytechnical University, School of Astronautics, National Key Laboratory of Aerospace Flight Dynamics, Xi'an, China
- Yang Yu 0070
— Chinese University of Hong Kong, Department of Computer Science and Engineering, Hong Kong
- Yang Yu 0071
— Weifang People's Hospital, Department of Stomatology, Weifang, China
- Yang Yu 0072
— Shandong University of Technology, School of Transportation and Vehicle Engineering, Zibo, China
- Yang Yu 0073
— Wuhan University, School of Cyber Science and Engineering, Key Laboratory of Aerospace Information Security and Trusted Computing, Wuhan, China
- Yang Yu 0074
— Victoria University of Wellington, School of Marketing and International Business, Wellington, New Zealand
- Yang Yu 0075 — Victoria University of Wellington, School of Engineering and Computer Science, Wellington, New Zealand
- Yang Yu 0076
— Jilin Communications Polytechnic, Department of Physical Education, Changchun, China
- Yang Yu 0077
— Shenyang Aerospace University, School of Automation, Shenyang, China
- Yang Yu 0078
— Northwestern University, Department of Statistics, Evanston, IL, USA
Refine list
![note](https://dblp.dagstuhl.de/img/note-mark.dark.12x12.png)
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [j45]Cong Guan, Ke Xue, Chunpeng Fan, Feng Chen, Lichao Zhang, Lei Yuan, Chao Qian, Yang Yu:
Open and real-world human-AI coordination by heterogeneous training with communication. Frontiers Comput. Sci. 19(4): 194314 (2025) - 2024
- [j44]Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu:
A survey on model-based reinforcement learning. Sci. China Inf. Sci. 67(2) (2024) - [j43]Chengxing Jia, Fuxiang Zhang, Tian Xu, Jing-Cheng Pang, Zongzhang Zhang, Yang Yu:
Model gradient: unified model and policy learning in model-based reinforcement learning. Frontiers Comput. Sci. 18(4): 184339 (2024) - [j42]Lei Yuan, Feng Chen, Zongzhang Zhang, Yang Yu:
Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation. Frontiers Comput. Sci. 18(6) (2024) - [j41]Ruo-Ze Liu
, Yanjie Shen, Yang Yu
, Tong Lu
:
Revisiting of AlphaStar. IEEE Trans. Games 16(2): 317-330 (2024) - [j40]Zijian Zhang
, Xin Lu
, Meng Li
, Jincheng An
, Yang Yu
, Hao Yin
, Liehuang Zhu
, Yong Liu
, Jiamou Liu
, Bakh Khoussainov
:
A Blockchain-Based Privacy-Preserving Scheme for Sealed-Bid Auction. IEEE Trans. Dependable Secur. Comput. 21(5): 4668-4683 (2024) - [j39]Ming Yang, Yiming Wang, Yang Yu
, Mingliang Zhou
, Leong Hou U
:
MixLight: Mixed-Agent Cooperative Reinforcement Learning for Traffic Light Control. IEEE Trans. Ind. Informatics 20(2): 2653-2661 (2024) - [j38]Zhengbang Zhu
, Rongjun Qin
, Junjie Huang
, Xinyi Dai
, Yang Yu
, Yong Yu
, Weinan Zhang
:
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems. ACM Trans. Inf. Syst. 42(4): 90:1-90:32 (2024) - [c140]Chao Chen, Jiacheng Xu, Weijian Liao, Hao Ding, Zongzhang Zhang, Yang Yu, Rui Zhao:
Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning. AAAI 2024: 11240-11248 - [c139]Chenxiao Gao, Chenyang Wu
, Mingjun Cao
, Rui Kong, Zongzhang Zhang, Yang Yu:
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning. AAAI 2024: 12127-12135 - [c138]Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu:
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-trajectory Reward. AAAI 2024: 13808-13816 - [c137]Renzhe Zhou, Chenxiao Gao, Zongzhang Zhang, Yang Yu:
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations. AAAI 2024: 17132-17140 - [c136]Chao Chen, Dawei Wang, Feng Mao, Jiacheng Xu, Zongzhang Zhang, Yang Yu:
Deep Anomaly Detection via Active Anomaly Search. AAMAS 2024: 308-316 - [c135]Ruifeng Chen, Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Feng Xu, Yang Yu:
Foresight Distribution Adjustment for Off-policy Reinforcement Learning. AAMAS 2024: 317-325 - [c134]Cong Guan, Ruiqi Xue, Ziqian Zhang, Lihe Li, Yi-Chen Li, Lei Yuan, Yang Yu:
Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task Adaptation. AAMAS 2024: 743-751 - [c133]Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chenxiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu:
Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation. AAMAS 2024: 944-953 - [c132]Chengxing Jia, Chenxiao Gao, Hao Yin, Fuxiang Zhang, Xiong-Hui Chen, Tian Xu, Lei Yuan, Zongzhang Zhang, Zhi-Hua Zhou, Yang Yu:
Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning. ICLR 2024 - [c131]Ziniu Li, Tian Xu, Yang Yu:
When is RL better than DPO in RLHF? A Representation and Optimization Perspective. Tiny Papers @ ICLR 2024 - [c130]Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu:
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning. ICLR 2024 - [c129]Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu:
Language Model Self-improvement by Reinforcement Learning Contemplation. ICLR 2024 - [c128]Zhilong Zhang, Yihao Sun, Junyin Ye, Tian-Shuo Liu, Jiaji Zhang, Yang Yu:
Flow to Better: Offline Preference-based Reinforcement Learning via Preferred Trajectory Generation. ICLR 2024 - [c127]Ruifeng Chen, Xiong-Hui Chen, Yihao Sun, Siyuan Xiao, Minhui Li, Yang Yu:
Policy-conditioned Environment Models are More Generalizable. ICML 2024 - [c126]Ruifeng Chen, Chengxing Jia, Zefang Huang, Tian-Shuo Liu, Xu-Hui Liu, Yang Yu:
Offline Transition Modeling via Contrastive Energy Learning. ICML 2024 - [c125]Xingchen Cao, Fan-Ming Luo, Junyin Ye, Tian Xu, Zhilong Zhang, Yang Yu:
Limited Preference Aided Imitation Learning from Imperfect Demonstrations. ICML 2024 - [c124]Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, XuHui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Yang Yu, Anqi Huang, Kai Xu, Zongzhang Zhang:
Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration. ICML 2024 - [c123]Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo:
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models. ICML 2024 - [c122]Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Ruifeng Chen, Zhilong Zhang, Xinwei Chen, Yang Yu:
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning. ICML 2024 - [c121]Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu:
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics. ICML 2024 - [c120]Lihe Li, Ruotong Chen, Ziqian Zhang, Zhichao Wu, Yi-Chen Li, Cong Guan, Yang Yu, Lei Yuan:
Continual Multi-Objective Reinforcement Learning via Reward Model Rehearsal. IJCAI 2024: 4434-4442 - [c119]Zhi-Hao Tan
, Jian-Dong Liu
, Xiao-Dong Bi
, Peng Tan
, Qin-Cheng Zheng
, Hai-Tian Liu
, Yi Xie
, Xiao-Chuan Zou
, Yang Yu
, Zhi-Hua Zhou
:
Beimingwu: A Learnware Dock System. KDD 2024: 5773-5782 - [c118]Ruiqi Xue, Ziqian Zhang, Lihe Li, Feng Chen, Yi-Chen Li, Yang Yu, Lei Yuan:
Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator. ECML/PKDD (7) 2024: 74-91 - [i97]Zhi-Hao Tan, Jian-Dong Liu, Xiao-Dong Bi, Peng Tan, Qin-Cheng Zheng, Hai-Tian Liu, Yi Xie, Xiao-Chuan Zou, Yang Yu, Zhi-Hua Zhou:
Beimingwu: A Learnware Dock System. CoRR abs/2401.14427 (2024) - [i96]Jing-Cheng Pang, Heng-Bo Fan, Pengyuan Wang, Jiahao Xiao, Nan Tang, Si-Hang Yang, Chengxing Jia, Sheng-Jun Huang, Yang Yu:
Empowering Language Models with Active Inquiry for Deeper Understanding. CoRR abs/2402.03719 (2024) - [i95]Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu:
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics. CoRR abs/2402.11317 (2024) - [i94]Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chenxiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu:
Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation. CoRR abs/2403.07261 (2024) - [i93]Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu:
Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts. CoRR abs/2404.09248 (2024) - [i92]Fan-Ming Luo, Zuolin Tu, Zefang Huang, Yang Yu:
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate. CoRR abs/2405.15384 (2024) - [i91]Haoxin Lin, Yu-Yan Xu, Yihao Sun, Zhilong Zhang, Yi-Chen Li, Chengxing Jia, Junyin Ye, Jiaji Zhang, Yang Yu:
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning. CoRR abs/2405.17031 (2024) - [i90]Chengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu:
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation. CoRR abs/2405.17039 (2024) - [i89]Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu:
Q-Adapter: Training Your LLM Adapter as a Residual Q-Function. CoRR abs/2407.03856 (2024) - [i88]Fuxiang Zhang, Junyou Li, Yi-Chen Li, Zongzhang Zhang, Yang Yu, Deheng Ye:
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models. CoRR abs/2407.03964 (2024) - [i87]Chen-Xiao Gao, Shengjun Fang, Chenjun Xiao, Yang Yu, Zongzhang Zhang:
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning. CoRR abs/2407.04451 (2024) - [i86]Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Ruifeng Chen, Zhilong Zhang, Xinwei Chen, Yang Yu:
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning. CoRR abs/2407.12448 (2024) - [i85]Zhilong Zhang, Ruifeng Chen, Junyin Ye, Yihao Sun, Pengyuan Wang, Jingcheng Pang, Kaiyuan Li, Tianshuo Liu, Haoxin Lin, Yang Yu, Zhi-Hua Zhou:
WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making. CoRR abs/2411.05619 (2024) - [i84]Feng Chen, Fuguang Han, Cong Guan, Lei Yuan, Zhilong Zhang, Yang Yu, Zongzhang Zhang:
Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay. CoRR abs/2411.10809 (2024) - 2023
- [j37]Hua Yang
, Minghao Zhao, Lei Yuan, Yang Yu, Zhenhua Li
, Ming Gu:
Memory-efficient Transformer-based network model for Traveling Salesman Problem. Neural Networks 161: 589-597 (2023) - [j36]Xiong-Hui Chen
, Fan-Ming Luo
, Yang Yu
, Qingyang Li
, Zhiwei Qin
, Wenjie Shang
, Jieping Ye
:
Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions. IEEE Trans. Pattern Anal. Mach. Intell. 45(12): 15260-15274 (2023) - [j35]Guangda Huzhang
, Zhen-Jia Pang, Yongqing Gao, Yawen Liu, Weijie Shen, Wen-Ji Zhou
, Qianying Lin, Qing Da, Anxiang Zeng
, Han Yu
, Yang Yu
, Zhi-Hua Zhou
:
AliExpress Learning-to-Rank: Maximizing Online Model Performance Without Going Online. IEEE Trans. Knowl. Data Eng. 35(2): 1214-1226 (2023) - [j34]Han Wang
, Yang Yu
, Yuan Jiang:
Fully Decentralized Multiagent Communication via Causal Inference. IEEE Trans. Neural Networks Learn. Syst. 34(12): 10193-10202 (2023) - [j33]Hang Zhao
, Zherong Pan
, Yang Yu
, Kai Xu
:
Learning Physically Realizable Skills for Online Packing of General 3D Shapes. ACM Trans. Graph. 42(5): 165:1-165:21 (2023) - [c117]Yang Yu, Qi Liu, Likang Wu, Runlong Yu
, Sanshi Lei Yu, Zaixi Zhang:
Untargeted Attack against Federated Recommendation Systems via Poisonous Item Embeddings and the Defense. AAAI 2023: 4854-4863 - [c116]Weijian Liao, Zongzhang Zhang, Yang Yu:
Policy-Independent Behavioral Metric-Based Representation for Deep Reinforcement Learning. AAAI 2023: 8746-8754 - [c115]Lei Yuan, Ziqian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Lihe Li, Chao Qian, Yang Yu:
Robust Multi-Agent Coordination via Evolutionary Generation of Auxiliary Adversarial Attackers. AAAI 2023: 11753-11762 - [c114]Chao Chen, Dawei Wang, Feng Mao, Zongzhang Zhang, Yang Yu:
Deep Anomaly Detection and Search via Reinforcement Learning (Student Abstract). AAAI 2023: 16180-16181 - [c113]Yi-Chen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu:
Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract). AAAI 2023: 16258-16259 - [c112]Aoran Wang, Hongyang Yang, Feng Mao, Zongzhang Zhang, Yang Yu, Xiaoyang Liu:
Anti-drifting Feature Selection via Deep Reinforcement Learning (Student Abstract). AAAI 2023: 16356-16357 - [c111]Renzhe Zhou, Zongzhang Zhang, Yang Yu:
Model-Based Offline Weighted Policy Optimization (Student Abstract). AAAI 2023: 16392-16393 - [c110]Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan:
Self-Motivated Multi-Agent Exploration. AAMAS 2023: 476-484 - [c109]Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu:
How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement. AAMAS 2023: 1276-1284 - [c108]Lei Yuan
, Lihe Li
, Ziqian Zhang
, Feng Chen
, Tianyi Zhang
, Cong Guan
, Yang Yu, Zhi-Hua Zhou
:
Learning to Coordinate with Anyone. DAI 2023: 4:1-4:9 - [c107]Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu:
Model-Based Reinforcement Learning with Multi-Step Plan Value Estimation. ECAI 2023: 1481-1488 - [c106]Huakang Lu, Hong Qian, Yupeng Wu, Ziqi Liu, Ya-Lin Zhang, Aimin Zhou, Yang Yu:
Degradation-Resistant Offline Optimization via Accumulative Risk Control. ECAI 2023: 1609-1616 - [c105]Xiong-Hui Chen, Bowei He
, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Jieping Ye, Chen Ma
:
Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems. ICDE 2023: 3389-3402 - [c104]Fuxiang Zhang, Chengxing Jia, Yi-Chen Li, Lei Yuan, Yang Yu, Zongzhang Zhang:
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data. ICLR 2023 - [c103]Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu:
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning. ICML 2023: 28701-28717 - [c102]Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu:
Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning. ICML 2023: 33177-33194 - [c101]Jing-Cheng Pang, Si-Hang Yang, Xiong-Hui Chen, Xinyu Yang, Yang Yu, Mas Ma, Ziqi Guo, Howard Yang, Bill Huang:
Object-Oriented Option Framework for Robotics Manipulation in Clutter. IROS 2023: 1230-1237 - [c100]Jiacheng Xu
, Chao Chen
, Fuxiang Zhang
, Lei Yuan
, Zongzhang Zhang
, Yang Yu:
Internal Logical Induction for Pixel-Symbolic Reinforcement Learning. KDD 2023: 2825-2837 - [c99]Xiong-Hui Chen, Yang Yu, Zhengmao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Rong-Jun Qin, Hongqiu Wu, Ruijin Ding, Fangsheng Huang:
Adversarial Counterfactual Environment Model Learning. NeurIPS 2023 - [c98]Ziniu Li, Tian Xu, Zeyu Qin, Yang Yu, Zhi-Quan Luo:
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms. NeurIPS 2023 - [c97]Yuren Liu, Biwei Huang, Zhengmao Zhu, Hong-Long Tian, Mingming Gong, Yang Yu, Kun Zhang:
Learning World Models with Identifiable Factorization. NeurIPS 2023 - [c96]Jing-Cheng Pang, Xinyu Yang, Si-Hang Yang, Xiong-Hui Chen, Yang Yu:
Natural Language Instruction-following with Task-related Language Development and Translation. NeurIPS 2023 - [c95]Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo:
Provably Efficient Adversarial Imitation Learning with Unknown Transitions. UAI 2023: 2367-2378 - [c94]Ziqian Zhang, Lei Yuan, Lihe Li, Ke Xue, Chengxing Jia, Cong Guan, Chao Qian, Yang Yu:
Fast Teammate Adaptation in the Presence of Sudden Policy Change. UAI 2023: 2465-2476 - [i83]Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan:
Self-Motivated Multi-Agent Exploration. CoRR abs/2301.02083 (2023) - [i82]Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo:
Theoretical Analysis of Offline Imitation With Supplementary Dataset. CoRR abs/2301.11687 (2023) - [i81]Jing-Cheng Pang, Xinyu Yang, Si-Hang Yang, Yang Yu:
Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation. CoRR abs/2302.09368 (2023) - [i80]Cong Guan, Feng Chen, Lei Yuan, Zongzhang Zhang, Yang Yu:
Efficient Communication via Self-supervised Information Aggregation for Online and Offline Multi-agent Reinforcement Learning. CoRR abs/2302.09605 (2023) - [i79]Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu:
How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement. CoRR abs/2303.02073 (2023) - [i78]Zheng-Mao Zhu, Yu-Ren Liu, Hong-Long Tian, Yang Yu, Kun Zhang:
Beware of Instantaneous Dependence in Reinforcement Learning. CoRR abs/2303.05458 (2023) - [i77]Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Jieping Ye, Chen Ma:
Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems. CoRR abs/2305.04832 (2023) - [i76]Lei Yuan, Feng Chen, Zongzhang Zhang, Yang Yu:
Communication-Robust Multi-Agent Learning by Adaptable Auxiliary Multi-Agent Adversary Generation. CoRR abs/2305.05116 (2023) - [i75]Lei Yuan, Ziqian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Lihe Li, Chao Qian, Yang Yu:
Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers. CoRR abs/2305.05909 (2023) - [i74]Ziqian Zhang, Lei Yuan, Lihe Li, Ke Xue, Chengxing Jia, Cong Guan, Chao Qian, Yang Yu:
Fast Teammate Adaptation in the Presence of Sudden Policy Change. CoRR abs/2305.05911 (2023) - [i73]Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu:
Robust Multi-agent Communication via Multi-view Message Certification. CoRR abs/2305.13936 (2023) - [i72]Lei Yuan, Lihe Li, Ziqian Zhang, Fuxiang Zhang, Cong Guan, Yang Yu:
Multi-agent Continual Coordination via Progressive Task Contextualization. CoRR abs/2305.13937 (2023) - [i71]Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu:
Language Model Self-improvement by Reinforcement Learning Contemplation. CoRR abs/2305.14483 (2023) - [i70]Yu-Ren Liu, Biwei Huang, Zheng-Mao Zhu, Hong-Long Tian, Mingming Gong, Yang Yu, Kun Zhang:
Learning World Models with Identifiable Factorization. CoRR abs/2306.06561 (2023) - [i69]Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo:
Provably Efficient Adversarial Imitation Learning with Unknown Transitions. CoRR abs/2306.06563 (2023) - [i68]Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu:
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning. CoRR abs/2306.06569 (2023) - [i67]Chenxiao Gao, Chenyang Wu
, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu:
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning. CoRR abs/2309.05915 (2023) - [i66]Lei Yuan, Lihe Li, Ziqian Zhang, Feng Chen, Tianyi Zhang, Cong Guan, Yang Yu, Zhi-Hua Zhou:
Learning to Coordinate with Anyone. CoRR abs/2309.12633 (2023) - [i65]Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu:
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning. CoRR abs/2310.05422 (2023) - [i64]Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, Yang Yu:
Imitator Learning: Achieve Out-of-the-Box Imitation Ability in Variable Environments. CoRR abs/2310.05712 (2023) - [i63]Ziniu Li, Tian Xu, Yushun Zhang, Yang Yu, Ruoyu Sun, Zhi-Quan Luo:
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models. CoRR abs/2310.10505 (2023) - [i62]Cong Guan, Lichao Zhang, Chunpeng Fan, Yichen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu:
Efficient Human-AI Coordination via Preparatory Language-based Convention. CoRR abs/2311.00416 (2023) - [i61]Lei Yuan, Ziqian Zhang, Lihe Li, Cong Guan, Yang Yu:
A Survey of Progress on Cooperative Multi-agent Reinforcement Learning in Open Environment. CoRR abs/2312.01058 (2023) - [i60]Ziniu Li, Tian Xu, Yang Yu:
Policy Optimization in RLHF: The Impact of Out-of-preference Data. CoRR abs/2312.10584 (2023) - [i59]Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu:
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward. CoRR abs/2312.10642 (2023) - [i58]Renzhe Zhou, Chenxiao Gao, Zongzhang Zhang, Yang Yu:
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations. CoRR abs/2312.15909 (2023) - 2022
- [j32]Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Chao Qian, Yang Yu:
ZOOpt: a toolbox for derivative-free optimization. Sci. China Inf. Sci. 65(10) (2022) - [j31]Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu:
On Efficient Reinforcement Learning for Full-length Game of StarCraft II. J. Artif. Intell. Res. 75: 213-260 (2022) - [j30]Yi-Feng Zhang
, Fan-Ming Luo, Yang Yu:
Improve generated adversarial imitation learning with reward variance regularization. Mach. Learn. 111(3): 977-995 (2022) - [j29]Yi-Qi Hu
, Xu-Hui Liu
, Shu-Qiao Li
, Yang Yu:
Cascaded Algorithm Selection With Extreme-Region UCB Bandit. IEEE Trans. Pattern Anal. Mach. Intell. 44(10): 6782-6794 (2022) - [j28]Tian Xu
, Ziniu Li, Yang Yu:
Error Bounds of Imitating Policies and Environments for Reinforcement Learning. IEEE Trans. Pattern Anal. Mach. Intell. 44(10): 6968-6980 (2022) - [j27]Ruo-Ze Liu
, Haifeng Guo, Xiaozhong Ji, Yang Yu, Zhen-Jia Pang, Zitai Xiao, Yuzhou Wu, Tong Lu
:
Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning. IEEE Trans. Games 14(2): 294-307 (2022) - [j26]