default search action
Weizhu Chen
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c109]Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen:
Competition-Level Problems are Effective LLM Evaluators. ACL (Findings) 2024: 13526-13544 - [c108]Shengnan An, Zexiong Ma, Siqi Cai, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen:
Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks. EMNLP (Findings) 2024: 833-854 - [c107]Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen:
Automatic Instruction Evolving for Large Language Models. EMNLP 2024: 6998-7018 - [c106]Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He:
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective. ICLR 2024 - [c105]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. ICLR 2024 - [c104]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen:
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. ICLR 2024 - [c103]Yixiao Li, Yifan Yu, Chen Liang, Nikos Karampatziakis, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models. ICLR 2024 - [c102]Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang:
Supervised Knowledge Makes Large Language Models Better In-context Learners. ICLR 2024 - [c101]Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen:
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. NAACL (Industry Track) 2024: 165-190 - [c100]Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan:
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models. NAACL-HLT (Findings) 2024: 2299-2314 - [c99]Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen:
Language Models can be Deductive Solvers. NAACL-HLT (Findings) 2024: 4026-4042 - [i115]Yueqin Yin, Zhendong Wang, Yi Gu, Hai Huang, Weizhu Chen, Mingyuan Zhou:
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts. CoRR abs/2402.10958 (2024) - [i114]Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Hassan Awadalla, Weizhu Chen:
SciAgent: Tool-augmented Language Models for Scientific Reasoning. CoRR abs/2402.11451 (2024) - [i113]Ming Zhong, Yelong Shen, Shuohang Wang, Yadong Lu, Yizhu Jiao, Siru Ouyang, Donghan Yu, Jiawei Han, Weizhu Chen:
Multi-LoRA Composition for Image Generation. CoRR abs/2402.16843 (2024) - [i112]Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen:
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning. CoRR abs/2403.02333 (2024) - [i111]Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
Exploring the Mystery of Influential Data for Mathematical Reasoning. CoRR abs/2404.01067 (2024) - [i110]Vlad Fomenko, Han Yu, Jongho Lee, Stanley Hsieh, Weizhu Chen:
A Note on LoRA. CoRR abs/2404.05086 (2024) - [i109]Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen:
Rho-1: Not All Tokens Are What You Need. CoRR abs/2404.07965 (2024) - [i108]Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou:
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. CoRR abs/2404.14219 (2024) - [i107]Yueqin Yin, Zhendong Wang, Yujia Xie, Weizhu Chen, Mingyuan Zhou:
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment. CoRR abs/2405.20830 (2024) - [i106]Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen:
Automatic Instruction Evolving for Large Language Models. CoRR abs/2406.00770 (2024) - [i105]Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen:
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling. CoRR abs/2406.07522 (2024) - [i104]Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Qingwei Lin, Jianguang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen:
Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena. CoRR abs/2407.10627 (2024) - [i103]Liyuan Liu, Young Jin Kim, Shuohang Wang, Chen Liang, Yelong Shen, Hao Cheng, Xiaodong Liu, Masahiro Tanaka, Xiaoxia Wu, Wenxiang Hu, Vishrav Chaudhary, Zeqi Lin, Chengruidong Zhang, Jilong Xue, Hany Awadalla, Jianfeng Gao, Weizhu Chen:
GRIN: GRadient-INformed MoE. CoRR abs/2409.12136 (2024) - [i102]Yaming Yang, Dilxat Muhtar, Yelong Shen, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Denvy Deng, Feng Sun, Qi Zhang, Weizhu Chen, Yunhai Tong:
MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning. CoRR abs/2410.09437 (2024) - [i101]Dilxat Muhtar, Yelong Shen, Yaming Yang, Xiaodong Liu, Yadong Lu, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Xueliang Zhang, Jianfeng Gao, Weizhu Chen, Qi Zhang:
StreamAdapter: Efficient Test Time Adaptation from Contextual Streams. CoRR abs/2411.09289 (2024) - 2023
- [c98]Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan:
Code Execution with Pre-trained Language Models. ACL (Findings) 2023: 4984-4999 - [c97]Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen:
Making Language Models Better Reasoners with Step-Aware Verifier. ACL (1) 2023: 5315-5333 - [c96]Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen:
Joint Generator-Ranker Learning for Natural Language Generation. ACL (Findings) 2023: 7681-7699 - [c95]Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng:
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models. ACL (1) 2023: 8208-8222 - [c94]Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen:
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation. EMNLP 2023: 2471-2484 - [c93]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. EMNLP (Findings) 2023: 9248-9274 - [c92]Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou:
Skill-Based Few-Shot Selection for In-Context Learning. EMNLP 2023: 13472-13492 - [c91]Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen:
CodeT: Code Generation with Generated Tests. ICLR 2023 - [c90]Pengcheng He, Jianfeng Gao, Weizhu Chen:
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. ICLR 2023 - [c89]Zhendong Wang, Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Diffusion-GAN: Training GANs with Diffusion. ICLR 2023 - [c88]Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao:
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. ICLR 2023 - [c87]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders. ICLR 2023 - [c86]Yixiao Li, Yifan Yu, Qingru Zhang, Chen Liang, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation. ICML 2023: 20336-20350 - [c85]Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao:
Less is More: Task-aware Layer-wise Distillation for Language Model Compression. ICML 2023: 20852-20867 - [c84]Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Nan Duan, Weizhu Chen:
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise. ICML 2023: 21051-21064 - [c83]Jason Phang, Yi Mao, Pengcheng He, Weizhu Chen:
HyperTuning: Toward Adapting Large Language Models without Back-propagation. ICML 2023: 27854-27875 - [c82]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. ICML 2023: 30706-30775 - [c81]Anh Nguyen, Nikos Karampatziakis, Weizhu Chen:
Meet in the Middle: A New Pre-training Paradigm. NeurIPS 2023 - [c80]Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang (Atlas) Wang, Mingyuan Zhou:
In-Context Learning Unlocked for Diffusion Models. NeurIPS 2023 - [c79]Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou:
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models. NeurIPS 2023 - [c78]Tong Wu, Zhihao Fan, Xiao Liu, Hai-Tao Zheng, Yeyun Gong, Yelong Shen, Jian Jiao, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen:
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation. NeurIPS 2023 - [i100]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. CoRR abs/2302.00618 (2023) - [i99]Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao:
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. CoRR abs/2302.12813 (2023) - [i98]Anh Nguyen, Nikos Karampatziakis, Weizhu Chen:
Meet in the Middle: A New Pre-training Paradigm. CoRR abs/2303.07295 (2023) - [i97]Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao:
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. CoRR abs/2303.10512 (2023) - [i96]Fengji Zhang, Bei Chen, Yue Zhang, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen:
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation. CoRR abs/2303.12570 (2023) - [i95]Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen:
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. CoRR abs/2303.16854 (2023) - [i94]Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan:
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models. CoRR abs/2304.06364 (2023) - [i93]Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou:
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models. CoRR abs/2304.12526 (2023) - [i92]Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou:
In-Context Learning Unlocked for Diffusion Models. CoRR abs/2305.01115 (2023) - [i91]Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan:
Code Execution with Pre-trained Language Models. CoRR abs/2305.05383 (2023) - [i90]Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen:
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation. CoRR abs/2305.09515 (2023) - [i89]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. CoRR abs/2305.11738 (2023) - [i88]Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou:
Skill-Based Few-Shot Selection for In-Context Learning. CoRR abs/2305.14210 (2023) - [i87]Woojeong Jin, Subhabrata Mukherjee, Yu Cheng, Yelong Shen, Weizhu Chen, Ahmed Hassan Awadallah, Damien Jose, Xiang Ren:
GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions. CoRR abs/2305.14676 (2023) - [i86]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. CoRR abs/2305.15294 (2023) - [i85]Yixiao Li, Yifan Yu, Qingru Zhang, Chen Liang, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation. CoRR abs/2306.11222 (2023) - [i84]Alexander Bukharin, Yixiao Li, Pengcheng He, Weizhu Chen, Tuo Zhao:
Deep Reinforcement Learning from Hierarchical Weak Preference Feedback. CoRR abs/2309.02632 (2023) - [i83]Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan:
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency. CoRR abs/2309.17272 (2023) - [i82]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen:
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. CoRR abs/2309.17452 (2023) - [i81]Liyuan Liu, Jianfeng Gao, Weizhu Chen:
Sparse Backpropagation for MoE Training. CoRR abs/2310.00811 (2023) - [i80]Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao:
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models. CoRR abs/2310.08659 (2023) - [i79]Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He:
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective. CoRR abs/2310.11451 (2023) - [i78]Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen:
Learning From Mistakes Makes LLM Better Reasoner. CoRR abs/2310.20689 (2023) - [i77]Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen:
Language Models can be Logical Solvers. CoRR abs/2311.06158 (2023) - [i76]Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen:
Competition-Level Problems are Effective LLM Evaluators. CoRR abs/2312.02143 (2023) - [i75]Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang:
Supervised Knowledge Makes Large Language Models Better In-context Learners. CoRR abs/2312.15918 (2023) - 2022
- [j2]Caihong Mu, Weizhu Chen, Yi Liu, Dongchang Lei, Ruochen Liu:
Virtual information core optimization for collaborative filtering recommendation based on clustering and evolutionary algorithms. Appl. Soft Comput. 116: 108355 (2022) - [c77]Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan:
XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge. AAAI 2022: 10840-10848 - [c76]Zhuocheng Gong, Di He, Yelong Shen, Tie-Yan Liu, Weizhu Chen, Dongyan Zhao, Ji-Rong Wen, Rui Yan:
Finding the Dominant Winning Ticket in Pre-Trained Language Models. ACL (Findings) 2022: 1459-1472 - [c75]Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren:
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models. ACL (1) 2022: 2763-2775 - [c74]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. ACL (Findings) 2022: 2912-2924 - [c73]Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, Nan Duan:
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation. ACL (1) 2022: 4852-4864 - [c72]Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao, Zhifang Sui, Weizhu Chen, Bill Dolan:
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation. ACL (1) 2022: 6723-6737 - [c71]Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao:
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing. ACL (1) 2022: 7162-7175 - [c70]Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen:
What Makes Good In-Context Examples for GPT-3? DeeLIO@ACL 2022: 100-114 - [c69]Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Ahmed Awadallah, Zhangyang Wang:
Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models. ECCV (23) 2022: 389-405 - [c68]Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan:
Soft-Labeled Contrastive Pre-Training for Function-Level Code Representation. EMNLP (Findings) 2022: 118-129 - [c67]Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen:
Reasoning Like Program Executors. EMNLP 2022: 761-779 - [c66]Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan:
CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search. EMNLP 2022: 2898-2910 - [c65]Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen:
LoRA: Low-Rank Adaptation of Large Language Models. ICLR 2022 - [c64]Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. ICLR 2022 - [c63]Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou:
TAPEX: Table Pre-training via Learning a Neural SQL Executor. ICLR 2022 - [c62]Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen:
Adversarial Retriever-Ranker for Dense Text Retrieval. ICLR 2022 - [c61]Qingru Zhang, Simiao Zuo, Chen Liang, Alexander Bukharin, Pengcheng He, Weizhu Chen, Tuo Zhao:
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance. ICML 2022: 26809-26823 - [c60]Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei Guan, Yongji Wang, Weizhu Chen, Jian-Guang Lou:
CERT: Continual Pre-training on Sketches for Library-oriented Code Generation. IJCAI 2022: 2369-2375 - [c59]Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen:
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering. NAACL-HLT 2022: 932-942 - [c58]Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
ALLSH: Active Learning Guided by Local Sensitivity and Hardness. NAACL-HLT (Findings) 2022: 1328-1342 - [c57]Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen:
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation. NAACL-HLT 2022: 1610-1623 - [i74]Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan:
CodeRetriever: Unimodal and Bimodal Contrastive Learning. CoRR abs/2201.10866 (2022) - [i73]Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Yan Gao, Qiang Fu, Jian-Guang Lou, Weizhu Chen:
Reasoning Like Program Executors. CoRR abs/2201.11473 (2022) - [i72]Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. CoRR abs/2202.02664 (2022) - [i71]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs. CoRR abs/2202.06510 (2022) - [i70]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Truncated Diffusion Probabilistic Models. CoRR abs/2202.09671 (2022) - [i69]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. CoRR abs/2202.13257 (2022) - [i68]Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu Chen, Nanning Zheng, Jian-Guang Lou:
Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models. CoRR abs/2203.03131 (2022) - [i67]Greg Yang, Edward J. Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick Ryder, Jakub Pachocki, Weizhu Chen, Jianfeng Gao:
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer. CoRR abs/2203.03466 (2022) - [i66]Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao:
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing. CoRR abs/2204.06625 (2022) - [i65]Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen:
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation. CoRR abs/2204.07675 (2022) - [i64]Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, Nan Duan:
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation. CoRR abs/2204.13031 (2022) - [i63]Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
ALLSH: Active Learning Guided by Local Sensitivity and Hardness. CoRR abs/2205.04980 (2022) - [i62]Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan:
A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation. CoRR abs/2205.11162 (2022) - [i61]