default search action
Nathan Lambert 0001
Person information
- affiliation: University of California Berkeley, Department of Electrical Engineering and Computer Sciences, CA, USA
Other persons with the same name
- Nathan Lambert 0002 — University of Victoria, Department of Computer Science, BC, Canada
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang:
A Survey on Data Selection for Language Models. Trans. Mach. Learn. Res. 2024 (2024) - [j4]Ran Wei, Nathan Lambert, Anthony D. McDonald, Alfredo García, Roberto Calandra:
A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning. Trans. Mach. Learn. Res. 2024 (2024) - [c10]Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Evan Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo:
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. ACL (1) 2024: 15725-15788 - [c9]Dirk Groeneveld, Iz Beltagy, Evan Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi:
OLMo: Accelerating the Science of Language Models. ACL (1) 2024: 15789-15809 - [c8]Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mossé, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker:
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback. ICML 2024 - [i34]Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo:
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. CoRR abs/2402.00159 (2024) - [i33]Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi:
OLMo: Accelerating the Science of Language Models. CoRR abs/2402.00838 (2024) - [i32]Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang:
A Survey on Data Selection for Language Models. CoRR abs/2402.16827 (2024) - [i31]Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi:
RewardBench: Evaluating Reward Models for Language Modeling. CoRR abs/2403.13787 (2024) - [i30]Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mossé, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker:
Social Choice for AI Alignment: Dealing with Diverse Human Feedback. CoRR abs/2404.10271 (2024) - [i29]Prasann Singhal, Nathan Lambert, Scott Niekum, Tanya Goyal, Greg Durrett:
D2PO: Discriminator-Guided DPO with Response Evaluation Models. CoRR abs/2405.01511 (2024) - [i28]Adrien Basdevant, Camille François, Victor Storchan, Kevin Bankston, Ayah Bdeir, Brian Behlendorf, Mérouane Debbah, Sayash Kapoor, Yann LeCun, Mark Surman, Helen King-Turvey, Nathan Lambert, Stefano Maffulli, Nik Marda, Govind Shivkumar, Justine Tunney:
Towards a Framework for Openness in Foundation Models: Proceedings from the Columbia Convening on Openness in Artificial Intelligence. CoRR abs/2405.15802 (2024) - [i27]Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi:
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback. CoRR abs/2406.09279 (2024) - [i26]Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri:
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs. CoRR abs/2406.18495 (2024) - [i25]Nathan Lambert, Hailey Schoelkopf, Aaron Gokaslan, Luca Soldaini, Valentina Pyatkin, Louis Castricato:
Self-Directed Synthetic Dialogues and Revisions Technical Report. CoRR abs/2407.18421 (2024) - [i24]Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi:
OLMoE: Open Mixture-of-Experts Language Models. CoRR abs/2409.02060 (2024) - 2023
- [c7]Thomas Krendl Gilbert, Nathan Lambert, Sarah Dean, Tom Zick, Aaron J. Snoswell, Soham Mehta:
Reward Reports for Reinforcement Learning. AIES 2023: 84-130 - [i23]Alexander N. Alvara, Lydia Lee, Emmanuel Sin, Nathan O. Lambert, Andrew J. Westphal, Kristofer S. J. Pister:
BLISS: Interplanetary Exploration with Swarms of Low-Cost Spacecraft. CoRR abs/2307.11226 (2023) - [i22]Sarah Shoker, Andrew W. Reddie, Sarah Barrington, Ruby Booth, Miles Brundage, Husanjot Chahal, Michael Depp, Bill Drexel, Ritwik Gupta, Marina Favaro, Jake Hecla, Alan Hickey, Margarita Konaev, Kirthi Kumar, Nathan Lambert, Andrew Lohn, Cullen O'Keefe, Nazneen Rajani, Michael Sellitto, Robert Trager, Leah Walker, Alexa Wehsener, Jessica Young:
Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings. CoRR abs/2308.00862 (2023) - [i21]Ran Wei, Nathan O. Lambert, Anthony D. McDonald, Alfredo García, Roberto Calandra:
A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning. CoRR abs/2310.06253 (2023) - [i20]Nathan Lambert, Thomas Krendl Gilbert, Tom Zick:
Entangled Preferences: The History and Risks of Reinforcement Learning and Human Feedback. CoRR abs/2310.13595 (2023) - [i19]Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf:
Zephyr: Direct Distillation of LM Alignment. CoRR abs/2310.16944 (2023) - [i18]Nathan O. Lambert, Roberto Calandra:
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback. CoRR abs/2311.00168 (2023) - [i17]Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew E. Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi:
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2. CoRR abs/2311.10702 (2023) - 2022
- [i16]Nathan Lambert, Markus Wulfmeier, William F. Whitney, Arunkumar Byravan, Michael Bloesch, Vibhavari Dasagi, Tim Hertweck, Martin A. Riedmiller:
The Challenges of Exploration for Offline Reinforcement Learning. CoRR abs/2201.11861 (2022) - [i15]Thomas Krendl Gilbert, Sarah Dean, Tom Zick, Nathan Lambert:
Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems. CoRR abs/2202.05716 (2022) - [i14]Nathan O. Lambert, Kristofer S. J. Pister, Roberto Calandra:
Investigating Compounding Prediction Errors in Learned Dynamics Models. CoRR abs/2203.09637 (2022) - [i13]Thomas Krendl Gilbert, Sarah Dean, Nathan Lambert, Tom Zick, Aaron J. Snoswell:
Reward Reports for Reinforcement Learning. CoRR abs/2204.10817 (2022) - [i12]Margaret Mitchell, Alexandra Sasha Luccioni, Nathan Lambert, Marissa Gerchick, Angelina McMillan-Major, Ezinwanne Ozoani, Nazneen Rajani, Tristan Thrush, Yacine Jernite, Douwe Kiela:
Measuring Data. CoRR abs/2212.05129 (2022) - 2021
- [j3]Nathan O. Lambert, Craig B. Schindler, Daniel S. Drew, Kristofer S. J. Pister:
Nonholonomic Yaw Control of an Underactuated Flying Robot With Model-Based Reinforcement Learning. IEEE Robotics Autom. Lett. 6(1): 455-461 (2021) - [c6]Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan O. Lambert, André Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra:
On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning. AISTATS 2021: 4015-4023 - [c5]Nathan O. Lambert, Albert Wilcox, Howard Zhang, Kristofer S. J. Pister, Roberto Calandra:
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning. CDC 2021: 2880-2887 - [c4]Mark Selden, Jason Zhou, Felipe Campos, Nathan O. Lambert, Daniel S. Drew, Kristofer S. J. Pister:
BotNet: A Simulator for Studying the Effects of Accurate Communication Models on Multi-Agent and Swarm Control. MRS 2021: 101-109 - [i11]McKane Andrus, Sarah Dean, Thomas Krendl Gilbert, Nathan Lambert, Tom Zick:
AI Development for the Public Interest: From Abstraction Traps to Sociotechnical Risks. CoRR abs/2102.04255 (2021) - [i10]Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan O. Lambert, André Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra:
On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning. CoRR abs/2102.13651 (2021) - [i9]Luis Pineda, Brandon Amos, Amy Zhang, Nathan O. Lambert, Roberto Calandra:
MBRL-Lib: A Modular Library for Model-based Reinforcement Learning. CoRR abs/2104.10159 (2021) - [i8]Sarah Dean, Thomas Krendl Gilbert, Nathan Lambert, Tom Zick:
Axes for Sociotechnical Inquiry in AI Research. CoRR abs/2105.06551 (2021) - [i7]Mark Selden, Jason Zhou, Felipe Campos, Nathan O. Lambert, Daniel S. Drew, Kristofer S. J. Pister:
BotNet: A Simulator for Studying the Effects of Accurate Communication Models on Multi-agent and Swarm Control. CoRR abs/2108.13606 (2021) - 2020
- [c3]Tianyu Li, Nathan O. Lambert, Roberto Calandra, Franziska Meier, Akshara Rai:
Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning. ICRA 2020: 413-419 - [c2]McKane Andrus, Sarah Dean, Thomas Krendl Gilbert, Nathan Lambert, Tom Zick:
AI Development for the Public Interest: From Abstraction Traps to Sociotechnical Risks. ISTAS 2020: 72-79 - [c1]Nathan O. Lambert, Brandon Amos, Omry Yadan, Roberto Calandra:
Objective Mismatch in Model-based Reinforcement Learning. L4DC 2020: 761-770 - [i6]Nathan O. Lambert, Brandon Amos, Omry Yadan, Roberto Calandra:
Objective Mismatch in Model-based Reinforcement Learning. CoRR abs/2002.04523 (2020) - [i5]Nathan O. Lambert, Farhan Toddywala, Brian Liao, Eric Zhu, Lydia Lee, Kristofer S. J. Pister:
Learning for Microrobot Exploration: Model-based Locomotion, Sparse-robust Navigation, and Low-power Deep Classification. CoRR abs/2004.13194 (2020) - [i4]Nathan O. Lambert, Craig B. Schindler, Daniel S. Drew, Kristofer S. J. Pister:
Nonholonomic Yaw Control of an Underactuated Flying Robot with Model-based Reinforcement Learning. CoRR abs/2009.01221 (2020) - [i3]Nathan O. Lambert, Albert Wilcox, Howard Zhang, Kristofer S. J. Pister, Roberto Calandra:
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning. CoRR abs/2012.09156 (2020)
2010 – 2019
- 2019
- [j2]Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Sergey Levine, Roberto Calandra, Kristofer S. J. Pister:
Low-Level Control of a Quadrotor With Deep Model-Based Reinforcement Learning. IEEE Robotics Autom. Lett. 4(4): 4224-4230 (2019) - [i2]Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Roberto Calandra, Sergey Levine, Kristofer S. J. Pister:
Low Level Control of a Quadrotor with Deep Model-Based Reinforcement learning. CoRR abs/1901.03737 (2019) - [i1]Tianyu Li, Nathan O. Lambert, Roberto Calandra, Franziska Meier, Akshara Rai:
Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning. CoRR abs/1909.12324 (2019) - 2018
- [j1]Daniel S. Drew, Nathan O. Lambert, Craig B. Schindler, Kristofer S. J. Pister:
Toward Controlled Flight of the Ionocraft: A Flying Microrobot Using Electrohydrodynamic Thrust With Onboard Sensing and No Moving Parts. IEEE Robotics Autom. Lett. 3(4): 2807-2813 (2018)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:21 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint