default search action
12th ICLR 2024: Vienna, Austria
- The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net 2024
Accept (oral)
- Yonatan Oren, Nicole Meister, Niladri S. Chatterji, Faisal Ladhak, Tatsunori Hashimoto:
Proving Test Set Contamination in Black-Box Language Models. - Yapei Chang, Kyle Lo, Tanya Goyal, Mohit Iyyer:
BooookScore: A systematic exploration of book-length summarization in the era of LLMs. - Zahra Kadkhodaie, Florentin Guth, Eero P. Simoncelli, Stéphane Mallat:
Generalization in diffusion models arises from geometry-adaptive harmonic representations. - Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade:
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions. - Gautam Reddy:
The mechanistic basis of data dependence and abrupt learning in an in-context classification task. - Yang Song, Prafulla Dhariwal:
Improved Techniques for Training Consistency Models. - Thaddäus Wiedemer, Jack Brady, Alexander Panfilov, Attila Juhos, Matthias Bethge, Wieland Brendel:
Provable Compositional Generalization for Object-Centric Learning. - Ching Fang, Kim Stachenfeld:
Predictive auxiliary objectives in deep RL mimic learning in the brain. - Haoqi Yuan, Zhancun Mu, Feiyang Xie, Zongqing Lu:
Pre-Training Goal-based Models for Sample-Efficient Reinforcement Learning. - Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson:
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! - Ido Amos, Jonathan Berant, Ankit Gupta:
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors. - Yixiao Li, Yifan Yu, Chen Liang, Nikos Karampatziakis, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models. - Miltiadis Kofinas, Boris Knyazev, Yan Zhang, Yunlu Chen, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, David W. Zhang:
Graph Neural Networks for Learning Equivariant Representations of Neural Networks. - Zaishuo Xia, Han Yang, Binghui Wang, Jinyuan Jia:
GNNCert: Deterministic Certification of Graph Neural Networks against Adversarial Perturbations. - Hyungho Na, Yunkyeong Seo, Il-Chul Moon:
Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning. - Yogesh Verma, Markus Heinonen, Vikas Garg:
ClimODE: Climate and Weather Forecasting with Physics-informed Neural ODEs. - Hengrui Zhang, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Xiao Qin, Christos Faloutsos, Huzefa Rangwala, George Karypis:
Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space. - Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin Choi, Nouha Dziri, Xiang Ren:
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement. - Bohang Zhang, Jingchu Gai, Yiheng Du, Qiwei Ye, Di He, Liwei Wang:
Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness. - Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao:
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts. - Nathan C. Frey, Daniel Berenberg, Karina Zadorozhny, Joseph Kleinhenz, Julien Lafrance-Vanasse, Isidro Hötzel, Yan Wu, Stephen Ra, Richard Bonneau, Kyunghyun Cho, Andreas Loukas, Vladimir Gligorijevic, Saeed Saremi:
Protein Discovery with Discrete Walk-Jump Sampling. - Kensen Shi, Joey Hong, Yinlin Deng, Pengcheng Yin, Manzil Zaheer, Charles Sutton:
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis. - Yeming Wen, Swarat Chaudhuri:
Batched Low-Rank Adaptation of Foundation Models. - Atsushi Shimizu, Xiaoou Cheng, Christopher Musco, Jonathan Weare:
Improved Active Learning via Dependent Leverage Score Sampling. - Suyu Ge, Yunan Zhang, Liyuan Liu, Minjia Zhang, Jiawei Han, Jianfeng Gao:
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs. - Galen Andrew, Peter Kairouz, Sewoong Oh, Alina Oprea, Hugh Brendan McMahan, Vinith Menon Suriyakumar:
One-shot Empirical Privacy Estimation for Federated Learning. - Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik R. Narasimhan:
SWE-bench: Can Language Models Resolve Real-world Github Issues? - Seyed-Iman Mirzadeh, Keivan Alizadeh-Vahid, Sachin Mehta, Carlo C. del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, Mehrdad Farajtabar:
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models. - Yiding Jiang, Christina Baek, J. Zico Kolter:
On the Joint Interaction of Models, Data, and Features. - Ismail Yunus Akhalwaya, Shashanka Ubaru, Kenneth L. Clarkson, Mark S. Squillante, Vishnu Jejjala, Yang-Hui He, Kugendran Naidoo, Vasileios Kalantzis, Lior Horesh:
Topological data analysis on noisy quantum computers. - Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi:
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. - Anshuman Chhabra, Peizhao Li, Prasant Mohapatra, Hongfu Liu:
"What Data Benefits My Classifier?" Enhancing Model Performance and Interpretability through Influence-Based Data Selection. - Tianrong Chen, Jiatao Gu, Laurent Dinh, Evangelos A. Theodorou, Joshua M. Susskind, Shuangfei Zhai:
Generative Modeling with Phase Stochastic Bridge. - Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey:
Zipformer: A faster and better encoder for automatic speech recognition. - Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, Jürgen Schmidhuber:
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. - Kim-Celine Kahl, Carsten T. Lüth, Maximilian Zenk, Klaus H. Maier-Hein, Paul F. Jaeger:
ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation. - Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan S. Kankanhalli:
Finetuning Text-to-Image Diffusion Models for Fairness. - Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov:
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models. - Seohong Park, Oleh Rybkin, Sergey Levine:
METRA: Scalable Unsupervised RL with Metric-Aware Abstraction. - Yichen Wu, Long-Kai Huang, Renzhen Wang, Deyu Meng, Ying Wei:
Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction. - Bo Zhao, Robert M. Gower, Robin Walters, Rose Yu:
Improving Convergence and Generalization Using Parameter Symmetries. - Ricky T. Q. Chen, Yaron Lipman:
Flow Matching on General Geometries. - Zhen Liu, Yao Feng, Yuliang Xiu, Weiyang Liu, Liam Paull, Michael J. Black, Bernhard Schölkopf:
Ghost on the Shell: An Expressive Representation of General 3D Shapes. - Pablo Pernias, Dominic Rampas, Mats L. Richter, Christopher Pal, Marc Aubreville:
Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models. - Yuxuan Song, Jingjing Gong, Hao Zhou, Mingyue Zheng, Jingjing Liu, Wei-Ying Ma:
Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks. - Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie E. Everett, Alexander A. Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith:
Small-scale proxies for large-scale Transformer training instabilities. - Pascal Chang, Jingwei Tang, Markus Gross, Vinicius C. Azevedo:
How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models. - Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski:
Vision Transformers Need Registers. - Sergei Solonets, Daniil Sinitsyn, Lukas von Stumberg, Nikita Araslanov, Daniel Cremers:
An Analytical Solution to Gauss-Newton Loss for Direct Image Alignment. - Hyosoon Jang, Minsu Kim, Sungsoo Ahn:
Learning Energy Decompositions for Partial Inference in GFlowNets. - Ian Gemp, Luke Marris, Georgios Piliouras:
Approximating Nash Equilibria in Normal-Form Games via Stochastic Optimization. - Giorgio Mariani, Irene Tallini, Emilian Postolache, Michele Mancusi, Luca Cosmo, Emanuele Rodolà:
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation. - Haiming Wang, Huajian Xin, Chuanyang Zheng, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo Li, Xiaodan Liang:
LEGO-Prover: Neural Theorem Proving with Growing Libraries. - Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Dieter Fox, Abhishek Gupta:
ASID: Active Exploration for System Identification in Robotic Manipulation. - Germain Kolossov, Andrea Montanari, Pulkit Tandon:
Towards a statistical theory of data selection under weak supervision. - Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran, Sarath Chandar:
Mastering Memory Tasks with World Models. - Gabriel Cardoso, Yazid Janati El Idrissi, Sylvain Le Corff, Eric Moulines:
Monte Carlo guided Denoising Diffusion models for Bayesian linear inverse problems. - Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis:
Self-Alignment with Instruction Backtranslation. - Sherry Yang, Yilun Du, Seyed Kamyar Seyed Ghasemipour, Jonathan Tompson, Leslie Pack Kaelbling, Dale Schuurmans, Pieter Abbeel:
Learning Interactive Real-World Simulators. - Shuo He, Chaojie Wang, Guowu Yang, Lei Feng:
Candidate Label Set Pruning: A Data-centric Perspective for Deep Partial-label Learning. - Jonathan Richens, Tom Everitt:
Robust agents learn causal world models. - Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu:
On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs. - Jisu Nam, Gyuseong Lee, Sunwoo Kim, Hyeonsu Kim, Hyoungwon Cho, Seyeon Kim, Seungryong Kim:
Diffusion Model for Dense Matching. - Shashanka Venkataramanan, Mamshad Nayeem Rizve, João Carreira, Yuki M. Asano, Yannis Avrithis:
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video. - Panagiotis Eustratiadis, Lukasz Dudziak, Da Li, Timothy M. Hospedales:
Neural Fine-Tuning Search for Few-Shot Learning. - Qiuhao Zeng, Changjian Shui, Long-Kai Huang, Peng Liu, Xi Chen, Charles Ling, Boyu Wang:
Latent Trajectory Learning for Limited Timestamps under Distribution Shift over Time. - Ruoyu Chen, Hua Zhang, Siyuan Liang, Jingzhi Li, Xiaochun Cao:
Less is More: Fewer Interpretable Region via Submodular Subset Selection. - Jason Y. Zhang, Amy Lin, Moneish Kumar, Tzu-Hsuan Yang, Deva Ramanan, Shubham Tulsiani:
Cameras as Rays: Pose Estimation via Ray Diffusion. - Jie Hu, Vishwaraj Doshi, Do Young Eun:
Accelerating Distributed Stochastic Optimization via Self-Repellent Random Walks. - Yuxin Wen, Yuchen Liu, Chen Chen, Lingjuan Lyu:
Detecting, Explaining, and Mitigating Memorization in Diffusion Models. - Sebastian Pineda-Arango, Fabio Ferreira, Arlind Kadra, Frank Hutter, Josif Grabocka:
Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How. - Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia:
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models. - Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin:
Amortizing intractable inference in large language models. - Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Chukwunyere Osi, Prateek Sharma, Fan Chen, Lei Jiang:
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models. - Izzeddin Gur, Hiroki Furuta, Austin V. Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust:
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis. - Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng:
Lipschitz Singularities in Diffusion Models. - Yossi Gandelsman, Alexei A. Efros, Jacob Steinhardt:
Interpreting CLIP's Image Representation via Text-Based Decomposition. - Yang He, Lingao Xiao, Joey Tianyi Zhou, Ivor W. Tsang:
Multisize Dataset Condensation. - Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, Gang Zeng:
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. - Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan:
LRM: Large Reconstruction Model for Single Image to 3D. - Wenxuan Li, Alan L. Yuille, Zongwei Zhou:
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks? - Haoyue Dai, Ignavier Ng, Gongxu Luo, Peter Spirtes, Petar Stojanov, Kun Zhang:
Gene Regulatory Network Inference in the Presence of Dropouts: a Causal View. - Yubo Zhuang, Xiaohui Chen, Yun Yang, Richard Y. Zhang:
Statistically Optimal K-means Clustering via Nonnegative Low-rank Semidefinite Programming. - André F. Cruz, Moritz Hardt:
Unprocessing Seven Years of Algorithmic Fairness. - Ziheng Qin, Kai Wang, Zangwei Zheng, Jianyang Gu, Xiangyu Peng, Zhaopan Xu, Daquan Zhou, Lei Shang, Baigui Sun, Xuansong Xie, Yang You:
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning. - Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Peng:
Multi-granularity Correspondence Learning from Long-term Noisy Videos.
Accept (spotlight)
- Sravanthi Gurugubelli, Sundeep Prabhakar Chepuri:
SaNN: Simple Yet Powerful Simplicial-aware Neural Networks. - Robin Staab, Mark Vero, Mislav Balunovic, Martin T. Vechev:
Beyond Memorization: Violating Privacy via Inference with Large Language Models. - Jasper Dekoninck, Marc Fischer, Luca Beurer-Kellner, Martin T. Vechev:
Controlled Text Generation via Language Model Arithmetic. - Nan Chen, Zemin Liu, Bryan Hooi, Bingsheng He, Rizal Fathony, Jun Hu, Jia Chen:
Consistency Training with Learnable Data Augmentation for Graph Anomaly Detection with Limited Supervision. - Juno Kim, Kakei Yamamoto, Kazusato Oko, Zhuoran Yang, Taiji Suzuki:
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems. - Suhan Shetty, Teng Xue, Sylvain Calinon:
Generalized Policy Iteration using Tensor Approximation for Hybrid Control. - Maksim Velikanov, Maxim Panov, Dmitry Yarotsky:
Generalization error of spectral algorithms. - Haoxuan Li, Chunyuan Zheng, Yanghao Xiao, Peng Wu, Zhi Geng, Xu Chen, Peng Cui:
Debiased Collaborative Filtering with Kernel-Based Causal Balancing. - Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca D. Dragan:
The Effective Horizon Explains Deep RL Performance in Stochastic Environments. - Ainaz Eftekhar, Kuo-Hao Zeng, Jiafei Duan, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna:
Selective Visual Representations Improve Convergence and Generalization for Embodied AI. - Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang:
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning. - Gregory Kang Ruey Lau, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low:
PINNACLE: PINN Adaptive ColLocation and Experimental points selection. - Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar:
Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances. - Guodong Wang, Yunhong Wang, Xiuguo Bao, Di Huang:
Rotation Has Two Sides: Evaluating Data Augmentation for Deep One-class Classification. - Lin-Han Jia, Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li:
Realistic Evaluation of Semi-supervised Learning Algorithms in Open Environments. - Denizalp Goktas, Amy Greenwald, Sadie Zhao, Alec Koppel, Sumitra Ganesh:
Efficient Inverse Multiagent Learning. - Tianqi Du, Yifei Wang, Yisen Wang:
On the Role of Discrete Tokenization in Visual Representation Learning. - Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas:
The Consensus Game: Language Model Generation via Equilibrium Search. - Jake Grigsby, Linxi Fan, Yuke Zhu:
AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents. - Zhuqing Liu, Xin Zhang, Jia Liu, Zhengyuan Zhu, Songtao Lu:
PILOT: An $\mathcal{O}(1/K)$-Convergent Approach for Policy Evaluation with Nonlinear Function Approximation. - Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Marcus McAleer:
Confronting Reward Model Overoptimization with Constrained RLHF. - Vimal Thilak, Chen Huang, Omid Saremi, Laurent Dinh, Hanlin Goh, Preetum Nakkiran, Joshua M. Susskind, Etai Littwin:
LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures. - Jiayang Liu, Yiming Bu, Daniel Tso, Qinru Qiu:
Improved Efficiency Based on Learned Saccade and Continuous Scene Reconstruction From Foveated Visual Sampling. - Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt:
Overthinking the Truth: Understanding how Language Models Process False Demonstrations. - Ibraheem Muhammad Moosa, Rui Zhang, Wenpeng Yin:
MT-Ranker: Reference-free machine translation evaluation by inter-system ranking. - Zayne Sprague, Xi Ye, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett:
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning. - Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie:
Harnessing Density Ratios for Online Reinforcement Learning. - Hanqi Zhou, Robert Bamler, Charley M. Wu, Álvaro Tejero-Cantero:
Predictive, scalable and interpretable knowledge tracing on structured domains. - Irene Cannistraci, Luca Moschella, Marco Fumero, Valentino Maiorca, Emanuele Rodolà:
From Bricks to Bridges: Product of Invariances to Enhance Latent Space Communication. - Sumeet Batra, Bryon Tjanaka, Matthew Christopher Fontaine, Aleksei Petrenko, Stefanos Nikolaidis, Gaurav S. Sukhatme:
Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning. - Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis:
Memorization Capacity of Multi-Head Attention in Transformers. - Jack Merullo, Carsten Eickhoff, Ellie Pavlick:
Circuit Component Reuse Across Tasks in Transformer Language Models. - Henry Li, Ronen Basri, Yuval Kluger:
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps. - Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis:
Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation. - Ali Shahin Shamsabadi, Gefei Tan, Tudor Cebere, Aurélien Bellet, Hamed Haddadi, Nicolas Papernot, Xiao Wang, Adrian Weller:
Confidential-DPproof: Confidential Proof of Differentially Private Training. - Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Wen-tau Yih, Mike Lewis:
In-Context Pretraining: Language Modeling Beyond Document Boundaries. - Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Evan Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hannaneh Hajishirzi, Noah A. Smith, Jesse Dodge:
What's In My Big Data?