default search action
Sanjiv Kumar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j17]Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar:
What do larger image classifiers memorise? Trans. Mach. Learn. Res. 2024 (2024) - [c111]Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit, Daniel Glasner, Ayan Chakrabarti, Sanjiv Kumar:
Rethinking FID: Towards a Better Evaluation Metric for Image Generation. CVPR 2024: 9307-9315 - [c110]Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar:
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation. CVPR 2024: 9316-9325 - [c109]Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar:
Regression Aware Inference with LLMs. EMNLP (Findings) 2024: 13667-13678 - [c108]Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar:
On Bias-Variance Alignment in Deep Models. ICLR 2024 - [c107]Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan:
Think before you speak: Training Language Models With Pause Tokens. ICLR 2024 - [c106]Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar:
Language Model Cascades: Token-Level Uncertainty And Beyond. ICLR 2024 - [c105]Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontañón, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli:
Functional Interpolation for Relative Positions improves Long Context Transformers. ICLR 2024 - [c104]Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Neha Gupta, Sanjiv Kumar:
Learning to Reject Meets Long-tail Learning. ICLR 2024 - [c103]Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Sanjiv Kumar:
Plugin estimators for selective classification with out-of-distribution detection. ICLR 2024 - [c102]Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix Yu, Cho-Jui Hsieh, Inderjit S. Dhillon, Sanjiv Kumar:
Two-stage LLM Fine-tuning with Less Specialization and More Generalization. ICLR 2024 - [c101]Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal:
DistillSpec: Improving Speculative Decoding via Knowledge Distillation. ICLR 2024 - [c100]Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar:
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? ICML 2024 - [c99]Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Wittawat Jitkrittum, Veeranjaneyulu Sadhanala, Sadeep Jayasumana, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar:
USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval. ICML 2024 - [c98]Yuchen Li, Alexandre Kirchmeyer, Aashay Mehta, Yilong Qin, Boris Dadachev, Kishore Papineni, Sanjiv Kumar, Andrej Risteski:
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines. ICML 2024 - [c97]Aishwarya P. S., Pranav Ajit Nair, Yashas Samaga, Toby Boyd, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli:
Tandem Transformers for Inference Efficient LLMs. ICML 2024 - [i112]Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit, Daniel Glasner, Ayan Chakrabarti, Sanjiv Kumar:
Rethinking FID: Towards a Better Evaluation Metric for Image Generation. CoRR abs/2401.09603 (2024) - [i111]Ke Ye, Heinrich Jiang, Afshin Rostamizadeh, Ayan Chakrabarti, Giulia DeSalvo, Jean-François Kagy, Lazaros Karydas, Gui Citovsky, Sanjiv Kumar:
SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection. CoRR abs/2401.13160 (2024) - [i110]Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu, Sobhan Miryoosefi, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar:
Efficient Stagewise Pretraining via Progressive Subnetworks. CoRR abs/2402.05913 (2024) - [i109]Aishwarya P. S., Pranav Ajit Nair, Yashas Samaga, Toby Boyd, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli:
Tandem Transformers for Inference Efficient LLMs. CoRR abs/2402.08644 (2024) - [i108]Yashas Samaga, Varun Yerram, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli:
HiRE: High Recall Approximate Top-k Estimation for Efficient LLM Inference. CoRR abs/2402.09360 (2024) - [i107]Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon, Felix X. Yu, Sanjiv Kumar:
Metric-aware LLM inference. CoRR abs/2403.04182 (2024) - [i106]Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, Sanjiv Kumar:
SOAR: Improved Indexing for Approximate Nearest Neighbor Search. CoRR abs/2404.00774 (2024) - [i105]Taehyeon Kim, Ananda Theertha Suresh, Kishore Papineni, Michael Riley, Sanjiv Kumar, Adrian Benton:
Towards Fast Inference: Exploring and Improving Blockwise Parallel Drafts. CoRR abs/2404.09221 (2024) - [i104]Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar:
Language Model Cascades: Token-level uncertainty and beyond. CoRR abs/2404.10136 (2024) - [i103]Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Seungyeon Kim, Neha Gupta, Aditya Krishna Menon, Sanjiv Kumar:
Faster Cascades via Speculative Decoding. CoRR abs/2405.19261 (2024) - [i102]Stefani Karp, Nikunj Saunshi, Sobhan Miryoosefi, Sashank J. Reddi, Sanjiv Kumar:
Landscape-Aware Growing: The Power of a Little LAG. CoRR abs/2406.02469 (2024) - [i101]Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar:
Efficient Document Ranking with Learnable Late Interactions. CoRR abs/2406.17968 (2024) - [i100]Yuchen Li, Alexandre Kirchmeyer, Aashay Mehta, Yilong Qin, Boris Dadachev, Kishore Papineni, Sanjiv Kumar, Andrej Risteski:
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines. CoRR abs/2407.21046 (2024) - [i99]Nikunj Saunshi, Stefani Karp, Shankar Krishnan, Sobhan Miryoosefi, Sashank J. Reddi, Sanjiv Kumar:
On the Inductive Bias of Stacking Towards Improving Reasoning. CoRR abs/2409.19044 (2024) - [i98]Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar:
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? CoRR abs/2410.08292 (2024) - [i97]Asher Trockman, Hrayr Harutyunyan, J. Zico Kolter, Sanjiv Kumar, Srinadh Bhojanapalli:
Mimetic Initialization Helps State Space Models Learn to Recall. CoRR abs/2410.11135 (2024) - [i96]Giulia DeSalvo, Jean-François Kagy, Lazaros Karydas, Afshin Rostamizadeh, Sanjiv Kumar:
No more hard prompts: SoftSRV prompting for synthetic data generation. CoRR abs/2410.16534 (2024) - [i95]Ankit Singh Rawat, Veeranjaneyulu Sadhanala, Afshin Rostamizadeh, Ayan Chakrabarti, Wittawat Jitkrittum, Vladimir Feinberg, Seungyeon Kim, Hrayr Harutyunyan, Nikunj Saunshi, Zachary Nado, Rakesh Shivanna, Sashank J. Reddi, Aditya Krishna Menon, Rohan Anil, Sanjiv Kumar:
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs. CoRR abs/2410.18779 (2024) - [i94]Jui-Nan Yen, Si Si, Zhao Meng, Felix X. Yu, Sai Surya Duvvuri, Inderjit S. Dhillon, Cho-Jui Hsieh, Sanjiv Kumar:
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization. CoRR abs/2410.20625 (2024) - [i93]Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar:
On the Role of Depth and Looping for In-Context Learning with Task Diversity. CoRR abs/2410.21698 (2024) - [i92]Gaurav Menghani, Ravi Kumar, Sanjiv Kumar:
LAuReL: Learned Augmented Residual Layer. CoRR abs/2411.07501 (2024) - 2023
- [c96]Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix X. Yu, Sanjiv Kumar:
Large Language Models with Controllable Working Memory. ACL (Findings) 2023: 1774-1793 - [c95]Gui Citovsky, Giulia DeSalvo, Sanjiv Kumar, Srikumar Ramalingam, Afshin Rostamizadeh, Yunjuan Wang:
Leveraging Importance Weights in Subset Selection. ICLR 2023 - [c94]Hrayr Harutyunyan, Ankit Singh Rawat, Aditya Krishna Menon, Seungyeon Kim, Sanjiv Kumar:
Supervision Complexity and its Role in Knowledge Distillation. ICLR 2023 - [c93]Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix X. Yu, Ruiqi Guo, Sanjiv Kumar:
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers. ICLR 2023 - [c92]Si Si, Felix X. Yu, Ankit Singh Rawat, Cho-Jui Hsieh, Sanjiv Kumar:
Serving Graph Compression for Graph Neural Networks. ICLR 2023 - [c91]Philip Sun, Ruiqi Guo, Sanjiv Kumar:
Automating Nearest Neighbor Search Configuration with Constrained Optimization. ICLR 2023 - [c90]Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar:
Teacher Guided Training: An Efficient Framework for Knowledge Transfer. ICLR 2023 - [c89]Sashank J. Reddi, Sobhan Miryoosefi, Stefani Karp, Shankar Krishnan, Satyen Kale, Seungyeon Kim, Sanjiv Kumar:
Efficient Training of Language Models using Few-Shot Learning. ICML 2023: 14553-14568 - [c88]Guna Shekar M., Wonjun Lee, Sanjiv Kumar, Yanan Duan, Imtiaz Rangwala:
A Framework for Developing the Next Generation Interactive Soil Moisture Forecasting System Using the Long-Short Term Memory Model. ICMLA 2023: 1986-1993 - [c87]Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar:
When Does Confidence-Based Cascade Deferral Suffice? NeurIPS 2023 - [c86]Vaishnavh Nagarajan, Aditya Krishna Menon, Srinadh Bhojanapalli, Hossein Mobahi, Sanjiv Kumar:
On student-teacher deviations in distillation: does it pay to disobey? NeurIPS 2023 - [c85]Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, Sanjiv Kumar:
SOAR: Improved Indexing for Approximate Nearest Neighbor Search. NeurIPS 2023 - [c84]Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Sanjiv Kumar:
ResMem: Learn what you can and memorize the rest. NeurIPS 2023 - [i91]Philip Sun, Ruiqi Guo, Sanjiv Kumar:
Automating Nearest Neighbor Search Configuration with Constrained Optimization. CoRR abs/2301.01702 (2023) - [i90]Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar:
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval. CoRR abs/2301.12005 (2023) - [i89]Gui Citovsky, Giulia DeSalvo, Sanjiv Kumar, Srikumar Ramalingam, Afshin Rostamizadeh, Yunjuan Wang:
Leveraging Importance Weights in Subset Selection. CoRR abs/2301.12052 (2023) - [i88]Hrayr Harutyunyan, Ankit Singh Rawat, Aditya Krishna Menon, Seungyeon Kim, Sanjiv Kumar:
Supervision Complexity and its Role in Knowledge Distillation. CoRR abs/2301.12245 (2023) - [i87]Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Sanjiv Kumar:
Learning to reject meets OOD detection: Are all abstentions created equal? CoRR abs/2301.12386 (2023) - [i86]Vaishnavh Nagarajan, Aditya Krishna Menon, Srinadh Bhojanapalli, Hossein Mobahi, Sanjiv Kumar:
On student-teacher deviations in distillation: does it pay to disobey? CoRR abs/2301.12923 (2023) - [i85]Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Sanjiv Kumar:
ResMem: Learn what you can and memorize the rest. CoRR abs/2302.01576 (2023) - [i84]Samy Jelassi, Boris Hanin, Ziwei Ji, Sashank J. Reddi, Srinadh Bhojanapalli, Sanjiv Kumar:
Depth Dependence of μP Learning Rates in ReLU MLPs. CoRR abs/2305.07810 (2023) - [i83]Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar:
When Does Confidence-Based Cascade Deferral Suffice? CoRR abs/2307.02764 (2023) - [i82]Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar:
SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models. CoRR abs/2308.10997 (2023) - [i81]Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan:
Think before you speak: Training Language Models With Pause Tokens. CoRR abs/2310.02226 (2023) - [i80]Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontañón, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli:
Functional Interpolation for Relative Positions Improves Long Context Transformers. CoRR abs/2310.04418 (2023) - [i79]Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar:
What do larger image classifiers memorise? CoRR abs/2310.05337 (2023) - [i78]Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal:
DistillSpec: Improving Speculative Decoding via Knowledge Distillation. CoRR abs/2310.08461 (2023) - [i77]Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar:
It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models. CoRR abs/2310.09250 (2023) - [i76]Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix X. Yu, Sanjiv Kumar:
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent. CoRR abs/2312.10003 (2023) - [i75]Srikumar Ramalingam, Pranjal Awasthi, Sanjiv Kumar:
A Weighted K-Center Algorithm for Data Subset Selection. CoRR abs/2312.10602 (2023) - 2022
- [j16]Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, Sanjiv Kumar:
Teacher's pet: understanding and mitigating biases in distillation. Trans. Mach. Learn. Res. 2022 (2022) - [c83]Gautam Gautam, Neeraj Sharma, Sanjiv Kumar:
Radio over FSO System for 5G Wireless Communication. ICCCNT 2022: 1-6 - [c82]Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar:
Robust Training of Neural Networks Using Scale Invariant Architectures. ICML 2022: 12656-12684 - [c81]Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar:
In defense of dual-encoders for neural ranking. ICML 2022: 15376-15400 - [c80]Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar:
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s. NeurIPS 2022 - [c79]Zonglin Li, Ruiqi Guo, Sanjiv Kumar:
Decoupled Context Processing for Context Augmented Language Modeling. NeurIPS 2022 - [c78]Harikrishna Narasimhan, Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar:
Post-hoc estimators for learning to defer to an expert. NeurIPS 2022 - [i74]Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar:
Robust Training of Neural Networks using Scale Invariant Architectures. CoRR abs/2202.00980 (2022) - [i73]Taman Narayan, Heinrich Jiang, Sen Zhao, Sanjiv Kumar:
Predicting on the Edge: Identifying Where a Larger Model Does Better. CoRR abs/2202.07652 (2022) - [i72]Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar:
ELM: Embedding and Logit Margins for Long-Tail Learning. CoRR abs/2204.13208 (2022) - [i71]Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar:
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s. CoRR abs/2206.14286 (2022) - [i70]Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar:
Teacher Guided Training: An Efficient Framework for Knowledge Transfer. CoRR abs/2208.06825 (2022) - [i69]Zonglin Li, Ruiqi Guo, Sanjiv Kumar:
Decoupled Context Processing for Context Augmented Language Modeling. CoRR abs/2210.05758 (2022) - [i68]Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix X. Yu, Ruiqi Guo, Sanjiv Kumar:
Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers. CoRR abs/2210.06313 (2022) - [i67]Arslan Chaudhry, Aditya Krishna Menon, Andreas Veit, Sadeep Jayasumana, Srikumar Ramalingam, Sanjiv Kumar:
When does mixup promote local linearity in learned representations? CoRR abs/2210.16413 (2022) - [i66]Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix X. Yu, Cho-Jui Hsieh, Inderjit S. Dhillon, Sanjiv Kumar:
Preserving In-Context Learning ability in Large Language Model Fine-tuning. CoRR abs/2211.00635 (2022) - [i65]Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix X. Yu, Sanjiv Kumar:
Large Language Models with Controllable Working Memory. CoRR abs/2211.05110 (2022) - 2021
- [j15]Ankita Verma, Savita, Sanjiv Kumar:
Routing Protocols in Delay Tolerant Networks: Comparative and Empirical Analysis. Wirel. Pers. Commun. 118(1): 551-574 (2021) - [c77]Sashank J. Reddi, Rama Kumar Pasumarthi, Aditya Krishna Menon, Ankit Singh Rawat, Felix X. Yu, Seungyeon Kim, Andreas Veit, Sanjiv Kumar:
RankDistil: Knowledge Distillation for Ranking. AISTATS 2021: 2368-2376 - [c76]Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Kumar Ravikumar, Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh:
Evaluations and Methods for Explanation through Robustness Analysis. ICLR 2021 - [c75]Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, Sanjiv Kumar:
Long-tail learning via logit adjustment. ICLR 2021 - [c74]Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar:
Overparameterisation and worst-case generalisation: friend or foe? ICLR 2021 - [c73]Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konecný, Sanjiv Kumar, Hugh Brendan McMahan:
Adaptive Federated Optimization. ICLR 2021 - [c72]Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar, Suvrit Sra:
Coping with Label Shift via Distributionally Robust Optimisation. ICLR 2021 - [c71]Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Seungyeon Kim, Sanjiv Kumar:
A statistical perspective on distillation. ICML 2021: 7632-7642 - [c70]Ankit Singh Rawat, Aditya Krishna Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix X. Yu, Sashank J. Reddi, Sanjiv Kumar:
Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces. ICML 2021: 8890-8901 - [c69]Erik Lindgren, Sashank J. Reddi, Ruiqi Guo, Sanjiv Kumar:
Efficient Training of Retrieval Models using Negative Cache. NeurIPS 2021: 4134-4146 - [c68]Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar:
Batch Active Learning at Scale. NeurIPS 2021: 11933-11944 - [i64]Srinadh Bhojanapalli, Kimberly Wilber, Andreas Veit, Ankit Singh Rawat, Seungyeon Kim, Aditya Krishna Menon, Sanjiv Kumar:
On the Reproducibility of Neural Network Predictions. CoRR abs/2102.03349 (2021) - [i63]Srikumar Ramalingam, Daniel Glasner, Kaushal Patel, Raviteja Vemulapalli, Sadeep Jayasumana, Sanjiv Kumar:
Balancing Constraints and Submodularity in Data Subset Selection. CoRR abs/2104.12835 (2021) - [i62]Ankit Singh Rawat, Aditya Krishna Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix X. Yu, Sashank J. Reddi, Sanjiv Kumar:
Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces. CoRR abs/2105.05736 (2021) - [i61]Seungyeon Kim, Daniel Glasner, Srikumar Ramalingam, Cho-Jui Hsieh, Kishore Papineni, Sanjiv Kumar:
Balancing Robustness and Sensitivity using Feature Contrastive Learning. CoRR abs/2105.09394 (2021) - [i60]Baris Sumengen, Anand Rajagopalan, Gui Citovsky, David Simcha, Olivier Bachem, Pradipta Mitra, Sam Blasiak, Mason Liang, Sanjiv Kumar:
Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets. CoRR abs/2105.11653 (2021) - [i59]Srinadh Bhojanapalli, Ayan Chakrabarti, Himanshu Jain, Sanjiv Kumar, Michal Lukasik, Andreas Veit:
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation. CoRR abs/2106.08823 (2021) - [i58]Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, Sanjiv Kumar:
Teacher's pet: understanding and mitigating biases in distillation. CoRR abs/2106.10494 (2021) - [i57]Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar:
Batch Active Learning at Scale. CoRR abs/2107.14263 (2021) - [i56]Srinadh Bhojanapalli, Ayan Chakrabarti, Andreas Veit, Michal Lukasik, Himanshu Jain, Frederick Liu, Yin-Wen Chang, Sanjiv Kumar:
Leveraging redundancy in attention with Reuse Transformers. CoRR abs/2110.06821 (2021) - [i55]Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar:
When in Doubt, Summon the Titans: Efficient Inference with Large Models. CoRR abs/2110.10305 (2021) - 2020
- [c67]Quan Geng, Wei Ding, Ruiqi Guo, Sanjiv Kumar:
Tight Analysis of Privacy and Utility Tradeoff in Approximate Differential Privacy. AISTATS 2020: 89-99 - [c66]Xuanqing Liu, Tesi Xiao, Si Si, Qin Cao, Sanjiv Kumar, Cho-Jui Hsieh:
How Does Noise Help Robustness? Explanation and Exploration under the Neural SDE Framework. CVPR 2020: 279-287 - [c65]Michal Lukasik, Himanshu Jain, Aditya Krishna Menon, Seungyeon Kim, Srinadh Bhojanapalli, Felix X. Yu, Sanjiv Kumar:
Semantic Label Smoothing for Sequence to Sequence Problems. EMNLP (1) 2020: 4992-4998 - [c64]Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar:
Pre-training Tasks for Embedding-based Large-scale Retrieval. ICLR 2020 - [c63]Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Can gradient clipping mitigate label noise? ICLR 2020 - [c62]Yangjun Ruan, Yuanhao Xiong, Sashank J. Reddi, Sanjiv Kumar, Cho-Jui Hsieh:
Learning to Learn by Zeroth-Order Oracle. ICLR 2020 - [c61]Yang You, Jing Li, Sashank J. Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, Cho-Jui Hsieh:
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. ICLR 2020 - [c60]Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Are Transformers universal approximators of sequence-to-sequence functions? ICLR 2020 - [c59]Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Low-Rank Bottleneck in Multi-head Attention Models. ICML 2020: 864-873 - [c58]Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar:
Accelerating Large-Scale Inference with Anisotropic Vector Quantization. ICML 2020: 3887-3896 - [c57]