default search action
Jay Mahadeokar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c28]Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra:
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models. ICASSP 2024: 10216-10220 - [c27]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model. ICASSP 2024: 12201-12205 - [c26]Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Effective Internal Language Model Training and Fusion for Factorized Transducer Model. ICASSP 2024: 12687-12691 - [c25]Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Prompting Large Language Models with Speech Recognition Abilities. ICASSP 2024: 13351-13355 - [c24]Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs. NAACL-HLT 2024: 5522-5532 - [i32]Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Effective internal language model training and fusion for factorized transducer model. CoRR abs/2404.01716 (2024) - [i31]Laxmi Pandey, Ke Li, Jinxi Guo, Debjyoti Paul, Arthur Guo, Jay Mahadeokar, Xuedong Zhang:
Towards scalable efficient on-device ASR with transfer learning. CoRR abs/2407.16664 (2024) - [i30]Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurélien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Rozière, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Graeme Nail, Grégoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel M. Kloumann, Ishan Misra, Ivan Evtimov, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, et al.:
The Llama 3 Herd of Models. CoRR abs/2407.21783 (2024) - [i29]Desh Raj, Gil Keren, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Faster Speech-LLaMA Inference with Multi-token Prediction. CoRR abs/2409.08148 (2024) - [i28]Yufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junteng Jia, Gil Keren, Egor Lakomkin, Yiteng Huang, Jacob Donley, Jay Mahadeokar, Ozlem Kalinli:
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses. CoRR abs/2409.11494 (2024) - 2023
- [c23]Junteng Jia, Ke Li, Mani Malek, Kshitiz Malik, Jay Mahadeokar, Ozlem Kalinli, Frank Seide:
Joint Federated Learning and Personalization for on-Device ASR. ASRU 2023: 1-8 - [c22]Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le:
Improving fast-slow Encoder based Transducer with Streaming Deliberation. ICASSP 2023: 1-5 - [c21]Dawei Liang, Hang Su, Tarun Singh, Jay Mahadeokar, Shanil Puri, Jiedan Zhu, Edison Thomaz, Mike Seltzer:
Dynamic Speech Endpoint Detection with Regression Targets. ICASSP 2023: 1-5 - [c20]Desh Raj, Junteng Jia, Jay Mahadeokar, Chunyang Wu, Niko Moritz, Xiaohui Zhang, Ozlem Kalinli:
Anchored Speech Recognition with Neural Transducers. ICASSP 2023: 1-5 - [c19]Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales:
Multi-Head State Space Model for Speech Recognition. INTERSPEECH 2023: 241-245 - [c18]Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. NeurIPS 2023 - [i27]Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales:
Multi-Head State Space Model for Speech Recognition. CoRR abs/2305.12498 (2023) - [i26]Shuo Liu, Leda Sari, Chunyang Wu, Gil Keren, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli:
Towards Selection of Text-to-speech Data to Augment ASR Training. CoRR abs/2306.00998 (2023) - [i25]Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. CoRR abs/2306.15687 (2023) - [i24]Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Prompting Large Language Models with Speech Recognition Abilities. CoRR abs/2307.11795 (2023) - [i23]Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra:
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models. CoRR abs/2309.01947 (2023) - [i22]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model. CoRR abs/2309.13018 (2023) - [i21]Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data. CoRR abs/2311.06753 (2023) - 2022
- [c17]Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang, Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, Mike Seltzer:
Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution. ICASSP 2022: 8277-8281 - [c16]Junteng Jia, Jay Mahadeokar, Weiyi Zheng, Yuan Shangguan, Ozlem Kalinli, Frank Seide:
Federated Domain Adaptation for ASR with Full Self-Supervision. INTERSPEECH 2022: 536-540 - [c15]Jay Mahadeokar, Yangyang Shi, Ke Li, Duc Le, Jiedan Zhu, Vikas Chandra, Ozlem Kalinli, Michael L. Seltzer:
Streaming parallel transducer beam search with fast slow cascaded encoders. INTERSPEECH 2022: 2083-2087 - [c14]Niko Moritz, Frank Seide, Duc Le, Jay Mahadeokar, Christian Fuegen:
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition. SLT 2022: 324-330 - [i20]Jay Mahadeokar, Yangyang Shi, Ke Li, Duc Le, Jiedan Zhu, Vikas Chandra, Ozlem Kalinli, Michael L. Seltzer:
Streaming parallel transducer beam search with fast-slow cascaded encoders. CoRR abs/2203.15773 (2022) - [i19]Junteng Jia, Jay Mahadeokar, Weiyi Zheng, Yuan Shangguan, Ozlem Kalinli, Frank Seide:
Federated Domain Adaptation for ASR with Full Self-Supervision. CoRR abs/2203.15966 (2022) - [i18]Niko Moritz, Frank Seide, Duc Le, Jay Mahadeokar, Christian Fuegen:
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition. CoRR abs/2204.08858 (2022) - [i17]Desh Raj, Junteng Jia, Jay Mahadeokar, Chunyang Wu, Niko Moritz, Xiaohui Zhang, Ozlem Kalinli:
Anchored Speech Recognition with Neural Transducers. CoRR abs/2210.11588 (2022) - [i16]Dawei Liang, Hang Su, Tarun Singh, Jay Mahadeokar, Shanil Puri, Jiedan Zhu, Edison Thomaz, Mike Seltzer:
Dynamic Speech Endpoint Detection with Regression Targets. CoRR abs/2210.14252 (2022) - 2021
- [c13]Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, Duc Le:
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer. ICASSP 2021: 7333-7337 - [c12]Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra:
Memory-Efficient Speech Recognition on Smart Devices. ICASSP 2021: 8368-8372 - [c11]Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer:
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion. Interspeech 2021: 1772-1776 - [c10]Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency. Interspeech 2021: 2042-2046 - [c9]Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios. Interspeech 2021: 2107-2111 - [c8]Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition. Interspeech 2021: 4553-4557 - [c7]Jay Mahadeokar, Yuan Shangguan, Duc Le, Gil Keren, Hang Su, Thong Le, Ching-Feng Yeh, Christian Fuegen, Michael L. Seltzer:
Alignment Restricted Streaming Recurrent Neural Network Transducer. SLT 2021: 52-59 - [c6]Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. SLT 2021: 251-257 - [i15]Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra:
Memory-efficient Speech Recognition on Smart Devices. CoRR abs/2102.11531 (2021) - [i14]Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency. CoRR abs/2104.02176 (2021) - [i13]Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer:
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion. CoRR abs/2104.02194 (2021) - [i12]Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition. CoRR abs/2104.02207 (2021) - [i11]Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios. CoRR abs/2104.02232 (2021) - [i10]Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang, Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, Mike Seltzer:
Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution. CoRR abs/2110.05241 (2021) - [i9]Yao-Yuan Yang, Moto Hira, Zhaoheng Ni, Anjali Chourdia, Artyom Astafurov, Caroline Chen, Ching-Feng Yeh, Christian Puhrsch, David Pollack, Dmitriy Genzel, Donny Greenberg, Edward Z. Yang, Jason Lian, Jay Mahadeokar, Jeff Hwang, Ji Chen, Peter Goldsborough, Prabhat Roy, Sean Narenthiran, Shinji Watanabe, Soumith Chintala, Vincent Quenneville-Bélair, Yangyang Shi:
TorchAudio: Building Blocks for Audio and Speech Processing. CoRR abs/2110.15018 (2021) - 2020
- [c5]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition. ICASSP 2020: 6874-6878 - [c4]Weipeng He, Lu Lu, Biqiao Zhang, Jay Mahadeokar, Kaustubh Kalgaonkar, Christian Fuegen:
Spatial Attention for Far-Field Speech Recognition with Deep Beamforming Neural Networks. ICASSP 2020: 7499-7503 - [c3]Mahaveer Jain, Gil Keren, Jay Mahadeokar, Geoffrey Zweig, Florian Metze, Yatharth Saraf:
Contextual RNN-T for Open Domain ASR. INTERSPEECH 2020: 11-15 - [i8]Mahaveer Jain, Gil Keren, Jay Mahadeokar, Yatharth Saraf:
Contextual RNN-T For Open Domain ASR. CoRR abs/2006.03411 (2020) - [i7]Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, Duc Le:
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer. CoRR abs/2010.13878 (2020) - [i6]Jay Mahadeokar, Yuan Shangguan, Duc Le, Gil Keren, Hang Su, Thong Le, Ching-Feng Yeh, Christian Fuegen, Michael L. Seltzer:
Alignment Restricted Streaming Recurrent Neural Network Transducer. CoRR abs/2011.03072 (2020) - [i5]Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. CoRR abs/2011.07754 (2020)
2010 – 2019
- 2019
- [i4]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-based Acoustic Modeling for Hybrid Speech Recognition. CoRR abs/1910.09799 (2019) - [i3]Ching-Feng Yeh, Jay Mahadeokar, Kaustubh Kalgaonkar, Yongqiang Wang, Duc Le, Mahaveer Jain, Kjell Schubert, Christian Fuegen, Michael L. Seltzer:
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention. CoRR abs/1910.12977 (2019) - [i2]Mahaveer Jain, Kjell Schubert, Jay Mahadeokar, Ching-Feng Yeh, Kaustubh Kalgaonkar, Anuroop Sriram, Christian Fuegen, Michael L. Seltzer:
RNN-T For Latency Controlled ASR With Improved Beam Search. CoRR abs/1911.01629 (2019) - [i1]Weipeng He, Lu Lu, Biqiao Zhang, Jay Mahadeokar, Kaustubh Kalgaonkar, Christian Fuegen:
Spatial Attention for Far-field Speech Recognition with Deep Beamforming Neural Networks. CoRR abs/1911.02115 (2019) - 2014
- [j2]Jay Mahadeokar, Sanjeev Saxena:
Faster algorithm to find anti-risk path between two nodes of an undirected graph. J. Comb. Optim. 27(4): 798-807 (2014) - [c2]Vidit Jain, Jay Mahadeokar:
Short-text representation using diffusion wavelets. WWW (Companion Volume) 2014: 301-302 - 2013
- [j1]Jay Mahadeokar, Sanjeev Saxena:
Faster replacement paths algorithms in case of edge or node failure for undirected, positive integer weighted graphs. J. Discrete Algorithms 23: 54-62 (2013) - 2012
- [c1]Jay Mahadeokar, Sanjeev Saxena:
Faster Replacement Paths Algorithm for Undirected, Positive Integer Weighted Graphs with Small Diameter. IWOCA 2012: 81-85
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint