default search action
Andrew Rosenberg
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2022
- [j9]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J. Sel. Top. Signal Process. 16(6): 1357-1366 (2022) - 2017
- [j8]Kartik Audhkhasi, Andrew Rosenberg, George Saon, Abhinav Sethy, Bhuvana Ramabhadran, Stanley F. Chen, Michael Picheny:
Recent progress in deep end-to-end models for spoken language processing. IBM J. Res. Dev. 61(4-5): 2:1-2:10 (2017) - [j7]Adam Goodkind, David Guy Brizan, Andrew Rosenberg:
Utilizing overt and latent linguistic structure to improve keystroke-based authentication. Image Vis. Comput. 58: 230-238 (2017) - [j6]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-Free Keyword Search From Speech. IEEE J. Sel. Top. Signal Process. 11(8): 1351-1359 (2017) - 2015
- [j5]David Guy Brizan, Adam Goodkind, Patrick Koch, Kiran S. Balagani, Vir V. Phoha, Andrew Rosenberg:
Utilizing linguistically enhanced keystroke dynamics to predict typist cognition and demographics. Int. J. Hum. Comput. Stud. 82: 57-68 (2015) - 2013
- [j4]William Yang Wang, Fadi Biadsy, Andrew Rosenberg, Julia Hirschberg:
Automatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification. Comput. Speech Lang. 27(1): 168-189 (2013) - [j3]Abdul Serwadda, Zibo Wang, Patrick Koch, Sathya Govindarajan, Raviteja Pokala, Adam Goodkind, David Guy Brizan, Andrew Rosenberg, Vir V. Phoha, Kiran S. Balagani:
Scan-Based Evaluation of Continuous Keystroke Authentication Systems. IT Prof. 15(4): 20-23 (2013) - 2009
- [j2]Andrew Rosenberg, Julia Hirschberg:
Charisma perception from text and speech. Speech Commun. 51(7): 640-655 (2009) - 2008
- [j1]Mari Ostendorf, Benoît Favre, Ralph Grishman, Dilek Hakkani-Tür, Mary P. Harper, Dustin Hillard, Julia Hirschberg, Heng Ji, Jeremy G. Kahn, Yang Liu, Sameer Maskey, Evgeny Matusov, Hermann Ney, Andrew Rosenberg, Elizabeth Shriberg, Wen Wang, Chuck Wooters:
Speech segmentation and spoken document processing. IEEE Signal Process. Mag. 25(3): 59-69 (2008)
Conference and Workshop Papers
- 2024
- [c92]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. ICASSP 2024: 11546-11550 - 2023
- [c91]Yosuke Higuchi, Andrew Rosenberg, Yuan Wang, Murali Karthick Baskar, Bhuvana Ramabhadran:
Mask-Conformer: Augmenting Conformer with Mask-Predict Decoder. ASRU 2023: 1-8 - [c90]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. ICASSP 2023: 1-5 - [c89]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5 - [c88]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. ICASSP 2023: 1-5 - [c87]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi:
O-1: Self-training with Oracle and 1-best Hypothesis. INTERSPEECH 2023: 77-81 - [c86]Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. INTERSPEECH 2023: 191-195 - [c85]Cal Peyser, Zhong Meng, Rohit Prabhavalkar, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho, Ke Hu:
Improving Joint Speech-Text Representations Without Alignment. INTERSPEECH 2023: 1354-1358 - 2022
- [c84]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681 - [c83]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Nicolás Serrano:
Reducing Domain mismatch in Self-supervised speech pre-training. INTERSPEECH 2022: 3028-3032 - [c82]Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Jesse Emond, Yinghui Huang, Pedro J. Moreno:
Non-Parallel Voice Conversion for ASR Augmentation. INTERSPEECH 2022: 3408-3412 - [c81]Cal Peyser, W. Ronny Huang, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho:
Towards Disentangled Speech Representations. INTERSPEECH 2022: 3603-3607 - [c80]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097 - [c79]Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew Rosenberg, Pedro J. Moreno:
A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization. INTERSPEECH 2022: 5125-5129 - [c78]Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park:
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR. SLT 2022: 23-30 - [c77]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75 - 2021
- [c76]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258 - [c75]Rohan Doshi, Youzheng Chen, Liyang Jiang, Xia Zhang, Fadi Biadsy, Bhuvana Ramabhadran, Fang Chu, Andrew Rosenberg, Pedro J. Moreno:
Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech. ICASSP 2021: 6988-6992 - [c74]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740 - 2020
- [c73]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703 - [c72]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033 - [c71]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560 - [c70]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836 - 2019
- [c69]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. ASRU 2019: 996-1002 - [c68]Min Ma, Bhuvana Ramabhadran, Jesse Emond, Andrew Rosenberg, Fadi Biadsy:
Comparison of Data Augmentation and Adaptation Strategies for Code-switched Automatic Speech Recognition. ICASSP 2019: 6081-6085 - [c67]Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. INTERSPEECH 2019: 2080-2084 - 2018
- [c66]Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:
Measuring the Effect of Linguistic Resources on Prosody Modeling for Speech Synthesis. ICASSP 2018: 5114-5118 - [c65]Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson:
Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. ICASSP 2018: 5989-5993 - [c64]Takashi Fukuda, Raul Fernandez, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Alexander Sorin, Gakuto Kurata:
Data Augmentation Improves Recognition of Foreign Accented Speech. INTERSPEECH 2018: 2409-2413 - [c63]Denys Katerenchuk, David Guy Brizan, Andrew Rosenberg:
Interpersonal Relationship Labels for the CALLHOME Corpus. LREC 2018 - [c62]Raul Fernandez, Andrew Rosenberg:
Comparing Prosodic Frameworks: Investigating the Acoustic-Symbolic Relationship in ToBI and RaP. SLT 2018: 312-318 - 2017
- [c61]Rachel Rakov, Andrew Rosenberg:
Investigating native and non-native English classification and transfer effects using Legendre polynomial coefficient clustering. ASRU 2017: 637-643 - [c60]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Tom Sercu, Kartik Audhkhasi, Abhinav Sethy, Markus Nußbaum-Thom, Andrew Rosenberg:
Knowledge distillation across ensembles of multilingual models for low-resource languages. ICASSP 2017: 4825-4829 - [c59]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-end ASR-free keyword search from speech. ICASSP 2017: 4840-4844 - [c58]Andrew Rosenberg, Kartik Audhkhasi, Abhinav Sethy, Bhuvana Ramabhadran, Michael Picheny:
End-to-end speech recognition and keyword search on low-resource languages. ICASSP 2017: 5280-5284 - [c57]Ali Raza Syed, Andrew Rosenberg, Michael I. Mandel:
Active learning for low-resource speech recognition: Impact of selection size and language modeling data. ICASSP 2017: 5315-5319 - [c56]Raul Fernandez, Andrew Rosenberg, Alexander Sorin, Bhuvana Ramabhadran, Ron Hoory:
Voice-transformation-based data augmentation for prosodic classification. ICASSP 2017: 5530-5534 - [c55]Asaf Rendel, Raul Fernandez, Zvi Kons, Andrew Rosenberg, Ron Hoory, Bhuvana Ramabhadran:
Weakly-Supervised Phrase Assignment from Text in a Speech-Synthesis System Using Noisy Labels. INTERSPEECH 2017: 759-763 - [c54]Andrew Rosenberg, Bhuvana Ramabhadran:
Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores. INTERSPEECH 2017: 3976-3980 - 2016
- [c53]Denys Katerenchuk, Andrew Rosenberg:
Hierarchy Prediction in Online Communities. AAAI 2016: 4224-4225 - [c52]Ali Raza Syed, Andrew Rosenberg, Ellen Kislal:
Supervised and unsupervised active learning for automatic speech recognition of low-resource languages. ICASSP 2016: 5320-5324 - [c51]Guozhen An, Sarah Ita Levitan, Rivka Levitan, Andrew Rosenberg, Michelle Levine, Julia Hirschberg:
Automatically Classifying Self-Rated Personality Scores from Speech. INTERSPEECH 2016: 1412-1416 - [c50]Sarah Ita Levitan, Guozhen An, Min Ma, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg:
Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection. INTERSPEECH 2016: 2006-2010 - [c49]Denys Katerenchuk, Andrew Rosenberg:
RankDCG: Rank-Ordering Evaluation Measure. LREC 2016 - 2015
- [c48]Adam Goodkind, David Guy Brizan, Andrew Rosenberg:
Improvements to keystroke-based authentication by adding linguistic context. BTAS 2015: 1-6 - [c47]Sarah Ita Levitan, Guozhen An, Mandi Wang, Gideon Mendels, Julia Hirschberg, Michelle Levine, Andrew Rosenberg:
Cross-Cultural Production and Detection of Deception from Speech. WMDD@ICMI 2015: 1-8 - [c46]Guozhen An, David Guy Brizan, Min Ma, Michelle Morales, Ali Raza Syed, Andrew Rosenberg:
Automatic recognition of unified parkinson's disease rating from speech with acoustic, i-vector and phonotactic features. INTERSPEECH 2015: 508-512 - [c45]Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:
Modeling phrasing and prominence using deep recurrent learning. INTERSPEECH 2015: 3066-3070 - [c44]Min Ma, Andrew Rosenberg:
CUNY Systems for the Query-by-Example Search on Speech Task at MediaEval 2015. MediaEval 2015 - [c43]Adam Goodkind, Andrew Rosenberg:
Muddying The Multiword Expression Waters: How Cognitive Demand Affects Multiword Expression Production. MWE@NAACL-HLT 2015: 87-95 - 2014
- [c42]Victor Soto, Erica Cooper, Lidia Mangu, Andrew Rosenberg, Julia Hirschberg:
Rescoring Confusion Networks for Keyword Search. ICASSP 2014: 7088-7092 - [c41]Justin Richards, Min Ma, Andrew Rosenberg:
Using word burst analysis to rescore keyword search candidates on low-resource languages. ICASSP 2014: 7824-7828 - [c40]Hilbert Locklear, Sathya Govindarajan, Zdenka Sitova, Adam Goodkind, David Guy Brizan, Andrew Rosenberg, Vir V. Phoha, Paolo Gasti, Kiran S. Balagani:
Continuous authentication with cognition-centric text production and revision features. IJCB 2014: 1-8 - [c39]Denys Katerenchuk, Andrew Rosenberg:
Improving named entity recognition with prosodic features. INTERSPEECH 2014: 293-297 - [c38]Raul Fernandez, Jia Cui, Andrew Rosenberg, Bhuvana Ramabhadran, Xiaodong Cui:
Exploiting vocal-source features to improve ASR accuracy for low-resource languages. INTERSPEECH 2014: 805-809 - [c37]Jia Cui, Bhuvana Ramabhadran, Xiaodong Cui, Andrew Rosenberg, Brian Kingsbury, Abhinav Sethy:
Recent improvements in neural network acoustic modeling for LVCSR in low resource languages. INTERSPEECH 2014: 840-844 - [c36]Denys Katerenchuk, David Guy Brizan, Andrew Rosenberg:
"was that your mother on the phone?": classifying interpersonal relationships between dialog participants with lexical and acoustic properties. INTERSPEECH 2014: 1831-1835 - [c35]Xiaodong Cui, Brian Kingsbury, Jia Cui, Bhuvana Ramabhadran, Andrew Rosenberg, Mohammad Sadegh Rasooli, Owen Rambow, Nizar Habash, Vaibhava Goel:
Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program. INTERSPEECH 2014: 2103-2107 - [c34]Victor Soto, Lidia Mangu, Andrew Rosenberg, Julia Hirschberg:
A comparison of multiple methods for rescoring keyword search lists for low resource languages. INTERSPEECH 2014: 2464-2468 - [c33]Min Ma, Justin Richards, Victor Soto, Julia Hirschberg, Andrew Rosenberg:
Strategies for rescoring keyword search results using word-burst and acoustic features. INTERSPEECH 2014: 2769-2773 - 2013
- [c32]Victor Soto, Erica Cooper, Andrew Rosenberg, Julia Hirschberg:
Cross-language phrase boundary detection. ICASSP 2013: 8460-8464 - [c31]Gouzhen An, David Guy Brizan, Andrew Rosenberg:
Detecting laughter and filled pauses using syllable-based features. INTERSPEECH 2013: 178-181 - [c30]Félix Grèzes, Justin Richards, Andrew Rosenberg:
Let me finish: automatic conflict detection using speaker overlap. INTERSPEECH 2013: 200-204 - [c29]Andrew Rosenberg:
Modeling prosodic sequences with k-means and dirichlet process GMMs. INTERSPEECH 2013: 520-524 - [c28]Rachel Rakov, Andrew Rosenberg:
"sure, i did the right thing": a system for sarcasm detection in speech. INTERSPEECH 2013: 842-846 - 2012
- [c27]Andrew Rosenberg:
Rethinking The Corpus: Moving towards Dynamic Linguistic Resources. INTERSPEECH 2012: 1392-1395 - [c26]Andrew Rosenberg:
Classifying Skewed Data: Importance Weighting to Optimize Average Recall. INTERSPEECH 2012: 2242-2245 - [c25]Sameer Maskey, Andrew Rosenberg:
Power Mean Pyramid Scores for Summarization Evaluation. INTERSPEECH 2012: 2378-2381 - [c24]Andrew Rosenberg:
Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation Models. INTERSPEECH 2012: 2410-2413 - [c23]Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:
Phrase Boundary Assignment from Text in Multiple Domains. INTERSPEECH 2012: 2558-2561 - [c22]Andrew Rosenberg:
Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification. SLT 2012: 376-381 - [c21]Ian Kaplan, Andrew Rosenberg:
Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates. SLT 2012: 449-454 - 2011
- [c20]Matt Huenerfauth, Pengfei Lu, Andrew Rosenberg:
Evaluating importance of facial expression in american sign language and pidgin signed english animations. ASSETS 2011: 99-106 - [c19]Ilknur Icke, Andrew Rosenberg:
Multi-objective Genetic Programming for Visual Analytics. EuroGP 2011: 322-334 - [c18]Ilknur Icke, Andrew Rosenberg:
Automated measures for interpretable dimensionality reduction for visual classification: A user study. IEEE VAST 2011: 281-282 - [c17]Andrew Rosenberg:
Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness. INTERSPEECH 2011: 1065-1068 - [c16]Andrew Rosenberg:
Using Mutual Information to Identify Regions of Analysis for Prosodic Analysis. INTERSPEECH 2011: 1377-1380 - [c15]Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:
"What is... Dengue Fever?" - Modeling and Predicting Pronunciation Errors in a Text-to-Speech System. INTERSPEECH 2011: 2189-2192 - [c14]Fadi Biadsy, William Yang Wang, Andrew Rosenberg, Julia Hirschberg:
Intoxication Detection Using Phonetic, Phonotactic and Prosodic Cues. INTERSPEECH 2011: 3209-3212 - 2010
- [c13]Ilknur Icke, Andrew Rosenberg:
Dimensionality reduction using symbolic regression. GECCO (Companion) 2010: 2085-2086 - [c12]Andrew Rosenberg:
AutoBI - a tool for automatic toBI annotation. INTERSPEECH 2010: 146-149 - [c11]Andrew Rosenberg:
Classification of Prosodic Events using Quantized Contour Modeling. HLT-NAACL 2010: 721-724 - 2009
- [c10]Andrew Rosenberg, Julia Hirschberg:
Detecting Pitch Accents at the Word, Syllable and Vowel Level. HLT-NAACL (Short Papers) 2009: 81-84 - 2008
- [c9]Sameer Maskey, Andrew Rosenberg, Julia Hirschberg:
Intonational phrases for speech summarization. INTERSPEECH 2008: 2430-2433 - 2007
- [c8]Andrew Rosenberg, Julia Hirschberg:
V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure. EMNLP-CoNLL 2007: 410-420 - [c7]Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg, Wisam Dakka:
Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis. INTERSPEECH 2007: 2221-2224 - [c6]Andrew Rosenberg, Mehrbod Sharifi, Julia Hirschberg:
Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news. INTERSPEECH 2007: 2589-2592 - [c5]Andrew Rosenberg, Julia Hirschberg:
Detecting pitch accent using pitch-corrected energy-based predictors. INTERSPEECH 2007: 2777-2780 - 2006
- [c4]Andrew Rosenberg, Julia Hirschberg:
On the correlation between energy and pitch accent in read English speech. INTERSPEECH 2006 - [c3]Andrew Rosenberg, Julia Hirschberg:
Story Segmentation of Broadcast News in English, Mandarin and Arabic. HLT-NAACL 2006 - 2005
- [c2]Andrew Rosenberg, Julia Hirschberg:
Acoustic/prosodic and lexical correlates of charismatic speech. INTERSPEECH 2005: 513-516 - 2004
- [c1]Andrew Rosenberg, Ed Binkowski:
Augmenting the kappa statistic to determine interannotator reliability for multiply labeled data points. HLT-NAACL (Short Papers) 2004
Informal and Other Publications
- 2024
- [i33]Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar S. Aleksic:
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings. CoRR abs/2401.04235 (2024) - [i32]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. CoRR abs/2402.18932 (2024) - [i31]Neeraj Gaur, Rohan Agrawal, Gary Wang, Parisa Haghani, Andrew Rosenberg, Bhuvana Ramabhadran:
ASTRA: Aligning Speech and Text Representations for Asr without Sampling. CoRR abs/2406.06664 (2024) - [i30]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Neeraj Gaur, Zhong Meng:
Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions. CoRR abs/2406.14701 (2024) - [i29]Bolaji Yusuf, Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran:
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models. CoRR abs/2407.04641 (2024) - [i28]Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge, Justin Chen, Harry Zhang, Pai Zhu, Jacob Bartel, Kyle Kastner, Gary Wang, Andrew Rosenberg, Quan Wang:
Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model. CoRR abs/2407.18879 (2024) - [i27]Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge, Justin Chen, Harry Zhang, Pai Zhu, Jacob Bartel, Kyle Kastner, Gary Wang, Andrew Rosenberg, Quan Wang:
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting. CoRR abs/2408.10463 (2024) - [i26]Shikhar Vashishth, Harman Singh, Shikhar Bharadwaj, Sriram Ganapathy, Chulayuth Asawaroengchai, Kartik Audhkhasi, Andrew Rosenberg, Ankur Bapna, Bhuvana Ramabhadran:
STAB: Speech Tokenizer Assessment Benchmark. CoRR abs/2409.02384 (2024) - [i25]Fadi Biadsy, Youzheng Chen, Isaac Elias, Kyle Kastner, Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran:
Zero-shot Cross-lingual Voice Transfer for TTS. CoRR abs/2409.13910 (2024) - 2023
- [i24]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. CoRR abs/2302.08583 (2023) - [i23]Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023) - [i22]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. CoRR abs/2304.14514 (2023) - [i21]Cal Peyser, Zhong Meng, Ke Hu, Rohit Prabhavalkar, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho:
Improving Joint Speech-Text Representations Without Alignment. CoRR abs/2308.06125 (2023) - [i20]Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. CoRR abs/2308.07393 (2023) - [i19]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi:
O-1: Self-training with Oracle and 1-best Hypothesis. CoRR abs/2308.07486 (2023) - 2022
- [i18]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. CoRR abs/2202.12719 (2022) - [i17]Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew Rosenberg, Pedro J. Moreno:
A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization. CoRR abs/2203.12559 (2022) - [i16]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022) - [i15]Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang:
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data. CoRR abs/2205.08014 (2022) - [i14]Cal Peyser, W. Ronny Huang, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho:
Towards Disentangled Speech Representations. CoRR abs/2208.13191 (2022) - [i13]Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Yinghui Huang, Jesse Emond, Pedro Moreno Mengibar:
Non-Parallel Voice Conversion for ASR Augmentation. CoRR abs/2209.06987 (2022) - [i12]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022) - [i11]Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park:
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR. CoRR abs/2210.10879 (2022) - [i10]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022) - 2021
- [i9]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021) - 2020
- [i8]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020) - 2019
- [i7]Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. CoRR abs/1907.04448 (2019) - [i6]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. CoRR abs/1909.11699 (2019) - 2018
- [i5]Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson:
Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. CoRR abs/1802.02656 (2018) - [i4]Denys Katerenchuk, Andrew Rosenberg:
RankDCG: Rank-Ordering Evaluation Measure. CoRR abs/1803.00719 (2018) - 2017
- [i3]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-free Keyword Search from Speech. CoRR abs/1701.04313 (2017) - 2012
- [i2]Ilknur Icke, Andrew Rosenberg:
Visual and semantic interpretability of projections of high dimensional data for classification tasks. CoRR abs/1205.4776 (2012) - 2010
- [i1]Ilknur Icke, Andrew Rosenberg:
Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling. CoRR abs/1010.1888 (2010)
Coauthor Index
aka: Pedro Moreno Mengibar
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint