
Michael L. Seltzer
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2020
- [c62]Duc Le, Thilo Köhler, Christian Fliegen, Michael L. Seltzer:
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR. ICASSP 2020: 6869-6873 - [c61]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition. ICASSP 2020: 6874-6878 - [c60]Yi-Chen Chen, Zhaojun Yang, Ching-Feng Yeh, Mahaveer Jain, Michael L. Seltzer:
Aipnet: Generative Adversarial Pre-Training of Accent-Invariant Networks for End-To-End Speech Recognition. ICASSP 2020: 6979-6983 - [c59]Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer:
Weak-Attention Suppression for Transformer Based Speech Recognition. INTERSPEECH 2020: 4996-5000 - [i16]Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer:
Weak-Attention Suppression For Transformer Based Speech Recognition. CoRR abs/2005.09137 (2020) - [i15]Yangyang Shi, Yongqiang Wang, Chunyang Wu, Ching-Feng Yeh, Julian Chan, Frank Zhang, Duc Le, Michael L. Seltzer:
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition. CoRR abs/2010.10759 (2020) - [i14]Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, Duc Le:
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer. CoRR abs/2010.13878 (2020) - [i13]Jay Mahadeokar, Yuan Shangguan, Duc Le, Gil Keren, Hang Su, Thong Le, Ching-Feng Yeh, Christian Fuegen, Michael L. Seltzer:
Alignment Restricted Streaming Recurrent Neural Network Transducer. CoRR abs/2011.03072 (2020) - [i12]Ching-Feng Yeh, Yongqiang Wang, Yangyang Shi, Chunyang Wu, Frank Zhang, Julian Chan, Michael L. Seltzer:
Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition. CoRR abs/2011.07120 (2020) - [i11]Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. CoRR abs/2011.07754 (2020)
2010 – 2019
- 2019
- [j12]Shinji Watanabe
, Shoko Araki, Michiel Bacchiani, Reinhold Haeb-Umbach, Michael L. Seltzer:
Introduction to the Issue on Far-Field Speech Processing in the Era of Deep Learning: Speech Enhancement, Separation, and Recognition. IEEE J. Sel. Top. Signal Process. 13(4): 785-786 (2019) - [j11]Reinhold Haeb-Umbach
, Shinji Watanabe
, Tomohiro Nakatani
, Michiel Bacchiani
, Björn Hoffmeister, Michael L. Seltzer, Heiga Zen, Mehrez Souden:
Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques. IEEE Signal Process. Mag. 36(6): 111-124 (2019) - [c58]Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fügen, Geoffrey Zweig, Michael L. Seltzer:
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition. ASRU 2019: 457-464 - [c57]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder. ICASSP 2019: 6186-6190 - [c56]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR. INTERSPEECH 2019: 3490-3494 - [i10]Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fügen, Geoffrey Zweig, Michael L. Seltzer:
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition. CoRR abs/1910.01493 (2019) - [i9]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-based Acoustic Modeling for Hybrid Speech Recognition. CoRR abs/1910.09799 (2019) - [i8]Duc Le, Thilo Köhler, Christian Fuegen, Michael L. Seltzer:
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR. CoRR abs/1910.12612 (2019) - [i7]Ching-Feng Yeh, Jay Mahadeokar, Kaustubh Kalgaonkar, Yongqiang Wang, Duc Le, Mahaveer Jain, Kjell Schubert, Christian Fuegen, Michael L. Seltzer:
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention. CoRR abs/1910.12977 (2019) - [i6]Mahaveer Jain, Kjell Schubert, Jay Mahadeokar, Ching-Feng Yeh, Kaustubh Kalgaonkar, Anuroop Sriram, Christian Fuegen, Michael L. Seltzer:
RNN-T For Latency Controlled ASR With Improved Beam Search. CoRR abs/1911.01629 (2019) - [i5]Yi-Chen Chen, Zhaojun Yang, Ching-Feng Yeh, Mahaveer Jain, Michael L. Seltzer:
AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition. CoRR abs/1911.11935 (2019) - 2018
- [c55]Suyoun Kim, Michael L. Seltzer:
Towards Language-Universal End-to-End Speech Recognition. ICASSP 2018: 4914-4918 - [c54]Zhuo Chen, Takuya Yoshioka, Xiong Xiao, Linyu Li, Michael L. Seltzer, Yifan Gong:
Efficient Integration of Fixed Beamformers and Speech Separation Networks for Multi-Channel Far-Field Speech Separation. ICASSP 2018: 5384-5388 - [c53]Suyoun Kim, Michael L. Seltzer, Jinyu Li, Rui Zhao:
Improved Training for Online End-to-end Speech Recognition Systems. INTERSPEECH 2018: 2913-2917 - [i4]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end contextual speech recognition using class language models and a token passing decoder. CoRR abs/1812.02142 (2018) - 2017
- [j10]Wayne Xiong
, Jasha Droppo
, Xuedong Huang
, Frank Seide, Michael L. Seltzer
, Andreas Stolcke
, Dong Yu
, Geoffrey Zweig:
Toward Human Parity in Conversational Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 25(12): 2410-2423 (2017) - [c52]Baolin Peng, Michael L. Seltzer, Y. C. Ju, Geoffrey Zweig, Kam-Fai Wong:
May I take your order? A Neural Model for Extracting Structured Information from Conversations. EACL (1) 2017: 450-459 - [c51]Tom Ko
, Vijayaditya Peddinti, Daniel Povey, Michael L. Seltzer, Sanjeev Khudanpur:
A study on data augmentation of reverberant speech for robust speech recognition. ICASSP 2017: 5220-5224 - [c50]Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong:
Large-Scale Domain Adaptation via Teacher-Student Learning. INTERSPEECH 2017: 2386-2390 - [p3]Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael I. Mandel, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu:
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 79-104 - [i3]Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong:
Large-Scale Domain Adaptation via Teacher-Student Learning. CoRR abs/1708.05466 (2017) - [i2]Suyoun Kim, Michael L. Seltzer:
Towards Language-Universal End-to-End Speech Recognition. CoRR abs/1711.02207 (2017) - [i1]Suyoun Kim, Michael L. Seltzer, Jinyu Li, Rui Zhao:
Improved training for online end-to-end speech recognition systems. CoRR abs/1711.02212 (2017) - 2016
- [c49]Pegah Ghahremani, Jasha Droppo
, Michael L. Seltzer:
Linearly augmented deep neural network. ICASSP 2016: 5085-5089 - [c48]Xiong Xiao, Shinji Watanabe
, Hakan Erdogan, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Michael I. Mandel, Dong Yu:
Deep beamforming networks for multi-channel speech recognition. ICASSP 2016: 5745-5749 - [c47]Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani:
On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models. INTERSPEECH 2016: 803-807 - 2015
- [j9]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo
:
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 23(10): 1670-1679 (2015) - [c46]Yu Zhang, Dong Yu, Michael L. Seltzer, Jasha Droppo
:
Speech recognition with prediction-adaptation-correction recurrent neural networks. ICASSP 2015: 5004-5008 - [c45]Ritwik Giri, Michael L. Seltzer, Jasha Droppo
, Dong Yu:
Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning. ICASSP 2015: 5014-5018 - [c44]Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani:
Exploring how deep neural networks form phonemic categories. INTERSPEECH 2015: 1912-1916 - 2014
- [c43]Hyunson Seo, Hong-Goo Kang, Michael L. Seltzer:
Factored adaptation of speaker and environment using orthogonal subspace transforms. ICASSP 2014: 3251-3255 - [c42]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo
:
Single-channel mixed speech recognition using deep neural networks. ICASSP 2014: 5632-5636 - [c41]Yan Huang, Malcolm Slaney, Michael L. Seltzer, Yifan Gong:
Towards better performance with heterogeneous training data in acoustic modeling using deep neural networks. INTERSPEECH 2014: 845-849 - [c40]Malcolm Slaney, Michael L. Seltzer:
The influence of pitch and noise on the discriminability of filterbank features. INTERSPEECH 2014: 2263-2267 - [c39]Dong Yu, Adam Eversole, Michael L. Seltzer, Kaisheng Yao, Brian Guenter, Oleksii Kuchaiev, Frank Seide, Huaming Wang, Jasha Droppo, Zhiheng Huang, Geoffrey Zweig, Christopher J. Rossbach, Jon Currey:
An introduction to computational networks and the computational network toolkit (invited talk). INTERSPEECH 2014 - 2013
- [c38]Samuel Thomas, Michael L. Seltzer, Kenneth Church, Hynek Hermansky:
Deep neural network features and semi-supervised training for low resource speech recognition. ICASSP 2013: 6704-6708 - [c37]Michael L. Seltzer, Jasha Droppo
:
Multi-task learning in deep neural networks for improved phoneme recognition. ICASSP 2013: 6965-6969 - [c36]Michael L. Seltzer, Dong Yu, Yongqiang Wang:
An investigation of deep neural networks for noise robust speech recognition. ICASSP 2013: 7398-7402 - [c35]Li Deng, Jinyu Li, Jui-Ting Huang, Kaisheng Yao, Dong Yu, Frank Seide, Michael L. Seltzer, Geoffrey Zweig, Xiaodong He, Jason D. Williams, Yifan Gong, Alex Acero:
Recent advances in deep learning for speech research at Microsoft. ICASSP 2013: 8604-8608 - [c34]Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, Frank Seide:
Feature Learning in Deep Neural Networks - A Study on Speech Recognition Tasks. ICLR 2013 - 2012
- [c33]Jinyu Li, Michael L. Seltzer, Yifan Gong:
Improvements to VTS feature enhancement. ICASSP 2012: 4677-4680 - [c32]Michael L. Seltzer, Alex Acero:
Factored adaptation using a combination of feature-space and model-space transforms. INTERSPEECH 2012: 1792-1795 - [c31]Jinyu Li, Michael L. Seltzer, Yifan Gong:
Efficient VTS Adaptation Using Jacobian Approximation. INTERSPEECH 2012: 1906-1909 - [p2]Michael L. Seltzer:
Acoustic Model Training for Robust Speech Recognition. Techniques for Noise Robustness in Automatic Speech Recognition 2012: 347-368 - 2011
- [j8]Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Ye-Yi Wang, Dong Yu:
In-Car Media Search. IEEE Signal Process. Mag. 28(4): 50-60 (2011) - [c30]Michael L. Seltzer, Alex Acero:
Factored adaptation for separable compensation of speaker and environmental variability. ASRU 2011: 146-151 - [c29]Flavio P. Ribeiro, Dinei A. F. Florêncio, Cha Zhang, Michael L. Seltzer:
CROWDMOS: An approach for crowdsourcing mean opinion score studies. ICASSP 2011: 2416-2419 - [c28]Xing Fan, Michael L. Seltzer, Jasha Droppo
, Henrique S. Malvar, Alex Acero:
Joint encoding of the waveform and speech recognition features using a transform codec. ICASSP 2011: 5148-5151 - [c27]Dong Yu, Michael L. Seltzer:
Improved Bottleneck Features Using Pretrained Deep Neural Networks. INTERSPEECH 2011: 237-240 - [c26]Michael L. Seltzer, Alex Acero:
Separating Speaker and Environmental Variability Using Factored Transforms. INTERSPEECH 2011: 1097-1100 - 2010
- [j7]Ozlem Kalinli, Michael L. Seltzer, Jasha Droppo
, Alex Acero:
Noise Adaptive Training for Robust Automatic Speech Recognition. IEEE Trans. Speech Audio Process. 18(8): 1889-1901 (2010) - [c25]Michael L. Seltzer, Alex Acero, Kaustubh Kalgaonkar:
Acoustic model adaptation via Linear Spline Interpolation for robust speech recognition. ICASSP 2010: 4550-4553 - [c24]Michael L. Seltzer, Alex Acero:
HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition. INTERSPEECH 2010: 1664-1667 - [c23]Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton:
Binary coding of speech spectrograms using a deep auto-encoder. INTERSPEECH 2010: 1692-1695
2000 – 2009
- 2009
- [c22]Kaustubh Kalgaonkar, Michael L. Seltzer, Alex Acero:
Noise robust model adaptation using linear spline interpolation. ASRU 2009: 199-204 - [c21]Michael L. Seltzer, Lei Zhang:
The data deluge: Challenges and opportunities of unlimited data in statistical signal processing. ICASSP 2009: 3701-3704 - [c20]Ozlem Kalinli, Michael L. Seltzer, Alex Acero:
Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition. ICASSP 2009: 3825-3828 - [c19]Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Michael L. Seltzer, Ivan Tashev, Alex Acero:
Voice search of structured media data. ICASSP 2009: 3941-3944 - [c18]Yun-Cheng Ju, Michael L. Seltzer, Ivan Tashev:
Improving perceived accuracy for in-car media search. INTERSPEECH 2009: 979-982 - 2008
- [c17]Ivan Tashev, Jasha Droppo
, Michael L. Seltzer, Alex Acero:
Robust design of wideband loudspeaker arrays. ICASSP 2008: 381-384 - [c16]Graham W. Taylor, Michael L. Seltzer, Alex Acero:
Maximum a posteriori ICA: Applying prior knowledge to the separation of acoustic sources. ICASSP 2008: 1821-1824 - [c15]Jasha Droppo, Michael L. Seltzer, Alex Acero, Yu-Hsiang Bosco Chiu:
Towards a non-parametric acoustic model: an acoustic decision tree for observation probability calculation. INTERSPEECH 2008: 289-292 - 2007
- [j6]Amarnag Subramanya, Michael L. Seltzer, Alejandro Acero:
Automatic Removal of Typed Keystrokes From Speech Signals. IEEE Signal Process. Lett. 14(5): 363-366 (2007) - [j5]Michael L. Seltzer, Alex Acero:
Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition. IEEE Trans. Speech Audio Process. 15(1): 235-245 (2007) - [c14]Michael L. Seltzer, Ivan Tashev, Alex Acero:
Microphone Array Post-Filter using Incremental Bayes Learning to Track the Spatial Distributions of Speech and Noise. ICASSP (1) 2007: 29-32 - [c13]Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Alex Acero:
Robust location understanding in spoken dialog systems using intersections. INTERSPEECH 2007: 2813-2816 - [c12]Ivan Tashev, Michael L. Seltzer, Yun-Cheng Ju, Dong Yu, Alex Acero:
Commute UX: Telephone Dialog System for Location-based Services. SIGdial 2007: 87-94 - 2006
- [j4]Michael L. Seltzer, Richard M. Stern
:
Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments. IEEE Trans. Speech Audio Process. 14(6): 2109-2121 (2006) - [c11]Amarnag Subramanya, Michael L. Seltzer, Alex Acero:
Automatic removal of typed keystrokes from speech signals. INTERSPEECH 2006 - 2005
- [c10]Michael L. Seltzer, Alex Acero:
Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension. ICASSP (1) 2005: 921-924 - [c9]Michael L. Seltzer, Alex Acero, Jasha Droppo:
Robust bandwidth extension of noise-corrupted narrowband speech. INTERSPEECH 2005: 1509-1512 - [p1]Bhiksha Raj, Michael L. Seltzer, Manuel Jesus Reyes-Gomez:
Speech Recognizer Based Maximum Likelihood Beamforming. Speech Separation by Humans and Machines 2005: 65-82 - 2004
- [j3]Bhiksha Raj, Michael L. Seltzer, Richard M. Stern
:
Reconstruction of missing features for robust speech recognition. Speech Commun. 43(4): 275-296 (2004) - [j2]Michael L. Seltzer, Bhiksha Raj, Richard M. Stern
:
A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Commun. 43(4): 379-393 (2004) - [j1]Michael L. Seltzer, Bhiksha Raj, Richard M. Stern
:
Likelihood-maximizing beamforming for robust hands-free speech recognition. IEEE Trans. Speech Audio Process. 12(5): 489-498 (2004) - [c8]Michael L. Seltzer, Richard M. Stern:
Parameter sharing in subband likelihood-maximizing beamforming for speech recognition using microphone arrays. ICASSP (1) 2004: 881-884 - 2003
- [c7]Michael L. Seltzer, Richard M. Stern:
Subband parameter optimization of microphone arrays for speech recognition in reverberant environments. ICASSP (1) 2003: 408-411 - [c6]Michael L. Seltzer, Jasha Droppo, Alex Acero:
A harmonic-model-based front end for robust speech recognition. INTERSPEECH 2003 - 2002
- [c5]Michael L. Seltzer, Bhiksha Raj, Richard M. Stern:
Speech recognizer-based microphone array processing for robust hands-free speech recognition. ICASSP 2002: 897-900 - 2001
- [c4]Rita Singh, Michael L. Seltzer, Bhiksha Raj, Richard M. Stern:
Speech in Noisy Environments: robust automatic segmentation, feature extraction, and hypothesis combination. ICASSP 2001: 273-276 - [c3]Michael L. Seltzer, Bhiksha Raj:
Calibration of microphone arrays for improved speech recognition. INTERSPEECH 2001: 1005-1008 - 2000
- [c2]Bhiksha Raj, Michael L. Seltzer, Richard M. Stern:
Reconstruction of damaged spectrographic features for robust speech recognition. INTERSPEECH 2000: 357-360 - [c1]Michael L. Seltzer, Bhiksha Raj, Richard M. Stern:
Classifier-based mask estimation for missing feature methods of robust speech recognition. INTERSPEECH 2000: 538-541
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
load content from web.archive.org
Privacy notice: By enabling the option above, your browser will contact the API of web.archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
Tweets on dblp homepage
Show tweets from on the dblp homepage.
Privacy notice: By enabling the option above, your browser will contact twitter.com and twimg.com to load tweets curated by our Twitter account. At the same time, Twitter will persistently store several cookies with your web browser. While we did signal Twitter to not track our users by setting the "dnt" flag, we do not have any control over how Twitter uses your data. So please proceed with care and consider checking the Twitter privacy policy.
last updated on 2021-01-19 22:28 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint