default search action
Shinsuke Sakai
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2022
- [c47]Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM. INTERSPEECH 2022: 3889-3893 - [i7]Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Distilling the Knowledge of BERT for CTC-based ASR. CoRR abs/2209.02030 (2022) - [i6]Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM. CoRR abs/2209.04062 (2022) - 2021
- [c46]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
An End-To-End Model from Speech to Clean Transcript for Parliamentary Meetings. APSIPA ASC 2021: 465-470 - [c45]Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Data Augmentation for ASR Using TTS Via a Discrete Representation. ASRU 2021: 68-75 - [c44]Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
ASR Rescoring and Confidence Estimation with Electra. ASRU 2021: 380-387 - [i5]Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
ASR Rescoring and Confidence Estimation with ELECTRA. CoRR abs/2110.01857 (2021) - 2020
- [c43]Kohei Matsuura, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Generative Adversarial Training Data Adaptation for Very Low-Resource Automatic Speech Recognition. INTERSPEECH 2020: 2737-2741 - [c42]Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR. INTERSPEECH 2020: 3635-3639 - [c41]Kohei Matsuura, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language. LREC 2020: 2622-2628 - [i4]Kohei Matsuura, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language. CoRR abs/2002.06675 (2020) - [i3]Kohei Matsuura, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Generative Adversarial Training Data Adaptation for Very Low-resource Automatic Speech Recognition. CoRR abs/2005.09256 (2020) - [i2]Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR. CoRR abs/2008.03822 (2020)
2010 – 2019
- 2019
- [c40]Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Multi-speaker Sequence-to-sequence Speech Synthesis for Data Augmentation in Acoustic-to-word Speech Recognition. ICASSP 2019: 6161-6165 - [i1]Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR. CoRR abs/1909.09993 (2019) - 2018
- [c39]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Forward-Backward Attention Decoder. INTERSPEECH 2018: 2232-2236 - [c38]Sei Ueno, Takafumi Moriya, Masato Mimura, Shinsuke Sakai, Yusuke Shinohara, Yoshikazu Yamaguchi, Yushi Aono, Tatsuya Kawahara:
Encoder Transfer for Attention-based Acoustic-to-word Speech Recognition. INTERSPEECH 2018: 2424-2428 - [c37]Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR. SLT 2018: 212-218 - [c36]Masato Mimura, Sei Ueno, Hirofumi Inaguma, Shinsuke Sakai, Tatsuya Kawahara:
Leveraging Sequence-to-Sequence Speech Synthesis for Enhancing Acoustic-to-Word Speech Recognition. SLT 2018: 477-484 - 2017
- [c35]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks. ASRU 2017: 134-140 - [c34]Sheng Li, Xugang Lu, Shinsuke Sakai, Masato Mimura, Tatsuya Kawahara:
Semi-supervised ensemble DNN acoustic model training. ICASSP 2017: 5270-5274 - [c33]Masato Mimura, Yoshiaki Bando, Kazuki Shimada, Shinsuke Sakai, Kazuyoshi Yoshii, Tatsuya Kawahara:
Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition. INTERSPEECH 2017: 2451-2455 - 2016
- [c32]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition. INTERSPEECH 2016: 3803-3807 - 2015
- [j6]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature. EURASIP J. Adv. Signal Process. 2015: 62 (2015) - [c31]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Deep autoencoders augmented with phone-class feature for reverberant speech recognition. ICASSP 2015: 4365-4369 - [c30]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Speech dereverberation using long short-term memory. INTERSPEECH 2015: 2435-2439 - 2014
- [c29]Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
Exploring deep neural networks and deep autoencoders in reverberant speech recognition. HSCMA 2014: 197-201 - 2013
- [j5]Sakriani Sakti, Michael Paul, Andrew M. Finch, Shinsuke Sakai, Thang Tat Vu, Noriyuki Kimura, Chiori Hori, Eiichiro Sumita, Satoshi Nakamura, Jun Park, Chai Wutiwiwatchai, Bo Xu, Hammam Riza, Karunesh Arora, Chi Mai Luong, Haizhou Li:
A-STAR: Toward translating Asian spoken languages. Comput. Speech Lang. 27(2): 509-527 (2013) - [j4]Shinsuke Sakai, Tatsuya Kawahara:
Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis. IEICE Trans. Inf. Syst. 96-D(6): 1359-1367 (2013) - 2011
- [j3]Shinsuke Sakai, Tatsuya Kawahara, Hisashi Kawai:
Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis. IEICE Trans. Inf. Syst. 94-D(10): 2006-2014 (2011) - [c28]Yu Tsao, Shigeki Matsuda, Shinsuke Sakai, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
A sampling-based environment population projection approach for rapid acoustic model adaptation. ICASSP 2011: 5504-5507 - 2010
- [c27]Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura:
NICT Blizzard Challenge 2010 Entry. Blizzard Challenge 2010 - [c26]Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai:
Improved training of excitation for HMM-based parametric speech synthesis. INTERSPEECH 2010: 809-812
2000 – 2009
- 2009
- [c25]Ranniery Maia, Tomoki Toda, Shinsuke Sakai, Yoshinori Shiga, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura:
The NICT Entry for the Blizzard Challenge 2009: an Enhanced HMM-based Speech Synthesis System with Trajectory Training considering Global Variance and State-Dependent Mixed Excitation. Blizzard Challenge 2009 - [c24]Shinsuke Sakai, Tatsuya Kawahara, Tohru Shimizu, Satoshi Nakamura:
Optimal learning of P-Layer additive F0 models with cross-validation. ICASSP 2009: 4245-4248 - [c23]Jinfu Ni, Shinsuke Sakai, Tohru Shimizu, Satoshi Nakamura:
CART-based modeling of Chinese tonal patterns with a functional model tracing the fundamental frequency trajectories. ICASSP 2009: 4253-4256 - [c22]Shinsuke Sakai, Ranniery Maia, Hisashi Kawai, Satoshi Nakamura:
A close look into the probabilistic concatenation model for corpus-based speech synthesis. INTERSPEECH 2009: 752-755 - [c21]Ranniery Maia, Tomoki Toda, Keiichi Tokuda, Shinsuke Sakai, Satoshi Nakamura:
A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis. INTERSPEECH 2009: 1783-1786 - [c20]Jinfu Ni, Shinsuke Sakai, Hisashi Kawai, Satoshi Nakamura:
Hyperbolic structure of fundamental frequency contour. IUCS 2009: 389-394 - 2008
- [c19]Ranniery Maia, Jinfu Ni, Shinsuke Sakai, Tomoki Toda, Keiichi Tokuda, Tohru Shimizu, Satoshi Nakamura:
The NICT/ATR speech synthesis system for the Blizzard Challenge 2008. Blizzard Challenge 2008 - [c18]Sakriani Sakti, Eka Kelana, Hammam Riza, Shinsuke Sakai, Konstantin Markov, Satoshi Nakamura:
Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project. IJCNLP 2008: 19-24 - [c17]Keiichiro Oura, Yoshihiko Nankaku, Tomoki Toda, Keiichi Tokuda, Ranniery Maia, Shinsuke Sakai, Satoshi Nakamura:
Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems. ISCSLP 2008: 1-4 - [c16]Jinfu Ni, Shinsuke Sakai, Tohru Shimizu, Satoshi Nakamura:
Frequency Modulation Technique for Prosodic Modification. ISCSLP 2008: 117-120 - [c15]Jinfu Ni, Shinsuke Sakai, Tohru Shimizu, Satoshi Nakamura:
Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model. ISUC 2008: 397-404 - 2007
- [c14]Jinfu Ni, Toshio Hirai, Hisashi Kawai, Tomoki Toda, Keiichi Tokuda, Minoru Tsuzaki, Shinsuke Sakai, Ranniery Maia, Satoshi Nakamura:
ATRECSS - ATR English speech corpus for speech synthesis. Blizzard Challenge 2007 - [c13]Shinsuke Sakai, Jinfu Ni, Ranniery Maia, Keiichi Tokuda, Minoru Tsuzaki, Tomoki Toda, Hisashi Kawai, Satoshi Nakamura:
Communicative speech synthesis with XIMERA: a first step. SSW 2007: 28-33 - 2006
- [c12]Shinsuke Sakai, Tatsuya Kawahara:
Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis. INTERSPEECH 2006 - 2005
- [j2]Shinsuke Sakai:
Fundamental Frequency Modeling for Speech Synthesis Based on a Statistical Learning Technique. IEICE Trans. Inf. Syst. 88-D(3): 489-495 (2005) - [c11]Shinsuke Sakai:
Additive Modeling of English F0 Contour for Speech Synthesis. ICASSP (1) 2005: 277-280 - [c10]Shinsuke Sakai, Han Shu:
A probabilistic approach to unit selection for corpus-based speech synthesis. INTERSPEECH 2005: 81-84 - 2004
- [c9]Shinsuke Sakai:
F0 modeling with multi-layer additive modeling based on a statistical learning technique. SSW 2004: 151-154 - 2000
- [c8]Ken Hanazawa, Shinsuke Sakai:
Continuous speech recognition with parse filtering. INTERSPEECH 2000: 262-265 - [c7]Takao Watanabe, Akitoshi Okumura, Shinsuke Sakai, Kiyoshi Yamabana, Shinichi Doi, Ken Hanazawa:
An automatic interpretation system for travel conversation. INTERSPEECH 2000: 444-447
1990 – 1999
- 1995
- [j1]James R. Glass, Giovanni Flammia, David Goodine, Michael S. Phillips, Joseph Polifroni, Shinsuke Sakai, Stephanie Seneff, Victor Zue:
Multilingual spoken-language understanding in the MIT Voyager system. Speech Commun. 17(1-2): 1-18 (1995) - 1994
- [c6]Jun Noguchi, Shinsuke Sakai, Kaichiro Hatazaki, Ken-ichi Iso, Takao Watanabe:
An automatic voice dialing system developed on PC speech i/o platform. ICSLP 1994: 699-702 - 1993
- [c5]James R. Glass, David Goodine, Michael S. Phillips, Shinsuke Sakai, Stephanie Seneff, Victor W. Zue:
A bilingual Voyager system. EUROSPEECH 1993: 2063-2066 - [c4]Shinsuke Sakai, Michael S. Phillips:
J-SUMMIT: Japanese spontaneous speech recognition. EUROSPEECH 1993: 2151-2154 - [c3]David Goodine, Michael S. Phillips, Shinsuke Sakai, Stephanie Seneff, Victor Zue:
A Bilingual VOYAGER System. HLT 1993 - 1992
- [c2]Shinsuke Sakai, Michael S. Phillips:
J-SUMMIT: a Japanese segment-based speech recognition system. ICSLP 1992: 1515-1518 - 1990
- [c1]Shinsuke Sakai, Kazunori Muraki:
From interlingua to speech: generating prosodic information from conceptual representation. ICASSP 1990: 329-332
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-21 23:41 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint