default search action

combined dblp search
author search
venue search
publication search

ask others

Yuki Saito 0001

齋藤佑樹

> Home > Persons

Person information

unicode name: 齋藤佑樹
affiliation (PhD 2021): University of Tokyo, Department of Information Physics and Computing, Tokyo, Japan

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

Journal Articles

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/access/XinJTSAS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/XinJTSAS24
Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions. IEEE Access 12: 19752-19764 (2024)
2021
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicetd/SaekiSTS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicetd/SaekiSTS21
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials. IEICE Trans. Inf. Syst. 104-D(7): 1002-1016 (2021)
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicetd/MizoguchiSTS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicetd/MizoguchiSTS21
Satoshi Mizoguchi, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching. IEICE Trans. Inf. Syst. 104-D(11): 1971-1980 (2021)
[j7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/SaitoTS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SaitoTS21
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Perceptual-Similarity-Aware Deep Speaker Representation Learning for Multi-Speaker Generative Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1033-1048 (2021)
2020
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicet/TamaruSTKS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicet/TamaruSTKS20
Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Generative Moment Matching Network-Based Neural Double-Tracking for Synthesized and Natural Singing Voices. IEICE Trans. Inf. Syst. 103-D(3): 639-647 (2020)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicetd/SaitoAT20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicetd/SaitoAT20
Yuki Saito, Kei Akuzawa, Kentaro Tachibana:
Joint Adversarial Training of Speech Recognition and Synthesis Models for Many-to-One Voice Conversion Using Phonetic Posteriorgrams. IEICE Trans. Inf. Syst. 103-D(9): 1978-1987 (2020)
[j4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/sigpro/TakamichiSTKS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/sigpro/TakamichiSTKS20
Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari:
Phase reconstruction from amplitude spectrograms based on directional-statistics deep neural networks. Signal Process. 169: 107368 (2020)
2019
[j3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/csl/SaitoTS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/SaitoTS19
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra. Comput. Speech Lang. 58: 347-363 (2019)
2018
[j2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/SaitoTS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SaitoTS18
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks. IEEE ACM Trans. Audio Speech Lang. Process. 26(1): 84-96 (2018)
2017
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicet/SaitoTS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicet/SaitoTS17
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Voice Conversion Using Input-to-Output Highway Networks. IEICE Trans. Inf. Syst. 100-D(8): 1925-1928 (2017)

Conference and Workshop Papers

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Tsunoo0NS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Tsunoo0NS25
Emiru Tsunoo, Yuki Saito, Wataru Nakata, Hiroshi Saruwatari:
Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features. ICASSP 2025: 1-5
2024
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/IshikawaTNTSTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/IshikawaTNTSTS24
Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time Noise Estimation for Lombard-Effect Speech Synthesis in Human-Avatar Dialogue Systems. APSIPA 2024: 1-6
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/NakataSSTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/NakataSSTS24
Wataru Nakata, Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec. APSIPA 2024: 1-6
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YamauchiIS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YamauchiIS24
Kazuki Yamauchi, Yusuke Ijima, Yuki Saito:
STYLECAP: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-Supervised Learning Models. ICASSP 2024: 11261-11265
[c37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IgarashiSSTYTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IgarashiSSTYTS24
Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari:
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment. INTERSPEECH 2024
[c36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaitoISTYTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaitoISTYTS24
Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari:
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark. INTERSPEECH 2024
[c35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SekiTTSIS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SekiTTSIS24
Kentaro Seki, Shinnosuke Takamichi, Norihiro Takamune, Yuki Saito, Kanami Imamura, Hiroshi Saruwatari:
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals. INTERSPEECH 2024
[c34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangKS24
Dong Yang, Tomoki Koriyama, Yuki Saito:
Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech. INTERSPEECH 2024
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YamauchiSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YamauchiSS24
Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari:
Cross-Dialect Text-to-Speech In Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level Bert. SLT 2024: 750-757
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BabaNSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BabaNSS24
Kaito Baba, Wataru Nakata, Yuki Saito, Hiroshi Saruwatari:
The T05 System for the voicemos challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech. SLT 2024: 818-824
2023
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/WatanabeTSNXS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/WatanabeTSNXS23
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control. ASRU 2023: 1-8
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WatanabeTSXS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WatanabeTSXS23
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari:
MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models. ICASSP 2023: 1-5
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangKSSXS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangKSSXS23
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech. ICASSP 2023: 1-5
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaitoTITS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaitoTITS23
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, Hiroshi Saruwatari:
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings. INTERSPEECH 2023: 3048-3052
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/UedaTSTS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/UedaTSTS23
Yota Ueda, Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Hiroshi Saruwatari:
HumanDiffusion: diffusion model using perceptual gradients. INTERSPEECH 2023: 4264-4268
[c26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaitoITTS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaitoITTS23
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center. INTERSPEECH 2023: 5561-5565
[c25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/HiraiSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/HiraiSS23
Ryunosuke Hirai, Yuki Saito, Hiroshi Saruwatari:
Federated Learning for Human-in-the-Loop Many-to-Many Voice Conversion. SSW 2023: 94-99
2022
[c24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/UdagawaSS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/UdagawaSS22
Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari:
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS. INTERSPEECH 2022: 2968-2972
[c23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NishimuraSTTS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NishimuraSTTS22
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History. INTERSPEECH 2022: 3373-3377
[c22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NakataKTSIMS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NakataKTSIMS22
Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Yuki Saito, Yusuke Ijima, Ryo Masumura, Hiroshi Saruwatari:
Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis. INTERSPEECH 2022: 4551-4555
[c21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaitoNTTS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaitoNTTS22
Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent. INTERSPEECH 2022: 5155-5159
2021
[c20]
- view
  - electronic edition @ ieee.org
  - details & citations
- export record
  dblp key:
  - conf/apsipa/LuoTKSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/LuoTKSS21
Xuan Luo, Shinnosuke Takamichi, Tomoki Koriyama, Yuki Saito, Hiroshi Saruwatari:
Emotion-Controllable Speech Synthesis Using Emotion Soft Labels and Fine-Grained Prosody Factors. APSIPA ASC 2021: 794-799
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/UedaFSTBS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/UedaFSTBS21
Yota Ueda, Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari:
Humanacgan: Conditional Generative Adversarial Network with Human-Based Auxiliary Classifier and its Evaluation in Phoneme Perception. ICASSP 2021: 6468-6472
[c18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XinSTKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XinSTKS21
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis. Interspeech 2021: 1614-1618
2020
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FujiiSTBS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FujiiSTBS20
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari:
Humangan: Generative Adversarial Network With Human-Based Discriminator And Its Evaluation In Speech Perception Modeling. ICASSP 2020: 6239-6243
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaekiSTS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaekiSTS20
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Lifter Training and Sub-Band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials. ICASSP 2020: 7784-7788
[c15]
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/SaekiSTS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaekiSTS20
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU. INTERSPEECH 2020: 1021-1022
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GotoOSTM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GotoOSTM20
Shunsuke Goto, Kotaro Onishi, Yuki Saito, Kentaro Tachibana, Koichiro Mori:
Face2Speech: Towards Multi-Speaker Text-to-Speech Synthesis Using an Embedding Vector Predicted from a Face Image. INTERSPEECH 2020: 1321-1325
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XinSTKS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XinSTKS20
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space. INTERSPEECH 2020: 2947-2951
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YamashitaKSTIMS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YamashitaKSTIMS20
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Ryo Masumura, Hiroshi Saruwatari:
Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis. INTERSPEECH 2020: 3201-3205
[c11]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/lrec/YamashitaKSTIMS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lrec/YamashitaKSTIMS20
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Ryo Masumura, Hiroshi Saruwatari:
DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus. LREC 2020: 6438-6443
[c10]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/lrec/SaitoTS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lrec/SaitoTS20
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
SMASH Corpus: A Spontaneous Speech Corpus Recording Third-person Audio Commentaries on Gameplay. LREC 2020: 6571-6577
2019
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TamaruSTKS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TamaruSTKS19
Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking. ICASSP 2019: 7070-7074
[c8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/SaitoTS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/SaitoTS19
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis. SSW 2019: 51-56
[c7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/NakamuraSTIS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/NakamuraSTIS19
Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Hiroshi Saruwatari:
V2S attack: building DNN-based voice conversion from automatic speaker verification. SSW 2019: 161-165
2018
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/UneSTKMS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/UneSTKMS18
Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki, Hiroshi Saruwatari:
Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech. APSIPA 2018: 340-344
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaitoINT18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaitoINT18
Yuki Saito, Yusuke Ijima, Kyosuke Nishida, Shinnosuke Takamichi:
Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors. ICASSP 2018: 5274-5278
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaitoTS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaitoTS18
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Text-to-Speech Synthesis Using STFT Spectra Based on Low-/Multi-Resolution Generative Adversarial Networks. ICASSP 2018: 5299-5303
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/TakamichiSTKS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/TakamichiSTKS18
Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari:
Phase Reconstruction from Amplitude Spectrograms Based on Von-Mises-Distribution Deep Neural Network. IWAENC 2018: 286-290
2017
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaitoTS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaitoTS17
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Training algorithm to deceive Anti-Spoofing Verification for DNN-based speech synthesis. ICASSP 2017: 4900-4904
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MiyoshiSTS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MiyoshiSTS17
Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities. INTERSPEECH 2017: 1268-1272

Data and Artifacts

see FAQ

What is the meaning of the colors in the publication lists?

2023
[d1]
- view
  authority control:
- export record
  dblp key:
  - data/10/XinJTSAS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/data/10/XinJTSAS23
Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions. IEEE DataPort, 2023

Informal and Other Publications

see FAQ

What is the meaning of the colors in the publication lists?

2025
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-12226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-12226
Dong Yang, Yiyi Cai, Yuki Saito, Lixu Wang, Hiroshi Saruwatari:
Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis. CoRR abs/2505.12226 (2025)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-23553
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-23553
Taisei Takano, Yuki Okamoto, Yusuke Kanamori, Yuki Saito, Ryotaro Nagase, Hiroshi Saruwatari:
Human-CLAP: Human-perception-based contrastive language-audio pretraining. CoRR abs/2506.23553 (2025)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-23582
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-23582
Yusuke Kanamori, Yuki Okamoto, Taisei Takano, Shinnosuke Takamichi, Yuki Saito, Hiroshi Saruwatari:
RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio. CoRR abs/2506.23582 (2025)
2024
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-00288
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-00288
Dong Yang, Tomoki Koriyama, Yuki Saito:
Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech. CoRR abs/2402.00288 (2024)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-13353
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-13353
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
Building speech corpus with diverse voice characteristics for its prompt-based representation. CoRR abs/2403.13353 (2024)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-13720
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-13720
Wataru Nakata, Kazuki Yamauchi, Dong Yang, Hiroaki Hyodo, Yuki Saito:
UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge. CoRR abs/2403.13720 (2024)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07254
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07254
Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari:
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark. CoRR abs/2406.07254 (2024)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07280
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07280
Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari:
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment. CoRR abs/2406.07280 (2024)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-17722
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-17722
Kentaro Seki, Shinnosuke Takamichi, Norihiro Takamune, Yuki Saito, Kanami Imamura, Hiroshi Saruwatari:
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals. CoRR abs/2406.17722 (2024)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-15828
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-15828
Wataru Nakata, Kentaro Seki, Hitomi Yanaka, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling. CoRR abs/2407.15828 (2024)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-07265
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-07265
Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari:
Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT. CoRR abs/2409.07265 (2024)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09305
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09305
Kaito Baba, Wataru Nakata, Yuki Saito, Hiroshi Saruwatari:
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech. CoRR abs/2409.09305 (2024)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-19248
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-19248
Emiru Tsunoo, Yuki Saito, Wataru Nakata, Hiroshi Saruwatari:
Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features. CoRR abs/2412.19248 (2024)
2023
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-13652
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-13652
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech. CoRR abs/2302.13652 (2023)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13713
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13713
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center. CoRR abs/2305.13713 (2023)
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13724
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13724
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, Hiroshi Saruwatari:
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings. CoRR abs/2305.13724 (2023)
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-12169
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-12169
Yota Ueda, Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Hiroshi Saruwatari:
HumanDiffusion: diffusion model using perceptual gradients. CoRR abs/2306.12169 (2023)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13509
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13509
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control. CoRR abs/2309.13509 (2023)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-06072
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-06072
Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions. CoRR abs/2310.06072 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-16509
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-16509
Kazuki Yamauchi, Yusuke Ijima, Yuki Saito:
StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models. CoRR abs/2311.16509 (2023)
2022
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-14757
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-14757
Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent. CoRR abs/2203.14757 (2022)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08039
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08039
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari:
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History. CoRR abs/2206.08039 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-10256
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-10256
Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari:
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS. CoRR abs/2206.10256 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-12549
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-12549
Yusuke Nakai, Yuki Saito, Kenta Udagawa, Hiroshi Saruwatari:
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech. CoRR abs/2209.12549 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-09916
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-09916
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari:
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models. CoRR abs/2210.09916 (2022)
2021
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-04051
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-04051
Yota Ueda, Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari:
HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perception. CoRR abs/2102.04051 (2021)
2020
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-06778
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-06778
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Lifter Training and Sub-band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials. CoRR abs/2002.06778 (2020)
2019
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-03389
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-03389
Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking. CoRR abs/1902.03389 (2019)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1907-08294
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1907-08294
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis. CoRR abs/1907.08294 (2019)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1908-01454
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1908-01454
Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Hiroshi Saruwatari:
V2S attack: building DNN-based voice conversion from automatic speaker verification. CoRR abs/1908.01454 (2019)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1908-06248
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1908-06248
Shinnosuke Takamichi, Kentaro Mitsui, Yuki Saito, Tomoki Koriyama, Naoko Tanji, Hiroshi Saruwatari:
JVS corpus: free Japanese multi-speaker voice corpus. CoRR abs/1908.06248 (2019)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1909-11391
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-11391
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, Hiroshi Saruwatari:
HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling. CoRR abs/1909.11391 (2019)
2018
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1807-03474
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1807-03474
Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari:
Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network. CoRR abs/1807.03474 (2018)
2017
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/MiyoshiSTS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/MiyoshiSTS17
Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities. CoRR abs/1704.02360 (2017)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1709-08041
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-08041
Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks. CoRR abs/1709.08041 (2017)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.