default search action

combined dblp search
author search
venue search
publication search

ask others

Jiatong Shi

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2026
[i93]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2601-00160
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2601-00160
Zhuoran Zhuang, Ye Chen, Chao Luo, Tian-Hao Zhang, Xuewei Zhang, Jian Ma, Jiatong Shi, Wei Zhang:
IKFST: IOO and KOO Algorithms for Accelerated and Precise WFST-based End-to-End Automatic Speech Recognition. CoRR abs/2601.00160 (2026)
[i92]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2601-12205
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2601-12205
Shih-Heng Wang, Jiatong Shi, Jinchuan Tian, Haibin Wu, Shinji Watanabe:
Do Neural Codecs Generalize? A Controlled Study Across Unseen Languages and Non-Speech Tasks. CoRR abs/2601.12205 (2026)
[i91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2601-19063
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2601-19063
Siddhant Arora, Jinchuan Tian, Jiatong Shi, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Optimizing Conversational Quality in Spoken Dialogue Systems with Reinforcement Learning from AI Feedback. CoRR abs/2601.19063 (2026)
[i90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2602-05220
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2602-05220
Jinchuan Tian, Haoran Wang, Bo-Hao Su, Chien-Yu Huang, Qingzheng Wang, Jiatong Shi, William Chen, Xun Gong, Siddhant Arora, Chin-Jou Li, Masao Someki, Takashi Maekaku, Keita Goto, Yusuke Shinohara, Jin Sakuma, Chao-Han Huck Yang, Shinji Watanabe:
Bagpiper: Solving Open-Ended Audio Tasks via Rich Captions. CoRR abs/2602.05220 (2026)
2025
[j6]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - journals/tmlr/MousaviMMPSWYKPMRELLSWK25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmlr/MousaviMMPSWYKPMRELLSWK25
Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova, Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch, Jinyu Li, Cem Subakan, Philip C. Woodland, Minje Kim, Hung-yi Lee, Shinji Watanabe, Yossi Adi, Mirco Ravanelli:
Discrete Audio Tokens: More Than a Survey! Trans. Mach. Learn. Res. 2025 (2025)
[c85]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TianZSZY0025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TianZSZY0025
Jinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang, Jianwei Yu, Shinji Watanabe, Dong Yu:
Preference Alignment Improves Language Model-Based TTS. ICASSP 2025: 1-5
[c84]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HuangCYLLLTDSSC25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HuangCYLLLTDSSC25
Chien-yu Huang, Wei-Chih Chen, Shu-Wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Chih-Kai Yang, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Fabian Alejandro Ritter Gutierrez, et al.:
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks. ICLR 2025
[c83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AroraTFJSKT025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AroraTFJSKT025
Siddhant Arora, Jinchuan Tian, Hayato Futami, Jee-weon Jung, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems. INTERSPEECH 2025
[c82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenMSBWWMHJALL25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenMSBWWMHJALL25
William Chen, Chutong Meng, Jiatong Shi, Martijn Bartelds, Shih-Heng Wang, Hsiu-Hsuan Wang, Rafael Mosquera, Sara Hincapie, Dan Jurafsky, Antonis Anastasopoulos, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties. INTERSPEECH 2025
[c81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChengZS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChengZS25
Yifan Cheng, Ruoyi Zhang, Jiatong Shi:
MIKU-PAL: An Automated and Standardized Multimodal Method for Speech Paralinguistic and Affect Labeling. INTERSPEECH 2025
[c80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuS25
Mingda Liu, Jiatong Shi:
Bridging Speech and Singing: Multi-stage Speech-Prompted Singing Voice Conversion with Speaker Embedding Adaptation. INTERSPEECH 2025
[c79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SheikhSASCL025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SheikhSASCL025
Zaid Sheikh, Shuichiro Shimizu, Siddhant Arora, Jiatong Shi, Samuele Cornell, Xinjian Li, Shinji Watanabe:
Scalable Spontaneous Speech Dataset (SSSD): Crowdsourcing Data Collection to Promote Dialogue Research. INTERSPEECH 2025
[c78]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiS025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiS025
Jiatong Shi, Hye-jin Shim, Shinji Watanabe:
Uni-VERSA: Versatile Speech Assessment with a Unified Network. INTERSPEECH 2025
[c77]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TianC0SABMSGYY025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TianC0SABMSGYY025
Jinchuan Tian, William Chen, Yifan Peng, Jiatong Shi, Siddhant Arora, Shikhar Bharadwaj, Takashi Maekaku, Yusuke Shinohara, Keita Goto, Xiang Yue, Huck Yang, Shinji Watanabe:
OpusLM: A Family of Open Unified Speech Language Models. INTERSPEECH 2025
[c76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ismir/HuangNSS0MTD25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/HuangNSS0MTD25
Yichen Huang, Zachary Novack, Koichi Saito, Jiatong Shi, Shinji Watanabe, Yuki Mitsufuji, John Thickstun, Chris Donahue:
Aligning Text-to-Music Evaluation with Human Preferences. ISMIR 2025: 174-181
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/kdd/HuangLSSSLM025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/kdd/HuangLSSSLM025
Tenghao Huang, Dong Hee Lee, John Sweeney, Jiatong Shi, Emily Steliotes, Matthew Lange, Jonathan May, Muhao Chen:
FoodPuzzle: Toward Developing Large Language Model Agents as Autonomous Flavor Scientists. KDD (2) 2025: 5493-5504
[c74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/naacl/TianSCAMMWPBZC025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/TianSCAMMWPBZC025
Jinchuan Tian, Jiatong Shi, William Chen, Siddhant Arora, Yoshiki Masuyama, Takashi Maekaku, Yihan Wu, Junyi Peng, Shikhar Bharadwaj, Yiwen Zhao, Samuele Cornell, Yifan Peng, Xiang Yue, Chao-Han Huck Yang, Graham Neubig, Shinji Watanabe:
ESPnet-SpeechLM: An Open Speech Language Model Toolkit. NAACL (System Demonstrations) 2025: 116-124
[c73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/naacl/ShiSTAWPYZTZAHS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/ShiSTAWPYZTZAHS25
Jiatong Shi, Hye-jin Shim, Jinchuan Tian, Siddhant Arora, Haibin Wu, Darius Petermann, Jia Qi Yip, You Zhang, Yuxun Tang, Wangyou Zhang, Dareen Alharthi, Yichen Huang, Koichi Saito, Jionghao Han, Yiwen Zhao, Chris Donahue, Shinji Watanabe:
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music. NAACL (System Demonstrations) 2025: 191-209
[c72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/naacl/Arora0STCBFKTSS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/Arora0STCBFKTSS25
Siddhant Arora, Yifan Peng, Jiatong Shi, Jinchuan Tian, William Chen, Shikhar Bharadwaj, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shuichiro Shimizu, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems. NAACL (System Demonstrations) 2025: 248-259
[i89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-15218
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-15218
Jinchuan Tian, Jiatong Shi, William Chen, Siddhant Arora, Yoshiki Masuyama, Takashi Maekaku, Yihan Wu, Junyi Peng, Shikhar Bharadwaj, Yiwen Zhao, Samuele Cornell, Yifan Peng, Xiang Yue, Chao-Han Huck Yang, Graham Neubig, Shinji Watanabe:
ESPnet-SpeechLM: An Open Speech Language Model Toolkit. CoRR abs/2502.15218 (2025)
[i88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-08533
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-08533
Siddhant Arora, Yifan Peng, Jiatong Shi, Jinchuan Tian, William Chen, Shikhar Bharadwaj, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shuichiro Shimizu, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems. CoRR abs/2503.08533 (2025)
[i87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-16669
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-16669
Yichen Huang, Zachary Novack, Koichi Saito, Jiatong Shi, Shinji Watanabe, Yuki Mitsufuji, John Thickstun, Chris Donahue:
Aligning Text-to-Music Evaluation with Human Preferences. CoRR abs/2503.16669 (2025)
[i86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-15772
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-15772
Yifan Cheng, Ruoyi Zhang, Jiatong Shi:
MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling. CoRR abs/2505.15772 (2025)
[i85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-20741
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-20741
Jiatong Shi, Hye-Jin Shim, Shinji Watanabe:
Uni-VERSA: Versatile Speech Assessment with a Unified Network. CoRR abs/2505.20741 (2025)
[i84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-24518
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-24518
Jiatong Shi, Yifan Cheng, Bo-Hao Su, Hye-jin Shim, Jinchuan Tian, Samuele Cornell, Yiwen Zhao, Siddhant Arora, Shinji Watanabe:
ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation. CoRR abs/2505.24518 (2025)
[i83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-00722
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-00722
Siddhant Arora, Jinchuan Tian, Hayato Futami, Jee-weon Jung, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems. CoRR abs/2506.00722 (2025)
[i82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-06930
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-06930
Alexander Spangher, Tenghao Huang, Jialiang Gu, Jiatong Shi, Muhao Chen:
DiscoSum: Discourse-aware News Summarization. CoRR abs/2506.06930 (2025)
[i81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-10274
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-10274
Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova, Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch, Jinyu Li, Cem Subakan, Philip C. Woodland, Minje Kim, Hung-yi Lee, Shinji Watanabe, Yossi Adi, Mirco Ravanelli:
Discrete Audio Tokens: More Than a Survey! CoRR abs/2506.10274 (2025)
[i80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-12260
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-12260
Wei Wang, Wangyou Zhang, Chenda Li, Jiatong Shi, Shinji Watanabe, Yanmin Qian:
Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment. CoRR abs/2506.12260 (2025)
[i79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-17611
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-17611
Jinchuan Tian, William Chen, Yifan Peng, Jiatong Shi, Siddhant Arora, Shikhar Bharadwaj, Takashi Maekaku, Yusuke Shinohara, Keita Goto, Xiang Yue, Huck Yang, Shinji Watanabe:
OpusLM: A Family of Open Unified Speech Language Models. CoRR abs/2506.17611 (2025)
[i78]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-07139
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-07139
William Chen, Chutong Meng, Jiatong Shi, Martijn Bartelds, Shih-Heng Wang, Hsiu-Hsuan Wang, Rafael Mosquera, Sara Hincapie, Dan Jurafsky, Antonis Anastasopoulos, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties. CoRR abs/2509.07139 (2025)
[i77]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-15629
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-15629
Lester Phillip Violeta, Xueyao Zhang, Jiatong Shi, Yusuke Yasuda, Wen-Chin Huang, Zhizheng Wu, Tomoki Toda:
The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion To Singing Style Conversion. CoRR abs/2509.15629 (2025)
[i76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-01812
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-01812
Yuxun Tang, Lan Liu, Wenhao Feng, Yiwen Zhao, Jionghao Han, Yifeng Yu, Jiatong Shi, Qin Jin:
SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment. CoRR abs/2510.01812 (2025)
[i75]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-02066
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-02066
Siddhant Arora, Jinchuan Tian, Hayato Futami, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems. CoRR abs/2510.02066 (2025)
[i74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-01261
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-01261
Jiatong Shi, Jionghao Han, Yichen Lu, Santiago Pascual, Pengfei Wu, Chenye Cui, Shinji Watanabe, Chao Weng, Cong Zhou:
Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play. CoRR abs/2511.01261 (2025)
[i73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-20972
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-20972
Jionghao Han, Jiatong Shi, Masao Someki, Yuxun Tang, Lan Liu, Yiwen Zhao, Wenhao Feng, Shinji Watanabe:
SingingSDS: A Singing-Capable Spoken Dialogue System for Conversational Roleplay Applications. CoRR abs/2511.20972 (2025)
[i72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-21045
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-21045
Jionghao Han, Jiatong Shi, Zhuoyan Tao, Yuxun Tang, Yiwen Zhao, Gus Xia, Shinji Watanabe:
CartoonSing: Unifying Human and Nonhuman Timbres in Singing Generation. CoRR abs/2511.21045 (2025)
[i71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-22687
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-22687
Jiatong Shi, Haoran Wang, William Chen, Chenda Li, Wangyou Zhang, Jinchuan Tian, Shinji Watanabe:
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning. CoRR abs/2511.22687 (2025)
[i70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-14653
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-14653
Yiwen Zhao, Jiatong Shi, Yuxun Tang, William Chen, Shinji Watanabe:
Robust Training of Singing Voice Synthesis Using Prior and Posterior Uncertainty. CoRR abs/2512.14653 (2025)
[i69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-14657
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-14657
Yiwen Zhao, Jiatong Shi, Jinchuan Tian, Yuxun Tang, Jiarui Hai, Jionghao Han, Shinji Watanabe:
Adapting Speech Language Model to Singing Voice Synthesis. CoRR abs/2512.14657 (2025)
2024
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/YangCHLLWSCTHFCLCHTLLMWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/YangCHLLWSCTHFCLCHTLLMWL24
Shu-Wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2884-2899 (2024)
[c71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HuangLYSCYWHHLR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HuangLYSCYWHHLR24
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Yuexian Zou, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. AAAI 2024: 23802-23804
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HeCTRS0NML24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HeCTRS0NML24
Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori S. Levin:
Wav2Gloss: Generating Interlinear Glossed Text from Speech. ACL (1) 2024: 568-582
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HuangZWYTYLW0CS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HuangZWYTYLW0CS24
Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners. ACL (1) 2024: 10929-10942
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/ChenZPLTSCML024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ChenZPLTSCML024
William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe:
Towards Robust Speech Representation Learning for Thousands of Languages. EMNLP 2024: 10205-10224
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/healthcom/QiSSBL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/healthcom/QiSSBL24
Kristin Qi, Jiatong Shi, Caroline Summerour, John A. Batsis, Xiaohui Liang:
Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for Early Detection of Cognitive Decline. HealthCom 2024: 1-6
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChangYCJLMSST0F24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChangYCJLMSST0F24
Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-Weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang:
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. ICASSP 2024: 11481-11485
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaekakuSCF024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaekakuSCF024
Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model. ICASSP 2024: 11741-11745
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangLWHKWACSPS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangLWHKWACSPS24
Chien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee:
Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For Speech. ICASSP 2024: 12136-12140
[c63]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ShiIMKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ShiIMKS24
Jiatong Shi, Hirofumi Inaguma, Xutai Ma, Ilia Kulikov, Anna Y. Sun:
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction. ICLR 2024
[c62]
- view
- export record
  dblp key:
  - conf/icml/YangT0HLGCSZ0ZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/YangT0HLGCSZ0ZW24
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Haohan Guo, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Zhou Zhao, Xixin Wu, Helen M. Meng:
UniAudio: Towards Universal Audio Generation with Large Language Models. ICML 2024: 56422-56447
[c61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangCSCHSM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangCSCHSM24
Kalvin Chang, Yi-Hui Chou, Jiatong Shi, Hsuan-Ming Chen, Nicole Holliday, Odette Scharenborg, David R. Mortensen:
Self-supervised Speech Representations Still Struggle with African American Vernacular English. INTERSPEECH 2024
[c60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangSTWTW0A0J24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangSTWTW0A0J24
Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. INTERSPEECH 2024
[c59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JungZSAHGTA024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JungZSAHGTA024
Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Alex Gichamba, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe:
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models. INTERSPEECH 2024
[c58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiMS24
Shuhua Li, Qirong Mao, Jiatong Shi:
PL-TTS: A Generalizable Prompt-based Diffusion TTS Augmented by Large Language Model. INTERSPEECH 2024
[c57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengTCAYS0CSCJ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengTCAYS0CSCJ024
Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe:
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer. INTERSPEECH 2024
[c56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiLBZWTYJ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiLBZWTYJ024
Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe:
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing. INTERSPEECH 2024
[c55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiMIS024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiMIS024
Jiatong Shi, Xutai Ma, Hirofumi Inaguma, Anna Sun, Shinji Watanabe:
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model. INTERSPEECH 2024
[c54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiWCBKTCJLL024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiWCBKTCJLL024
Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets. INTERSPEECH 2024
[c53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SrivastavaSC024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SrivastavaSC024
Tejes Srivastava, Jiatong Shi, William Chen, Shinji Watanabe:
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios. INTERSPEECH 2024
[c52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangWSJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangWSJ24
Yuxun Tang, Yuning Wu, Jiatong Shi, Qin Jin:
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models. INTERSPEECH 2024
[c51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuZSTYJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuZSTYJ24
Yuning Wu, Chunlei Zhang, Jiatong Shi, Yuxun Tang, Shan Yang, Qin Jin:
TokSing: Singing Voice Synthesis based on Discrete Tokens. INTERSPEECH 2024
[c50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZangS0YHTXZGTD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZangS0YHTXZGTD24
Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan:
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection. INTERSPEECH 2024
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuYSQJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuYSQJ24
Yuning Wu, Yifeng Yu, Jiatong Shi, Tao Qian, Qin Jin:
A Systematic Exploration of Joint-Training for Singing Voice Synthesis. ISCSLP 2024: 289-293
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/TangSWJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/TangSWJ24
Yuxun Tang, Jiatong Shi, Yuning Wu, Qin Jin:
An Exploration on Singing MOS Prediction. ISCSLP 2024: 651-655
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/WuSYTQLHB0J24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/WuSYTQLHB0J24
Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. ACM Multimedia 2024: 11279-11281
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangSHWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangSHWL24
Shih-Heng Wang, Jiatong Shi, Chien-Yu Huang, Shinji Watanabe, Hung-Yi Lee:
Fusion Of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition. SLT 2024: 247-254
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-Weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs For Audio, Music, and Speech. SLT 2024: 562-569
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YuSWTW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YuSWTW24
Yifeng Yu, Jiatong Shi, Yuning Wu, Yuxun Tang, Shinji Watanabe:
Visinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation. SLT 2024: 719-726
[c43]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangZSYTD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangZSYTD24
You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan:
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge. SLT 2024: 782-787
[c42]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SomekiCACCHPSSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SomekiCACCHPSSW24
Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-EZ: Python-Only ESPnet For Easy Fine-Tuning And Integration. SLT 2024: 863-870
[i68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-16658
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-16658
Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe:
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer. CoRR abs/2401.16658 (2024)
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17230
Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe:
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models. CoRR abs/2401.17230 (2024)
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17619
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17619
Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe:
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2. CoRR abs/2401.17619 (2024)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-13169
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-13169
Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori S. Levin:
Wav2Gloss: Generating Interlinear Glossed Text from Speech. CoRR abs/2403.13169 (2024)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-09385
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-09385
Shu-Wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. CoRR abs/2404.09385 (2024)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-05244
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-05244
You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Tomoki Toda, Zhiyao Duan:
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan. CoRR abs/2405.05244 (2024)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02438
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02438
Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan:
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection. CoRR abs/2406.02438 (2024)
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02950
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02950
Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Brian Yan, Jiatong Shi, Yifan Peng, Shinji Watanabe:
4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders. CoRR abs/2406.02950 (2024)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07725
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07725
Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. CoRR abs/2406.07725 (2024)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08416
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08416
Yuning Wu, Chunlei Zhang, Jiatong Shi, Yuxun Tang, Shan Yang, Qin Jin:
TokSing: Singing Voice Synthesis based on Discrete Tokens. CoRR abs/2406.08416 (2024)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08641
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08641
Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets. CoRR abs/2406.08641 (2024)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08761
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08761
Yifeng Yu, Jiatong Shi, Yuning Wu, Shinji Watanabe:
VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation. CoRR abs/2406.08761 (2024)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08905
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08905
Yuxun Tang, Yuning Wu, Jiatong Shi, Qin Jin:
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models. CoRR abs/2406.08905 (2024)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09869
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09869
Jiatong Shi, Xutai Ma, Hirofumi Inaguma, Anna Y. Sun, Shinji Watanabe:
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model. CoRR abs/2406.09869 (2024)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-10911
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-10911
Yuxun Tang, Jiatong Shi, Yuning Wu, Qin Jin:
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction. CoRR abs/2406.10911 (2024)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-00837
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-00837
William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe:
Towards Robust Speech Representation Learning for Thousands of Languages. CoRR abs/2407.00837 (2024)
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-14262
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-14262
Kalvin Chang, Yi-Hui Chou, Jiatong Shi, Hsuan-Ming Chen, Nicole Holliday, Odette Scharenborg, David R. Mortensen:
Self-supervised Speech Representations Still Struggle with African American Vernacular English. CoRR abs/2408.14262 (2024)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-16132
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-16132
You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan:
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge. CoRR abs/2408.16132 (2024)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-07226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-07226
Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. CoRR abs/2409.07226 (2024)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09506
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09506
Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration. CoRR abs/2409.09506 (2024)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12403
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12403
Jinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang, Jianwei Yu, Shinji Watanabe, Dong Yu:
Preference Alignment Improves Language Model-Based TTS. CoRR abs/2409.12403 (2024)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12832
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12832
Tenghao Huang, Donghee Lee, John Sweeney, Jiatong Shi, Emily Steliotes, Matthew Lange, Jonathan May, Muhao Chen:
FoodPuzzle: Developing Large Language Model Agents as Flavor Scientists. CoRR abs/2409.12832 (2024)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-15897
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-15897
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-12885
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-12885
Kristin Qi, Jiatong Shi, Caroline Summerour, John A. Batsis, Xiaohui Liang:
Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for Early Detection of Cognitive Decline. CoRR abs/2410.12885 (2024)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-05088
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-05088
Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondrej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, Dávid Javorský, Mateusz Krubinski, Tsz Kin Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Maurya, John P. McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Sara Papi, Peter Polák, Adam Pospísil, Pavel Pecina, Elizabeth Salesky, Nivedita Sethiya, Balaram Sarkar, Jiatong Shi, Claytone Sikasote, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Brian Thompson, Marco Turchi, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zemánek, Rodolfo Zevallos:
Findings of the IWSLT 2024 Evaluation Campaign. CoRR abs/2411.05088 (2024)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-05361
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-05361
Chien-yu Huang, Wei-Chih Chen, Shu-Wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Chih-Kai Yang, Fabian Ritter Gutierrez, Ming To Chuang, Kuan-Po Huang, Siddhant Arora, You-Kuan Lin, Eunjung Yeo, Kalvin Chang, Chung-Ming Chien, Kwanghee Choi, Cheng-Hsiu Hsieh, Yi-Cheng Lin, Chee-En Yu, I-Hsiang Chiu, Heitor R. Guimarães, Jionghao Han, Tzu-Quan Lin, Tzu-Yuan Lin, Homu Chang, Ting-Wu Chang, Chun Wei Chen, Shou-Jen Chen, Yu-Hua Chen, Hsi-Chun Cheng, Kunal Dhawan, Jia-Lin Fang, Shi-Xin Fang, Kuan-Yu Fang Chiang, Chi An Fu, Hsien-Fu Hsiao, Ching Yu Hsu, Shao-Syuan Huang, Lee Chen Wei, Hsi-Che Lin, Hsuan-Hao Lin, Hsuan-Ting Lin, Jian-Ren Lin, Ting-Chun Liu, Li-Chun Lu, Tsung-Min Pai, Ankita Pasad, Shih-Yun Shan Kuan, Suwon Shon, Yuxun Tang, Yun-Shao Tsai, Jui-Chiang Wei, Tzu-Chieh Wei, Chengxi Wu, Dien-Ruei Wu, Chao-Han Huck Yang, Chieh-Chi Yang, Jia Qi Yip, Shao-Xiang Yuan, Vahid Noroozi, Zhehuai Chen, Haibin Wu, Karen Livescu, David Harwath, Shinji Watanabe, Hung-yi Lee:
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks. CoRR abs/2411.05361 (2024)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-18107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-18107
Shih-Heng Wang, Jiatong Shi, Chien-yu Huang, Shinji Watanabe, Hung-yi Lee:
Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition. CoRR abs/2411.18107 (2024)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-18217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-18217
Shih-Heng Wang, Zih-Ching Chen, Jiatong Shi, Ming-To Chuang, Guan-Ting Lin, Kuan-Po Huang, David Harwath, Shang-Wen Li, Hung-yi Lee:
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario. CoRR abs/2411.18217 (2024)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-17667
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-17667
Jiatong Shi, Hye-jin Shim, Jinchuan Tian, Siddhant Arora, Haibin Wu, Darius Petermann, Jia Qi Yip, You Zhang, Yuxun Tang, Wangyou Zhang, Dareen Alharthi, Yichen Huang, Koichi Saito, Jionghao Han, Yiwen Zhao, Chris Donahue, Shinji Watanabe:
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music. CoRR abs/2412.17667 (2024)
2023
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/eaai/CaoSW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/eaai/CaoSW23
Feilong Cao, Jiatong Shi, Chenglin Wen:
A dynamic graph aggregation framework for 3D point cloud registration. Eng. Appl. Artif. Intell. 120: 105817 (2023)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/ijon/ShiYYC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijon/ShiYYC23
Jiatong Shi, Hailiang Ye, Bing Yang, Feilong Cao:
An iteration-based interactive attention network for 3D point cloud registration. Neurocomputing 560: 126822 (2023)
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/YanS0IPDPFBHZNH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/YanS0IPDPFBHZNH23
Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polak, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe:
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit. ACL (demo) 2023: 400-411
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/QianLSWGYJ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/QianLSWGYJ23
Tao Qian, Fan Lou, Jiatong Shi, Yuning Wu, Shuai Guo, Xiang Yin, Qin Jin:
UniLG: A Unified Structure-aware Framework for Lyrics Generation. ACL (1) 2023: 983-1001
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChenSYBZPCMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChenSYBZPCMW23
William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe:
Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning. ASRU 2023: 1-8
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChouCWOBYCPYCPCCCS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChouCWOBYCPYCPCCCS23
Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Iu-Tshian Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, Jiatong Shi:
Evaluating Self-Supervised Speech Models on a Taiwanese Hokkien Corpus. ASRU 2023: 1-7
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HuangVLST23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HuangVLST23
Wen-Chin Huang, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi, Tomoki Toda:
The Singing Voice Conversion Challenge 2023. ASRU 2023: 1-8
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/PengTYBCLSACSZSSJMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/PengTYBCLSACSZSSJMW23
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-Weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data. ASRU 2023: 1-8
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ShiCBWHHCCTLMLW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ShiCBWHHCCTLMLW23
Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe:
Findings of the 2023 ML-Superb Challenge: Pre-Training And Evaluation Over More Languages And Beyond. ASRU 2023: 1-8
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenYSPMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenYSPMW23
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe:
Improving Massively Multilingual ASR with Auxiliary CTC Objectives. ICASSP 2023: 1-5
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GaoSCGLWK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GaoSCGLWK23
Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-Yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
Euro: Espnet Unsupervised ASR Open-Source Toolkit. ICASSP 2023: 1-5
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiHCGGWLL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiHCGGWLL23
Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-Yi Lee:
Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR. ICASSP 2023: 1-5
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiTLIWPW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiTLIWPW23
Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe:
Enhancing Speech-To-Speech Translation with Multiple TTS Targets. ICASSP 2023: 1-5
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuSQGJ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuSQGJ23
Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor. ICASSP 2023: 1-5
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiBCHHCC0ML023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiBCHHCC0ML023
Jiatong Shi, Dan Berrebbi, William Chen, En-Pei Hu, Wei-Ping Huang, Ho-Lam Chung, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. INTERSPEECH 2023: 884-888
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Shi0IG0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Shi0IG0023
Jiatong Shi, Yun Tang, Hirofumi Inaguma, Hongyu Gong, Juan Pino, Shinji Watanabe:
Exploration on HuBERT with Multiple Resolution. INTERSPEECH 2023: 3287-3291
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sudo0YS023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sudo0YS023
Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe:
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders. INTERSPEECH 2023: 3312-3316
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/AgrawalABBBCCCC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/AgrawalABBBCCCC23
Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondrej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John P. McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Polák, Elijah Rippeth, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, Rodolfo Zevallos:
Findings of the IWSLT 2023 Evaluation Campaign. IWSLT@ACL 2023: 1-61
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/YanSMCLPA023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/YanSMCLPA023
Brian Yan, Jiatong Shi, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe:
CMU's IWSLT 2023 Simultaneous Speech Translation System. IWSLT@ACL 2023: 235-240
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-12829
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-12829
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe:
Improving Massively Multilingual ASR With Auxiliary CTC Objectives. CoRR abs/2302.12829 (2023)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-08607
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-08607
Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor. CoRR abs/2303.08607 (2023)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04618
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04618
Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe:
Enhancing Speech-to-Speech Translation with Multiple TTS Targets. CoRR abs/2304.04618 (2023)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-12995
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-12995
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. CoRR abs/2304.12995 (2023)
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-07455
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-07455
Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-Yi Lee:
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation. CoRR abs/2305.07455 (2023)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-10615
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-10615
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei-Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. CoRR abs/2305.10615 (2023)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-01084
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-01084
Jiatong Shi, Yun Tang, Hirofumi Inaguma, Hongyu Gong, Juan Pino, Shinji Watanabe:
Exploration on HuBERT with Multiple Resolutions. CoRR abs/2306.01084 (2023)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-14422
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-14422
Wen-Chin Huang, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi, Yusuke Yasuda, Tomoki Toda:
The Singing Voice Conversion Challenge 2023. CoRR abs/2306.14422 (2023)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-02867
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-02867
Yuning Wu, Yifeng Yu, Jiatong Shi, Tao Qian, Qin Jin:
A Systematic Exploration of Joint-training for Singing Voice Synthesis. CoRR abs/2308.02867 (2023)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09510
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09510
Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee:
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech. CoRR abs/2309.09510 (2023)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13876
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13876
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data. CoRR abs/2309.13876 (2023)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15317
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15317
William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe:
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning. CoRR abs/2309.15317 (2023)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15800
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15800
Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-Weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang:
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. CoRR abs/2309.15800 (2023)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-00704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-00704
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-02720
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-02720
Jiatong Shi, Hirofumi Inaguma, Xutai Ma, Ilia Kulikov, Anna Y. Sun:
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction. CoRR abs/2310.02720 (2023)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-03938
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-03938
Tejes Srivastava, Jiatong Shi, William Chen, Shinji Watanabe:
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Multilingual and Low Resource Scenarios. CoRR abs/2310.03938 (2023)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-03975
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-03975
Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
HuBERTopic: Enhancing Semantic Representation of HuBERT through Self-supervision Utilizing Topic Model. CoRR abs/2310.03975 (2023)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-05513
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-05513
Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chung, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond. CoRR abs/2310.05513 (2023)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-06668
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-06668
Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Iu-Tshian Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, Jiatong Shi:
Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus. CoRR abs/2312.06668 (2023)
2022
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/ShiZWWYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/ShiZWWYY22
Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer. Comput. Speech Lang. 73: 101327 (2022)
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/TsaiCHHLYDLLSCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/TsaiCHHLYDLLSCH22
Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-Wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities. ACL (1) 2022: 8479-8492
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/QianSGWJ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/QianSGWJ22
Tao Qian, Jiatong Shi, Shuai Guo, Peter Wu, Qin Jin:
Training Strategies for Automatic Song Writing: A Unified Framework Perspective. ICASSP 2022: 4738-4742
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangSWYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangSWYY22
Chunlei Zhang, Jiatong Shi, Chao Weng, Meng Yu, Dong Yu:
Towards end-to-end Speaker Diarization with Generalized Neural Speaker Clustering. ICASSP 2022: 8372-8376
[c21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiSH0K22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiSH0K22
Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. INTERSPEECH 2022: 1656-1660
[c20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Deng0SA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Deng0SA22
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora:
Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation. INTERSPEECH 2022: 1746-1750
[c19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BerrebbiSYLA022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BerrebbiSYLA022
Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel López-Francisco, Jonathan D. Amith, Shinji Watanabe:
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation. INTERSPEECH 2022: 3533-3537
[c18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GuoSQ0J22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GuoSQ0J22
Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. INTERSPEECH 2022: 4272-4276
[c17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiGQHWXCLW0J22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiGQHWXCLW0J22
Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis. INTERSPEECH 2022: 4277-4281
[c16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sudo0NS022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sudo0NS022
Yui Sudo, Muhammad Shakeel, Kazuhiro Nakadai, Jiatong Shi, Shinji Watanabe:
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection. INTERSPEECH 2022: 4641-4645
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/AnastasopoulosB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/AnastasopoulosB22
Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondrej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vera Kloudová, Surafel Melaku Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Miguel Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe:
Findings of the IWSLT 2022 Evaluation Campaign. IWSLT@ACL 2022: 98-157
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/YanFDSPBWNW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/YanFDSPBWNW22
Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe:
CMU's IWSLT 2022 Dialect Speech Translation System. IWSLT@ACL 2022: 298-307
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/FengDYYLSCHWCWMLL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/FengDYYLSCHWCWMLL22
Tzu-hsun Feng, Shuyan Annie Dong, Ching-Feng Yeh, Shu-Wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee:
Superb @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning. SLT 2022: 1096-1103
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MengCSWGLT22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MengCSWGLT22
Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. SLT 2022: 1128-1135
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-06849
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-06849
Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-Wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Jeff Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities. CoRR abs/2203.06849 (2022)
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-17001
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-17001
Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. CoRR abs/2203.17001 (2022)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-02470
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-02470
Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel López-Francisco, Jonathan D. Amith, Shinji Watanabe:
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation. CoRR abs/2204.02470 (2022)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-08920
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-08920
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora:
Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation. CoRR abs/2204.08920 (2022)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-04029
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-04029
Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis. CoRR abs/2205.04029 (2022)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-01818
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-01818
Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. CoRR abs/2208.01818 (2022)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-07189
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-07189
Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. CoRR abs/2210.07189 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-08634
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-08634
Tzu-hsun Feng, Shuyan Annie Dong, Ching-Feng Yeh, Shu-Wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee:
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning. CoRR abs/2210.08634 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-03025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-03025
Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-yi Lee:
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR. CoRR abs/2211.03025 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-17196
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-17196
Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
EURO: ESPnet Unsupervised ASR Open-source Toolkit. CoRR abs/2211.17196 (2022)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-10818
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-10818
Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe:
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders. CoRR abs/2212.10818 (2022)
2021
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/tjs/ShiYXW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tjs/ShiYXW21
Jiatong Shi, Kunlin Yang, Wei Xu, Mingming Wang:
Leveraging deep learning with audio analytics to predict the success of crowdfunding projects. J. Supercomput. 77(7): 7833-7853 (2021)
[c11]
- view
  - electronic edition @ ieee.org
  - details & citations
- export record
  dblp key:
  - conf/apsipa/WuLSSWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/WuLSSWM21
Peter Wu, Paul Pu Liang, Jiatong Shi, Ruslan Salakhutdinov, Shinji Watanabe, Louis-Philippe Morency:
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks. APSIPA ASC 2021: 841-848
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/WuSZWB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/WuSZWB21
Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W. Black:
Cross-Lingual Transfer for Speech Processing Using Acoustic Language Similarity. ASRU 2021: 1050-1057
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/eacl/ShiAGSDW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eacl/ShiAGSDW21
Jiatong Shi, Jonathan D. Amith, Rey Castillo García, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe:
Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yolóxochitl Mixtec. EACL 2021: 1134-1145
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiGHZJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiGHZJ21
Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss. ICASSP 2021: 76-80
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GuoBCHHIKLGSSWW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GuoBCHHIKLGSSWW21
Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on Espnet Toolkit Boosted By Conformer. ICASSP 2021: 5874-5878
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiZW00021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiZW00021
Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation. ICASSP 2021: 6908-6912
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangCCLLLLSCLHT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangCCLLLLSCLHT21
Shu-Wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB: Speech Processing Universal PERformance Benchmark. Interspeech 2021: 1194-1198
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/InagumaYDGSDW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/InagumaYDGSDW21
Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Guo, Jiatong Shi, Kevin Duh, Shinji Watanabe:
ESPnet-ST IWSLT 2021 Offline Speech Translation System. IWSLT 2021: 100-109
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2101-10877
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-10877
Jiatong Shi, Jonathan D. Amith, Rey Castillo García, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe:
Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yoloxóchitl Mixtec. CoRR abs/2101.10877 (2021)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-01051
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-01051
Shu-Wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB: Speech processing Universal PERformance Benchmark. CoRR abs/2105.01051 (2021)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-00636
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-00636
Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Guo, Jiatong Shi, Kevin Duh, Shinji Watanabe:
ESPnet-ST IWSLT 2021 Offline Speech Translation System. CoRR abs/2107.00636 (2021)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-07840
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-07840
Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe:
ESPnet2-TTS: Extending the Edge of TTS Research. CoRR abs/2110.07840 (2021)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-01326
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-01326
Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W. Black:
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity. CoRR abs/2111.01326 (2021)
2020
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HouDZYSS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HouDZYSS20
Wenxin Hou, Yue Dong, Bairong Zhuang, Longfei Yang, Jiatong Shi, Takahiro Shinozaki:
Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning. INTERSPEECH 2020: 1037-1041
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiHJ20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiHJ20
Jiatong Shi, Nan Huo, Qin Jin:
Context-Aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training. INTERSPEECH 2020: 3057-3061
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-08647
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-08647
Jiatong Shi, Nan Huo, Qin Jin:
Context-aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training. CoRR abs/2008.08647 (2020)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-12024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-12024
Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss. CoRR abs/2010.12024 (2020)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13956
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13956
Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on ESPnet Toolkit Boosted by Conformer. CoRR abs/2010.13956 (2020)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-13393
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-13393
Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation. CoRR abs/2011.13393 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2018
[c1]
- view
  - electronic edition @ aisnet.org
  - details & citations
- export record
  dblp key:
  - conf/pacis/ShiDX18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pacis/ShiDX18
Jiatong Shi, Wei Du, Wei Xu:
Identifying Impact Factors of Question Quality in Online Health Q&A Communities: an Empirical Analysis on MedHelp. PACIS 2018: 173

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.