default search action
9th SSW 2016: Sunnyvale, CA, USA
- Alan W. Black:
The 9th ISCA Speech Synthesis Workshop, SSW 2016, Sunnyvale, CA, USA, September 13-15, 2016. ISCA 2016
Keynote Session 1
- Oriol Guasch:
Large-scale finite element simulations of the physics of voice.
Oral Session 1: Prosody
- Mahsa Sadat Elyasi Langarani, Jan P. H. van Santen:
Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features. 1-6 - Rasmus Dall, Marcus Tomalin, Mirjam Wester:
Synthesising Filled Pauses: Representation and Datamixing. 7-13 - Pierre-Edouard Honnet, Philip N. Garner:
Emphasis recreation for TTS using intonation atoms. 14-20 - Eva Vanmassenhove, João P. Cabral, Fasih Haider:
Prediction of Emotions from Text using Sentiment Analysis for Expressive Speech Synthesis. 21-26
Poster Session 1
- Yasuhiro Hamada, Nobutaka Ono, Shigeki Sagayama:
Non-filter waveform generation from cepstrum using spectral phase reconstruction. 27-31 - Alexandros Lazaridis, Milos Cernak, Pierre-Edouard Honnet, Philip N. Garner:
Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis. 32-37 - Mirjam Wester, Zhizheng Wu, Junichi Yamagishi:
Multidimensional scaling of systems in the Voice Conversion Challenge 2016. 38-43 - Dong-Yan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Nguyen Quy Hy, Minghui Dong, Haizhou Li:
An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity. 44-51 - Yusuke Tajiri, Tomoki Toda:
Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoring. 52-58 - Igor Jauk, Antonio Bonafonte:
Prosodic and Spectral iVectors for Expressive Speech Synthesis. 59-63 - Michael Pucher, Fernando Villavicencio, Junichi Yamagishi:
Development of a statistical parametric synthesis system for operatic singing in German. 64-69 - Srikanth Ronanki, Siva Reddy Gangireddy, Bajibabu Bollepalli, Simon King:
DNN-based Speech Synthesis for Indian Languages from ASCII text. 70-75 - Sunayana Sitaram, Sai Krishna Rallabandi, Shruti Rijhwani, Alan W. Black:
Experiments with Cross-lingual Systems for Synthesis of Code-Mixed Text. 76-81 - Avni Rajpal, Hemant A. Patil:
Jerk Minimization for Acoustic-To-Articulatory Inversion. 82-87 - Sunhee Kim:
How to select a good voice for TTS. 88-92 - John Andersson, Sebastian Berlin, André Costa, Harald Berthelsen, Hanna Lindgren, Nikolaj Lindberg, Jonas Beskow, Jens Edlund, Joakim Gustafson:
WikiSpeech - enabling open source text-to-speech for Wikipedia. 93-99
Keynote Session 1
- Alex Acero:
Siri's voice gets deep learning.
Oral Session 2: Deep Learning in Speech Synthesis
- Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi:
Parallel and cascaded deep neural networks for text-to-speech synthesis. 100-105 - Keiichi Tokuda, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku:
Temporal modeling in neural network based statistical parametric speech synthesis. 106-111 - Santiago Pascual, Antonio Bonafonte:
Multi-output RNN-LSTM for multiple speaker speech synthesis with α-interpolation model. 112-117 - Xin Wang, Shinji Takaki, Junichi Yamagishi:
A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora. 118-121
Demo Session
- Nobuaki Minematsu, Daisuke Saito, Nobuyuki Nishizawa:
Prosodic Reading Tutor of Japanese, Suzuki-kun: The first and only educational tool to teach the formal Japanese. 122 - Hideki Kawahara:
Aliasing-free L-F model and its application to an interactive MATLAB tool and test signal generation for speech analysis procedures. 123 - Srikanth Ronanki, Zhizheng Wu, Oliver Watts, Simon King:
A Demonstration of the Merlin Open Source Neural Network Speech Synthesis System. 124 - Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, Koray Kavukcuoglu:
WaveNet: A Generative Model for Raw Audio. 125 - Blaise Potard, Matthew P. Aylett, David A. Baude:
Demo of Idlak Tangle, An Open Source DNN-Based Parametric Speech Synthesiser. 126
Poster Session 2
- Meet H. Soni, Hemant A. Patil:
Non-intrusive Quality Assessment of Synthesized Speech using Spectral Features and Support Vector Regression. 127-133 - Sushant V. Rao, Nirmesh J. Shah, Hemant A. Patil:
Novel Pre-processing using Outlier Removal in Voice Conversion. 134-139 - Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki:
Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform. 140-145 - Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech. 146-152 - Shinji Takaki, Sangjin Kim, Junichi Yamagishi:
Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis. 153-159 - Zhengchen Zhang, Fuxiang Wu, Chenyu Yang, Minghui Dong, Fugen Zhou:
Mandarin Prosodic Phrase Prediction based on Syntactic Trees. 160-165 - Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigating Very Deep Highway Networks for Parametric Speech Synthesis. 166-171 - Sivanand Achanta, Rambabu Banoth, Ayushi Pandey, Anandaswarup Vadapalli, Suryakanth V. Gangashetty:
Contextual Representation using Recurrent Neural Network Hidden State for Statistical Parametric Speech Synthesis. 172-177 - Nobuyuki Nishizawa, Tomonori Yazaki:
Wide Passband Design for Cosine-Modulated Filter Banks in Sinusoidal Speech Synthesis. 178-183 - Pallavi Baljekar, Alan W. Black:
Utterance Selection Techniques for TTS Systems Using Found Speech. 184-189 - Andrew Wilkinson, Alok Parlikar, Sunayana Sitaram, Tim White, Alan W. Black, Suresh Bazaj:
Open-Source Consumer-Grade Indic Text To Speech. 190-195 - Mei Li, Zhizheng Wu, Lei Xie:
On the impact of phoneme alignment in DNN-based speech synthesis. 196-201 - Zhizheng Wu, Oliver Watts, Simon King:
Merlin: An Open Source Neural Network Speech Synthesis System. 202-207
Keynote Session 3
- Quoc V. Le:
End-to-end Learning for Text and Speech.
Oral Session 3: Analysis and Modeling for Speech Synthesis
- Jonas Beskow, Harald Berthelsen:
A hybrid harmonics-and-bursts modelling approach to speech synthesis. 208-213 - Gilles Degottex, Pierre Lanchantin, Mark J. F. Gales:
A Pulse Model in Log-domain for a Uniform Synthesizer. 214-220 - Hideki Kawahara, Yannis Agiomyrgiannakis, Heiga Zen:
Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis. 221-228 - Slava Shechtman, Alexander Sorin:
Wideband Harmonic Model: Alignment and Noise Modeling for High Quality Speech Synthesis. 229-234
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.