default search action
Xixin Wu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j11]Xiaohan Feng, Xixin Wu, Helen Meng:
Injecting Linguistic Knowledge Into BERT for Dialogue State Tracking. IEEE Access 12: 93761-93770 (2024) - [j10]Zihao Yang, Xixin Wu, Xindang He, Xiaofei Guan:
A multiscale analysis-assisted two-stage reduced-order deep learning approach for effective thermal conductivity of arbitrary contrast heterogeneous materials. Eng. Appl. Artif. Intell. 136: 108916 (2024) - 2023
- [j9]Wen Wu, Chao Zhang, Xixin Wu, Philip C. Woodland:
Estimating the Uncertainty in Emotion Class Labels With Utterance-Specific Dirichlet Priors. IEEE Trans. Affect. Comput. 14(4): 2810-2822 (2023) - [j8]Haohan Guo, Fenglong Xie, Xixin Wu, Frank K. Soong, Helen Meng:
MSMC-TTS: Multi-Stage Multi-Codebook VQ-VAE Based Neural TTS. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1811-1824 (2023) - [j7]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3290-3303 (2023) - [j6]Xixin Wu, Hui Lu, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng:
Hiformer: Sequence Modeling Networks With Hierarchical Attention Mechanisms. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3993-4003 (2023) - 2021
- [j5]Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exemplar-Based Emotive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 29: 874-886 (2021) - [j4]Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng:
Any-to-Many Voice Conversion With Location-Relative Sequence-to-Sequence Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1717-1728 (2021) - [j3]Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Sequential Capsule Networks. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3280-3291 (2021) - 2017
- [j2]Kun Li, Xixin Wu, Helen M. Meng:
Intonation classification for L2 English speech using multi-distribution deep neural networks. Comput. Speech Lang. 43: 18-33 (2017) - 2015
- [j1]Zhiyong Wu, Kai Zhao, Xixin Wu, Xinyu Lan, Helen Meng:
Acoustic to articulatory mapping with deep neural network. Multim. Tools Appl. 74(22): 9889-9907 (2015)
Conference and Workshop Papers
- 2024
- [c74]Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes. AAAI 2024: 15267-15275 - [c73]Hui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu, Helen Meng:
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations. ICASSP 2024: 11141-11145 - [c72]Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng:
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition. ICASSP 2024: 11986-11990 - [c71]Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng:
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization. ICASSP 2024: 12306-12310 - [c70]Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis. ICASSP 2024: 12316-12320 - [c69]Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. ICASSP 2024: 12341-12345 - [c68]Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. ICASSP 2024: 12662-12666 - [c67]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Haohan Guo, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Zhou Zhao, Xixin Wu, Helen M. Meng:
UniAudio: Towards Universal Audio Generation with Large Language Models. ICML 2024 - [c66]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. IJCNN 2024: 1-8 - [c65]Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng:
Rethinking Machine Ethics - Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? NAACL-HLT (Findings) 2024: 2227-2242 - [c64]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, Jim Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. NAACL-HLT (Findings) 2024: 4131-4155 - 2023
- [c63]Hongyin Luo, Tianhua Zhang, Yung-Sung Chuang, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, James R. Glass:
Search Augmented Instruction Learning. EMNLP (Findings) 2023: 3717-3729 - [c62]Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng:
Leveraging Pretrained Representations With Task-Related Keywords for Alzheimer's Disease Detection. ICASSP 2023: 1-5 - [c61]Jinchao Li, Xixin Wu, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng:
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition. ICASSP 2023: 1-5 - [c60]Yuhao Liu, Cheng Gong, Longbiao Wang, Xixin Wu, Qiuyu Liu, Jianwu Dang:
VF-Taco2: Towards Fast and Lightweight Synthesis for Autoregressive Models with Variation Autoencoder and Feature Distillation. ICASSP 2023: 1-5 - [c59]Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng:
A Sidecar Separator Can Convert A Single-Talker Speech Recognition System to A Multi-Talker One. ICASSP 2023: 1-5 - [c58]Helen Meng, Brian Mak, Man-Wai Mak, Helene H. Fung, Xianmin Gong, Timothy C. Y. Kwok, Xunying Liu, Vincent C. T. Mok, Patrick C. M. Wong, Jean Woo, Xixin Wu, Ka Ho Wong, Sean Shensheng Xu, Naijun Zheng, Ranzo Huang, Jiawen Kang, Xiaoquan Ke, Junan Li, Jinchao Li, Yi Wang:
Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders. INTERSPEECH 2023: 1713-1717 - [c57]Yunxiang Li, Pengfei Liu, Xixin Wu, Helen Meng:
PunCantonese: A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts. INTERSPEECH 2023: 2183-2187 - [c56]Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng:
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator. INTERSPEECH 2023: 3467-3471 - [c55]Hui Lu, Xixin Wu, Zhiyong Wu, Helen Meng:
SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody. ACM Multimedia 2023: 2829-2837 - 2022
- [c54]Kun Li, Tianhua Zhang, Liping Tang, Junan Li, Hongyuan Lu, Xixin Wu, Helen Meng:
Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout. DialDoc@ACL 2022: 123-129 - [c53]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng:
Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning. ICASSP 2022: 3164-3168 - [c52]Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng:
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation. ICASSP 2022: 6677-6681 - [c51]Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. ICASSP 2022: 6902-6906 - [c50]Hang Su, Danyang Zhao, Long Dang, Minglei Li, Xixin Wu, Xunying Liu, Helen Meng:
A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition. ICASSP 2022: 8087-8091 - [c49]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-Tencent Speaker Diarization System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 9161-9165 - [c48]Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. INTERSPEECH 2022: 426-430 - [c47]Haohan Guo, Hui Lu, Xixin Wu, Helen Meng:
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS. INTERSPEECH 2022: 1566-1570 - [c46]Haohan Guo, Feng-Long Xie, Frank K. Soong, Xixin Wu, Helen Meng:
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS. INTERSPEECH 2022: 1611-1615 - [c45]Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Exploring linguistic feature and model combination for speech recognition based automatic AD detection. INTERSPEECH 2022: 3328-3332 - [c44]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. INTERSPEECH 2022: 4357-4361 - [c43]HoLam Chung, Junan Li, Pengfei Liu, Wai-Kim Leung, Xixin Wu, Helen Meng:
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition. ISCSLP 2022: 26-30 - [c42]Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu, Helen Meng:
HILvoice:Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis. ISCSLP 2022: 86-90 - [c41]Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. ACM Multimedia 2022: 5811-5820 - [c40]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. Odyssey 2022: 92-99 - [c39]Jixiu Li, Yisen Huang, Wing Yin Ng, Truman Cheng, Xixin Wu, Qi Dou, Helen Meng, Pheng-Ann Heng, Yunhui Liu, Shannon Melissa Chan, David Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li:
Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope. ROBIO 2022: 403-408 - [c38]Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE. SLT 2022: 814-821 - 2021
- [c37]Qingyun Dou, Xixin Wu, Moquan Wan, Yiting Lu, Mark J. F. Gales:
Deliberation-Based Multi-Pass Speech Synthesis. Interspeech 2021: 136-140 - [c36]Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis. Interspeech 2021: 3775-3779 - [c35]Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng:
Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks. Interspeech 2021: 4314-4318 - [c34]Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion. Interspeech 2021: 4813-4817 - [c33]Disong Wang, Jianwei Yu, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng:
Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization. ISCSLP 2021: 1-5 - 2020
- [c32]Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
End-To-End Accent Conversion Without Using Native Utterances. ICASSP 2020: 6289-6293 - [c31]Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, Helen Meng:
Adversarial Attacks on GMM I-Vector Based Speaker Verification Systems. ICASSP 2020: 6579-6583 - [c30]Yuewen Cao, Songxiang Liu, Xixin Wu, Shiyin Kang, Peng Liu, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora. ICASSP 2020: 7619-7623 - [c29]Disong Wang, Jianwei Yu, Xixin Wu, Songxiang Liu, Lifa Sun, Xunying Liu, Helen Meng:
End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation for Dysarthric Speech Reconstruction. ICASSP 2020: 7744-7748 - [c28]Kate M. Knill, Linlin Wang, Yu Wang, Xixin Wu, Mark J. F. Gales:
Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems. INTERSPEECH 2020: 255-259 - [c27]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. INTERSPEECH 2020: 1540-1544 - [c26]Naijun Zheng, Xixin Wu, Jinghua Zhong, Xunying Liu, Helen Meng:
Speaker-Aware Linear Discriminant Analysis in Speaker Verification. INTERSPEECH 2020: 3012-3016 - [c25]Xixin Wu, Kate M. Knill, Mark J. F. Gales, Andrey Malinin:
Ensemble Approaches for Uncertainty in Spoken Language Assessment. INTERSPEECH 2020: 3860-3864 - [c24]Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. Odyssey 2020: 365-371 - 2019
- [c23]Ming Liao, Jing Li, Haisong Zhang, Lingzhi Wang, Xixin Wu, Kam-Fai Wong:
Coupling Global and Local Context for Unsupervised Aspect Extraction. EMNLP/IJCNLP (1) 2019: 4578-4588 - [c22]Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition. ICASSP 2019: 6555-6559 - [c21]Xixin Wu, Songxiang Liu, Yuewen Cao, Xu Li, Jianwei Yu, Dongyang Dai, Xi Ma, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Capsule Networks. ICASSP 2019: 6695-6699 - [c20]Yuewen Cao, Xixin Wu, Songxiang Liu, Jianwei Yu, Xu Li, Zhiyong Wu, Xunying Liu, Helen Meng:
End-to-end Code-switched TTS with Mix of Monolingual Recordings. ICASSP 2019: 6935-6939 - [c19]Mu Wang, Xixin Wu, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Guangzhi Li, Dan Su, Dong Yu, Helen Meng:
Quasi-fully Convolutional Neural Network with Variational Inference for Speech Synthesis. ICASSP 2019: 7060-7064 - [c18]Jianwei Yu, Max W. Y. Lam, Xie Chen, Shoukang Hu, Songxiang Liu, Xixin Wu, Xunying Liu, Helen Meng:
Recurrent Neural Network Language Model Training Using Natural Gradient. ICASSP 2019: 7260-7264 - [c17]Dongyang Dai, Zhiyong Wu, Runnan Li, Xixin Wu, Jia Jia, Helen Meng:
Learning Discriminative Features from Spectrograms Using Center Loss for Speech Emotion Recognition. ICASSP 2019: 7405-7409 - [c16]Songxiang Liu, Yuewen Cao, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng:
Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams. INTERSPEECH 2019: 714-718 - [c15]Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. INTERSPEECH 2019: 2090-2094 - [c14]Shoukang Hu, Xurong Xie, Shansong Liu, Max W. Y. Lam, Jianwei Yu, Xixin Wu, Xunying Liu, Helen Meng:
LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition. INTERSPEECH 2019: 2793-2797 - [c13]Hang Su, Borislav Dzodzo, Xixin Wu, Xunying Liu, Helen Meng:
Unsupervised Methods for Audio Classification from Lecture Discussion Recordings. INTERSPEECH 2019: 3347-3351 - [c12]Jianwei Yu, Max W. Y. Lam, Shoukang Hu, Xixin Wu, Xu Li, Yuewen Cao, Xunying Liu, Helen Meng:
Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models. INTERSPEECH 2019: 3510-3514 - 2018
- [c11]Xixin Wu, Lifa Sun, Shiyin Kang, Songxiang Liu, Zhiyong Wu, Xunying Liu, Helen Meng:
Feature Based Adaptation for Speaking Style Synthesis. ICASSP 2018: 5304-5308 - [c10]Shaoguang Mao, Zhiyong Wu, Xu Li, Runnan Li, Xixin Wu, Helen Meng:
Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech. ICME 2018: 1-6 - [c9]Songxiang Liu, Jinghua Zhong, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance. INTERSPEECH 2018: 496-500 - [c8]Xu Li, Shaoguang Mao, Xixin Wu, Kun Li, Xunying Liu, Helen Meng:
Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis. INTERSPEECH 2018: 2554-2558 - [c7]Jianwei Yu, Xurong Xie, Shansong Liu, Shoukang Hu, Max W. Y. Lam, Xixin Wu, Ka Ho Wong, Xunying Liu, Helen Meng:
Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus. INTERSPEECH 2018: 2938-2942 - [c6]Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis. INTERSPEECH 2018: 3072-3076 - [c5]Mu Wang, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Speech Super-Resolution Using Parallel WaveNet. ISCSLP 2018: 260-264 - [c4]Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
The HCCL-CUHK System for the Voice Conversion Challenge 2018. Odyssey 2018: 248-254 - 2015
- [c3]Xixin Wu, Zhiyong Wu, Yishuang Ning, Jia Jia, Lianhong Cai, Helen M. Meng:
Understanding speaking styles of internet speech data with LSTM and low-resource training. ACII 2015: 815-820 - 2014
- [c2]Xixin Wu, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai, Weifeng Li:
Automatic speech data clustering with human perception based weighted distance. ISCSLP 2014: 216-220 - 2012
- [c1]Xixin Wu, Zhiyong Wu, Jia Jia, Lianhong Cai:
Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers. ISCSLP 2012: 363-367
Informal and Other Publications
- 2024
- [i63]Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng:
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition. CoRR abs/2401.04152 (2024) - [i62]Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng:
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization. CoRR abs/2401.14664 (2024) - [i61]Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2401.17796 (2024) - [i60]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. CoRR abs/2403.16078 (2024) - [i59]Dongchao Yang, Dingdong Wang, Haohan Guo, Xueyuan Chen, Xixin Wu, Helen Meng:
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models. CoRR abs/2406.02328 (2024) - [i58]Haohan Guo, Fenglong Xie, Dongchao Yang, Hui Lu, Xixin Wu, Helen Meng:
Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder. CoRR abs/2406.02940 (2024) - [i57]Xueyuan Chen, Dongchao Yang, Dingdong Wang, Xixin Wu, Zhiyong Wu, Helen Meng:
CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2406.08336 (2024) - [i56]Dongchao Yang, Haohan Guo, Yuanyuan Wang, Rongjie Huang, Xiang Li, Xu Tan, Xixin Wu, Helen Meng:
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner. CoRR abs/2406.10056 (2024) - [i55]Tianhua Zhang, Kun Li, Hongyin Luo, Xixin Wu, James R. Glass, Helen Meng:
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers. CoRR abs/2406.10991 (2024) - [i54]Jing Xu, Minglin Wu, Xixin Wu, Helen Meng:
Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models. CoRR abs/2406.14092 (2024) - [i53]Jingyan Zhou, Kun Li, Junan Li, Jiawen Kang, Minda Hu, Xixin Wu, Helen Meng:
Purple-teaming LLMs with Adversarial Defender Training. CoRR abs/2407.01850 (2024) - [i52]Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei:
Autoregressive Speech Synthesis without Vector Quantization. CoRR abs/2407.08551 (2024) - [i51]Lingwei Meng, Jiawen Kang, Yuejiao Wang, Zengrui Jin, Xixin Wu, Xunying Liu, Helen Meng:
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System. CoRR abs/2407.09817 (2024) - [i50]Yuejiao Wang, Xianmin Gong, Lingwei Meng, Xixin Wu, Helen Meng:
Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder. CoRR abs/2407.10376 (2024) - [i49]Weiqin Li, Peiji Yang, Yicheng Zhong, Yixuan Zhou, Zhisheng Wang, Zhiyong Wu, Xixin Wu, Helen Meng:
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models. CoRR abs/2407.13509 (2024) - [i48]Dongchao Yang, Rongjie Huang, Yuanyuan Wang, Haohan Guo, Dading Chong, Songxiang Liu, Xixin Wu, Helen Meng:
SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models. CoRR abs/2408.13893 (2024) - [i47]Haohan Guo, Fenglong Xie, Kun Xie, Dongchao Yang, Dake Guo, Xixin Wu, Helen Meng:
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis. CoRR abs/2409.00933 (2024) - [i46]Lingwei Meng, Shujie Hu, Jiawen Kang, Zhaoqing Li, Yuejiao Wang, Wenxuan Wu, Xixin Wu, Xunying Liu, Helen Meng:
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions. CoRR abs/2409.08596 (2024) - [i45]Haohan Guo, Fenglong Xie, Dongchao Yang, Xixin Wu, Helen Meng:
Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation. CoRR abs/2409.11630 (2024) - [i44]Jiawen Kang, Lingwei Meng, Mingyu Cui, Yuejiao Wang, Xixin Wu, Xunying Liu, Helen Meng:
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC. CoRR abs/2409.12388 (2024) - [i43]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Zhiyong Wu, Helen Meng, Xixin Wu:
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions. CoRR abs/2409.12560 (2024) - [i42]Jiawen Kang, Dongrui Han, Lingwei Meng, Jingyan Zhou, Jinchao Li, Xixin Wu, Helen Meng:
Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech. CoRR abs/2409.16322 (2024) - [i41]Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam:
A Survey on the Honesty of Large Language Models. CoRR abs/2409.18786 (2024) - 2023
- [i40]HoLam Chung, Junan Li, Pengfei Liu, Wai-Kim Leung, Xixin Wu, Helen Meng:
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition. CoRR abs/2302.00836 (2023) - [i39]Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng:
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One. CoRR abs/2302.09908 (2023) - [i38]Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng:
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection. CoRR abs/2303.08019 (2023) - [i37]Jinchao Li, Xixin Wu, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng:
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition. CoRR abs/2303.08027 (2023) - [i36]Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
Interpretable Unified Language Checking. CoRR abs/2304.03728 (2023) - [i35]Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
SAIL: Search-Augmented Instruction Learning. CoRR abs/2305.15225 (2023) - [i34]Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng:
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator. CoRR abs/2305.16263 (2023) - [i33]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis. CoRR abs/2307.16012 (2023) - [i32]Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng:
Rethinking Machine Ethics - Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? CoRR abs/2308.15399 (2023) - [i31]Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. CoRR abs/2308.16577 (2023) - [i30]Haohan Guo, Fenglong Xie, Jiawen Kang, Yujia Xiao, Xixin Wu, Helen Meng:
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning. CoRR abs/2309.00126 (2023) - [i29]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James R. Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. CoRR abs/2309.10814 (2023) - [i28]Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. CoRR abs/2309.11977 (2023) - [i27]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023) - [i26]Xiaohan Feng, Xixin Wu, Helen Meng:
Injecting linguistic knowledge into BERT for Dialogue State Tracking. CoRR abs/2311.15623 (2023) - [i25]Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration based on Similarity between Nodes. CoRR abs/2312.11858 (2023) - [i24]Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis. CoRR abs/2312.12181 (2023) - 2022
- [i23]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge. CoRR abs/2202.01986 (2022) - [i22]Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng:
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation. CoRR abs/2202.09082 (2022) - [i21]Haohan Guo, Hui Lu, Xixin Wu, Helen Meng:
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS. CoRR abs/2203.01080 (2022) - [i20]Wen Wu, Chao Zhang, Xixin Wu, Philip C. Woodland:
Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors. CoRR abs/2203.04443 (2022) - [i19]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. CoRR abs/2203.15377 (2022) - [i18]Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. CoRR abs/2203.16928 (2022) - [i17]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. CoRR abs/2206.09131 (2022) - [i16]Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Exploring linguistic feature and model combination for speech recognition based automatic AD detection. CoRR abs/2206.13758 (2022) - [i15]Haohan Guo, Feng-Long Xie, Frank K. Soong, Xixin Wu, Helen Meng:
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS. CoRR abs/2209.10887 (2022) - [i14]Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE. CoRR abs/2210.13771 (2022) - [i13]Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng:
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations. CoRR abs/2210.15131 (2022) - 2021
- [i12]Xixin Wu, Mark J. F. Gales:
Should Ensemble Members Be Calibrated? CoRR abs/2101.05397 (2021) - [i11]Qingyun Dou, Yiting Lu, Potsawee Manakul, Xixin Wu, Mark J. F. Gales:
Attention Forcing for Machine Translation. CoRR abs/2104.01264 (2021) - [i10]Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis. CoRR abs/2107.03298 (2021) - [i9]Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng:
Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks. CoRR abs/2107.08803 (2021) - [i8]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Characterizing the adversarial vulnerability of speech self-supervised learning. CoRR abs/2111.04330 (2021) - 2020
- [i7]Xu Li, Xixin Wu, Xunying Liu, Helen Meng:
Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech. CoRR abs/2002.00205 (2020) - [i6]Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. CoRR abs/2004.04014 (2020) - [i5]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. CoRR abs/2006.06186 (2020) - [i4]Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng:
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling. CoRR abs/2009.02725 (2020) - [i3]Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion. CoRR abs/2011.01678 (2020) - 2019
- [i2]Peng Liu, Xixin Wu, Shiyin Kang, Guangzhi Li, Dan Su, Dong Yu:
Maximizing Mutual Information for Tacotron. CoRR abs/1909.01145 (2019) - [i1]Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, Helen Meng:
Adversarial Attacks on GMM i-vector based Speaker Verification Systems. CoRR abs/1911.03078 (2019)
Coauthor Index
aka: Helen Meng
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:12 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint