About-me

Hi there! I received my Ph.D. degree from Human-Computer Communications Laboratory (HCCL) of The Chinese University of Hong Kong (CUHK), supervised by Prof. Helen Meng. My research interests mainly focus on speech generation, e.g., voice conversion and text-to-speech synthesis. Before joining CUHK, I received my M.S. and B.S. degrees from Peking University (PKU, supervised by Prof. Yuexian Zou) and University of Electronic Science and Technology of China (UESTC), respectively.

Selected-publications

I. Voice conversion and speech synthesis

  • Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng, “VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion”, ICASSP 2022. [paper][demo]
  • Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng, “Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation”, ICASSP 2022. [paper][demo]
  • Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng, “VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion”, Interspeech 2021. [paper][code][demo]
  • Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng, “Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion”, Interspeech 2021. [paper][demo]
  • Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng, “Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling”, IEEE/ACM Transactions on Audio Speech and Language Processing, 2021. [paper][code][demo]
  • Disong Wang, Liqun Deng, Yang Zhang, Nianzu Zheng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng, “FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis”, ICASSP 2021. [paper][code][demo]
  • Disong Wang, Jianwei Yu, Xixin Wu, Songxiang Liu, Lifa Sun, Xunying Liu, Helen Meng, “End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation for Dysarthric Speech Reconstruction”, ICASSP 2020. [paper][demo]
  • Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng, “End-to-End Accent Conversion Without Using Native Utterances”, ICASSP 2020. [paper][demo]

II. Disorded speech recognition and detection

  • Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng, “Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization”, Interspeech 2021. [paper]
  • Disong Wang, Jianwei Yu, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng, “Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization”, ISCSLP 2021. [paper]

III. Miscellaneous (Speaker DOA estimation, speech dereverberation, speaker verification, etc.)

  • Disong Wang, Yuexian Zou, “Joint Noise and Reverberation Adaptive Learning for Robust Speaker DOA Estimation with An Acoustic Vector Sensor”, Interspeech 2018. [paper]
  • Disong Wang, Yuexian Zou, Wenwu Wang, “Learning soft mask with DNN and DNN-SVM for multi-speaker DOA estimation using an acoustic vector sensor”, Journal of the Franklin Institute, 2018. [paper]
  • Disong Wang, Yuexian Zou, Wei Shi, “A Deep Convolutional Encoder-Decoder Model for Robust Speech Dereverberation”, DSP 2017. [paper]
  • Disong Wang, Yuexian Zou, Junhong Liu, Yichi Huang, “A Robust DBN-vector based Speaker Verification System under Channel Mismatch Conditions”, DSP 2016. [paper]
  • Disong Wang, Xiansheng Guo, Yuexian Zou, “Accurate and Robust Device-Free Localization Approach via Sparse Representation in Presence of Noise and Outliers”, DSP 2016. [paper]