Explanipedia

Graphpb: Graphical Representations Of Prosody Boundary In Speech Synthesis Open

Aolan Sun, Jianzong Wang, Ning Cheng, H.Y. Peng, Zhen Zeng , et al. · 2024

Computer science

This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, intending to parse the semantic and syntactic relationship of input sequences in a graphical domain for improv…

FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework Open

Jianzong Wang, Xulong Zhang, Aolan Sun, Ning Cheng, Jing Xiao · 2023

Computer science Philosophy

This paper integrates graph-to-sequence into an end-to-end text-to-speech framework for syntax-aware modelling with syntactic information of input text. Specifically, the input text is parsed by a dependency parsing module to form a syntac…

SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model Open

Jianzong Wang, Xulong Zhang, Haobin Tang, Aolan Sun, Ning Cheng , et al. · 2023

Computer science Political science

In recent Text-to-Speech (TTS) systems, a neural vocoder often generates speech samples by solely conditioning on acoustic features predicted from an acoustic model. However, there are always distortions existing in the predicted acoustic …

Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar Open

Aolan Sun, Xulong Zhang, Tiandong Ling, Jianzong Wang, Ning Cheng , et al. · 2022

Computer science Philosophy Medicine

Since the beginning of the COVID-19 pandemic, remote conferencing and school-teaching have become important tools. The previous applications aim to save the commuting cost with real-time interactions. However, our application is going to l…

Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion Open

SiCheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun , et al. · 2022

Computer science Art Political science

One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic. Existing works generally disentangle timbre, while information about pitch, rhythm and content is still mixed together…

GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis Open

Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng , et al. · 2020

Computer science

This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, intending to parse the semantic and syntactic relationship of input sequences in a graphical domain for improv…

GraphTTS: graph-to-sequence modelling in neural text-to-speech Open

Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng , et al. · 2020

Computer science Biology

This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from inp…

Aolan Sun YOU? Author Swipe