Mingyu Cui
YOU?
Author Swipe
View article: Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration
Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration Open
Prosody diversity is essential for achieving naturalness and expressiveness in zero-shot text-to-speech (TTS). However, frequently used acoustic metrics capture only partial views of prosodic variation and correlate poorly with human perce…
View article: Advances in the interrelated nature of vaginal microecology, HPV infection, and cervical lesions
Advances in the interrelated nature of vaginal microecology, HPV infection, and cervical lesions Open
Vaginal microecology serves as a crucial defense mechanism in women’s reproductive health. It encompasses vaginal anatomy, microbial flora, endocrine regulation, and immune responses. Lactobacillus species dominate this ecosystem, maintain…
View article: Sedentary Leisure Behaviour, Physical Activity, and Gastroesophageal Reflux Disease: Evidence From a Mendelian Randomization Analysis
Sedentary Leisure Behaviour, Physical Activity, and Gastroesophageal Reflux Disease: Evidence From a Mendelian Randomization Analysis Open
Background and Aims Gastroesophageal reflux disease (GERD) is common worldwide. Although associations between sedentary behaviour (LSBs), physical activity (PAs), and GERD have been reported, their causal relationships remain unclear. This…
View article: Supplementary Table S2 from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin
Supplementary Table S2 from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin Open
Supplementary Table S2: Baseline clinical characteristics of participants in the training and validation cohorts
View article: Data from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin
Data from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin Open
Colorectal adenomas are responsible for the origin of most colorectal cancers. Early detection together with active intervention of colorectal adenomas plays a crucial role in the prevention of colorectal cancer. This study aimed to constr…
View article: Supplementary Table S1 from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin
Supplementary Table S1 from Construction and Validation of a Novel Forecasting Nomogram to the Risk of Colorectal Adenomas: Preventing Colorectal Cancer at Its Origin Open
Supplementary Table S1: Baseline clinical characteristics of subjects in the colorectal adenoma and no adenoma groups
View article: A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models
A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models Open
With the rise of Speech Large Language Models (Speech LLMs), there has been growing interest in discrete speech tokens for their ability to integrate with text-based tokens seamlessly. Compared to most studies that focus on continuous spee…
View article: Exploring SSL Discrete Tokens for Multilingual ASR
Exploring SSL Discrete Tokens for Multilingual ASR Open
With the advancement of Self-supervised Learning (SSL) in speech-related tasks, there has been growing interest in utilizing discrete tokens generated by SSL for automatic speech recognition (ASR), as they offer faster processing technique…
View article: Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition Open
Self-supervised learning (SSL) based speech foundation models have been applied to a wide range of ASR tasks. However, their application to dysarthric and elderly speech via data-intensive parameter fine-tuning is confronted by in-domain d…
View article: Unveiling the driving role of pH on community stability and function during lignocellulose degradation in paddy soil
Unveiling the driving role of pH on community stability and function during lignocellulose degradation in paddy soil Open
Introduction Crop straw, a major by-product of agricultural production, is pivotal in maintaining soil health and preserving the ecological environment. While straw incorporation is widely recognized as a sustainable practice, the incomple…
View article: Cross-Speaker Encoding Network for Multi-Talker Speech Recognition
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition Open
End-to-end multi-talker speech recognition has garnered great interest as an effective approach to directly transcribe overlapped speech from multiple speakers. Current methods typically adopt either 1) single-input multiple-output (SIMO) …
View article: Self-Supervised ASR Models and Features for Dysarthric and Elderly Speech Recognition
Self-Supervised ASR Models and Features for Dysarthric and Elderly Speech Recognition Open
Self-supervised learning (SSL) based speech foundation models have been applied to a wide range of ASR tasks. However, their application to dysarthric and elderly speech via data-intensive parameter fine-tuning is confronted by in-domain d…
View article: Cutaneous manifestations of inflammatory bowel disease: basic characteristics, therapy, and potential pathophysiological associations
Cutaneous manifestations of inflammatory bowel disease: basic characteristics, therapy, and potential pathophysiological associations Open
Inflammatory bowel disease (IBD) is a chronic inflammatory disease typically involving the gastrointestinal tract but not limited to it. IBD can be subdivided into Crohn’s disease (CD) and ulcerative colitis (UC). Extraintestinal manifesta…
View article: Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition Open
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date. Motivated by the invariance of visual modality to acoustic signal corruption, an audio-visual…
View article: Research on the Current Situation, Challenges and Path of Green Finance Development in China under the Background of "Double Carbon"
Research on the Current Situation, Challenges and Path of Green Finance Development in China under the Background of "Double Carbon" Open
As the world's largest carbon emission country, China has put forward the "double carbon" goal in order to better respond to climate change, and the development of green finance is an important measure to implement sustainable development …
View article: Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition Open
Automatic recognition of disordered and elderly speech remains a highly challenging task to date due to the difficulty in collecting such data in large quantities. This paper explores a series of approaches to integrate domain adapted SSL …
View article: A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One Open
Although automatic speech recognition (ASR) can perform well in common non-overlapping environments, sustaining performance in multi-talker overlapping speech recognition remains challenging. Recent research revealed that ASR model's encod…
View article: Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems Open
Speaker adaptation techniques provide a powerful solution to customise automatic speech recognition (ASR) systems for individual users. Practical application of unsupervised model-based speaker adaptation techniques to data intensive end-t…
View article: Research on the influencing factors of female teachers' fertility intention under the "three-child" policy based on grounded theory
Research on the influencing factors of female teachers' fertility intention under the "three-child" policy based on grounded theory Open
Since the implementation of the "three-child" policy in China, its influence and response are far less than that of the "two-child" policy. Therefore, in order to deeply study the current situation and root causes of women's response to th…
View article: Classification and identification of glass artifacts based on fuzzy clustering and binary logit regression analysis
Classification and identification of glass artifacts based on fuzzy clustering and binary logit regression analysis Open
This paper establishes a mathematical model based on the existing data of glass types, summarizes the classification rules of high potassium glass and lead-barium glass, and selects the appropriate chemical composition and subclasses for h…
View article: Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems Open
Fundamental modelling differences between hybrid and end-to-end (E2E)\nautomatic speech recognition (ASR) systems create large diversity and\ncomplementarity among them. This paper investigates multi-pass rescoring and\ncross adaptation ba…
View article: Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Confidence Score Based Conformer Speaker Adaptation for Speech Recognition Open
A key challenge for automatic speech recognition (ASR) systems is to model the speaker level variability. In this paper, compact speaker dependent learning hidden unit contributions (LHUC) are used to facilitate both speaker adaptive train…
View article: Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems Open
Fundamental modelling differences between hybrid and end-to-end (E2E) automatic speech recognition (ASR) systems create large diversity and complementarity among them. This paper investigates multi-pass rescoring and cross adaptation based…