Shuai Wang
YOU?
Author Swipe
View article: 3C conjugates: a highly sensitive platform for antibody internalization assessment in ADC development
3C conjugates: a highly sensitive platform for antibody internalization assessment in ADC development Open
Antibody-drug conjugates (ADCs) rely on antibody-mediated internalization to deliver cytotoxic payloads into tumor cells. Therefore, quantitative assessment of antibody internalization is essential for ADC development, particularly during …
View article: The temporal evolution of family educational priorities and their impact on children’s sports participation: evidence from the HAPC model
The temporal evolution of family educational priorities and their impact on children’s sports participation: evidence from the HAPC model Open
Introduction This study examines how family educational priorities influence children’s sports participation in China. Drawing on social learning theory and ecological systems theory, it conceptualizes parental emphasis on education as a p…
View article: Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods
Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods Open
This paper summarizes the Interspeech2025 Multilingual Conversational Speech Language Model (MLC-SLM) challenge, which aims to advance the exploration of building effective multilingual conversational speech LLMs (SLLMs). We provide a deta…
View article: Study on English Reading Teaching for Non-English Majors under the Guidance of Discourse Analysis Theory
Study on English Reading Teaching for Non-English Majors under the Guidance of Discourse Analysis Theory Open
Within the spectrum of English learning endeavors, English reading undoubtedly occupies a pivotal position. Nevertheless, conventional instructional paradigms in English reading tend to overemphasize grammatical rules and lexical items, wh…
View article: Accent Normalization Using Self-Supervised Discrete Tokens with Non-Parallel Data
Accent Normalization Using Self-Supervised Discrete Tokens with Non-Parallel Data Open
Accent normalization converts foreign-accented speech into native-like speech while preserving speaker identity. We propose a novel pipeline using self-supervised discrete tokens and non-parallel training data. The system extracts tokens f…
View article: Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis Open
We investigate hierarchical emotion distribution (ED) for achieving multi-level quantitative control of emotion rendering in text-to-speech synthesis (TTS). We introduce a novel multi-step hierarchical ED prediction module that quantifies …
View article: AWCDL: Automatic weight calibration deep learning for detecting HER2 status in whole-slide breast cancer image
AWCDL: Automatic weight calibration deep learning for detecting HER2 status in whole-slide breast cancer image Open
View article: The rumen microbiome and its metabolome together with the host metabolome regulate the growth performance of crossbred cattle
The rumen microbiome and its metabolome together with the host metabolome regulate the growth performance of crossbred cattle Open
View article: A Dual Contrastive Learning with Auxiliary Supervision for Community Detection
A Dual Contrastive Learning with Auxiliary Supervision for Community Detection Open
View article: Chain-of-Jailbreak Attack for Image Generation Models via Step by Step Editing
Chain-of-Jailbreak Attack for Image Generation Models via Step by Step Editing Open
View article: Multi-step Prediction and Control of Hierarchical Emotion Distribution in Text-to-speech Synthesis
Multi-step Prediction and Control of Hierarchical Emotion Distribution in Text-to-speech Synthesis Open
View article: Hierarchical Control of Emotion Rendering in Speech Synthesis
Hierarchical Control of Emotion Rendering in Speech Synthesis Open
Emotional text-to-speech synthesis (TTS) aims to generate realistic emotional speech from input text. However, quantitatively controlling multi-level emotion rendering remains challenging. In this paper, we propose a flow-matching based em…
View article: Lucky Imaging Based Blind Deconvolution Algorithm for Wide Field-of-view Solar GLAO Image
Lucky Imaging Based Blind Deconvolution Algorithm for Wide Field-of-view Solar GLAO Image Open
This paper proposes a lucky imaging based blind deconvolution algorithm for wide field-of-view (FoV) ground layer adaptive optics (GLAO) solar images. Our method effectively combines the advantages of traditional lucky imaging and blind de…
View article: Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step Open
Text-based image generation models, such as Stable Diffusion and DALL-E 3, hold significant potential in content creation and publishing workflows, making them the focus in recent years. Despite their remarkable capability to generate dive…
View article: MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion Open
In accented voice conversion or accent conversion, we seek to convert the accent in speech from one another while preserving speaker identity and semantic content. In this study, we formulate a novel method for creating multi-accented spee…
View article: DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning
DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning Open
Federated Learning (FL) is a distributed machine learning scheme in which clients jointly participate in the collaborative training of a global model by sharing model information rather than their private datasets. In light of concerns ass…
View article: Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Open
Speaker individuality information is among the most critical elements within speech signals. By thoroughly and accurately modeling this information, it can be utilized in various intelligent speech applications, such as speaker recognition…
View article: Early determination of potential critical quality attributes of therapeutic antibodies in developability studies through surface plasmon resonance-based relative binding activity assessment
Early determination of potential critical quality attributes of therapeutic antibodies in developability studies through surface plasmon resonance-based relative binding activity assessment Open
Precise measurement of the binding activity changes of therapeutic antibodies is important to determine the potential critical quality attributes (CQAs) in developability assessment at the early stage of antibody development. Here, we repo…
View article: Breaking Boundaries: A Universal Wavefront Reconstruction Approach for High-resolution Solar Imaging
Breaking Boundaries: A Universal Wavefront Reconstruction Approach for High-resolution Solar Imaging Open
This Letter proposes a universal wavefront reconstruction approach based on a coupled data set and neural network, aiming to overcome the limitations of current algorithms in terms of universality and wavefront sensing accuracy for variabl…
View article: Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis Open
Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromi…
View article: Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis Open
It remains a challenge to effectively control the emotion rendering in\ntext-to-speech (TTS) synthesis. Prior studies have primarily focused on\nlearning a global prosodic representation at the utterance level, which\nstrongly correlates w…
View article: Fine-Grained Quantitative Emotion Editing for Speech Generation
Fine-Grained Quantitative Emotion Editing for Speech Generation Open
It remains a significant challenge how to quantitatively control the expressiveness of speech emotion in speech generation. In this work, we present a novel approach for manipulating the rendering of emotions for speech generation. We prop…
View article: Cascaded Temporal and Spatial Attention Network for solar adaptive optics image restoration
Cascaded Temporal and Spatial Attention Network for solar adaptive optics image restoration Open
Context. Atmospheric turbulence severely degrades the quality of images observed through a ground-based telescope. An adaptive optics (AO) system only partially improves the image quality by correcting certain level wavefronts, making post…
View article: Target-independent dynamic wavefront sensing method based on distorted grating and deep learning
Target-independent dynamic wavefront sensing method based on distorted grating and deep learning Open
A real-time wavefront sensing method for arbitrary targets is proposed, which provides an effective way for diversified wavefront sensing application scenarios. By using a distorted grating, the positive and negative defocus images are sim…
View article: Application of machine learning in the prediction of deficient mismatch repair in patients with colorectal cancer based on routine preoperative characterization
Application of machine learning in the prediction of deficient mismatch repair in patients with colorectal cancer based on routine preoperative characterization Open
Simple summary Detecting deficient mismatch repair (dMMR) in patients with colorectal cancer is essential for clinical decision-making, including evaluation of prognosis, guidance of adjuvant chemotherapy and immunotherapy, and primary scr…
View article: Beamforming against main lobe interference based on radial basis function neural network
Beamforming against main lobe interference based on radial basis function neural network Open
Aiming at the problem that the performance of traditional beamforming algorithm deteriorates sharply in the presence of main lobe interference, a beamforming algorithm based on radial basis function (RBF) neural network is proposed. Firstl…
View article: Neurobiological substrates of major psychiatry disorders: transdiagnostic associations between white matter abnormalities, neuregulin 1 and clinical manifestation
Neurobiological substrates of major psychiatry disorders: transdiagnostic associations between white matter abnormalities, neuregulin 1 and clinical manifestation Open
Background: Schizophrenia, bipolar disorder and major depressive disorder are increasingly being conceptualized as a transdiagnostic continuum. Disruption of white matter is a common alteration in these psychiatric disorders, but the molec…
View article: Blind restoration of solar images via the Channel Sharing Spatio-temporal Network
Blind restoration of solar images via the Channel Sharing Spatio-temporal Network Open
Context. Due to the presence of atmospheric turbulence, the quality of solar images tends to be significantly degraded when observed by ground-based telescopes. The adaptive optics (AO) system can achieve partial correction but stops short…
View article: Grid‐Based Whole Trajectory Clustering in Road Networks Environment
Grid‐Based Whole Trajectory Clustering in Road Networks Environment Open
In the data mining of road networks, trajectory clustering of moving objects plays an important role in many applications. Most existing algorithms for this problem are based on every position point in a trajectory and face a significant c…
View article: End-to-End Speaker-Dependent Voice Activity Detection
End-to-End Speaker-Dependent Voice Activity Detection Open
Voice activity detection (VAD) is an essential pre-processing step for tasks such as automatic speech recognition (ASR) and speaker recognition. A basic goal is to remove silent segments within an audio, while a more general VAD system cou…