Jinglin Liu
YOU?
Author Swipe
View article: Expanding the phenotypic spectrum of Rauch-Steindl syndrome: A novel NSD2 variant with atrial septal defect in a Chinese patient
Expanding the phenotypic spectrum of Rauch-Steindl syndrome: A novel NSD2 variant with atrial septal defect in a Chinese patient Open
Background Rauch-Steindl syndrome (RSS) is a very rare autosomal dominant disorder caused by pathogenic variants in the NSD2 gene, characterized by dysmorphic facial features, prenatal and postnatal growth retardation, and variable develop…
View article: Magnetic Circuit Analysis and Design Optimized for Cost-Effectiveness of Surface-Inserted Rare Earth Consequent-Pole Permanent Magnet Machines
Magnetic Circuit Analysis and Design Optimized for Cost-Effectiveness of Surface-Inserted Rare Earth Consequent-Pole Permanent Magnet Machines Open
In consequent-pole permanent magnet (CPPM) machines, the configuration where PM poles and iron poles are alternately arranged causes distortion in the air-gap magnetic field. This results in significant differences in magnetic circuit char…
View article: Analytical Modeling and Analysis of Halbach Array Permanent Magnet Synchronous Motor
Analytical Modeling and Analysis of Halbach Array Permanent Magnet Synchronous Motor Open
The Halbach array permanent magnet can improve the power density of motors. This paper uses analytical modeling to analyze and optimize the Halbach array permanent magnet synchronous motor (PMSM). Firstly, a general motor model is establis…
View article: DualHet-YOLO: A Dual-Backbone Heterogeneous YOLO Network for Inspection Robots to Recognize Yellow-Feathered Chicken Behavior in Floor-Raised House
DualHet-YOLO: A Dual-Backbone Heterogeneous YOLO Network for Inspection Robots to Recognize Yellow-Feathered Chicken Behavior in Floor-Raised House Open
The behavior of floor-raised chickens is closely linked to their health status and environmental comfort. As a type of broiler chicken with special behaviors, understanding the daily actions of yellow-feathered chickens is crucial for accu…
View article: Deep Learning-Based Detection and Digital Twin Implementation of Beak Deformities in Caged Layer Chickens
Deep Learning-Based Detection and Digital Twin Implementation of Beak Deformities in Caged Layer Chickens Open
With the increasing urgency for digital transformation in large-scale caged layer farms, traditional methods for monitoring the environment and chicken health, which often rely on human experience, face challenges related to low efficiency…
View article: Reference Prototype of High Lift Motor for Distributed Electric Propulsion All‐Electric Aircraft
Reference Prototype of High Lift Motor for Distributed Electric Propulsion All‐Electric Aircraft Open
The all‐electric aircraft with a distributed electric propulsion system being studied in this article uses 11 motors, of which 10 are specifically used for lift enhancement during takeoff and landing. This paper describes the iterative des…
View article: MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes Open
Talking face generation (TFG) aims to animate a target identity's face to create realistic talking videos. Personalized TFG is a variant that emphasizes the perceptual identity similarity of the synthesized result (from the perspective of …
View article: MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency Open
Voice conversion aims to modify the source speaker's voice to resemble the target speaker while preserving the original speech content. Despite notable advancements in voice conversion these days, multi-lingual voice conversion (including …
View article: A compensation method for PMSM sensorless control with parameter identification considering SMO observation error
A compensation method for PMSM sensorless control with parameter identification considering SMO observation error Open
Sensorless control of permanent magnet synchronous motor (PMSM) can increase the reliability of electric actuators of more electrical aircraft. Numerical online parameter estimation method will enhance the performance for sensorless contro…
View article: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Open
Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing comp…
View article: Layered co-continuous structure in bone scaffold fabricated by laser additive manufacturing for enhancing electro-responsive shape memory properties
Layered co-continuous structure in bone scaffold fabricated by laser additive manufacturing for enhancing electro-responsive shape memory properties Open
Porous scaffold based on electro-responsive shape memory polymers (ESMPs) possesses great potential applications in minimally invasive surgery for bone defect repair because it provides the ability for remote control and internal heating. …
View article: Genotype characterization of tetrahydrobiopterin deficiency in two Tibetan children
Genotype characterization of tetrahydrobiopterin deficiency in two Tibetan children Open
We identified and treated two cases of BH4D in Tibetan populations in China, marking the first confirmed instances. Our report emphasizes the significance of conducting differential diagnosis tests for BH4D.
View article: Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis Open
One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video. The existing methods fail to simultaneously achieve the…
View article: C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model
C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model Open
Co-speech gesture generation is crucial for automatic digital avatar animation. However, existing methods suffer from issues such as unstable training and temporal inconsistency, particularly in generating high-fidelity and comprehensive g…
View article: Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis Open
Zero-shot text-to-speech (TTS) aims to synthesize voices with unseen speech prompts, which significantly reduces the data and computation requirements for voice cloning by skipping the fine-tuning process. However, the prompting mechanisms…
View article: Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis Open
We are interested in a novel task, namely low-resource text-to-talking avatar. Given only a few-minute-long talking person video with the audio track as the training data and arbitrary texts as the driving input, we aim to synthesize high-…
View article: Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias Open
Scaling text-to-speech to a large and wild dataset has been proven to be highly effective in achieving timbre and speech style generalization, particularly in zero-shot TTS. However, previous works usually encode speech into latent using a…
View article: Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation Open
Large diffusion models have been successful in text-to-audio (T2A) synthesis tasks, but they often suffer from common issues such as semantic misalignment and poor temporal consistency due to limited natural language understanding and data…
View article: AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation Open
Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date. Despite the recent success, current S2ST models still suffer from distinct degradation in …
View article: CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training Open
Improving text representation has attracted much attention to achieve expressive text-to-speech (TTS). However, existing works only implicitly learn the prosody with masked token reconstruction tasks, which leads to low training efficiency…
View article: RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis Open
We are interested in a challenging task, Realistic-Music-Score based Singing Voice Synthesis (RMS-SVS). RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types (grace, slur, rest, etc.). …
View article: AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment Open
The speech-to-singing (STS) voice conversion task aims to generate singing samples corresponding to speech recordings while facing a major challenge: the alignment between the target (singing) pitch contour and the source (speech) content …
View article: GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation Open
Generating talking person portraits with arbitrary speech audio is a crucial problem in the field of digital human and metaverse. A modern talking face generation method is expected to achieve the goals of generalized audio-lip synchroniza…
View article: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Open
Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing comp…
View article: Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG)
Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG) Open
ICASSP2023 General Meeting Understanding and Generation Challenge (MUG) focuses on prompting a wide range of spoken language processing (SLP) research on meeting transcripts, as SLP applications are critical to improve users' efficiency in…
View article: MUG: A General Meeting Understanding and Generation Benchmark
MUG: A General Meeting Understanding and Generation Benchmark Open
Listening to long video/audio recordings from video conferencing and online courses for acquiring information is extremely inefficient. Even after ASR systems transcribe recordings into long-form spoken language documents, reading ASR tran…
View article: GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis Open
Generating photo-realistic video portrait with arbitrary speech audio is a crucial problem in film-making and virtual reality. Recently, several works explore the usage of neural radiance field in this task to improve 3D realness and image…
View article: Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models Open
Large-scale multimodal generative modeling has created milestones in text-to-image and text-to-video generation. Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio…
View article: CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training Open
Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023.
View article: RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis Open
We are interested in a challenging task, Realistic-Music-Score based Singing Voice Synthesis (RMS-SVS). RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types (grace, slur, rest, etc.). …