Yuan Yao
YOU?
Author Swipe
View article: Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation Open
Motion-controllable image animation is a fundamental task with a wide range of potential applications. Recent works have made progress in controlling camera or object motion via various motion representations, while they still struggle to …
View article: From Detection to Explanation: Integrating Temporal and Spatial Features for Rumor Detection and Explaining Results Using LLMs
From Detection to Explanation: Integrating Temporal and Spatial Features for Rumor Detection and Explaining Results Using LLMs Open
View article: Evidence for Redefined Prenatal Screening: Demand-Driven Trio-WES with Four-Dimensional Risk Stratification Enables Comprehensive Fetus Assessment
Evidence for Redefined Prenatal Screening: Demand-Driven Trio-WES with Four-Dimensional Risk Stratification Enables Comprehensive Fetus Assessment Open
View article: Prediction of Option Prices by BP Neural Network Based on Principal Component Analysis
Prediction of Option Prices by BP Neural Network Based on Principal Component Analysis Open
Under the development of artificial intelligence, this paper adopts the neural network algorithm based on principal component analysis to perform data fitting and prediction on the option prices of Huaxia SSE 50ETF. It also compares the fi…
View article: Loss of glymphatic homeostasis in heart failure
Loss of glymphatic homeostasis in heart failure Open
Heart failure is associated with progressive reduction in cerebral blood flow and neurodegenerative changes leading to cognitive decline. The glymphatic system is crucial for the brain’s waste removal, and its dysfunction is linked to neur…
View article: Squeezing Context into Patches: Towards Memory-Efficient Ultra-High Resolution Semantic Segmentation
Squeezing Context into Patches: Towards Memory-Efficient Ultra-High Resolution Semantic Segmentation Open
Segmenting ultra-high-resolution (UHR) images poses a significant challenge due to constraints on GPU memory, leading to a trade-off between detailed local information and a comprehensive contextual understanding. Current UHR methods often…
View article: CPT: Colorful Prompt Tuning for pre-trained vision-language models
CPT: Colorful Prompt Tuning for pre-trained vision-language models Open
Vision-Language Pre-training (VLP) models have shown promising capabilities in grounding natural language in image data, facilitating a broad range of cross-modal tasks. However, we note that there exists a significant gap between the obje…
View article: M$^3$Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection
M$^3$Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection Open
Most existing salient object detection methods mostly use U-Net or feature pyramid structure, which simply aggregates feature maps of different scales, ignoring the uniqueness and interdependence of them and their respective contributions …
View article: sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging Open
Automatic sleep staging based on electroencephalography (EEG) and electromyography (EMG) signals is an important aspect of sleep-related research. Current sleep staging methods suffer from two major drawbacks. First, there are limited info…
View article: WT-YOLOX: An Efficient Detection Algorithm for Wind Turbine Blade Damage Based on YOLOX
WT-YOLOX: An Efficient Detection Algorithm for Wind Turbine Blade Damage Based on YOLOX Open
Wind turbine blades will suffer various surface damages due to their operating environment and high-speed rotation. Accurate identification in the early stage of damage formation is crucial. The damage detection of wind turbine blades is a…
View article: CTANet: Confidence-Based Threshold Adaption Network for Semi-Supervised Segmentation of Uterine Regions from MR Images for HIFU Treatment
CTANet: Confidence-Based Threshold Adaption Network for Semi-Supervised Segmentation of Uterine Regions from MR Images for HIFU Treatment Open
View article: OpenMonkeyChallenge: Dataset and Benchmark Challenges for Pose Estimation of Non-human Primates
OpenMonkeyChallenge: Dataset and Benchmark Challenges for Pose Estimation of Non-human Primates Open
View article: CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models Open
Pre-Trained Vision-Language Models (VL-PTMs) have shown promising capabilities in grounding natural language in image data, facilitating a broad variety of cross-modal tasks. However, we note that there exists a significant gap between the…
View article: Image-to-Video Generation via 3D Facial Dynamics
Image-to-Video Generation via 3D Facial Dynamics Open
We present a versatile model, FaceAnime, for various video generation tasks from still images. Video generation from a single face image is an interesting problem and usually tackled by utilizing Generative Adversarial Networks (GANs) to i…
View article: Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning
Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning Open
In self-supervised spatio-temporal representation learning, the temporal resolution and long-short term characteristics are not yet fully explored, which limits representation capabilities of learned models. In this paper, we propose a nov…
View article: Boosting Semantic Human Matting with Coarse Annotations
Boosting Semantic Human Matting with Coarse Annotations Open
Semantic human matting aims to estimate the per-pixel opacity of the foreground human regions. It is quite challenging and usually requires user interactive trimaps and plenty of high quality annotated data. Annotating such kind of data is…
View article: A Human Target Infrared Image Segmentation Approach Based on Convolution Neural Network
A Human Target Infrared Image Segmentation Approach Based on Convolution Neural Network Open
In order to effectively segment the human target under complex background constraints, we present an infrared target segmentation method based on deep convolution neural network, and proposes the loss function based on the intersection-ove…
View article: Intelligent Object Recognition of Urban Water Bodies Based on Deep Learning for Multi-Source and Multi-Temporal High Spatial Resolution Remote Sensing Imagery
Intelligent Object Recognition of Urban Water Bodies Based on Deep Learning for Multi-Source and Multi-Temporal High Spatial Resolution Remote Sensing Imagery Open
High spatial resolution remote sensing image (HSRRSI) data provide rich texture, geometric structure, and spatial distribution information for surface water bodies. The rich detail information provides better representation of the internal…
View article: MONET: Multiview Semi-Supervised Keypoint Detection via Epipolar Divergence
MONET: Multiview Semi-Supervised Keypoint Detection via Epipolar Divergence Open
This paper presents MONET -- an end-to-end semi-supervised learning framework for a keypoint detector using multiview image streams. In particular, we consider general subjects such as non-human species where attaining a large scale annota…
View article: Multiview Cross-supervision for Semantic Segmentation
Multiview Cross-supervision for Semantic Segmentation Open
This paper presents a semi-supervised learning framework for a customized semantic segmentation task using multiview image streams. A key challenge of the customized task lies in the limited accessibility of the labeled data due to the req…
View article: MONET: Multiview Semi-supervised Keypoint via Epipolar Divergence.
MONET: Multiview Semi-supervised Keypoint via Epipolar Divergence. Open
This paper presents MONET -- an end-to-end semi-supervised learning framework for a keypoint detector using multiview image streams. In particular, we consider general subjects such as non-human species where attaining a large scale annota…
View article: Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining
Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining Open
Deep convolutional neural network (CNN) achieves outstanding performance in the field of target detection. As one of the most typical targets in remote sensing images (RSIs), airport has attracted increasing attention in recent years. Howe…
View article: Visual Attribute Transfer through Deep Image Analogy
Visual Attribute Transfer through Deep Image Analogy Open
We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure. By visual attribute transfer, we mean transfer of visual information (such as …
View article: An Improved Model based on Viewer Response to Time-varying Video Quality for Video Telephony over LTE
An Improved Model based on Viewer Response to Time-varying Video Quality for Video Telephony over LTE Open
The advent of LTE network’s full deployment has led to a proliferation of mobile video services due to the greatly improved network conditions. One area of intense research is video telephony. Apparently operators are highly concerned abou…
View article: Routine screening for fetal limb abnormalities in the first trimester
Routine screening for fetal limb abnormalities in the first trimester Open
Objective We aim to determine the accuracy of first‐trimester ultrasonography in detecting fetal limb abnormalities. Methods This is a retrospective study of all women undergoing fetal nuchal translucency (NT) assessment and detailed fetal…
View article: Discriminative Learning for Automatic Staging of Placental Maturity via Multi-layer Fisher Vector
Discriminative Learning for Automatic Staging of Placental Maturity via Multi-layer Fisher Vector Open