Explanipedia

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation Open

Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo · 2025

Motion-controllable image animation is a fundamental task with a wide range of potential applications. Recent works have made progress in controlling camera or object motion via various motion representations, while they still struggle to …

From Detection to Explanation: Integrating Temporal and Spatial Features for Rumor Detection and Explaining Results Using LLMs Open

Nanjiang Zhong, Xiudi Jiang, Yuan Yao · 2025

Evidence for Redefined Prenatal Screening: Demand-Driven Trio-WES with Four-Dimensional Risk Stratification Enables Comprehensive Fetus Assessment Open

Zhichun Feng, Huanyun Li, Wenxin Liu, Xiaofan Zhu, Zhi Gao , et al. · 2025

Prediction of Option Prices by BP Neural Network Based on Principal Component Analysis Open

Yuan Yao · 2024

Under the development of artificial intelligence, this paper adopts the neural network algorithm based on principal component analysis to perform data fitting and prediction on the option prices of Huaxia SSE 50ETF. It also compares the fi…

Loss of glymphatic homeostasis in heart failure Open

Marios Kritsilis, Lotte Vanherle, Marko Rosenholm, René in ‘t Zandt, Yuan Yao , et al. · 2024

Heart failure is associated with progressive reduction in cerebral blood flow and neurodegenerative changes leading to cognitive decline. The glymphatic system is crucial for the brain’s waste removal, and its dysfunction is linked to neur…

Squeezing Context into Patches: Towards Memory-Efficient Ultra-High Resolution Semantic Segmentation Open

Yuan Yao, Wutao Liu, Pan Gao, Qun Dai, Jie Qin · 2024

Segmenting ultra-high-resolution (UHR) images poses a significant challenge due to constraints on GPU memory, leading to a trade-off between detailed local information and a comprehensive contextual understanding. Current UHR methods often…

CPT: Colorful Prompt Tuning for pre-trained vision-language models Open

Yuan Yao, Ao Zhang, Zhengyan Zhang, Zhiyuan Liu, Tat‐Seng Chua , et al. · 2024

Vision-Language Pre-training (VLP) models have shown promising capabilities in grounding natural language in image data, facilitating a broad range of cross-modal tasks. However, we note that there exists a significant gap between the obje…

M$^3$Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection Open

Yuan Yao, Pan Gao, Xiaoyang Tan · 2023

Most existing salient object detection methods mostly use U-Net or feature pyramid structure, which simply aggregates feature maps of different scales, ignoring the uniqueness and interdependence of them and their respective contributions …

sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging Open

Jingyuan Chen, Yuan Yao, Mie Anderson, Natalie Hauglund, Celia Kjærby , et al. · 2023

Automatic sleep staging based on electroencephalography (EEG) and electromyography (EMG) signals is an important aspect of sleep-related research. Current sleep staging methods suffer from two major drawbacks. First, there are limited info…

WT-YOLOX: An Efficient Detection Algorithm for Wind Turbine Blade Damage Based on YOLOX Open

Yuan Yao, Guozhong Wang, Jinhui Fan · 2023

Wind turbine blades will suffer various surface damages due to their operating environment and high-speed rotation. Accurate identification in the early stage of damage formation is crucial. The damage detection of wind turbine blades is a…

CTANet: Confidence-Based Threshold Adaption Network for Semi-Supervised Segmentation of Uterine Regions from MR Images for HIFU Treatment Open

Chen Zhang, Guanyu Yang, Fuqiang Li, Wen Yang, Yuan Yao , et al. · 2023

OpenMonkeyChallenge: Dataset and Benchmark Challenges for Pose Estimation of Non-human Primates Open

Yuan Yao, Praneet Bala, Abhiraj Mohan, Eliza Bliss‐Moreau, Kristine Coleman , et al. · 2022

CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models Open

Yuan Yao, Ao Zhang, Zhengyan Zhang, Zhiyuan Liu, Tat‐Seng Chua , et al. · 2021

Pre-Trained Vision-Language Models (VL-PTMs) have shown promising capabilities in grounding natural language in image data, facilitating a broad variety of cross-modal tasks. However, we note that there exists a significant gap between the…

Image-to-Video Generation via 3D Facial Dynamics Open

Xiaoguang Tu, Yingtian Zou, Jian Zhao, Wenjie Ai, Jian Dong , et al. · 2021

We present a versatile model, FaceAnime, for various video generation tasks from still images. Video generation from a single face image is an interesting problem and usually tackled by utilizing Generative Adversarial Networks (GANs) to i…

Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning Open

Yuan Yao, Chang Liu, Dezhao Luo, Yu Zhou, Qixiang Ye · 2020

In self-supervised spatio-temporal representation learning, the temporal resolution and long-short term characteristics are not yet fully explored, which limits representation capabilities of learned models. In this paper, we propose a nov…

Boosting Semantic Human Matting with Coarse Annotations Open

Jinlin Liu, Yuan Yao, Wendi Hou, Miaomiao Cui, Xuansong Xie , et al. · 2020

Semantic human matting aims to estimate the per-pixel opacity of the foreground human regions. It is quite challenging and usually requires user interactive trimaps and plenty of high quality annotated data. Annotating such kind of data is…

A Human Target Infrared Image Segmentation Approach Based on Convolution Neural Network Open

Chao Liu, Qingping Hu, Yuan Yao · 2020

In order to effectively segment the human target under complex background constraints, we present an infrared target segmentation method based on deep convolution neural network, and proposes the loss function based on the intersection-ove…

Intelligent Object Recognition of Urban Water Bodies Based on Deep Learning for Multi-Source and Multi-Temporal High Spatial Resolution Remote Sensing Imagery Open

Shiran Song, Jianhua Liu, Yuan Liu, Guoqiang Feng, Hui Han , et al. · 2020

High spatial resolution remote sensing image (HSRRSI) data provide rich texture, geometric structure, and spatial distribution information for surface water bodies. The rich detail information provides better representation of the internal…

MONET: Multiview Semi-Supervised Keypoint Detection via Epipolar Divergence Open

Yuan Yao, Yasamin Jafarian, Hyun Soo Park · 2019

This paper presents MONET -- an end-to-end semi-supervised learning framework for a keypoint detector using multiview image streams. In particular, we consider general subjects such as non-human species where attaining a large scale annota…

Multiview Cross-supervision for Semantic Segmentation Open

Yuan Yao, Hyun Soo Park · 2018

This paper presents a semi-supervised learning framework for a customized semantic segmentation task using multiview image streams. A key challenge of the customized task lies in the limited accessibility of the labeled data due to the req…

MONET: Multiview Semi-supervised Keypoint via Epipolar Divergence. Open

Yasamin Jafarian, Yuan Yao, Hyun Soo Park · 2018

This paper presents MONET -- an end-to-end semi-supervised learning framework for a keypoint detector using multiview image streams. In particular, we consider general subjects such as non-human species where attaining a large scale annota…

Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining Open

Bowen Cai, Zhiguo Jiang, Haopeng Zhang, Danpei Zhao, Yuan Yao · 2017

Deep convolutional neural network (CNN) achieves outstanding performance in the field of target detection. As one of the most typical targets in remote sensing images (RSIs), airport has attracted increasing attention in recent years. Howe…

Visual Attribute Transfer through Deep Image Analogy Open

Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, Sing Bing Kang · 2017

We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure. By visual attribute transfer, we mean transfer of visual information (such as …

An Improved Model based on Viewer Response to Time-varying Video Quality for Video Telephony over LTE Open

Yao Sun, Fei Wang, Yuan Yao, Jing Wang, Zesong Fei · 2017

The advent of LTE network’s full deployment has led to a proliferation of mobile video services due to the greatly improved network conditions. One area of intense research is video telephony. Apparently operators are highly concerned abou…

Routine screening for fetal limb abnormalities in the first trimester Open

Yimei Liao, Shengli Li, Guoyang Luo, Huaxuan Wen, Shuyuan Ouyang , et al. · 2015

Objective We aim to determine the accuracy of first‐trimester ultrasonography in detecting fetal limb abnormalities. Methods This is a retrospective study of all women undergoing fetal nuchal translucency (NT) assessment and detailed fetal…

Discriminative Learning for Automatic Staging of Placental Maturity via Multi-layer Fisher Vector Open

Baiying Lei, Yuan Yao, Siping Chen, Shengli Li, Wanjun Li , et al. · 2015

Yuan Yao YOU? Author Swipe