Xi Li
YOU?
Author Swipe
View article: Comparative Study on Photosynthetic Characteristics and Leaf Structure of Paphiopedilum parishii in Different Growth Periods
Comparative Study on Photosynthetic Characteristics and Leaf Structure of Paphiopedilum parishii in Different Growth Periods Open
This study investigates the differences in photosynthetic characteristics of Paphiopedilum parishii (Rchb.f.) Stein during its reproductive and nutrient growth periods. Using plants from the same individual, we compared light response curv…
View article: Proton-Pump Inhibitors Versus H2-Receptor Antagonists on the Risk of Major Osteoporotic Fractures: A Global Propensity-Score-Matched Cohort Study
Proton-Pump Inhibitors Versus H2-Receptor Antagonists on the Risk of Major Osteoporotic Fractures: A Global Propensity-Score-Matched Cohort Study Open
Background Long-term suppression of gastric acid is crucial for various conditions; however, proton-pump inhibitors (PPIs) may negatively impact skeletal health more than histamine-2 receptor antagonists (H2RAs). There is limited head-to-h…
View article: Comparative Study on Leaf Functional Traits and Environmental Adaptability of Seedlings of the Endangered Plants Ormosia olivacea, Ormosia pachycarpa, and Ormosia sericeolucida
Comparative Study on Leaf Functional Traits and Environmental Adaptability of Seedlings of the Endangered Plants Ormosia olivacea, Ormosia pachycarpa, and Ormosia sericeolucida Open
To investigate the photosynthetic characteristics and leaf anatomical structures of seedlings from the endangered plants Ormosia olivacea, Ormosia pachycarpa, and Ormosia sericeolucida, this study aimed to elucidate the influence of leaf s…
View article: A Curated and Re-annotated Peripheral Blood Cell Dataset Integrating Four Public Resources
A Curated and Re-annotated Peripheral Blood Cell Dataset Integrating Four Public Resources Open
We present TXL-PBC, a curated and re-annotated peripheral blood cell dataset constructed by integrating four publicly available resources: Blood Cell Count and Detection (BCCD), Blood Cell Detection Dataset (BCDD), Peripheral Blood Cells (…
View article: InfVSR: Breaking Length Limits of Generic Video Super-Resolution
InfVSR: Breaking Length Limits of Generic Video Super-Resolution Open
Real-world videos often extend over thousands of frames. Existing video super-resolution (VSR) approaches, however, face two persistent challenges when processing long sequences: (1) inefficiency due to the heavy cost of multi-step denoisi…
View article: Architecture Considerations for ISAC in 6G
Architecture Considerations for ISAC in 6G Open
ISAC is emerging as a foundational capability in 6G, enabling mobile networks to not only offer communication services but also to sense and perceive their environment at scale. This paper explores architectural considerations to enable se…
View article: Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation
Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation Open
The computational demands of self-attention mechanisms pose a critical challenge for transformer-based video generation, particularly in synthesizing ultra-long sequences. Current approaches, such as factorized attention and fixed sparse p…
View article: WGGLFA: Wavelet-Guided Global–Local Feature Aggregation Network for Facial Expression Recognition
WGGLFA: Wavelet-Guided Global–Local Feature Aggregation Network for Facial Expression Recognition Open
Facial expression plays an important role in human–computer interaction and affective computing. However, existing expression recognition methods cannot effectively capture multi-scale structural details contained in facial expressions, le…
View article: Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection
Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection Open
The Mixture of Experts (MoE) architecture has excelled in Large Vision-Language Models (LVLMs), yet its potential in real-time open-vocabulary object detectors, which also leverage large-scale vision-language datasets but smaller models, r…
View article: SphereDrag: Spherical Geometry-Aware Panoramic Image Editing
SphereDrag: Spherical Geometry-Aware Panoramic Image Editing Open
Image editing has made great progress on planar images, but panoramic image editing remains underexplored. Due to their spherical geometry and projection distortions, panoramic images present three key challenges: boundary discontinuity, t…
View article: Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning
Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning Open
3D Gaussian Splatting (3DGS) has emerged as a powerful and efficient 3D representation for novel view synthesis. This paper extends 3DGS capabilities to inpainting, where masked objects in a scene are replaced with new contents that blend …
View article: CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities Open
Customized video generation aims to generate high-quality videos guided by text prompts and subject's reference images. However, since it is only trained on static images, the fine-tuning process of subject learning disrupts abilities of v…
View article: RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements Open
Recent advances in camera-controllable video generation have been constrained by the reliance on static-scene datasets with relative-scale camera annotations, such as RealEstate10K. While these datasets enable basic viewpoint control, they…
View article: Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models
Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models Open
The rapid advancement of pretrained text-driven diffusion models has significantly enriched applications in image generation and editing. However, as the demand for personalized content editing increases, new challenges emerge especially w…
View article: Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models
Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models Open
The rapid advancement of pretrained text-driven diffusion models has significantly enriched applications in image generation and editing. However, as the demand for personalized content editing increases, new challenges emerge especially w…
View article: RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control
RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control Open
Recent advancements in camera-trajectory-guided image-to-video generation offer higher precision and better support for complex camera control compared to text-based approaches. However, they also introduce significant usability challenges…
View article: Network Digital Twin for 5G-Enabled Mobile Robots
Network Digital Twin for 5G-Enabled Mobile Robots Open
The maturity and commercial roll-out of 5G networks and its deployment for private networks makes 5G a key enabler for various vertical industries and applications, including robotics. Providing ultra-low latency, high data rates, and ubiq…
View article: An Innovative NOx Emissions Prediction Model Based on Random Forest Feature Selection and Evolutionary Reformer
An Innovative NOx Emissions Prediction Model Based on Random Forest Feature Selection and Evolutionary Reformer Open
Developing more precise NOx emission prediction models is pivotal for effectively controlling NOx emissions from gas turbines. In this paper, a Reformer is combined with random forest (RF) feature selection and the chaos game optimization …
View article: Analysis and comparison for image colorization with machine learning based on PyTorch and ChromaGAN
Analysis and comparison for image colorization with machine learning based on PyTorch and ChromaGAN Open
View article: Mdsd: Multi-Turn Diverse Synthetic Dialog Generation for Domain Specific Incomplete Requests Understanding
Mdsd: Multi-Turn Diverse Synthetic Dialog Generation for Domain Specific Incomplete Requests Understanding Open
View article: English text topic classification using BERT-based model
English text topic classification using BERT-based model Open
The rapid development of big data and artificial intelligence has made text topic classification an important part of natural language processing research, and it has also promoted the optimization of pre-trained model performance. In orde…
View article: VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models Open
Zero-shot customized video generation has gained significant attention due to its substantial application potential. Existing methods rely on additional models to extract and inject reference subject features, assuming that the Video Diffu…
View article: Linezolid combined Strontium substituted hydroxyapatite-Bi polymeric composite for Osteomyelitis affected bone regeneration analysis
Linezolid combined Strontium substituted hydroxyapatite-Bi polymeric composite for Osteomyelitis affected bone regeneration analysis Open
The primary objective of this investigation is to rectify bacterial infections in bone (osteomyelitis) and bone regeneration by utilizing an antibiotic-loaded hydroxyapatite polymer composite. In this regard, strontium (Sr)-substituted hyd…
View article: Securing Federated Learning against Backdoor Threats with Foundation Model Integration
Securing Federated Learning against Backdoor Threats with Foundation Model Integration Open
Federated Learning (FL) enables decentralized model training while preserving privacy. Recently, the integration of Foundation Models (FMs) into FL has enhanced performance but introduced a novel backdoor attack mechanism. Attackers can ex…
View article: CamI2V: Camera-Controlled Image-to-Video Diffusion Model
CamI2V: Camera-Controlled Image-to-Video Diffusion Model Open
Recent advancements have integrated camera pose as a user-friendly and physics-informed condition in video diffusion models, enabling precise camera control. In this paper, we identify one of the key challenges as effectively modeling nois…
View article: Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown
Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown Open
Multi-object tracking (MOT) emerges as a pivotal and highly promising branch in the field of computer vision. Classical closed-vocabulary MOT (CV-MOT) methods aim to track objects of predefined categories. Recently, some open-vocabulary MO…
View article: Hybrid Mask Generation for Infrared Small Target Detection with Single-Point Supervision
Hybrid Mask Generation for Infrared Small Target Detection with Single-Point Supervision Open
Single-frame infrared small target (SIRST) detection poses a significant challenge due to the requirement to discern minute targets amidst complex infrared background clutter. In this paper, we focus on a weakly-supervised paradigm to obta…
View article: CLASH: Complementary Learning with Neural Architecture Search for Gait Recognition
CLASH: Complementary Learning with Neural Architecture Search for Gait Recognition Open
Gait recognition, which aims at identifying individuals by their walking patterns, has achieved great success based on silhouette. The binary silhouette sequence encodes the walking pattern within the sparse boundary representation. Theref…
View article: Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model Open
The Mixture-of-Experts (MoE) has gained increasing attention in studying Large Vision-Language Models (LVLMs). It uses a sparse model to replace the dense model, achieving comparable performance while activating fewer parameters during inf…
View article: Effects of simulated multi-sensory stimulation integration on physiological and psychological restoration in virtual urban green space environment
Effects of simulated multi-sensory stimulation integration on physiological and psychological restoration in virtual urban green space environment Open
Virtual urban green environment images and audio stimuli had been proven to have restorative effects on subjects’ physical and mental health. In this area, researchers predominantly focused on visual, auditory and olfactory aspects, while …