Explanipedia

Efficient Point-Based Neural Network For Finish Line Extraction Open

Mengxun Li, Xinhua Pan, Gui-Song Xia, Cui Fen Huang, Yuan Gao , et al. · 2025

Aim or purpose: Accurate extraction of finish lines from prepared teeth is critical for optimizing CAD/CAM workflows and is essential for enabling controlled automatic crown design. This study proposes a lightweight point-based neural netw…

DragOSM: Extract Building Roofs and Footprints from Aerial Images by Aligning Historical Labels Open

Kai Li, Xingxing Weng, Yupeng Deng, Yu Meng, Chao Pang , et al. · 2025

Extracting polygonal roofs and footprints from remote sensing images is critical for large-scale urban analysis. Most existing methods rely on segmentation-based models that assume clear semantic boundaries of roofs, but these approaches s…

Creative Abrasion: Cultural Distance and Migrant Entrepreneurship Open

Nan Li, Gui-Song Xia, Yanzhao Tang · 2025

CC-Former: Urban Flood Mapping from InSAR Coherence with Vision Transformer: Libya and Storm Daniel as Test Case Open

Tamer Saleh, Shimaa Holail, Mina Al-Saad, Fang Xu, Mohamed Zahran , et al. · 2025

Urban flooding is a recurring and distressing issue with severe consequences, including the destruction of densely populated infrastructure and loss of life. Mapping inundated urban areas using synthetic aperture radar (SAR) data is crucia…

Evolving superpixel-level affinity based on contrastive learning and good neighbors for hyperspectral image clustering Open

Yao Qin, Gui-Song Xia, Kun Li, Yuanxin Ye, Weiping Ni · 2025

Vision-Language Modeling Meets Remote Sensing: <i>Models, datasets, and perspectives</i> Open

Xingxing Weng, Chao Pang, Gui-Song Xia · 2025

Vision-language modeling (VLM) aims to bridge the information gap between images and natural language. Under the new paradigm of first pre-training on massive image-text pairs and then fine-tuning on task-specific data, VLM in the remote s…

Seeing through Satellite Images at Street Views Open

Ming Qian, Qiuyu Wang, Xianwei Zheng, Hanjiang Xiong, Gui-Song Xia , et al. · 2025

This paper studies the task of SatStreet-view synthesis, which aims to render photorealistic street-view panorama images and videos given any satellite image and specified camera positions or trajectories. We formulate to learn neural radi…

SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model Open

Kaiyu Li, Zepeng Xin, Pang Li, Chao Pang, Yupeng Deng , et al. · 2025

Remote sensing has become critical for understanding environmental dynamics, urban planning, and disaster management. However, traditional remote sensing workflows often rely on explicit segmentation or detection methods, which struggle to…

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis Open

Chao Pang, Xingxing Weng, Jiang Wu, Jiayu Li, Yi Liu , et al. · 2025

This paper develops a Versatile and Honest vision language Model (VHM) for remote sensing image analysis. VHM is built on a large-scale remote sensing image-text dataset with rich-content captions (VersaD), and an honest instruction datase…

Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination Open

Qi Bi, Jingjun Yi, Haolan Zhan, Wei Ji, Gui-Song Xia · 2025

Fine-grained domain generalization (FGDG) aims to learn a fine-grained representation that can be well generalized to unseen target domains when only trained on the source domain data. Compared with generic domain generalization, FGDG is p…

Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination Open

Qi Bi, Jingjun Yi, Haolan Zhan, Wei Ji, Gui-Song Xia · 2025

Fine-grained domain generalization (FGDG) aims to learn a fine-grained representation that can be well generalized to unseen target domains when only trained on the source domain data. Compared with generic domain generalization, FGDG is p…

UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations Open

Han Shu, Xubo Zhu, Ji Wu, Xin Cai, Hong Yu , et al. · 2025

Precise LiDAR-camera calibration is crucial for integrating these two sensors into robotic systems to achieve robust perception. In applications like autonomous driving, online targetless calibration enables a prompt sensor misalignment co…

Model Hemorrhage and the Robustness Limits of Large Language Models Open

Lefei Zhang, Gui-Song Xia, Liangpei Zhang · 2025

Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment through quantization, pruning, or decoding strategy adjustme…

Mitigating the Impact of Prominent Position Shift in Drone-based RGBT Object Detection Open

Yan Zhang, Wen Yang, Chang Xu, Qian Hu, Fang Xu , et al. · 2025

Drone-based RGBT object detection plays a crucial role in many around-the-clock applications. However, real-world drone-viewed RGBT data suffers from the prominent position shift problem, i.e., the position of a tiny object differs greatly…

RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone Scenes Open

Zhichao Sun, Yepeng Liu, Huachao Zhu, Yuliang Gu, Yuda Zou , et al. · 2025

Drones have become prevalent robotic platforms with diverse applications, showing significant potential in Embodied Artificial Intelligence (Embodied AI). Referring Expression Comprehension (REC) enables drones to locate objects based on n…

Mask Clustering-Based Annotation Engine for Large-Scale Submeter Land Cover Mapping Open

Hao Chen, Fang Xu, Tamer Saleh, Weifeng Hao, Gui-Song Xia · 2025

Recent advances in remote sensing technology have made submeter resolution imagery increasingly accessible, offering remarkable detail for fine-grained land cover analysis. However, its full potential remains underutilized - particularly f…

Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient Open

Yuan Gao, Zujing Liu, Weizhong Zhang, Bo Du, Gui-Song Xia · 2025

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation Open

Lunhao Duan, Shanshan Zhao, Wenjun Yan, Yinglun Li, Qing-Guo Chen , et al. · 2024

Recently, text-to-image generation models have achieved remarkable advancements, particularly with diffusion models facilitating high-quality image synthesis from textual descriptions. However, these models often struggle with achieving pr…

Institutional Environment and Productive Entrepreneurship Open

Nan Li, Yanzhao Tang, Gui-Song Xia, Hongqin Tang, Li He · 2024

In the context of intensifying global competition, productive entrepreneurship plays an important role in industrial upgrading and sustainable economic development. This study explores how the institutional environment affects productive e…

Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning Open

Chang Xu, Ruixiang Zhang, Wen Yang, Haoran Zhu, Fang Xu , et al. · 2024

Detecting oriented tiny objects, which are limited in appearance information yet prevalent in real-world applications, remains an intricate and under-explored problem. To address this, we systemically introduce a new dataset, benchmark, an…

Tiny Object Detection with Single Point Supervision Open

Haoran Zhu, Chang Xu, Ruixiang Zhang, Fang Xu, Wenzhen Yang , et al. · 2024

Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level s…

QuadricsReg: Large-Scale Point Cloud Registration using Quadric Primitives Open

Ji Wu, Hong Yu, Han Shu, Xin Cai, Mingfeng Wang , et al. · 2024

In the realm of large-scale point cloud registration, designing a compact symbolic representation is crucial for efficiently processing vast amounts of data, ensuring registration robustness against significant viewpoint variations and occ…

MCVO: A Generic Visual Odometry for Arbitrarily Arranged Multi-Cameras Open

Huai Yu, Junhao Wang, Yao He, Wen Yang, Gui-Song Xia · 2024

Making multi-camera visual SLAM systems easier to set up and more robust to the environment is attractive for vision robots. Existing monocular and binocular vision SLAM systems have narrow sensing Field-of-View (FoV), resulting in degener…

LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References Open

Shuguo Jiang, Fang Xu, Sen Jia, Gui-Song Xia · 2024

Change detection, which typically relies on the comparison of bi-temporal images, is significantly hindered when only a single image is available. Comparing a single image with an existing map, such as OpenStreetMap, which is continuously …

Intensity Field Decomposition for Tissue-Guided Neural Tomography Open

Mengxun Li, Jin-Gang Yu, Yuan Gao, Cui Huang, Gui-Song Xia · 2024

Cone-beam computed tomography (CBCT) typically requires hundreds of X-ray projections, which raises concerns about radiation exposure. While sparse-view reconstruction reduces the exposure by using fewer projections, it struggles to achiev…

Partial Distribution Matching via Partial Wasserstein Adversarial Networks Open

Ziming Wang, Nan Xue, Lei Liu, Rebecka Jörnsten, Gui-Song Xia · 2024

This paper studies the problem of distribution matching (DM), which is a fundamental machine learning problem seeking to robustly align two probability distributions. Our approach is established on a relaxed formulation, called partial dis…

Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation Open

Chuandong Liu, Xingxing Weng, Shuguo Jiang, Pengcheng Li, Lei Yu , et al. · 2024

This paper explores scene affinity (AIScene), namely intra-scene consistency and inter-scene correlation, for semi-supervised LiDAR semantic segmentation in driving scenes. Adopting teacher-student training, AIScene employs a teacher netwo…

Time-series satellite remote sensing reveals gradually increasing war damage in the Gaza Strip Open

Shimaa Holail, Tamer Saleh, Xiongwu Xiao, Jing Xiao, Gui-Song Xia , et al. · 2024

War-related urban destruction is a significant global concern, impacting national security, social stability, people’s survival and economic development. The effects of urban geomorphology and complex geological contexts during conflicts, …

DMTG: One-Shot Differentiable Multi-Task Grouping Open

Yuan Gao, Shuguo Jiang, Moran Li, Jin-Gang Yu, Gui-Song Xia · 2024

We aim to address Multi-Task Learning (MTL) with a large number of tasks by Multi-Task Grouping (MTG). Given N tasks, we propose to simultaneously identify the best task groups from 2^N candidates and train the model weights simultaneously…

Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference Open

Yuan Gao, Yajing Luo, Junhong Wang, Kui Jia, Gui-Song Xia · 2024

Humans can easily deduce the relative pose of a previously unseen object, without labeling or training, given only a single query-reference image pair. This is arguably achieved by incorporating i) 3D/2.5D shape perception from a single im…

Gui-Song Xia YOU? Author Swipe