Jinchang Ren
YOU?
Author Swipe
View article: M3OT: A Multi-Drone Multi-Modality dataset for Multi-Object Tracking
M3OT: A Multi-Drone Multi-Modality dataset for Multi-Object Tracking Open
We provide a dataset for object detection and tracking in aerial imagery, namely “M3OT”. M3OT is a multi-modality vehicle detection and tracking dataset acquired by two Unmanned Aerial Vehicles (UAVs) in a high-altitude region, consisting …
View article: TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning
TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning Open
With the rapid advancement of large language models and vision-language models, employing large models as Web Agents has become essential for automated web interaction. However, training Web Agents with reinforcement learning faces critica…
View article: NFFLS: Rapid and Accurate Underwater 3-D Reconstruction With Neural Fields for Forward-Looking Sonar
NFFLS: Rapid and Accurate Underwater 3-D Reconstruction With Neural Fields for Forward-Looking Sonar Open
Forward-looking sonar (FLS) can capture high-resolution acoustical images from the underwater scenes, maintaining performance even in turbid water conditions and poor lighting. Although neural fields have become popular for 3-D reconstruct…
View article: MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning
MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning Open
Text Image Machine Translation (TIMT)-the task of translating textual content embedded in images-is critical for applications in accessibility, cross-lingual information access, and real-world document understanding. However, TIMT remains …
View article: Inheritance and protection of intangible cultural heritage in drama category based on AI human–computer interaction and digital technology
Inheritance and protection of intangible cultural heritage in drama category based on AI human–computer interaction and digital technology Open
To address the challenges of insufficient content interactivity and poor user experience in the inheritance of traditional drama intangible cultural heritage in modern society, this study proposes an innovative model—the Intelligent Herita…
View article: Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook Open
Deep learning has profoundly transformed remote sensing, yet prevailing architectures like Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) remain constrained by critical trade-offs: CNNs suffer from limited receptive fi…
View article: Enhanced Partially Relevant Video Retrieval through Inter- and Intra-Sample Analysis with Coherence Prediction
Enhanced Partially Relevant Video Retrieval through Inter- and Intra-Sample Analysis with Coherence Prediction Open
Partially Relevant Video Retrieval (PRVR) aims to retrieve the target video that is partially relevant to the text query. The primary challenge in PRVR arises from the semantic asymmetry between textual and visual modalities, as videos oft…
View article: Hyperspectral Image Classification Using a Multi-Scale CNN Architecture with Asymmetric Convolutions from Small to Large Kernels
Hyperspectral Image Classification Using a Multi-Scale CNN Architecture with Asymmetric Convolutions from Small to Large Kernels Open
Deep learning-based hyperspectral image (HSI) classification methods, such as Transformers and Mambas, have attracted considerable attention. However, several challenges persist, e.g., (1) Transformers suffer from quadratic computational c…
View article: MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning Open
Large-scale reinforcement learning (RL) methods have proven highly effective in enhancing the reasoning abilities of large language models (LLMs), particularly for tasks with verifiable solutions such as mathematics and coding. However, ap…
View article: MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling Open
Process reward models (PRMs) have shown success in complex reasoning tasks for large language models (LLMs). However, their application to machine translation (MT) remains underexplored due to the lack of systematic methodologies and evalu…
View article: SCA3D: Enhancing Cross-modal 3D Retrieval via 3D Shape and Caption Paired Data Augmentation
SCA3D: Enhancing Cross-modal 3D Retrieval via 3D Shape and Caption Paired Data Augmentation Open
The cross-modal 3D retrieval task aims to achieve mutual matching between text descriptions and 3D shapes. This has the potential to enhance the interaction between natural language and the 3D environment, especially within the realms of r…
View article: Diversified Augmentation with Domain Adaptation for Debiased Video Temporal Grounding
Diversified Augmentation with Domain Adaptation for Debiased Video Temporal Grounding Open
Temporal sentence grounding in videos (TSGV) faces challenges due to public TSGV datasets containing significant temporal biases, which are attributed to the uneven temporal distributions of target moments. Existing methods generate augmen…
View article: FusDreamer: Label-Efficient Remote Sensing World Model for Multimodal Data Classification
FusDreamer: Label-Efficient Remote Sensing World Model for Multimodal Data Classification Open
World models significantly enhance hierarchical understanding, improving data integration and learning efficiency. To explore the potential of the world model in the remote sensing (RS) field, this paper proposes a label-efficient remote s…
View article: Frequency-Domain Guided Swin Transformer and Global–Local Feature Integration for Remote Sensing Images Semantic Segmentation
Frequency-Domain Guided Swin Transformer and Global–Local Feature Integration for Remote Sensing Images Semantic Segmentation Open
Convolutional Neural Networks (CNNs), transformers, and the hybrid methods have been significant application in remote sensing. However, existing methods are limited in effectively modeling frequency domain information, which affects their…
View article: Binary Quantization Vision Transformer for Effective Segmentation of Red Tide in Multispectral Remote Sensing Imagery
Binary Quantization Vision Transformer for Effective Segmentation of Red Tide in Multispectral Remote Sensing Imagery Open
As a global marine disaster, red tides pose serious threats to marine ecology and the blue economy, making their monitoring crucial for preventing harmful algal blooms and protecting the marine environment. In this study, satellite remote …
View article: LKVHAN: Multiscale Large Kernel Vertical-Horizontal Attention Network for Hyperspectral Image Classification
LKVHAN: Multiscale Large Kernel Vertical-Horizontal Attention Network for Hyperspectral Image Classification Open
Among deep learning-based hyperspectral image (HSI) classification models, convolutional neural networks (CNNs), Transformers, Mamba, and large kernel CNNs (LKCNNs) models have been widely explored for HSI classification. Nonetheless, thes…
View article: MSLKCNN: A Simple and Powerful Multiscale Large Kernel CNN for Hyperspectral Image Classification
MSLKCNN: A Simple and Powerful Multiscale Large Kernel CNN for Hyperspectral Image Classification Open
Deep learning-based hyperspectral image (HSI) classification models typically utilize multiple feature extraction layers to learn the features of land covers. Nevertheless, they encounter challenges, e.g., 1) Transformers require substanti…
View article: Research on the Policy of Introducing Young Talents to Hengqin -- Analysis Based on CGSS2021
Research on the Policy of Introducing Young Talents to Hengqin -- Analysis Based on CGSS2021 Open
This paper focuses on the research of policies for introducing young talents to Hengqin. Based on the national survey data CGSS2021, it conducts an in - depth analysis of the influencing factors of youth employment in China. Firstly, the p…
View article: ChangeDA: Depth-Augmented Multitask Network for Remote Sensing Change Detection via Differential Analysis
ChangeDA: Depth-Augmented Multitask Network for Remote Sensing Change Detection via Differential Analysis Open
In the field of Remote Sensing Change Detection (RSCD), accurately identifying significant changes between bitemporal images is essential for environmental monitoring, urban planning, and disaster assessment. In recent years, advancements …
View article: Prototype-Guided Spatial–Spectral Interaction Network for Hyperspectral Anomaly Detection
Prototype-Guided Spatial–Spectral Interaction Network for Hyperspectral Anomaly Detection Open
In recent years, deep learning has emerged as one of the most widely utilized techniques in hyperspectral anomaly detection (HAD) with an impressive detection accuracy. However, the investigation into the diverse background representation …
View article: Blind Super-Resolution Based on Interframe Information Compensation for Satellite Video
Blind Super-Resolution Based on Interframe Information Compensation for Satellite Video Open
Super-Resolution (SR) of satellite video has long been a critical research direction in the field of remote sensing video processing and analysis, and blind SR has attracted increasing attention in the face of satellite video with unknown …
View article: M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation Open
Recent advancements in large language models (LLMs) have given rise to the LLM-as-a-judge paradigm, showcasing their potential to deliver human-like judgments. However, in the field of machine translation (MT) evaluation, current LLM-as-a-…
View article: ICSF: Integrating Inter-Modal and Cross-Modal Learning Framework for Self-Supervised Heterogeneous Change Detection
ICSF: Integrating Inter-Modal and Cross-Modal Learning Framework for Self-Supervised Heterogeneous Change Detection Open
Heterogeneous change detection (HCD) is a process to determine the change information by analyzing heterogeneous images of the same geographic location taken at different times, which plays an important role in remote sensing applications …
View article: Dual Teacher: Improving the Reliability of Pseudo Labels for Semi-Supervised Oriented Object Detection
Dual Teacher: Improving the Reliability of Pseudo Labels for Semi-Supervised Oriented Object Detection Open
Oriented object detection in remote sensing is a critical task for accurately location and measurement of the interested targets. Despite of its success in object detection, deep learning-based detectors rely heavily on extensive data anno…
View article: Enhancing underwater situational awareness: RealSense camera integration with deep learning for improved depth perception and distance measurement
Enhancing underwater situational awareness: RealSense camera integration with deep learning for improved depth perception and distance measurement Open
This work presents a depth image refinement technique designed to enhance the usability of a commercial camera in underwater environments. Stereo vision-based depth cameras offer dense data that is well-suited for accurate environmental un…