Jiale Cao
YOU?
Author Swipe
View article: Integrating proton co-storage in iron-based anodes for high-performance nickel-iron batteries
Integrating proton co-storage in iron-based anodes for high-performance nickel-iron batteries Open
The growing demand for sustainable energy solutions has intensified the need for efficient, cost-effective, and scalable energy storage technologies. Among candidate systems, nickel‑iron (Ni-Fe) batteries stand out due to their low cost, a…
View article: Solar Trap‐Adsorption Photocathode for Highly Stable 2.4 V Dual‐Ion Solid‐State Iodine Batteries
Solar Trap‐Adsorption Photocathode for Highly Stable 2.4 V Dual‐Ion Solid‐State Iodine Batteries Open
Rechargeable aqueous iodine‐based electrochemical energy storage systems offer a cost‐effective alternative to conventional alkali metal batteries for grid‐scale applications. However, their practical deployment is hindered by sluggish iod…
View article: DANet: spatial gene expression prediction from H&E histology images through dynamic alignment
DANet: spatial gene expression prediction from H&E histology images through dynamic alignment Open
Predicting spatial gene expression from Hematoxylin and Eosin histology images offers a promising approach to significantly reduce the time and cost associated with gene expression sequencing, thereby facilitating a deeper understanding of…
View article: SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection Open
Multimodal 3D object detection based on deep neural networks has indeed made significant progress. However, it still faces challenges due to the misalignment of scale and spatial information between features extracted from 2D images and th…
View article: Glad: A Streaming Scene Generator for Autonomous Driving
Glad: A Streaming Scene Generator for Autonomous Driving Open
The generation and simulation of diverse real-world scenes have significant application value in the field of autonomous driving, especially for the corner cases. Recently, researchers have explored employing neural radiance fields or diff…
View article: Dual-Domain Low-Light Image Enhancement Network Via Frequency Selection and Structure-Guided Attention
Dual-Domain Low-Light Image Enhancement Network Via Frequency Selection and Structure-Guided Attention Open
View article: Performance prediction of a solar pure water and hot water hybrid system based on IGWA-BP neural network
Performance prediction of a solar pure water and hot water hybrid system based on IGWA-BP neural network Open
View article: Tspnet: Two-Stage Network for Progressive Low-Light Image Enhancement
Tspnet: Two-Stage Network for Progressive Low-Light Image Enhancement Open
View article: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation Open
Contrastive Language-Image Pre-training (CLIP) exhibits strong zero-shot classification ability on various image-level tasks, leading to the research to adapt CLIP for pixel-level open-vocabulary semantic segmentation without additional tr…
View article: VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos Open
Fine-grained alignment between videos and text is challenging due to complex spatial and temporal dynamics in videos. Existing video-based Large Multimodal Models (LMMs) handle basic conversations but struggle with precise pixel-level grou…
View article: DB-SAM: Delving into High Quality Universal Medical Image Segmentation
DB-SAM: Delving into High Quality Universal Medical Image Segmentation Open
Recently, the Segment Anything Model (SAM) has demonstrated promising segmentation capabilities in a variety of downstream segmentation tasks. However in the context of universal medical image segmentation there exists a notable performanc…
View article: iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
iSeg: An Iterative Refinement-based Framework for Training-free Segmentation Open
Stable diffusion has demonstrated strong image synthesis ability to given text descriptions, suggesting it to contain strong semantic clue for grouping objects. The researchers have explored employing stable diffusion for training-free seg…
View article: Multi-Granularity Language-Guided Training for Multi-Object Tracking
Multi-Granularity Language-Guided Training for Multi-Object Tracking Open
Most existing multi-object tracking methods typically learn visual tracking features via maximizing dis-similarities of different instances and minimizing similarities of the same instance. While such a feature learning scheme achieves pro…
View article: CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation Open
Open-vocabulary video instance segmentation strives to segment and track instances belonging to an open set of categories in a videos. The vision-language model Contrastive Language-Image Pre-training (CLIP) has shown robust zero-shot clas…
View article: Quantitative Alignment and Study of Low-Frequency Excitation Parameters Based on Walnut Tree (Juglans Regia L.) Shape Characteristics
Quantitative Alignment and Study of Low-Frequency Excitation Parameters Based on Walnut Tree (Juglans Regia L.) Shape Characteristics Open
View article: SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation Open
Open-vocabulary semantic segmentation strives to distinguish pixels into different semantic groups from an open set of categories. Most existing methods explore utilizing pre-trained vision-language models, in which the key is to adopt the…
View article: CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation Open
Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as …
View article: Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects Open
Surface defect inspection is a very challenging task in which surface defects usually show weak appearances or exist under complex backgrounds. Most high-accuracy defect detection methods require expensive computation and storage overhead,…
View article: A Spatial-Temporal Deformable Attention based Framework for Breast Lesion Detection in Videos
A Spatial-Temporal Deformable Attention based Framework for Breast Lesion Detection in Videos Open
Detecting breast lesion in videos is crucial for computer-aided diagnosis. Existing video-based breast lesion detection approaches typically perform temporal feature aggregation of deep backbone features based on the self-attention operati…
View article: DFormer: Diffusion-guided Transformer for Universal Image Segmentation
DFormer: Diffusion-guided Transformer for Universal Image Segmentation Open
This paper introduces an approach, named DFormer, for universal image segmentation. The proposed DFormer views universal image segmentation task as a denoising process using a diffusion model. DFormer first adds various levels of Gaussian …
View article: Transformer-based stereo-aware 3D object detection from binocular images
Transformer-based stereo-aware 3D object detection from binocular images Open
Transformers have shown promising progress in various visual object detection tasks, including monocular 2D/3D detection and surround-view 3D detection. More importantly, the attention mechanism in the Transformer model and the 3D informat…
View article: LEAPS: End-to-End One-Step Person Search With Learnable Proposals
LEAPS: End-to-End One-Step Person Search With Learnable Proposals Open
We propose an end-to-end one-step person search approach with learnable proposals, named LEAPS. Given a set of sparse and learnable proposals, LEAPS employs a dynamic person search head to directly perform person detection and correspondin…
View article: Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step Person Search
Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step Person Search Open
Weakly supervised person search aims to perform joint pedestrian detection and re-identification (re-id) with only person bounding-box annotations. Recently, the idea of contrastive learning is initially applied to weakly supervised person…
View article: Vibration Response of Walnuts under Vibration Harvesting
Vibration Response of Walnuts under Vibration Harvesting Open
Vibration harvesting is a promising method for walnut production owing to its low cost and high efficiency. However, current research focuses on simulation analysis and lacks a theoretical model explaining the walnuts’ specific vibration r…
View article: Cross-Modal Local Calibration and Global Context Modeling Network for RGB–Infrared Remote-Sensing Object Detection
Cross-Modal Local Calibration and Global Context Modeling Network for RGB–Infrared Remote-Sensing Object Detection Open
RGB–infrared object detection in remote-sensing images is crucial for achieving around-clock surveillance of unmanned aerial vehicles. RGB–infrared remote-sensing object detection methods based on deep learning usually mine the complementa…
View article: Dynamic and Quantitative Risk Assessment of Cruise Ship Propulsion System Failure an Integrated Type-2 Fuzzy-Bayesian Approach
Dynamic and Quantitative Risk Assessment of Cruise Ship Propulsion System Failure an Integrated Type-2 Fuzzy-Bayesian Approach Open
View article: Non-Destructive Detection of Moldy Walnuts Based on Hyperspectral Imaging Technology
Non-Destructive Detection of Moldy Walnuts Based on Hyperspectral Imaging Technology Open
Walnuts with their shells are a popular agricultural product in China. However, mildew from growth can sometimes be processed into foods. It is difficult to visually determine which walnuts have mildew without breaking the shells. A non-de…
View article: 3D Vision with Transformers: A Survey
3D Vision with Transformers: A Survey Open
The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field. The transformer has been used as a replacement for the widely used convolution operators, due to its …
View article: PSTR: End-to-End One-Step Person Search With Transformers
PSTR: End-to-End One-Step Person Search With Transformers Open
We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture. PSTR comprises a person search-specialized (PSS) module that contai…
View article: Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer
Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer Open
State-of-the-art transformer-based video instance segmentation (VIS) approaches typically utilize either single-scale spatio-temporal features or per-frame multi-scale features during the attention computations. We argue that such an atten…