Binglu Wang
YOU?
Author Swipe
View article: Visual-guided human-object interaction detection
Visual-guided human-object interaction detection Open
The aim of human-object interaction (HOI) detection is to identify the triplets consisting of a human, a verb, and an object. Although existing methods leverage vision-language models (e.g., CLIP) to transfer textual information for unseen…
View article: Peer Review and the Diffusion of Ideas
Peer Review and the Diffusion of Ideas Open
This study examines a fundamental yet overlooked function of peer review: its role in exposing reviewers to new and unexpected ideas. Leveraging a natural experiment involving over half a million peer review invitations covering both accep…
View article: FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance Open
Despite significant advances in video generation, synthesizing physically plausible human actions remains a persistent challenge, particularly in modeling fine-grained semantics and complex temporal dynamics. For instance, generating gymna…
View article: Multimodal Large Models Are Effective Action Anticipators
Multimodal Large Models Are Effective Action Anticipators Open
The task of long-term action anticipation demands solutions that can effectively model temporal dynamics over extended periods while deeply understanding the inherent semantics of actions. Traditional approaches, which primarily rely on re…
View article: Prevalence of co-morbid anxiety and depression in pregnancy and postpartum: a systematic review and meta-analysis
Prevalence of co-morbid anxiety and depression in pregnancy and postpartum: a systematic review and meta-analysis Open
The prevalence of co-morbid anxiety and depression varies greatly between research studies, making it difficult to understand and estimate the magnitude of this problem. This systematic review and meta-analysis aim to provide up-to-date in…
View article: Enhanced Window-Based Self-Attention with Global and Multi-Scale Representations for Remote Sensing Image Super-Resolution
Enhanced Window-Based Self-Attention with Global and Multi-Scale Representations for Remote Sensing Image Super-Resolution Open
Transformers have recently gained significant attention in low-level vision tasks, particularly for remote sensing image super-resolution (RSISR). The vanilla vision transformer aims to establish long-range dependencies between image patch…
View article: Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model
Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model Open
Gaze object prediction (GOP) aims to predict the category and location of the object that a human is looking at. Previous methods utilized box-level supervision to identify the object that a person is looking at, but struggled with semanti…
View article: Vision-based Wearable Steering Assistance for People with Impaired Vision in Jogging
Vision-based Wearable Steering Assistance for People with Impaired Vision in Jogging Open
Outdoor sports pose a challenge for people with impaired vision. The demand for higher-speed mobility inspired us to develop a vision-based wearable steering assistance. To ensure broad applicability, we focused on a representative sports …
View article: Explainable Bayesian Recurrent Neural Smoother to Capture Global State Evolutionary Correlations
Explainable Bayesian Recurrent Neural Smoother to Capture Global State Evolutionary Correlations Open
Through integrating the evolutionary correlations across global states in the bidirectional recursion, an explainable Bayesian recurrent neural smoother (EBRNS) is proposed for offline data-assisted fixed-interval state smoothing. At first…
View article: TransGOP: Transformer-Based Gaze Object Prediction
TransGOP: Transformer-Based Gaze Object Prediction Open
Gaze object prediction aims to predict the location and category of the object that is watched by a human. Previous gaze object prediction works use CNN-based object detectors to predict the object's location. However, we find that Transfo…
View article: TransGOP: Transformer-Based Gaze Object Prediction
TransGOP: Transformer-Based Gaze Object Prediction Open
Gaze object prediction aims to predict the location and category of the object that is watched by a human. Previous gaze object prediction works use CNN-based object detectors to predict the object's location. However, we find that Transfo…
View article: PneumoLLM: Harnessing the Power of Large Language Model for Pneumoconiosis Diagnosis
PneumoLLM: Harnessing the Power of Large Language Model for Pneumoconiosis Diagnosis Open
The conventional pretraining-and-finetuning paradigm, while effective for common diseases with ample data, faces challenges in diagnosing data-scarce occupational diseases like pneumoconiosis. Recently, large language models (LLMs) have ex…
View article: A Dual-Attention Deep Discriminative Domain Generalization Model for Hyperspectral Image Classification
A Dual-Attention Deep Discriminative Domain Generalization Model for Hyperspectral Image Classification Open
Recently, hyperspectral image classification has made great progress with the development of convolutional neural networks. However, due to the challenges of distribution shifts and data redundancies, the classification accuracy is low. So…
View article: Temporal Action Localization in the Deep Learning Era: A Survey
Temporal Action Localization in the Deep Learning Era: A Survey Open
The temporal action localization research aims to discover action instances from untrimmed videos, representing a fundamental step in the field of intelligent video understanding. With the advent of deep learning, backbone networks have be…
View article: Explainable Gated Bayesian Recurrent Neural Network for Non-Markov State Estimation
Explainable Gated Bayesian Recurrent Neural Network for Non-Markov State Estimation Open
The optimality of Bayesian filtering relies on the completeness of prior models, while deep learning holds a distinct advantage in learning models from offline data. Nevertheless, the current fusion of these two methodologies remains large…
View article: Multi-granularity Backprojection Transformer for Remote Sensing Image Super-Resolution
Multi-granularity Backprojection Transformer for Remote Sensing Image Super-Resolution Open
Backprojection networks have achieved promising super-resolution performance for nature images but not well be explored in the remote sensing image super-resolution (RSISR) field due to the high computation costs. In this paper, we propose…
View article: Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Can large language models provide useful feedback on research papers? A large-scale empirical analysis Open
Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production and intricate knowledge specialization challenge the conventional scientific feedback mechanisms. High-quality peer reviews are inc…
View article: YOLO-DCTI: Small Object Detection in Remote Sensing Base on Contextual Transformer Enhancement
YOLO-DCTI: Small Object Detection in Remote Sensing Base on Contextual Transformer Enhancement Open
Object detection for remote sensing is a fundamental task in image processing of remote sensing; as one of the core components, small or tiny object detection plays an important role. Despite the considerable advancements achieved in small…
View article: CORE: Cooperative Reconstruction for Multi-Agent Perception
CORE: Cooperative Reconstruction for Multi-Agent Perception Open
This paper presents CORE, a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception. It addresses the task from a novel perspective of cooperative reconstruction, based on two key insights: 1…
View article: Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution
Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution Open
Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the q…
View article: Hybrid Attention-Based U-Shaped Network for Remote Sensing Image Super-Resolution
Hybrid Attention-Based U-Shaped Network for Remote Sensing Image Super-Resolution Open
Recently, remote sensing image super-resolution (RSISR) has drawn considerable attention and made great breakthroughs based on convolutional neural networks (CNNs). Due to the scale and richness of texture and structural information freque…
View article: GaTector: A Unified Framework for Gaze Object Prediction
GaTector: A Unified Framework for Gaze Object Prediction Open
Gaze object prediction is a newly proposed task that aims to discover the objects being stared at by humans. It is of great application significance but still lacks a unified solution framework. An intuitive solution is to incorporate an o…
View article: Learning Pixel-Adaptive Weights for Portrait Photo Retouching
Learning Pixel-Adaptive Weights for Portrait Photo Retouching Open
Portrait photo retouching is a photo retouching task that emphasizes human-region priority and group-level consistency. The lookup table-based method achieves promising retouching performance by learning image-adaptive weights to combine 3…
View article: I2Net: Mining intra-video and inter-video attention for temporal action localization
I2Net: Mining intra-video and inter-video attention for temporal action localization Open
This paper focuses on two challenges for temporal action localization community, i.e., lack of long-term relationship and action pattern uncertainty. The former prevents the cooperation among multiple action instances within a video, while…
View article: Document- and Keyword-based Author Co-citation Analysis
Document- and Keyword-based Author Co-citation Analysis Open
In the field of scientometrics, the principal purpose for author co-citation analysis (ACA) is to map knowledge domains by quantifying the relationship between co-cited author pairs. However, traditional ACA has been criticized since its i…