Hao Zhang
YOU?
Author Swipe
View article: A Comprehensive Survey on Cross-Domain Recommendation: Taxonomy, Progress, and Prospects
A Comprehensive Survey on Cross-Domain Recommendation: Taxonomy, Progress, and Prospects Open
View article: Reverse Modeling in Large Language Models
Reverse Modeling in Large Language Models Open
View article: EFE-YOLO:A YOLO11n-based Efficient Feature Enhanced Framework for Steel Surface Defect Detection
EFE-YOLO:A YOLO11n-based Efficient Feature Enhanced Framework for Steel Surface Defect Detection Open
View article: AUCAD: Automated Construction of Alignment Dataset from Log-Related Issues for Enhancing LLM-based Log Generation
AUCAD: Automated Construction of Alignment Dataset from Log-Related Issues for Enhancing LLM-based Log Generation Open
Log statements have become an integral part of modern software systems. Prior research efforts have focused on supporting the decisions of placing log statements, such as where/what to log. With the increasing adoption of Large Language Mo…
View article: Adaptive Hypergraph-Augmented Graph Convolution Network for Skeleton-based Action Recognition
Adaptive Hypergraph-Augmented Graph Convolution Network for Skeleton-based Action Recognition Open
View article: Multi‐stage image inpainting using improved partial convolutions
Multi‐stage image inpainting using improved partial convolutions Open
In recent years, deep learning models have dramatically influenced image inpainting. However, many existing studies still suffer from over‐smoothed or blurred textures when missing regions are large or contain rich visual details. To resto…
View article: MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs Open
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, evaluating these reasoning abilities has become increasi…
View article: Blind Image Deblurring: When Patch-wise Minimal Pixels Prior Meets Fractional-Order Method
Blind Image Deblurring: When Patch-wise Minimal Pixels Prior Meets Fractional-Order Method Open
Blind image deblurring is a challenging issue in image processing. In blind image deblurring, the typical approach involves iteratively estimating both the blur kernel and latent image until convergence to the blur kernel of the observed i…
View article: Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models Open
Integrating external knowledge into large language models (LLMs) presents a promising solution to overcome the limitations imposed by their antiquated and static parametric memory. Prior studies, however, have tended to over-reliance on ex…
View article: MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Open
Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimod…
View article: A Non-Destructive Detection and Grading Method of the Internal Quality of Preserved Eggs Based on an Improved ConvNext
A Non-Destructive Detection and Grading Method of the Internal Quality of Preserved Eggs Based on an Improved ConvNext Open
As a traditional delicacy in China, preserved eggs inevitably experience instances of substandard quality during the production process. Chinese preserved egg production facilities can only rely on experienced workers to select the preserv…
View article: Empowering Sequential Recommendation from Collaborative Signals and Semantic Relatedness
Empowering Sequential Recommendation from Collaborative Signals and Semantic Relatedness Open
Sequential recommender systems (SRS) could capture dynamic user preferences by modeling historical behaviors ordered in time. Despite effectiveness, focusing only on the \textit{collaborative signals} from behaviors does not fully grasp us…
View article: Deep Unfolding Network with Spatial Alignment for multi-modal MRI reconstruction
Deep Unfolding Network with Spatial Alignment for multi-modal MRI reconstruction Open
Multi-modal Magnetic Resonance Imaging (MRI) offers complementary diagnostic information, but some modalities are limited by the long scanning time. To accelerate the whole acquisition process, MRI reconstruction of one modality from highl…
View article: Visual sentiment analysis with semantic correlation enhancement
Visual sentiment analysis with semantic correlation enhancement Open
Visual sentiment analysis is in great demand as it provides a computational method to recognize sentiment information in abundant visual contents from social media sites. Most of existing methods use CNNs to extract varying visual attribut…
View article: Interpretable Geoscience Artificial Intelligence (XGeoS-AI): Application to Demystify Image Recognition
Interpretable Geoscience Artificial Intelligence (XGeoS-AI): Application to Demystify Image Recognition Open
As Earth science enters the era of big data, artificial intelligence (AI) not only offers great potential for solving geoscience problems, but also plays a critical role in accelerating the understanding of the complex, interactive, and mu…
View article: Text-Guided Generation and Editing of Compositional 3D Avatars
Text-Guided Generation and Editing of Compositional 3D Avatars Open
Our goal is to create a realistic 3D facial avatar with hair and accessories using only a text description. While this challenge has attracted significant recent interest, existing methods either lack realism, produce unrealistic shapes, o…
View article: Visual Sentiment Analysis with Semantic Correlation Enhancement
Visual Sentiment Analysis with Semantic Correlation Enhancement Open
View article: The moving target tracking and segmentation method based on space-time fusion
The moving target tracking and segmentation method based on space-time fusion Open
At present, the target tracking method based on the correlation operation mainly uses deep learning to extract spatial information from video frames and then performs correlations on this basis. However, it does not extract the motion feat…
View article: Learning multi-level representations for affective image recognition
Learning multi-level representations for affective image recognition Open
Images can convey intense affective experiences and affect people on an affective level. With the prevalence of online pictures and videos, evaluating emotions from visual content has attracted considerable attention. Affective image recog…
View article: Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network
Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network Open
Using deep learning models and attention mechanisms to classify fine-grained emotions of Chinese microblogs has become a research hotspot.However,the existing attention mechanisms consider the impact of words on words,and lack effective in…
View article: Graph transformer network with temporal kernel attention for skeleton-based action recognition
Graph transformer network with temporal kernel attention for skeleton-based action recognition Open
Skeleton-based human action recognition has caused wide concern, as skeleton data can robustly adapt to dynamic circumstances such as camera view changes and background interference thus allowing recognition methods to focus on robust feat…
View article: MAM: A multipath attention mechanism for image recognition
MAM: A multipath attention mechanism for image recognition Open
Attention mechanism has shown excellent performance in many computer vision tasks, while the previous literature may not adequately consider different types of attention mechanisms or is individual elaborate designed for a certain network.…
View article: Contrastive learning for a single historical painting’s blind super-resolution
Contrastive learning for a single historical painting’s blind super-resolution Open
View article: OsaMOT: Occlusion and scale‐aware multi‐object tracking algorithm for low viewpoint
OsaMOT: Occlusion and scale‐aware multi‐object tracking algorithm for low viewpoint Open
Multi‐object tracking (MOT), which uses the context information of image sequences to locate, maintain identities and generate trajectories of multiple targets in each frame, is key technology in the field of computer vision. To address th…
View article: Ant_ViBe: Improved ViBe Algorithm Based on Ant Colony Clustering under Dynamic Background
Ant_ViBe: Improved ViBe Algorithm Based on Ant Colony Clustering under Dynamic Background Open
Foreground target detection algorithm (FTDA) is a fundamental preprocessing step in computer vision and video processing. A universal background subtraction algorithm for video sequences (ViBe) is a fast, simple, efficient and with optimal…
View article: Target Tracking Method Based on Adaptive Structured Sparse Representation With Attention
Target Tracking Method Based on Adaptive Structured Sparse Representation With Attention Open
Considering the problems of motion blur, partial occlusion and fast motion in target tracking, a target tracking method based on adaptive structured sparse representation with attention is proposed. Under the framework of particle filterin…