Pengguang Chen
YOU?
Author Swipe
View article: Dual-Balancing for Multi-Task Learning
Dual-Balancing for Multi-Task Learning Open
View article: MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders Open
Multi-task dense scene understanding, which trains a model for multiple dense prediction tasks, has a wide range of application scenarios. Capturing long-range dependency and enhancing cross-task interactions are crucial to multi-task dens…
View article: MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders Open
Multi-task dense scene understanding, which learns a model for multiple dense prediction tasks, has a wide range of application scenarios. Modeling long-range dependency and enhancing cross-task interactions are crucial to multi-task dense…
View article: MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs Open
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, evaluating these reasoning abilities has become increasi…
View article: VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning
VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning Open
Thanks to advances in deep learning techniques, Human Pose Estimation (HPE) has achieved significant progress in natural scenarios. However, these models perform poorly in artificial scenarios such as painting and sculpture due to the doma…
View article: MOODv2: Masked Image Modeling for Out-of-Distribution Detection
MOODv2: Masked Image Modeling for Out-of-Distribution Detection Open
The crux of effective out-of-distribution (OOD) detection lies in acquiring a robust in-distribution (ID) representation, distinct from OOD samples. While previous methods predominantly leaned on recognition-based techniques for this purpo…
View article: MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation Open
In this work, we introduce a novel evaluation paradigm for Large Language Models (LLMs) that compels them to transition from a traditional question-answering role, akin to a student, to a solution-scoring role, akin to a teacher. This para…
View article: BAL: Balancing Diversity and Novelty for Active Learning
BAL: Balancing Diversity and Novelty for Active Learning Open
The objective of Active Learning is to strategically label a subset of the dataset to maximize performance within a predetermined labeling budget. In this study, we harness features acquired through self-supervised learning. We introduce a…
View article: MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks
MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks Open
Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks. However, their performance tends to falter when confronted with more challenging programming problems. We observe that conve…
View article: Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics
Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics Open
Defect inspection is paramount within the closed-loop manufacturing system. However, existing datasets for defect inspection often lack precision and semantic granularity required for practical applications. In this paper, we introduce the…
View article: Dual-Balancing for Multi-Task Learning
Dual-Balancing for Multi-Task Learning Open
Multi-task learning aims to learn multiple related tasks simultaneously and has achieved great success in various fields. However, the disparity in loss and gradient scales among tasks often leads to performance compromises, and the balanc…
View article: TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation Open
Contrastive Language-Image Pre-training (CLIP) has recently shown great promise in pixel-level zero-shot learning tasks. However, existing approaches utilizing CLIP's text and patch embeddings to generate semantic masks often misidentify i…
View article: Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need
Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need Open
The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples. Previous work applied recognition-based methods to learn the ID features, which tend to learn s…
View article: Adversarial Attacks on ML Defense Models Competition
Adversarial Attacks on ML Defense Models Competition Open
Due to the vulnerability of deep neural networks (DNNs) to adversarial examples, a large number of defense techniques have been proposed to alleviate this problem in recent years. However, the progress of building more robust models is usu…
View article: Deep Structured Instance Graph for Distilling Object Detectors
Deep Structured Instance Graph for Distilling Object Detectors Open
Effectively structuring deep knowledge plays a pivotal role in transfer from teacher to student, especially in semantic vision tasks. In this paper, we present a simple knowledge structure to exploit and encode information inside the detec…
View article: Exploring and Improving Mobile Level Vision Transformers
Exploring and Improving Mobile Level Vision Transformers Open
We study the vision transformer structure in the mobile level in this paper, and find a dramatic performance drop. We analyze the reason behind this phenomenon, and propose a novel irregular patch embedding module and adaptive patch fusion…
View article: Distilling Knowledge via Knowledge Review
Distilling Knowledge via Knowledge Review Open
Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network. Previous methods mostly focus on proposing feature transformation and loss f…
View article: Jigsaw Clustering for Unsupervised Visual Representation Learning
Jigsaw Clustering for Unsupervised Visual Representation Learning Open
Unsupervised representation learning with contrastive learning achieved great success. This line of methods duplicate each training batch to construct contrastive pairs, making each training batch and its augmented version forwarded simult…