Yunkai Dang
YOU?
Author Swipe
View article: Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey Open
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with large language models (LLMs) and computer vision (CV) systems driving advancements in natural language understanding and visual processing, resp…
View article: RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness Open
Traditional feedback learning for hallucination reduction relies on labor-intensive manual labeling or expensive proprietary models. This leaves the community without foundational knowledge about how to build high-quality feedback with ope…
View article: FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?
FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models? Open
Few-shot learning aims to train models that can be generalized to novel classes with only a few samples. Recently, a line of works are proposed to enhance few-shot learning with accessible semantic information from class names. However, th…
View article: Multi-Level Correlation Network For Few-Shot Image Classification
Multi-Level Correlation Network For Few-Shot Image Classification Open
Few-shot image classification(FSIC) aims to recognize novel classes given few\nlabeled images from base classes. Recent works have achieved promising\nclassification performance, especially for metric-learning methods, where a\nmeasure at …