2025-02-07
ELITE: Enhanced Language-Image Toxicity Evaluation for Safety
2025-02-07 • Wonjun Lee, Daewoo Lee, Eugene Choi, Shuishan Yu, Ashkan Yousefpour, Haon Park, Bumsub Ham, Suhyun Kim
Current Vision Language Models (VLMs) remain vulnerable to malicious prompts that induce harmful outputs. Existing safety benchmarks for VLMs primarily rely on automated evaluation methods, but these methods struggle to detect implicit harmful content or produce inaccurate evaluations. Therefore, we found that existing benchmarks have low levels of harmfulness, ambiguous data, and limited diversity in image-text pair combinations. To address these issues, we propose the ELITE benchmark, a high-quality safety evalu…