Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Exploring foci of: arXiv (Cornell University) Waffling around for Performance: Visual Classification with Random Words and Broad Concepts June 2023 • Karsten Roth, Jae Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata The visual classification performance of vision-language models such as CLIP has been shown to benefit from additional semantic knowledge from large language models (LLMs) such as GPT-3. In particular, averaging over LLM-generated class descriptors, e.g. "waffle, which has a round shape", can notably improve generalization performance. In this work, we critically study this behavior and propose WaffleCLIP, a framework for zero-shot visual classification which simply replaces LLM-generated descriptors with random c… Open Article Page

Computer Science Generalization Artificial Intelligence Machine Learning Programming Language Paleontology Mathematical Analysis Biology Philosophy Open Article

Mathematics Open Article