Ki, Dayeon
YOU?
Author Swipe
View article: Can They Dixit? Yes they Can! Dixit as a Playground for Multimodal Language Model Capabilities
Can They Dixit? Yes they Can! Dixit as a Playground for Multimodal Language Model Capabilities Open
Multi-modal large language models (MLMs) are often assessed on static, individual benchmarks -- which cannot jointly assess MLM capabilities in a single task -- or rely on human or model pairwise comparisons -- which is highly subjective, …
View article: Linguistic Nepotism: Trading-off Quality for Language Preference in Multilingual RAG
Linguistic Nepotism: Trading-off Quality for Language Preference in Multilingual RAG Open
Multilingual Retrieval-Augmented Generation (mRAG) systems enable language models to answer knowledge-intensive queries with citation-supported responses across languages. While such systems have been proposed, an open questions is whether…
View article: GraphicBench: A Planning Benchmark for Graphic Design with Language Agents
GraphicBench: A Planning Benchmark for Graphic Design with Language Agents Open
Large Language Model (LLM)-powered agents have unlocked new possibilities for automating human tasks. While prior work has focused on well-defined tasks with specified goals, the capabilities of agents in creative design tasks with open-en…
View article: AskQE: Question Answering as Automatic Evaluation for Machine Translation
AskQE: Question Answering as Automatic Evaluation for Machine Translation Open
How can a monolingual English speaker determine whether an automatic translation in French is good enough to be shared? Existing MT error detection and quality estimation (QE) techniques do not address this practical scenario. We introduce…