arXiv (Cornell University)
Lego-Edit: A General Image Editing Framework with Model-Level Bricks and MLLM Builder
September 2025 • Qiong Jia, Yu Liu, Yajie Chai, Xintong Yao, Qiming Lu, Yasen Zhang, R. S. Shi, Ying Huang, Guoquan Zhang
Instruction-based image editing has garnered significant attention due to its direct interaction with users. However, real-world user instructions are immensely diverse, and existing methods often fail to generalize effectively to instructions outside their training domain, limiting their practical application. To address this, we propose Lego-Edit, which leverages the generalization capability of Multi-modal Large Language Model (MLLM) to organize a suite of model-level editing tools to tackle this challenge. Leg…