Sam Dodge YOU? Author Swipe

Last 10y

Open Invitation to Help Curate This Field & Enhance Impact .ORG

Apple Intelligence Foundation Language Models: Tech Report 2025 Open

Anders Larsen, Xiyou Zhou, Jun Qin, Margit Bowler, Eray Yildiz , et al. · 2025

We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: i a 3B-parameter on-device model optimized for Apple silicon through architectural innovations s…

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Open

Haotian Zhang, Mingfei Gao, Zhe Gan, Philipp Dufter, Nina Wenzel , et al. · 2024

Computer science

We present MM1.5, a new family of multimodal large language models (MLLMs) designed to enhance capabilities in text-rich image understanding, visual referring and grounding, and multi-image reasoning. Building upon the MM1 architecture, MM…

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Open

Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang , et al. · 2024

Computer science Psychology Geography

In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image enc…

Creating related items for first view…