Giovanni Monea
YOU?
Author Swipe
View article: Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers Open
A central question in multilingual language modeling is whether large language models (LLMs) develop a universal concept representation, disentangled from specific languages. In this paper, we address this question by analyzing latent repr…
View article: Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Do Llamas Work in English? On the Latent Language of Multilingual Transformers Open
We ask whether multilingual language models trained on unbalanced, English-dominated corpora use English as an internal pivot language -- a question of key importance for understanding how language models function and the origins of lingui…
View article: PaSS: Parallel Speculative Sampling
PaSS: Parallel Speculative Sampling Open
Scaling the size of language models to tens of billions of parameters has led to impressive performance on a wide range of tasks. At generation, these models are used auto-regressively, requiring a forward pass for each generated token, an…