Taiming Lu
YOU?
Author Swipe
View article: Stronger Normalization-Free Transformers
Stronger Normalization-Free Transformers Open
Although normalization layers have long been viewed as indispensable components of deep learning architectures, the recent introduction of Dynamic Tanh (DyT) has demonstrated that alternatives are possible. The point-wise function DyT cons…
View article: World-in-World: World Models in a Closed-Loop World
World-in-World: World Models in a Closed-Loop World Open
Generative world models (WMs) can now simulate worlds with striking visual realism, which naturally raises the question of whether they can endow embodied agents with predictive perception for decision making. Progress on this question has…
View article: GenEx: Generating an Explorable World
GenEx: Generating an Explorable World Open
Understanding, navigating, and exploring the 3D physical real world has long been a central challenge in the development of artificial intelligence. In this work, we take a step toward this goal by introducing GenEx, a system capable of pl…
View article: Generative World Explorer
Generative World Explorer Open
Planning with partial observation is a central challenge in embodied AI. A majority of prior works have tackled this challenge by developing agents that physically explore their environment to update their beliefs about the world state. In…
View article: Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell
Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell Open
Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LL…