Essential AI YOU? Author Swipe

Last 10y

Open Invitation to Help Curate This Field & Enhance Impact .ORG

Essential-Web v1.0: 24T tokens of organized web data Open

Essential AI, :, Andrew Hojel, Michael Pust, Tim Romanski , et al. · 2025

Data plays the most prominent role in how language models acquire skills and knowledge. The lack of massive, well-organized pre-training datasets results in costly and inaccessible data pipelines. We present Essential-Web v1.0, a 24-trilli…

Practical Efficiency of Muon for Pretraining Open

Essential AI, :, I. Shah, Anthony M. Polloreno, Karl Stratos , et al. · 2025

We demonstrate that Muon, the simplest instantiation of a second-order optimizer, explicitly expands the Pareto frontier over AdamW on the compute-time tradeoff. We find that Muon is more effective than AdamW in retaining data efficiency a…

Rethinking Reflection in Pre-Training Open

Essential AI, :, Darsh Shah, Peter Rushton, S.K. Singla , et al. · 2025

A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually b…

Creating related items for first view…