Jacob Hatef YOU? Author Swipe

Last 10y

Open Invitation to Help Curate This Field & Enhance Impact .ORG

Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning Open

Lang Xu, Quentin Anthony, Jacob Hatef, Aamir Shafi, Hari Subramoni , et al. · 2025

Scaling up Large Language Model(LLM) training involves fitting a tremendous amount of training parameters across a limited number of workers. However, methods like ZeRO-3 that drastically reduce GPU memory pressure often incur heavy commun…

Demystifying the Communication Characteristics for Distributed Transformer Models Open

Quentin Anthony, B Michałowicz, Jacob Hatef, Lang Xu, Mustafa Abduljabbar , et al. · 2024

Computer science Engineering

Deep learning (DL) models based on the transformer architecture have revolutionized many DL applications such as large language models (LLMs), vision transformers, audio generation, and time series prediction. Much of this progress has bee…

The Case for Co-Designing Model Architectures with Hardware Open

Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman , et al. · 2024

Computer science

While GPUs are responsible for training the vast majority of state-of-the-art deep learning models, the implications of their architecture are often overlooked when designing new deep learning (DL) models. As a consequence, modifying a DL …

Creating related items for first view…