Exploring foci of:
arXiv (Cornell University)
Chopper: A Multi-Level GPU Characterization Tool & Derived Insights Into LLM Training Inefficiency
December 2025 • Kurzynski, Marco, Aga, Shaizeen, Wu, Di
Training large language models (LLMs) efficiently requires a deep understanding of how modern GPU systems behave under real-world distributed training workloads. While prior work has focused primarily on kernel-level performance or single-GPU microbenchmarks, the complex interaction between communication, computation, memory behavior, and power management in multi-GPU LLM training remains poorly characterized. In this work, we introduce Chopper, a profiling and analysis framework that collects, aligns, and visuali…
Computer Science
X-Inefficiency
Architecture
Artificial Intelligence
Parallel Computing
Computer Engineering
Training, Validation, And Test Data Sets
Embedded System
Machine Learning
Data Structure
Visualization (Graphics)