Explanipedia

Zorse: Optimizing LLM Training Efficiency on Heterogeneous GPU Clusters Open

Runsheng Benson Guo, Utkarsh Anand, Khuzaima Daudjee, Rathijit Sen · 2025

Large language models (LLMs) require vast amounts of GPU compute to train, but limited availability and high costs of GPUs make homogeneous clusters impractical for many organizations. Instead, assembling heterogeneous clusters by pooling …

Cephalo: Harnessing Heterogeneous GPU Clusters for Training Transformer Models Open

Runsheng Benson Guo, Utkarsh Anand, Arthur Chen, Khuzaima Daudjee · 2024

Training transformer models requires substantial GPU compute and memory resources. In homogeneous clusters, distributed strategies allocate resources evenly, but this approach is inefficient for heterogeneous clusters, where GPUs differ in…

FreeRide: Harvesting Bubbles in Pipeline Parallelism Open

Jiashu Zhang, Zihan Pan, Molly, Kang Xu, Khuzaima Daudjee , et al. · 2024

Computer science Environmental science

The occurrence of bubbles in pipeline parallelism is an inherent limitation that can account for more than 40% of the large language model (LLM) training time and is one of the main reasons for the underutilization of GPU resources in LLM …

Practical Hardware Transactional vEB Trees Open

Mohammad Khalaji, Trevor Brown, Khuzaima Daudjee, Vitaly Aksenov · 2024

Computer science Mathematics

van Emde Boas (vEB) trees are sequential data structures optimized for extremely fast predecessor and successor queries. Such queries are an important incentive to use ordered sets or maps such as vEB trees. All operations in a vEB tree ar…

The future is big graphs Open

Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar , et al. · 2021

Computer science

Ensuring the success of big graph processing for the next decade and beyond.

Klink: Progress-Aware Scheduling for Streaming Data Systems Open

Omar Farhat, Khuzaima Daudjee, Leonardo Querzoni · 2021

Computer science Economics

Modern stream processing engines (SPEs) process large volumes of events propagated at high velocity through multiple queries. To improve performance, existing SPEs generally aim to minimize query output latency by minimizing, in turn, the …

The Future is Big Graphs! A Community View on Graph Processing Systems Open

Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar , et al. · 2020

Computer science

Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads underst…

Providing Serializability for Pregel-like Graph Processing Systems Open

Minyang Han, Khuzaima Daudjee · 2016

Computer science

There is considerable interest in the design and development of distributed systems that can execute algorithms to process large graphs. Serializability guarantees that parallel executions of a graph algorithm produce the same results as s…

Khuzaima Daudjee YOU? Author Swipe