Ali R. Butt
YOU?
Author Swipe
View article: Multi-Agent Code-Orchestrated Generation for Reliable Infrastructure-as-Code
Multi-Agent Code-Orchestrated Generation for Reliable Infrastructure-as-Code Open
The increasing complexity of cloud-native infrastructure has made Infrastructure-as-Code (IaC) essential for reproducible and scalable deployments. While large language models (LLMs) have shown promise in generating IaC snippets from natur…
View article: User-based I/O Profiling for Leadership Scale HPC Workloads
User-based I/O Profiling for Leadership Scale HPC Workloads Open
I/O constitutes a significant portion of most of the application runtime. Spawning many such applications concurrently on an HPC system leads to severe I/O contention. Thus, understanding and subsequently reducing I/O contention induced by…
View article: Ensuring Fair LLM Serving Amid Diverse Applications
Ensuring Fair LLM Serving Amid Diverse Applications Open
In a multi-tenant large language model (LLM) serving platform hosting diverse applications, some users may submit an excessive number of requests, causing the service to become unavailable to other users and creating unfairness. Existing f…
View article: CIWARS: a web server for waterborne antibiotic resistance surveillance using longitudinal metagenomic data
CIWARS: a web server for waterborne antibiotic resistance surveillance using longitudinal metagenomic data Open
The rise of antibiotic resistance (AR) is a major global health crisis, exacerbated by the overuse and misuse of antibiotics, leading to the rapid spread of antibiotic resistance genes (ARGs) in bacterial pathogens. This phenomenon poses s…
View article: FLOAT: Federated Learning Optimizations with Automated Tuning
FLOAT: Federated Learning Optimizations with Automated Tuning Open
Federated Learning (FL) has emerged as a powerful approach that enables collaborative distributed model training without the need for data sharing. However, FL grapples with inherent heterogeneity challenges leading to issues such as strag…
View article: Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File Systems
Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File Systems Open
The imbalanced I/O load on large parallel file systems affects the parallel I/O performance of high-performance computing (HPC) applications. One of the main reasons for I/O imbalances is the lack of a global view of system-wide resource c…
View article: An End-to-end High-performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems
An End-to-end High-performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems Open
The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place a high pressure on the…
View article: Towards Persistent Memory based Stateful Serverless Computing for Big Data Applications
Towards Persistent Memory based Stateful Serverless Computing for Big Data Applications Open
The Function-as-a-service (FaaS) computing model has recently seen significant growth especially for highly scalable, event-driven applications. The easy-to-deploy and cost-efficient fine-grained billing of FaaS is highly attractive to big…
View article: A Survey on Attacks and Their Countermeasures in Deep Learning: Applications in Deep Neural Networks, Federated, Transfer, and Deep Reinforcement Learning
A Survey on Attacks and Their Countermeasures in Deep Learning: Applications in Deep Neural Networks, Federated, Transfer, and Deep Reinforcement Learning Open
Deep Learning (DL) techniques are being used in various critical applications like self-driving cars. DL techniques such as Deep Neural Networks (DNN), Deep Reinforcement Learning (DRL), Federated Learning (FL), and Transfer Learning (TL) …
View article: Towards cost-effective and resource-aware aggregation at Edge for Federated Learning
Towards cost-effective and resource-aware aggregation at Edge for Federated Learning Open
Federated Learning (FL) is a machine learning approach that addresses privacy and data transfer costs by computing data at the source. It's particularly popular for Edge and IoT applications where the aggregator server of FL is in resource…
View article: An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers
An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers Open
Supercomputer design is a complex, multi-dimensional optimization process, wherein several subsystems need to be reconciled to meet a desired figure of merit performance for a portfolio of applications and a budget constraint. However, ove…
View article: Prediction of high-performance computing input/output variability and its application to optimization for system configurations
Prediction of high-performance computing input/output variability and its application to optimization for system configurations Open
Performance variability is an important measure for a reliable high performance computing (HPC) system. Performance variability is affected by complicated interactions between numerous factors, such as CPU frequency, the number of input/ou…
View article: Prediction of High-Performance Computing Input/Output Variability and Its Application to Optimization for System Configurations
Prediction of High-Performance Computing Input/Output Variability and Its Application to Optimization for System Configurations Open
Performance variability is an important measure for a reliable high performance computing (HPC) system. Performance variability is affected by complicated interactions between numerous factors, such as CPU frequency, the number of input/ou…
View article: Understanding HPC Application I/O Behavior Using System Level Statistics
Understanding HPC Application I/O Behavior Using System Level Statistics Open
The processor performance of high performance computing (HPC) systems is increasing at a much higher rate than storage performance. This imbalance leads to I/O performance bottlenecks in massively parallel HPC applications. Therefore, ther…
View article: Algorithm 1012
Algorithm 1012 Open
DELAUNAYSPARSE contains both serial and parallel codes written in Fortran 2003 (with OpenMP) for performing medium- to high-dimensional interpolation via the Delaunay triangulation. To accommodate the exponential growth in the size of the …
View article: MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems
MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems Open
Deep learning (DL) has become a key tool for solving complex scientific problems. However, managing the multi-dimensional large-scale data associated with DL, especially atop extant multiple graphics processing units (GPUs) in modern super…
View article: An Integrated Indexing and Search Service for Distributed File Systems
An Integrated Indexing and Search Service for Distributed File Systems Open
Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the underlying file systems, and are often deployed using external databases and indexing services. However…
View article: Customizable Scale-Out Key-Value Stores
Customizable Scale-Out Key-Value Stores Open
Enterprise KV stores are often not well suited for HPC applications, and thus cumbersome end-to-end KV design customization is required to meet the needs of modern HPC applications. To this end, in this article we present bespoKV, an adapt…
View article: A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers
A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers Open
Deep learning (DL) has become a key technique for solving complex problems in scientific research and discovery. DL training for science is substantially challenging because it has to deal with massive quantities of multi-dimensional data.…
View article: iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems Open
Parallel I/O performance is crucial to sustaining scientific applications on large-scale High-Performance Computing (HPC) systems. However, I/O load imbalance in the underlying distributed and shared storage systems can significantly reduc…
View article: BESPOKV: Application Tailored Scale-Out Key-Value Stores
BESPOKV: Application Tailored Scale-Out Key-Value Stores Open
Enterprise KV stores are not well suited for HPC applications, and entail customization and cumbersome end-to-end KV design to extract the HPC application needs. To this end, in this paper we present BESPOKV, an adaptive, extensible, and s…
View article: A Heterogeneity-Aware Task Scheduler for Spark
A Heterogeneity-Aware Task Scheduler for Spark Open
Big data processing systems such as Spark are employed in an increasing number of diverse applications—such as machine learning, graph computation, and scientific computing—each with dynamic and different resource needs. These applications…
View article: An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays
An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays Open
Here, the need for novel data analysis is urgent in the face of a data deluge from modern applications. Traditional approaches to data analysis incur significant data movement costs, moving data back and forth between the storage system an…
View article: Sizing Buffers of IoT Edge Routers
Sizing Buffers of IoT Edge Routers Open
In typical IoT systems, sensors and actuators are connected to small embedded computers, called IoT devices, and the IoT devices are connected to one or more appropriate cloud services over the internet through an edge access router. A ver…
View article: Chameleon: An Adaptive Wear Balancer for Flash Clusters
Chameleon: An Adaptive Wear Balancer for Flash Clusters Open
NAND flash-based Solid State Devices (SSDs) offer the desirable features of high performance, energy efficiency, and fast growing capacity. Thus, the use of SSDs is increasing in distributed storage systems. A key obstacle in this context …
View article: Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems
Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems Open
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems…
View article: Scaling up data-parallel analytics platforms: Linear algebraic operation cases
Scaling up data-parallel analytics platforms: Linear algebraic operation cases Open
Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms. Scaling up as well as scaling out such algorithms are key to supporting large scale data analysis that require …