Olivia Hsu
YOU?
Author Swipe
View article: Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures Open
Irregular embedding lookups are a critical bottleneck in recommender models, sparse large language models, and graph learning models. In this paper, we first demonstrate that, by offloading these lookups to specialized access units, Decoup…
View article: Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture
Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture Open
View article: Adaptive Self-improvement LLM Agentic System for ML Library Development
Adaptive Self-improvement LLM Agentic System for ML Library Development Open
ML libraries, often written in architecture-specific programming languages (ASPLs) that target domain-specific architectures, are key to efficient ML systems. However, writing these high-performance ML libraries is challenging because it r…
View article: DFModel: Design Space Optimization of Large-Scale Systems Exploiting Dataflow Mappings
DFModel: Design Space Optimization of Large-Scale Systems Exploiting Dataflow Mappings Open
We propose DFModel, a modeling framework for mapping dataflow computation graphs onto large-scale systems. Mapping a workload to a system requires optimizing dataflow mappings at various levels, including the inter-chip (between chips) lev…
View article: Compilation of Modular and General Sparse Workspaces
Compilation of Modular and General Sparse Workspaces Open
Recent years have seen considerable work on compiling sparse tensor algebra expressions. This paper addresses a shortcoming in that work, namely how to generate efficient code (in time and space) that scatters values into a sparse result t…
View article: Compilation of Modular and General Sparse Workspaces
Compilation of Modular and General Sparse Workspaces Open
Recent years have seen considerable work on compiling sparse tensor algebra expressions. This paper addresses a shortcoming in that work, namely how to generate efficient code (in time and space) that scatters values into a sparse result t…
View article: Mosaic: An Interoperable Compiler for Tensor Algebra
Mosaic: An Interoperable Compiler for Tensor Algebra Open
We introduce Mosaic, a sparse tensor algebra compiler that can bind tensor expressions to external functions of other tensor algebra libraries and compilers. Users can extend Mosaic by adding new functions and bind a sub-expression to a fu…
View article: BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
BaCO: A Fast and Portable Bayesian Compiler Optimization Framework Open
We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks.…
View article: The Sparse Abstract Machine
The Sparse Abstract Machine Open
We propose the Sparse Abstract Machine (SAM), an abstract machine model for targeting sparse tensor algebra to reconfigurable and fixed-function spatial dataflow accelerators. SAM defines a streaming dataflow abstraction with sparse primit…
View article: Inclusive Study Group Formation at Scale
Inclusive Study Group Formation at Scale Open
Underrepresented students face many significant challenges in their education. In particular, they often have a harder time than their peers from majority groups in building long-term high-quality study groups. This challenge is exacerbate…
View article: BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
BaCO: A Fast and Portable Bayesian Compiler Optimization Framework Open
We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks.…
View article: Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture
Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture Open
We introduce Stardust, a compiler that compiles sparse tensor algebra to reconfigurable dataflow architectures (RDAs). Stardust introduces new user-provided data representation and scheduling language constructs for mapping to resource-con…
View article: The Sparse Abstract Machine
The Sparse Abstract Machine Open
We propose the Sparse Abstract Machine (SAM), an abstract machine model for targeting sparse tensor algebra to reconfigurable and fixed-function spatial dataflow accelerators. SAM defines a streaming dataflow abstraction with sparse primit…
View article: Inclusive Study Group Formation At Scale
Inclusive Study Group Formation At Scale Open
Underrepresented students face many significant challenges in their education. In particular, they often have a harder time than their peers from majority groups in building long-term high-quality study groups. This challenge is exacerbate…
View article: Compilation of sparse array programming models
Compilation of sparse array programming models Open
This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports element-wise application, reduction, and broadcasting of arbitrary functions over dense …