Henry Hoffmann
YOU?
Author Swipe
View article: WatchHAR: Real-time On-device Human Activity Recognition System for Smartwatches
WatchHAR: Real-time On-device Human Activity Recognition System for Smartwatches Open
Despite advances in practical and multimodal fine-grained Human Activity Recognition (HAR), a system that runs entirely on smartwatches in unconstrained environments remains elusive. We present WatchHAR, an audio and inertial-based HAR sys…
View article: SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding
SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding Open
Low-latency decoding for large language models (LLMs) is crucial for applications like chatbots and code assistants, yet generating long outputs remains slow in single-query settings. Prior work on speculative decoding (which combines a sm…
View article: A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation
A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation Open
Recent advancements in graph representation learning have shifted attention towards dynamic graphs, which exhibit evolving topologies and features over time. The increased use of such graphs creates a paramount need for generative models s…
View article: Quality Measures for Dynamic Graph Generative Models
Quality Measures for Dynamic Graph Generative Models Open
Deep generative models have recently achieved significant success in modeling graph data, including dynamic graphs, where topology and features evolve over time. However, unlike in vision and natural language domains, evaluating generative…
View article: A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation
A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation Open
Recent advancements in graph representation learning have shifted attention towards dynamic graphs, which exhibit evolving topologies and features over time. The increased use of such graphs creates a paramount need for generative models s…
View article: MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices
MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices Open
There has been a continued trend towards minimizing instrumentation for full-body motion capture, going from specialized rooms and equipment, to arrays of worn sensors and recently sparse inertial pose capture methods. However, as these te…
View article: CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving Open
As large language models (LLMs) take on complex tasks, their inputs are supplemented with longer contexts that incorporate domain knowledge. Yet using long contexts is challenging as nothing can be generated until the whole context is proc…
View article: UpDown: Programmable fine-grained Events for Scalable Performance on Irregular Applications
UpDown: Programmable fine-grained Events for Scalable Performance on Irregular Applications Open
Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports fine-gra…
View article: Keeper: Automated Testing and Fixing of Machine Learning Software
Keeper: Automated Testing and Fixing of Machine Learning Software Open
The increasing number of software applications incorporating machine learning (ML) solutions has led to the need for testing techniques. However, testing ML software requires tremendous human effort to design realistic and relevant test in…
View article: SEASONS: Signal and Energy Aware Sensing on iNtermittent Systems
SEASONS: Signal and Energy Aware Sensing on iNtermittent Systems Open
Both energy-aware, batteryless intermittent systems and signal-aware adaptive sampling algorithms (ASA) aim to maximize sensor data accuracy under energy constraints in edge devices. Intuitively, combining both into a signal- & energy-awar…
View article: Acoustic Keystroke Leakage on Smart Televisions
Acoustic Keystroke Leakage on Smart Televisions Open
Smart Televisions (TVs) are internet-connected TVs that support video streaming applications and web browsers.Users enter information into Smart TVs through on-screen virtual keyboards.These keyboards require users to navigate between keys…
View article: DPS: Adaptive Power Management for Overprovisioned Systems
DPS: Adaptive Power Management for Overprovisioned Systems Open
Maximizing performance under a power budget is essential for HPC systems and has inspired the development of many power management frameworks. These can be broadly characterized into two groups: model-based and stateless. Model-based frame…
View article: Run-Time Prevention of Software Integration Failures of Machine Learning APIs
Run-Time Prevention of Software Integration Failures of Machine Learning APIs Open
Due to the under-specified interfaces, developers face challenges in correctly integrating machine learning (ML) APIs in software. Even when the ML API and the software are well designed on their own, the resulting application misbehaves w…
View article: CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving Open
As large language models (LLMs) take on complex tasks, their inputs are supplemented with longer contexts that incorporate domain knowledge. Yet using long contexts is challenging, as nothing can be generated until the whole context is pro…
View article: Automatic and Efficient Customization of Neural Networks for ML Applications
Automatic and Efficient Customization of Neural Networks for ML Applications Open
ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs o…
View article: Navigating the Dynamic Noise Landscape of Variational Quantum Algorithms with QISMET
Navigating the Dynamic Noise Landscape of Variational Quantum Algorithms with QISMET Open
In the Noisy Intermediate Scale Quantum (NISQ) era, the dynamic nature of quantum systems causes noise sources to constantly vary over time. Transient errors from the dynamic NISQ noise landscape are challenging to comprehend and are espec…
View article: Apparatus and method for optimizing quantifiable behavior in configurable devices and systems
Apparatus and method for optimizing quantifiable behavior in configurable devices and systems Open
An apparatus and method are provided to perform constrained optimization of a constrained property of an apparatus, which is complex due to having several components, and these components are configurable in real-time. The optimization is …
View article: CAFQA: A Classical Simulation Bootstrap for Variational Quantum Algorithms
CAFQA: A Classical Simulation Bootstrap for Variational Quantum Algorithms Open
Classical computing plays a critical role in the advancement of quantum frontiers in the NISQ era. In this spirit, this work uses classical simulation to bootstrap Variational Quantum Algorithms (VQAs). VQAs rely upon the iterative optimiz…
View article: Acela: Predictable Datacenter-level Maintenance Job Scheduling
Acela: Predictable Datacenter-level Maintenance Job Scheduling Open
Datacenter operators ensure fair and regular server maintenance by using automated processes to schedule maintenance jobs to complete within a strict time budget. Automating this scheduling problem is challenging because maintenance job du…
View article: CALORIE: A Constraint Language and Optimizing Runtime for Exascale Power Management (Final Report)
CALORIE: A Constraint Language and Optimizing Runtime for Exascale Power Management (Final Report) Open
This final technical report summarizes the key accomplishments on the CALORIE project, a DOE Early Career award received by PI Henry Hoffmann at the University of Chicago. CALORIE’s main goal was to create principled methodologies, tools, …
View article: Navigating the dynamic noise landscape of variational quantum algorithms with QISMET
Navigating the dynamic noise landscape of variational quantum algorithms with QISMET Open
Transient errors from the dynamic NISQ noise landscape are challenging to comprehend and are especially detrimental to classes of applications that are iterative and/or long-running, and therefore their timely mitigation is important for q…
View article: SCOPE: Safe Exploration for Dynamic Computer Systems Optimization
SCOPE: Safe Exploration for Dynamic Computer Systems Optimization Open
Modern computer systems need to execute under strict safety constraints (e.g., a power limit), but doing so often conflicts with their ability to deliver high performance (i.e. minimal latency). Prior work uses machine learning to automati…
View article: Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression
Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression Open
Sample-efficient machine learning (SEML) has been widely applied to find optimal latency and power tradeoffs for configurable computer systems. Instead of randomly sampling from the configuration space, SEML reduces the search cost by dram…
View article: NURD: Negative-Unlabeled Learning for Online Datacenter Straggler Prediction
NURD: Negative-Unlabeled Learning for Online Datacenter Straggler Prediction Open
Datacenters execute large computational jobs, which are composed of smaller tasks. A job completes when all its tasks finish, so stragglers -- rare, yet extremely slow tasks -- are a major impediment to datacenter performance. Accurately p…
View article: CAFQA: A classical simulation bootstrap for variational quantum algorithms
CAFQA: A classical simulation bootstrap for variational quantum algorithms Open
This work tackles the problem of finding a good ansatz initialization for Variational Quantum Algorithms (VQAs), by proposing CAFQA, a Clifford Ansatz For Quantum Accuracy. The CAFQA ansatz is a hardware-efficient circuit built with only C…
View article: Generalizable and interpretable learning for configuration extrapolation
Generalizable and interpretable learning for configuration extrapolation Open
Modern software applications are increasingly configurable, which puts a burden on users to tune these configurations for their target hardware and workloads. To help users, machine learning techniques can model the complex relationships b…
View article: Proxima
Proxima Open
Atomistic-scale simulations are prominent scientific applications that require the repetitive execution of a computationally expensive routine to calculate a system's potential energy. Prior work shows that these expensive routines can be …
View article: Neighborhood street activity and greenspace usage uniquely contribute to predicting crime
Neighborhood street activity and greenspace usage uniquely contribute to predicting crime Open
Crime is a costly societal issue. While many factors influence urban crime, one less-studied but potentially important factor is neighborhood greenspace. Research has shown that greenspace is often negatively associated with crime. Measuri…