Explanipedia

Combining Machine Learning and Lifetime-Based Resource Management for Memory Allocation and Beyond Open

Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley , et al. · 2024

Computer science

Memory management is fundamental to the performance of all applications. On modern server architectures, an application's memory allocator needs to balance memory utilization against the ability to use 2MB huge pages, which are crucial for…

Combining Machine Learning and Lifetime-Based Resource Management for Memory Allocation and Beyond Open

Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley , et al. · 2024

Computer science

Memory management is fundamental to the performance of all applications. On modern server architectures, an application's memory allocator needs to balance memory utilization against the ability to use 2MB huge pages, which are crucial for…

RAIZN: Redundant Array of Independent Zoned Namespaces Open

Thomas Kim, Jekyeom Jeon, Nikhil Arora, Huaicheng Li, Michael Kaminsky , et al. · 2023

Computer science

Zoned Namespace (ZNS) SSDs are the latest evolution of host-managed flash storage, enabling improved performance at a lower cost-per-byte than traditional block interface (conventional) SSDs. To date, there is no support for arranging thes…

Lightweight Preemptible Functions Open

Sol Boucher, Anuj Kalia, David G. Andersen, Michael Kaminsky · 2022

Computer science

We introduce novel programming abstractions for isolation of both time and memory. They operate at finer granularity than traditional primitives, supporting preemption at sub-millisecond timescales and tasks defined at the level of a funct…

Succinct range filters Open

Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky , et al. · 2021

Computer science Physics Materials science

We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups and common range queries, such as range counts. SuRF i…

High availability in cheap distributed key value storage Open

Thomas Kim, Daniel Wong, Gregory R. Ganger, Michael Kaminsky, David G. Andersen · 2020

Computer science

Memory-based storage currently offers the highest-performance distributed storage, keeping the primary copy of all data in DRAM. Recent advances in non-volatile main memory (NVMM) technologies promise latency similar to DRAM at reduced cos…

Cuckoo index Open

Andreas Kipf, Damian Chromejko, Alexander Hall, Peter Boncz, David G. Andersen · 2020

Computer science Mathematics Biology

In modern data warehousing, data skipping is essential for high query performance. While index structures such as B-trees or hash tables allow for precise pruning, their large storage requirements make them impractical for indexing seconda…

Succinct Range Filters Open

Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky , et al. · 2020

Computer science Mathematics Materials science

We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups and common range queries: open-range queries, closed-r…

Order-Preserving Key Compression for In-Memory Search Trees Open

Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky, Kimberly Keeton , et al. · 2020

Computer science Engineering Physics

We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order. HOPE's approach is to identify common key patterns a…

Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination Open

Conglong Li, Minjia Zhang, David G. Andersen, Yuxiong He · 2020

Computer science

In applications ranging from image search to recommendation systems, the problem of identifying a set of "similar" real-valued vectors to a query vector plays a critical role. However, retrieving these vectors and computing the correspondi…

Learning-based Memory Allocation for C++ Server Workloads Open

Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley , et al. · 2020

Computer science

Modern C++ servers have memory footprints that vary widely over time, causing persistent heap fragmentation of up to 2x from long-lived objects allocated during peak memory usage. This fragmentation is exacerbated by the use of huge (2MB) …

Order-Preserving Key Compression for In-Memory Search Trees Open

Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky, Kimberly Keeton , et al. · 2020

Computer science Mathematics Engineering

We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order. HOPE's approach is to identify common key patterns a…

Accelerating Deep Learning by Focusing on the Biggest Losers Open

Angela H. Jiang, Daniel Wong, Giulio Zhou, David G. Andersen, Jay B. Dean , et al. · 2019

Computer science Physics

This paper introduces Selective-Backprop, a technique that accelerates the training of deep neural networks (DNNs) by prioritizing examples with high loss at each iteration. Selective-Backprop uses the output of a training example's forwar…

Scaling Video Analytics on Constrained Edge Nodes Open

Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim , et al. · 2019

Computer science Geography Economics

As video camera deployments continue to grow, the need to process large volumes of real-time data strains wide area network infrastructure. When per-camera bandwidth is limited, it is infeasible for applications such as traffic monitoring …

MLSys: The New Frontier of Machine Learning Systems Open

Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis , et al. · 2019

Computer science Engineering Political science

Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically …

EDF: Ensemble, Distill, and Fuse for Easy Video Labeling Open

Giulio Zhou, Subramanya R. Dulloor, David G. Andersen, Michael Kaminsky · 2018

Computer science Mathematics Engineering

We present a way to rapidly bootstrap object detection on unseen videos using minimal human annotations. We accomplish this by combining two complementary sources of knowledge (one generic and the other specific) using bounding box merging…

Motivating the Rules of the Game for Adversarial Example Research Open

Justin Gilmer, Ryan P. Adams, Ian Goodfellow, David G. Andersen, George E. Dahl · 2018

Computer science Engineering Biology

Advances in machine learning have led to broad deployment of systems with impressive performance on important problems. Nonetheless, these systems can be induced to make errors on data that are surprisingly similar to examples the learned …

Datacenter RPCs can be General and Fast Open

Anuj Kalia, Michael Kaminsky, David G. Andersen · 2018

Computer science

It is commonly believed that datacenter networking software must sacrifice generality to attain high performance. The popularity of specialized distributed systems designed specifically for niche technologies such as RDMA, lossless network…

SuRF Open

Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky , et al. · 2018

Computer science Mathematics Engineering

We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups and common range queries: open-range queries, closed-r…

3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning Open

Hyeontaek Lim, David G. Andersen, Michael Kaminsky · 2018

Computer science Materials science

The performance and efficiency of distributed machine learning (ML) depends significantly on how long it takes for nodes to exchange state changes. Overly-aggressive attempts to reduce communication often sacrifice final model accuracy and…

FastPass: Providing First-Packet Delivery Open

Dan Wendlandt, David G. Andersen, Adrian Perrig · 2018

Computer science Psychology Mathematics

This paper introduces FastPass, an architecture that thwarts flooding attacks by providing destinations with total control over their upstream network capacity. FastPass explores an extreme design point, providing complete resistance to di…

An Architecture for Internet Data Transfer Open

Niraj H. Tolia, Michael Kaminsky, David G. Andersen, Swapnil Patil · 2018

Computer science Art Economics

This paper presents the design and implementation of DOT, a flexible architecture for data transfer. This architecture separates content negotiation from the data transfer itself. Applications determine what data they need to send and then…

Learning to Protect Communications with Adversarial Neural Cryptography Open

Martı́n Abadi, David G. Andersen · 2016

Computer science Physics

We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms o…

Full-Stack Architecting to Achieve a Billion-Requests-Per-Second Throughput on a Single Key-Value Store Server Platform Open

Sheng Li, Hyeontaek Lim, Victor W. Lee, Jung Ho Ahn, Anuj Kalia , et al. · 2016

Computer science Engineering

Distributed in-memory key-value stores (KVSs), such as memcached, have become a critical data serving layer in modern Internet-oriented data center infrastructure. Their performance and efficiency directly affect the QoS of web services an…

NetMemex: Providing Full-Fidelity Traffic Archival Open

Hyeontaek Lim, Vyas Sekar, Yoshihisa Abe, David G. Andersen · 2016

Computer science Philosophy

NetMemex explores efficient network traffic archival without any loss of information. Unlike NetFlow-like aggregation, NetMemex allows retrieving the entire packet data including full payload, which makes it useful in forensic analysis, ne…

Scheduling techniques for hybrid circuit/packet networks Open

He Liu, Matthew K. Mukerjee, Conglong Li, Nicolas Feltman, George C. Papen , et al. · 2015

Computer science Engineering

A range of new datacenter switch designs combine wireless or optical circuit technologies with electrical packet switching to deliver higher performance at lower cost than traditional packet-switched networks. These "hybrid" networks sched…

Scaling Up Clustered Network Appliances with ScaleBricks Open

Dong Zhou, Bin Fan, Hyeontaek Lim, David G. Andersen, Michael Kaminsky , et al. · 2015

Computer science Engineering

This paper presents ScaleBricks, a new design for building scalable, clustered network appliances that must "pin" flow state to a specific handling node without being able to choose which node that should be. ScaleBricks applies a new, com…

David G. Andersen YOU? Author Swipe