Dave Dice YOU? Author Swipe

Last 10y

Open Invitation to Help Curate This Field & Enhance Impact .ORG

Semaphores Augmented with a Waiting Array Open

Dave Dice, Alex Kogan · 2025

Computer science

Semaphores are a widely used and foundational synchronization and coordination construct used for shared memory multithreaded programming. They are a keystone concept, in the sense that most other synchronization constructs can be implemen…

Reciprocating Locks Open

Dave Dice, Alex Kogan · 2025

Business Computer science

We present "Reciprocating Locks", a novel mutual exclusion locking algorithm, targeting cache-coherent shared memory (CC), that enjoys a number of desirable properties. The doorway arrival phase and the release operation both run in consta…

Exploring Time-Space trade-offs for synchronized in Lilliput Open

Dave Dice, А. Г. Коган · 2024

Computer science Business

In the context of Project Lilliput, which attempts to reduce the size of object header in the HotSpot Java Virtual Machine (JVM), we explore a curated set of synchronization algorithms. Each of the algorithms could serve as a potential rep…

FedPerm: Private and Robust Federated Learning by Parameter Permutation Open

Hamid Mozaffari, Virendra J. Marathe, Dave Dice · 2022

Computer science Physics Political science

Federated Learning (FL) is a distributed learning paradigm that enables mutually untrusting clients to collaboratively train a common machine learning model. Client data privacy is paramount in FL. At the same time, the model must be prote…

Intra-process Caching and Reuse of Threads Open

Dave Dice, Alex Kogan · 2021

Computer science Engineering

Creating and destroying threads on modern Linux systems incurs high latency, absent concurrency, and fails to scale as we increase concurrency. To address this concern we introduce a process-local cache of idle threads. Specifically, inste…

Ready When You Are: Efficient Condition Variables via Delegated Condition Evaluation Open

Dave Dice, Alex Kogan · 2021

Business Computer science

Multi-thread applications commonly utilize condition variables for communication between threads. Condition variables allow threads to block and wait until a certain condition holds, and also enable threads to wake up their blocked peers n…

Optimizing Inference Performance of Transformers on CPUs Open

Dave Dice, Alex Kogan · 2021

Computer science Engineering

The Transformer architecture revolutionized the field of natural language processing (NLP). Transformers-based models (e.g., BERT) power many important Web services, such as search, translation, question-answering, etc. While enormous rese…

Hemlock : Compact and Scalable Mutual Exclusion Open

Dave Dice, Alex Kogan · 2021

Computer science Engineering

We present Hemlock, a novel mutual exclusion locking algorithm that is extremely compact, requiring just one word per thread plus one word per lock, but which still provides local spinning in most circumstances, high throughput under conte…

Compact Java Monitors Open

Dave Dice, Alex Kogan · 2021

Computer science Geology

For scope and context, the idea we'll describe below, Compact Java Monitors, is intended as a potential replacement implementation for the "synchronized" construct in the HotSpot JVM. The readers is assumed to be familiar with current HotS…

Scalable range locks for scalable address spaces and beyond Open

Alex Kogan, Dave Dice, Shady Issa · 2020

Computer science Engineering

Range locks are a synchronization construct designed to provide concurrent access to multiple threads (or processes) to disjoint parts of a shared resource. Originally conceived in the file system context, range locks are gaining increasin…

Fissile Locks Open

Dave Dice, Alex Kogan · 2020

Computer science Engineering

Classic test-and-test (TS) mutual exclusion locks are simple, and enjoy high performance and low latency of ownership transfer under light or no contention. However, they do not scale gracefully under high contention and do not provide any…

Avoiding Scalability Collapse by Restricting Concurrency Open

Dave Dice, Alex Kogan · 2019

Computer science Engineering

Saturated locks often degrade the performance of a multithreaded application, leading to a so-called scalability collapse problem. This problem arises when a growing number of threads circulating through a saturated lock causes the overall…

Compact NUMA-Aware Locks Open

Dave Dice, Alex Kogan · 2018

Computer science Engineering

Modern multi-socket architectures exhibit non-uniform memory access (NUMA) behavior, where access by a core to data cached locally on a socket is much faster than access to data cached on a remote socket. Prior work offers several efficien…

TWA -- Ticket Locks Augmented with a Waiting Array Open

Dave Dice, Alex Kogan · 2018

Computer science Engineering

The classic ticket lock consists of ticket and grant fields. Arriving threads atomically fetch-and-increment ticket and then wait for grant to become equal to the value returned by the fetch-and-increment primitive, at which point the thre…

Improving Parallelism in Hardware Transactional Memory Open

Dave Dice, Maurice Herlihy, Alex Kogan · 2018

Computer science Engineering

Today’s hardware transactional memory (HTM) systems rely on existing coherence protocols, which implement a requester-wins strategy. This, in turn, leads to poor performance when transactions frequently conflict, causing them to resort to …

Persistent Memory Transactions Open

Virendra J. Marathe, Achin Mishra, Amee Trivedi, Yihe Huang, Faisal Zaghloul , et al. · 2018

Computer science

This paper presents a comprehensive analysis of performance trade offs between implementation choices for transaction runtime systems on persistent memory. We compare three implementations of transaction runtimes: undo logging, redo loggin…

Malthusian Locks Open

Dave Dice · 2015

Computer science Engineering Political science

Applications running in modern multithreaded environments are sometimes \emph{over-threaded}. The excess threads do not improve performance, and in fact may act to degrade performance via \emph{scalability collapse}. Often, such software a…

The Influence of Malloc Placement on TSX Hardware Transactional Memory Open

Dave Dice, Tim Harris, Alex Kogan, Yossi Lev · 2015

Computer science Psychology

The hardware transactional memory (HTM) implementation in Intel's i7-4770 "Haswell" processor tracks the transactional read-set in the L1 (level-1), L2 (level-2) and L3 (level-3) caches and the write-set in the L1 cache. Displacement or ev…

Creating related items for first view…

Fetching topic information...