Demand paging
View article
Efficient Memory Management for Large Language Model Serving with PagedAttention Open
High throughput serving of large language models (LLMs) requires batching sufficiently many requests at a time. However, existing systems struggle because the key-value cache (KV cache) memory for each request is huge and grows and shrinks…
View article
Nimble Page Management for Tiered Memory Systems Open
Software-controlled heterogeneous memory systems have the potential to increase the performance and cost efficiency of computing systems. However they can only deliver on this promise if supported by efficient page management policies and …
View article
VAULT Open
Intel's SGX offers state-of-the-art security features, including confidentiality, integrity, and authentication (CIA) when accessing sensitive pages in memory. Sensitive pages are placed in an Enclave Page Cache (EPC) within the physical m…
View article
A Framework for Memory Oversubscription Management in Graphics Processing Units Open
Modern discrete GPUs support unified memory and demand paging. Automatic management of data movement between CPU memory and GPU memory dramatically reduces developer effort. However, when application working sets exceed physical memory cap…
View article
MEMTIS: Efficient Memory Tiering with Dynamic Page Classification and Page Size Determination Open
The evergrowing memory demand fueled by datacenter workloads is the driving force behind new memory technology innovations (e.g., NVM, CXL). Tiered memory is a promising solution which harnesses such multiple memory types with varying capa…
View article
Static Memory Deduplication for Performance Optimization in Cloud Computing Open
In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory cap…
View article
FaaSnap Open
FaaSnap is a VM snapshot-based platform that uses a set of complementary optimizations to improve function cold-start performance for Function-as-a-Service (FaaS) applications. Compact loading set files take better advantage of prefetching…
View article
Efficient Memory Management for Large Language Model Serving with PagedAttention Open
High throughput serving of large language models (LLMs) requires batching sufficiently many requests at a time. However, existing systems struggle because the key-value cache (KV cache) memory for each request is huge and grows and shrinks…
View article
Windows Memory Forensics: Detecting (Un)Intentionally Hidden Injected Code by Examining Page Table Entries Open
Malware utilizes code injection techniques to either manipulate other processes (e.g. done by banking trojans) or hide its existence. With some exceptions, such as ROP gadgets, the injected code needs to be executable by the CPU (at least …
View article
Perforated Page: Supporting Fragmented Memory Allocation for Large Pages Open
The availability of large pages has dramatically improved the efficiency of address translation for applications that use large contiguous regions of memory. However, large pages can be difficult to allocate due to fragmented memory, non-m…
View article
Deconstructing the Energy Consumption of the Mobile Page Load Open
Mobile Web page performance is critical to content providers, service providers, and users, as Web browsers are one of the most popular apps on phones. Slow Web pages are known to adversely affect profits and lead to user abandonment. Whil…
View article
Secure Page Fusion with VUsion Open
To reduce memory pressure, modern operating systems and hypervisors such as Linux/KVM deploy page-level memory fusion to merge physical memory pages with the same content (i.e., page fusion). A write to a fused memory page triggers a copy-…
View article
On-demand-fork Open
Fork has long been the process creation system call for Unix. At its inception, fork was hailed as an efficient system call due to its use of copy-on-write on memory shared between parent and child processes. However, application memory de…
View article
Adaptive Page Migration Policy With Huge Pages in Tiered Memory Systems Open
To accommodate the growing demand for memory capacity in a cost-effective way, multiple types of memory are incorporated in a single system. In such tiered memory systems consisting of small fast and large slow memory components, accuratel…
View article
VAULT Open
Intel's SGX offers state-of-the-art security features, including confidentiality, integrity, and authentication (CIA) when accessing sensitive pages in memory. Sensitive pages are placed in an Enclave Page Cache (EPC) within the physical m…
View article
TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory Open
The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient sol…
View article
Page Size Aware Cache Prefetching Open
The increase in working set sizes of contemporary applications outpaces the growth in cache sizes, resulting in frequent main memory accesses that deteriorate system per- formance due to the disparity between processor and memory speeds. P…
View article
WIRD: An Efficiency Migration Scheme in Hybrid DRAM and PCM Main Memory for Image Processing Applications Open
Using a hybrid main memory in embedded systems to process image processing applications has become an irresistible trend. However, the performance deficiencies (less write endurance and relative longer write latency) in phase change memory…
View article
Reducing Minor Page Fault Overheads through Enhanced Page Walker Open
Application virtual memory footprints are growing rapidly in all systems from servers down to smartphones. To address this growing demand, system integrators are incorporating ever larger amounts of main memory, warranting rethinking of me…
View article
InvisiPage Open
State-of-art secure processors like Intel SGX remain susceptible to leaking page-level address trace of an application via the page fault channel in which a malicious OS induces spurious page faults and deduces application's secrets from i…
View article
Tight Bounds for Parallel Paging and Green Paging Open
In the parallel paging problem, there are p processors that share a cache of size k. The goal is to partition the cache among the processors over time in order to minimize their average completion time. For this long-standing open problem,…
View article
DPW-LRU: An Efficient Buffer Management Policy Based on Dynamic Page Weight for Flash Memory in Cyber-Physical Systems Open
Owing to its high performance, small size, and low energy consumption, NAND flash memory has been extensively adopted in cyber-physical systems. However, the inherent characteristics of flash memory, including not-in-place update and asymm…
View article
Contiguity Representation in Page Table for Memory Management Units Open
Conventional page-based memory management schemes have certain overheads related to system performance and memory utilization mainly due to page table walks. In addition, conventional translation look-aside buffers (TLBs) often suffer from…
View article
Revisiting Swapping in User-Space With Lightweight Threading Open
Memory-intensive applications, such as in-memory databases, caching systems and key-value stores, are increasingly demanding larger main memory to fit their working sets. Conventional swapping can enlarge the memory capacity by paging out …
View article
Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory Open
The abstraction of a shared memory space over separate CPU and GPU memory domains has eased the burden of portability for many HPC codebases. However, users pay for ease of use provided by system-managed memory with a moderate-to-high perf…
View article
Flexible Page-level Memory Access Monitoring Based on Virtualization Hardware Open
Page protection is often used to achieve memory access monitoring in many applications, dealing with program-analysis, checkpoint-based failure recovery, and garbage collection in managed runtime systems. Typically, low overhead access mon…
View article
A Novel Longest Distance First Page Replacement Algorithm Open
Objectives: To improve the performance of computer in program execution by employing Longest Distance First page replacement algorithm in memory management. Method: There are many traditional page replacement algorithms used in virtual mem…
View article
Smart scene management for IoT-based constrained devices using checkpointing Open
International audience
View article
Online Parallel Paging with Optimal Makespan Open
The classical paging problem can be described as follows: given a cache that can hold up to k pages (or blocks) and a sequence of requests to pages, how should we manage the cache so as to maximize performance-or, in other words, complete …
View article
Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines Open
This repository contains artifacts of the paper Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines by Reto Achermann, Jayneel Gandhi, Timothy Roscoe, Abhishek Bhattacharjee, and Ashish Panwar to appear in the 25t…