Explanipedia

DRISA Open

Shuangchen Li, Dimin Niu, Krishna T. Malladi, Hongzhong Zheng, Bob Brennan , et al. · 2017

Computer science

Data movement between the processing units and the memory in traditional von Neumann architecture is creating the "memory wall" problem. To bridge the gap, two approaches, the memory-rich processor (more on-chip memory) and the compute-cap…

Can far memory improve job throughput? Open

Emmanuel Amaro, Christopher Branner-Augmon, Zhihong Luo, Amy Ousterhout, Marcos K. Aguilera , et al. · 2020

Computer science

As memory requirements grow, and advances in memory technology slow, the availability of sufficient main memory is increasingly the bottleneck in large compute clusters. One solution to this is memory disaggregation, where jobs can remotel…

Clio: a hardware-software co-designed disaggregated memory system Open

Zhiyuan Guo, Yizhou Shan, Xuhao Luo, Yutong Huang, Yiying Zhang · 2022

Computer science

Memory disaggregation has attracted great attention recently because of its benefits in efficient memory utilization and ease of management. So far, memory disaggregation research has all taken one of two approaches: building/emulating mem…

In-Memory Data Parallel Processor Open

Daichi Fujiki, Scott Mahlke, Reetuparna Das · 2018

Computer science

Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized k…

Umpire: Application-focused management and coordination of complex hierarchical memory Open

David Beckingsale, Marty McFadden, Johann Dahm, Ramesh Pankajakshan, Richard D. Hornung · 2019

Computer science

Advanced architectures like Sierra provide a wide range of memory resources that must often be carefully controlled by the user. These resources have varying capacities, access timing rules, and visibility to different compute resources. A…

A New Approach to Automatic Memory Banking using Trace-Based Address Mining Open

Yuan Zhou, Khalid Al-Hawaj, Zhiru Zhang · 2017

Computer science

Recent years have seen an increased deployment of FPGAs as programmable accelerators for improving the performance and energy efficiency of compute-intensive applications. A well-known "secret sauce" of achieving highly efficient FPGA acce…

An MIG-based compiler for programmable logic-in-memory architectures Open

Mathias Soeken, Saeideh Shirinzadeh, Pierre‐Emmanuel Gaillardon, Luca Amarù, Rolf Drechsler , et al. · 2016

Computer science

Resistive memories have gained high research attention for enabling design of in-memory computing circuits and systems. We propose for the first time an automatic compilation methodology suited to a recently proposed computer architecture …

On the Memory Underutilization: Exploring Disaggregated Memory on HPC Systems Open

Ivy Peng, Roger Pearce, Maya Gokhale · 2020

Computer science Engineering

Large-scale high-performance computing (HPC) systems consist of massive compute and memory resources tightly coupled in nodes. We perform a large-scale study of memory utilization on four production HPC clusters. Our results show that more…

A Primer on Memory Consistency and Cache Coherence, Second Edition Open

Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood · 2020

Computer science Physics

Many modern computer systems, including homogeneous and heterogeneous architectures, support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a share…

Contention-Aware Dynamic Memory Bandwidth Isolation with Predictability in COTS Multicores: An Avionics Case Study Open

Ankit Agrawal, Gerhard Fohler, Johannes Freitag, Jan Nowotsch, Sascha Uhrig , et al. · 2017

Computer science Materials science Economics

Airbus is investigating COTS multicore platforms for safety-critical avionics applications, pursuing helicopter-style autonomous and electric aircraft. These aircraft need to be ultra-lightweight for future mobility in the urban city lands…

Memory Sizing of a Scalable SRAM In-Memory Computing Tile Based Architecture Open

Roman Gauchi, Maha Kooli, Pascal Vivet, Jean-Philippe Noël, Édith Beigné , et al. · 2019

Computer science

Modern computing applications require more and more data to be processed. Unfortunately, the trend in memory technologies does not scale as fast as the computing performances, leading to the so called memory wall. New architectures are cur…

Disaggregated Cloud Memory with Elastic Block Management Open

Kwangwon Koh, Kangho Kim, Seunghyub Jeon, Jaehyuk Huh · 2018

Computer science

With the growing importance of in-memory data processing, cloud service providers have launched large memory virtual machine services to accommodate memory intensive workloads. Such large memory services using low volume scaled-up machines…

Partial Failure Resilient Memory Management System for (CXL-based) Distributed Shared Memory Open

Mingxing Zhang, Teng Ma, J.L. Hua, Zheng Liu, Kang Chen , et al. · 2023

Computer science Political science

The efficiency of distributed shared memory (DSM) has been greatly improved by recent hardware technologies. But, the difficulty of distributed memory management can still be a major obstacle to the democratization of DSM, especially when …

NumaMMA Open

François Trahay, Manuel Selva, Lionel Morel, Kévin Marquet · 2018

Computer science

International audience

Testing Computation-in-Memory Architectures Based on Emerging Memories Open

Said Hamdioui, Moritz Fieback, Surya Nagarajan, Mottaqiallah Taouil · 2019

Computer science Engineering Art

Today's computing architectures and device technologies are incapable of meeting the increasingly stringent demands on energy and performance posed by evolving applications. Therefore, alternative novel post-CMOS computing architectures ar…

Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies Open

Isaac Sánchez Barrera, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero , et al. · 2018

Computer science

Shared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where memory accesses are i…

Survey on memory management techniques in heterogeneous computing systems Open

Anakhi Hazarika, Soumyajit Poddar, Hafizur Rahaman · 2019

Computer science

A major issue faced by data scientists today is how to scale up their processing infrastructure to meet the challenge of big data and high‐performance computing (HPC) workloads. With today's HPC domain, it is required to connect multiple g…

Virtualizing Deep Neural Networks for Memory-Efficient Neural Network Design Open

Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler · 2016

Computer science Mathematics

The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study di…

A memory scheduling strategy for eliminating memory access interference in heterogeneous system Open

Juan Fang, Mengxuan Wang, Zelin Wei · 2020

Computer science Economics

Multiple CPUs and GPUs are integrated on the same chip to share memory, and access requests between cores are interfering with each other. Memory requests from the GPU seriously interfere with the CPU memory access performance. Requests be…

Automation in Distributed Shared Memory Testing for Multi-Processor Systems Open

Swethasri Kavuri · 2019

Computer science Engineering

This research paper explores the critical domain of automated testing for Distributed Shared Memory (DSM) systems in multi-processor environments. As the complexity of multi-core and distributed computing systems continues to grow, ensurin…

Hardware Implementation and Analysis of Gen-Z Protocol for Memory-Centric Architecture Open

Seokbin Hong, Wonok Kwon, Myeong‐Hoon Oh · 2020

Computer science

With the increase in memory-intensive applications, a memory-centric architecture has been proposed in which the central processing units (CPUs) access a pool of fabric-attached memory. This architecture eliminates the dependency of system…

Empirical Memory-Access Cost Models in Multicore NUMA Architectures Open

Patrick McCormick, Ryan Karl Braithwaite, Wu-chun Feng · 2024

Computer science Art Economics

Data location is of prime importance when scheduling tasks in a non-uniform memory access (NUMA) architecture. The characteristics of the NUMA architecture must be understood so tasks can be scheduled onto processors that are close to the …

PIM-trie: A Skew-resistant Trie for Processing-in-Memory Open

Hongbo Kang, Yiwei Zhao, Guy E. Blelloch, Laxman Dhulipala, Yan Gu , et al. · 2023

Computer science

Memory latency and bandwidth are significant bottlenecks in designing in-memory indexes. Processing-in-memory (PIM), an emerging hardware design approach, alleviates this problem by embedding processors in memory modules, enabling low-late…

Effectively Prefetching Remote Memory with Leap Open

Hasan Al Maruf, Mosharaf Chowdhury · 2019

Computer science

Memory disaggregation over RDMA can improve the performance of memory-constrained applications by replacing disk swapping with remote memory accesses. However, state-of-the-art memory disaggregation solutions still use data path components…

Understanding object-level memory access patterns across the spectrum Open

Xu Ji, Chao Wang, Nosayba El-Sayed, Xiaosong Ma, Youngjae Kim , et al. · 2017

Computer science Biology

Memory accesses limit the performance and scalability of countless applications. Many design and optimization efforts will benefit from an in-depth understanding of memory access behavior, which is not offered by extant access tracing and …

An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns Open

Bingchao Li, Jizeng Wei, Jizhou Sun, Murali Annavaram, Nam Sung Kim · 2019

Computer science

GPUs provide high-bandwidth/low-latency on-chip shared memory and L1 cache to efficiently service a large number of concurrent memory requests. Specifically, concurrent memory requests accessing contiguous memory space are coalesced into w…

Miss Penalty Aware Cache Replacement for Hybrid Memory Systems Open

Hai Jin, Di Chen, Haikun Liu, Xiaofei Liao, Rentong Guo , et al. · 2020

Computer science

Current DRAM-based memory systems face the scalability challenges in terms of memory density, energy consumption, and monetary cost. Hybrid memory architectures composed of emerging nonvolatile memory (NVM) and DRAM is a promising approach…

CrypTag: Thwarting Physical and Logical Memory Vulnerabilities using Cryptographically Colored Memory Open

Pascal Nasahl, Robert Schilling, Mario Werner, Jan Hoogerbrugge, Marcel Medwed , et al. · 2021

Computer science

Memory vulnerabilities are a major threat to many computing systems. To\neffectively thwart spatial and temporal memory vulnerabilities, full logical\nmemory safety is required. However, current mitigation techniques for memory\nsafety are…

Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper Open

Gabin Schieffer, Jacob Wahlgren, Jie Ren, Jennifer Faj, Ivy Peng · 2024

Computer science

Memory management across discrete CPU and GPU physical memory is traditionally achieved through explicit GPU allocations and data copy or unified virtual memory. The Grace Hopper Superchip, for the first time, supports an integrated CPU-GP…

Distributed-Memory FastFlow Building Blocks Open

Nicolò Tonci, Massimo Torquati, Gabriele Mencagli, Marco Danelutto · 2022

Computer science Economics

We present the new distributed-memory run-time system (RTS) of the C++-based open-source structured parallel programming library FastFlow . The new RTS enables the execution of FastFlow shared-memory applications written using its Building…

Uniform memory access