Explanipedia

A Complexity-Effective Local Delta Prefetcher Open

Agustín Navarro-Torres, Biswabandan Panda, Jesús Alastruey-Benedé, Pablo Ibáñez, Víctor Viñnals-Yúfera , et al. · 2025

Computer science

Data prefetching is crucial for performance in modern processors by effectively masking long-latency memory accesses. Over the past decades, numerous data prefetching mechanisms have been proposed, which have continuously reduced the acces…

Flexible Swapping for the Cloud Open

Milan Pandurov, Lukas Humbel, Dmitry Sepp, Adamos Ttofari, Leon Thomm , et al. · 2024

Computer science Business

Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. To optimize this resource, "cold" memory can be reclaimed from VMs and stored on slower…

Alternate Path μ-op Cache Prefetching Open

Sawan Singh, Arthur Pérais, Alexandra Jimborean, Alberto Ros · 2024

Computer science

International audience

Bounding Speculative Execution of Atomic Regions to a Single Retry Open

Eduardo José Gómez-Hernández, Juan M. Cebrián, Stefanos Kaxiras, Alberto Ros · 2024

Computer science

Mutual exclusion has long served as a fundamental construct in parallel programs. Despite a long history of optimizing the lower-level lock and unlock operations used to enforce mutual exclusion, such operations largely dictate performance…

Improved Converted Traces from Rebasing Microarchitectural Research with Industry Traces Open

Josué Feliu, Arthur Pérais, Daniel A. Jiménez, Alberto Ros · 2023

Computer science

Improved converted traces of the paper "Rebasing Microarchitectural Research with Industry Traces", published at the 2023 IEEE International Symposium on Workload Characterization. It includes the CVP-1 traces used in the paper converted w…

Improved Converted Traces from Rebasing Microarchitectural Research with Industry Traces Open

Josué Feliu, Arthur Pérais, Daniel A. Jiménez, Alberto Ros · 2023

Computer science

Improved converted traces of the paper "Rebasing Microarchitectural Research with Industry Traces", published at the 2023 IEEE International Symposium on Workload Characterization. It includes the CVP-1 traces used in the paper converted w…

On the interactions between ILP and TLP with hardware transactional memory Open

Víctor Nicolás-Conesa, Rubén Titos-Gil, Ricardo Fernández‐Pascual, Alberto Ros, Manuel E. Acacio · 2023

Computer science Psychology

Hardware implementations of Transactional Memory (HTM) are designed to facilitate efficient thread synchronization in parallel programs, encouraging the use of larger critical sections. By employing optimistic concurrency control to execut…

Rebasing Microarchitectural Research with Industry Traces Open

Josué Feliu, Arthur Pérais, Daniel A. Jiménez, Alberto Ros · 2023

Computer science Engineering Geography

International audience

Data Artifact: Rebasing Microarchitectural Research with Industry Traces Open

Josué Feliu, Arthur Pérais, Daniel A. Jiménez, Alberto Ros · 2023

Computer science

Data Artifact of the paper "Rebasing Microarchitectural Research with Industry Traces", published at the 2023 IEEE International Symposium on Workload Characterization. It includes the original CVP-1 traces used in the paper. Note: the imp…

Data Artifact: Rebasing Microarchitectural Research with Industry Traces Open

Josué Feliu, Arthur Pérais, Daniel A. Jiménez, Alberto Ros · 2023

Computer science

Data Artifact of the paper "Rebasing Microarchitectural Research with Industry Traces", published at the 2023 IEEE International Symposium on Workload Characterization. It includes the original CVP-1 traces used in the paper. Note: the imp…

Towards faster, greener and easier to program computers Open

Alberto Ros · 2023

Computer science Economics Physics

Towards faster, greener and easier to program computers The ERC Consolidator Grant project ECHO (Extending Coherence for Hardware-Driven Optimizations in Multicore Architectures) aims to change the events that occur in multiprocessors such…

Speculative inter-thread store-to-load forwarding in SMT architectures Open

Josué Feliu, Alberto Ros, Manuel E. Acacio, Stefanos Kaxiras · 2022

Computer science

Applications running on out-of-order cores have benefited for decades of store-to-load forwarding which accelerates communication of store values to loads of the same thread. Despite threads running on a simultaneous multithreading (SMT) c…

Exploring Instruction Fusion Opportunities in General Purpose Processors Open

Sawan Singh, Arthur Pérais, Alexandra Jimborean, Alberto Ros · 2022

Computer science Engineering Philosophy

International audience

Berti: an Accurate Local-Delta Data Prefetcher Open

Agustín Navarro-Torres, Biswabandan Panda, Jesús Alastruey-Benedé, Pablo Ibáñez, Víctor Viñals , et al. · 2022

Computer science

Data prefetching is a technique that plays a crucial role in modern high-performance processors by hiding long latency memory accesses. Several state-of-the-art hardware prefetchers exploit the concept of deltas, defined as the difference …

Free atomics Open

Ashkan Asgharzadeh, Juan M. Cebrián, Arthur Pérais, Stefanos Kaxiras, Alberto Ros · 2022

Computer science Philosophy

International audience

Do Not Predict – Recompute! How Value Recomputation Can Truly Boost the Performance of Invisible Speculation Open

Christos Sakalis, Zamshed I. Chowdhury, Shayne Wadle, İsmail Aktürk, Alberto Ros , et al. · 2021

Computer science Economics Biology

Recent architectural approaches that address speculative side-channel attacks aim to prevent software from exposing the microarchitectural state changes of transient execution. The Delay-on-Miss technique is one such approach, which simply…

Compiler-Assisted Compaction/Restoration of SIMD Instructions Open

Juan M. Cebrián, Thibaud Balem, Adrián Barredo, Marc Casas, Miquel Moretó , et al. · 2021

Computer science

All the supercomputers in the world exploit data-level parallelism (DLP), for example by using single instructions to operate over several data elements. Improving vector processing is therefore key for exascale computing. Control flow div…

On Value Recomputation to Accelerate Invisible Speculation Open

Christos Sakalis, Zamshed I. Chowdhury, Shayne Wadle, İsmail Aktürk, Alberto Ros , et al. · 2021

Computer science Economics Biology

Recent architectural approaches that address speculative side-channel attacks aim to prevent software from exposing the microarchitectural state changes of transient execution. The Delay-on-Miss technique is one such approach, which simply…

Boosting Store Buffer Efficiency with Store-Prefetch Bursts Open

Juan M. Cebrián, Stefanos Kaxiras, Alberto Ros · 2020

Computer science

Virtually all processors today employ a store buffer (SB) to hide store latency. However, when the store buffer is full, store latency is exposed to the processor causing pipeline stalls. The default strategies to mitigate these stalls are…

Speculative Enforcement of Store Atomicity Open

Alberto Ros, Stefanos Kaxiras · 2020

Computer science Political science

Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see its own stores while they are in limbo, i.e., executed (and perhaps retired) but not yet inserted in memory order. This is known as store-to-…

Regional Out-of-Order Writes in Total Store Order Open

Sawan Singh, Alexandra Jimborean, Alberto Ros · 2020

Computer science Business

The store buffer, an essential component in today's processors, is designed to hide memory latency by moving stores off the processor's critical path. Furthermore, under the Total Store Order (TSO) memory model, the store buffer ensures th…

The Entangling Instruction Prefetcher Open

Alberto Ros, Alexandra Jimborean · 2020

Computer science

Prefetching instructions is a fundamental technique for designing high-performance computers.There are three key properties to consider when designing an efficient and effective prefetcher: timeliness, coverage, and accuracy.Timeliness is …

Efficient invisible speculative execution through selective delay and value prediction Open

Christos Sakalis, Stefanos Kaxiras, Alberto Ros, Alexandra Jimborean, Magnus Själander · 2019

Computer science Economics

Speculative execution, the base on which modern high-performance general-purpose CPUs are built on, has recently been shown to enable a slew of security attacks. All these attacks are centered around a common set of behaviors: During specu…

Filter caching for free Open

Ricardo N. Alves, Alberto Ros, David Black-Schaffer, Stefanos Kaxiras · 2019

Computer science

Modern processors contain store-buffers to allow stores to retire under a miss, thus hiding store-miss latency. The store-buffer needs to be large (for performance) and searched on every load (for correctness), thereby making it a costly s…

Way Combination for an Adaptive and Scalable Coherence Directory Open

Rubén Titos-Gil, Antonio Flores, Ricardo Fernández‐Pascual, Alberto Ros, Salvador Petit , et al. · 2019

Computer science Physics

© 2019 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertisíng or promotional purposes, cre…

Ghost loads Open

Christos Sakalis, Mehdi Alipour, Alberto Ros, Alexandra Jimborean, Stefanos Kaxiras , et al. · 2019

Computer science Biology Economics

Speculative execution is necessary for achieving high performance on modern general-purpose CPUs but, starting with Spectre and Meltdown, it has also been proven to cause severe security flaws. In case of a misspeculation, the architectura…

The Superfluous Load Queue Open

Alberto Ros, Stefanos Kaxiras · 2018

Computer science

In an out-of-order core, the load queue (LQ), the store queue (SQ), and the store buffer (SB) are responsible for ensuring: i) correct forwarding of stores to loads and ii) correct ordering among loads (with respect to external stores). Th…

Non-Speculative Store Coalescing in Total Store Order Open

Alberto Ros, Stefanos Kaxiras · 2018

Computer science Economics

We present a non-speculative solution for a coalescing store buffer in total store order (TSO) consistency. Coalescing violates TSO with respect to both conflicting loads and conflicting stores, if partial state is exposed to the memory sy…

Mending Fences with Self-Invalidation and Self-Downgrade Open

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Stefanos Kaxiras, Carl Leonardsson, Alberto Ros , et al. · 2018

Computer science Philosophy Materials science

Cache coherence protocols based on self-invalidation and self-downgrade have recently seen increased popularity due to their simplicity, potential performance efficiency, and low energy consumption. However, such protocols result in memory…

Mending Fences with Self-Invalidation and Self-Downgrade Open

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Stefanos Kaxiras, Carl Leonardsson, Alberto Ros , et al. · 2018

Computer science

Cache coherence protocols based on self-invalidation and self-downgrade have recently seen increased popularity due to their simplicity, potential performance efficiency, and low energy consumption. However, such protocols result in memory…

Alberto Ros YOU? Author Swipe