Explanipedia

The FlashAttention Paradigm: Re-architecting Transformers for Memory-Optimal Scalability Open

Revista, Zen, IA, 10 · 2025

The Transformer architecture has revolutionized deep learning, particularly in natural language processing and computer vision. However, its core self-attention mechanism suffers from quadratic memory and computational complexity with resp…

Elastic Gradient Checkpointing: Scaling Deep Learning Beyond Conventional Memory Limits Open

Revista, Zen, IA, 10 · 2025

The relentless pursuit of larger and more complex deep learning models has increasingly encountered a fundamental bottleneck: the finite memory capacity of conventional hardware accelerators. As model architectures scale in depth, width, a…

The FlashAttention Paradigm: Re-architecting Transformers for Memory-Optimal Scalability Open

Revista, Zen, IA, 10 · 2025

The Transformer architecture has revolutionized deep learning, particularly in natural language processing and computer vision. However, its core self-attention mechanism suffers from quadratic memory and computational complexity with resp…

FlashAttention: Breaking the Memory Wall for Efficient Self-Attention Scaling Open

Revista, Zen, IA, 10 · 2025

The self-attention mechanism is a cornerstone of the Transformer architecture, driving significant advancements across natural language processing, computer vision, and other domains. However, its quadratic computational complexity and, cr…

Elastic Gradient Checkpointing: Scaling Deep Learning Beyond Conventional Memory Limits Open

Revista, Zen, IA, 10 · 2025

The relentless pursuit of larger and more complex deep learning models has increasingly encountered a fundamental bottleneck: the finite memory capacity of conventional hardware accelerators. As model architectures scale in depth, width, a…

FlashAttention: Breaking the Memory Wall for Efficient Self-Attention Scaling Open

Revista, Zen, IA, 10 · 2025

The self-attention mechanism is a cornerstone of the Transformer architecture, driving significant advancements across natural language processing, computer vision, and other domains. However, its quadratic computational complexity and, cr…

Flash Attention: Unlocking Bandwidth-Optimal Self-Attention for Trillion-Parameter Models Open

Revista, Zen, IA, 10 · 2025

The Transformer architecture, with its cornerstone self-attention mechanism, has revolutionized deep learning, particularly in natural language processing. However, as models scale towards trillions of parameters and sequence lengths grow,…

Flash Attention: Unlocking Bandwidth-Optimal Self-Attention for Trillion-Parameter Models Open

Revista, Zen, IA, 10 · 2025

The Transformer architecture, with its cornerstone self-attention mechanism, has revolutionized deep learning, particularly in natural language processing. However, as models scale towards trillions of parameters and sequence lengths grow,…

Compute-in-Memory Based on Emerging Non-Volatile Memories: RRAM, MRAM, and FeRAM Open

Qianye Han · 2025

In the era of artificial intelligence, Internet of things and big data, processing massive data puts forward unprecedented requirements for the throughput and energy efficiency of computing systems. In traditional von Neumann architectures…

A memory constrained bayesian optimization via robust online memory estimation Open

Befekadu Bekuretsion, Wolfgang Menzel, Solomon Teferra, Befekadu Bekuretsion, Wolfgang Menzel , et al. · 2025

Bayesian optimization (BO) is a memory-intensive algorithm that requires training and evaluating an expensive objective function. In contrast to previous works that use an offline memory estimation to make BO memory-efficient, we propose a…

CAMformer: Associative Memory is All You Need Open

Molom-Ochir, Tergel, Morris, Benjamin F., Horton, Mark, Wei, Chiyue, Guo Cong , et al. · 2025

Transformers face scalability challenges due to the quadratic cost of attention, which involves dense similarity computations between queries and keys. We propose CAMformer, a novel accelerator that reinterprets attention as an associative…

Indexed Parametric Memory (IPM): A New Paradigm for Lossless Lifelong LLM Memory Open

汪, 忠仁 · 2025

We introduce Indexed Parametric Memory (IPM), the third memory paradigm for large language models after non-parametric retrieval (RAG) and traditional parametric memory. In IPM: Raw conversation history is stored externally with unique IDs…

Indexed Parametric Memory (IPM): A New Paradigm for Lossless Lifelong LLM Memory Open

汪, 忠仁 · 2025

We introduce Indexed Parametric Memory (IPM), the third memory paradigm for large language models after non-parametric retrieval (RAG) and traditional parametric memory. In IPM: Raw conversation history is stored externally with unique IDs…

Indexed Parametric Memory (IPM): A New Paradigm for Lossless Lifelong LLM Memory Open

汪, 忠仁 · 2025

We introduce Indexed Parametric Memory (IPM), the third memory paradigm for large language models after non-parametric retrieval (RAG) and traditional parametric memory. In IPM: Raw conversation history is stored externally with unique IDs…

Indexed Parametric Memory (IPM): A New Paradigm for Lossless Lifelong LLM Memory Open

汪, 忠仁 · 2025

We introduce Indexed Parametric Memory (IPM), the third memory paradigm for large language models after non-parametric retrieval (RAG) and traditional parametric memory. In IPM: Raw conversation history is stored externally with unique IDs…

mohamedorhan/Electromagnetic-Energy-Memory-EEM-: Electromagnetic Energy Memory (EEM) — Official v1.0.0 Release Open

Mohamed Orhan · 2025

Electromagnetic Energy Memory (EEM) Official v1.0.0 Scientific Release This release contains the complete implementation and documentation for the Electromagnetic Energy Memory (EEM) model — a resonant, non-chemical energy-storage framewor…

mohamedorhan/Electromagnetic-Energy-Memory-EEM-: Electromagnetic Energy Memory (EEM) — Official v1.0.0 Release Open

Mohamed Orhan · 2025

Electromagnetic Energy Memory (EEM) Official v1.0.0 Scientific Release This release contains the complete implementation and documentation for the Electromagnetic Energy Memory (EEM) model — a resonant, non-chemical energy-storage framewor…

EONSim: An NPU Simulator for On-Chip Memory and Embedding Vector Operations Open

Choi Sang-Un, Oh, Yunho · 2025

Embedding vector operations are a key component of modern deep neural network workloads. Unlike matrix operations with deterministic access patterns, embedding vector operations exhibit input data-dependent and non-deterministic memory acc…

EONSim: An NPU Simulator for On-Chip Memory and Embedding Vector Operations Open

Choi Sang-Un, Oh, Yunho · 2025

Embedding vector operations are a key component of modern deep neural network workloads. Unlike matrix operations with deterministic access patterns, embedding vector operations exhibit input data-dependent and non-deterministic memory acc…

IMPLEMENTATION OF LOW POWER MEMRISTOR CONTENT ADDRESSABLE MEMORY USING FINFET Open

Ancy Joy, Jinsa Kuruvilla · 2025

The research environment is promptly looking at the extensive development of memristor devices in industrial applications. Future technology is eagerly waiting for the upcoming developments in memristor-based devices. Memristor regulates t…

Revisiting Memory Hierarchies with CMM-H: Use Device-side Caching to Integrate DRAM and SSD for a Hybrid CXL Memory Open

Mohammadreza Soltaniyeh, Gongjin Sun, Xuebin Yao, Amir Beygi, Ramdas Kachare , et al. · 2025

INFLUENCE OF PROTON RADIATION ON THE DEGRADATION AND FAILURES OF RAM CHIPS Open

V. Kotlyarov, A. Shevchenko · 2025

The influence of proton irradiation on semiconductor random-access memory (RAM) chips in the space environment is examined. The relevance of the topic stems from the fact that cosmic-ray protons are capable of causing both instantaneous ma…

Physical complexity and black hole quantum computers Open

Michael T. Reilly, Seth Lloyd · 2025

The theory of computational complexity is based on the tradeoff between two computational resources, memory space and computer time. This paper investigates the physical counterparts of these resources. Memory space is the number of bits o…

Atomically‐Thin Freestanding Racetrack Memory Devices Open

Gu Ke, Prajwal Rigvedi, Peng Wang, Zihan Yin, Hakan Deniz , et al. · 2025

Advances in freestanding membranes allow novel heterostructures to be formed from distinct families of materials in 2D or 3D configurations. Recently, this technique has been used to form a 3D racetrack memory device by transferring a comp…

Storage Class Memory is Dead, All Hail Managed-Retention Memory: Rethinking Memory for the AI Era Open

Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu , et al. · 2025

Design and Implementation of Memory Controller for Byte Access from Data Memory for SoC’s Devices Open

Preethi Kathirvel, R Pruthvika, G Vaishnavika, K Vathsala, G. Manasa · 2025

A System-on Modern computing systems, particularly System-on-Chip (SoC) architectures, incorporate multiple processors, integrated memory, and control logic to enhance efficiency. These architectures are prevalent in contemporary electroni…

Self-Refresh Memory in Pixel Circuit With 18-bit Color Depth for Liquid Crystal Displays Open

Won-Been Jeong, Jae-Hee Jo, Sang-Hoon Kim, Hoon‐Ju Chung, Seung‐Woo Lee · 2025

A Survey on Computing-in-Memory (CiM) and Emerging Nonvolatile Memory (NVM) Simulators Open

Julien Maurer, Ahmed Mamdouh Mohamed Ahmed, Parsa Khorrami, Sabrina Hassan Moon, Dayane Reis · 2025

Modern computer applications have become highly data-intensive, giving rise to an increase in data traffic between the processor and memory units. Computing-in-Memory (CiM) has shown great promise as a solution to this aptly named von Neum…

3D NAND flash memory for a Pseudo quantum computer platform Open

Young June Park, D. Park · 2025

Overcoming sensory-memory interference in working memory circuits Open

Andrii Zahorodnii, Diego Mendoza-Halliday, Julio Martinez‐Trujillo, Ning Qian, Robert Desimone , et al. · 2025

Memories of recent stimuli are crucial for guiding behavior, but the sensory pathways responsible for encoding these memories are continuously bombarded by new sensory experiences. How the brain overcomes interference between sensory input…

Computer memory