Explanipedia

Universal Properties of Activation Sparsity in Modern Large Language Models Open

Filip Szatkowski, Patryk Będkowski, Alessio Devoto, Jan Dubiński, Pasquale Minervini , et al. · 2025

Input-dependent activation sparsity is a notable property of deep learning models, which has been extensively studied in networks with ReLU activations and is associated with efficiency, robustness, and interpretability. However, the appro…

Mixture-of-experts graph transformers for interpretable particle collision detection Open

Donatella Genovese, Amanda Sgroi, Alessio Devoto, S. H. Valentine, Lincoln C. Wood , et al. · 2025

Computer science Engineering

The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …

Interpretable classification of Levantine ceramic thin sections via neural networks Open

Sara Capriotti, Alessio Devoto, Simone Scardapane, Silvano Mignardi, Laura Medeghini · 2025

Computer science Geology Materials science

Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…

Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks Open

Sara Capriotti, Alessio Devoto, Simone Scardapane, Silvano Mignardi, Laura Medeghini · 2025

Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…

Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference Open

Bartosz Wójcik, Alessio Devoto, Karol Pustelnik, Pasquale Minervini, Simone Scardapane · 2025

Computer science

While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Open

Nathan Godey, Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini , et al. · 2025

Autoregressive language models rely on a Key-Value (KV) Cache, which avoids re-computing past hidden states during generation, making it faster. As model sizes and context lengths grow, the KV Cache becomes a significant memory bottleneck,…

Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection Open

Donatella Genovese, Amanda Sgroi, Alessio Devoto, S. H. Valentine, Lincoln C. Wood , et al. · 2025

Computer science Engineering

The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …

Goal-oriented Communications based on Recursive Early Exit Neural Networks Open

Jary Pomponi, Mattia Merluzzi, Alessio Devoto, Mateus P. Mota, Paolo Di Lorenzo , et al. · 2024

Computer science

This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models. The proposed approach is built on two key components. First, we introduce an innovative early exit strategy that dynami…

Goal-Oriented Communications Based on Recursive Early Exit Neural Networks Open

Jary Pomponi, Mattia Merluzzi, Alessio Devoto, Mateus P. Mota, Paolo Di Lorenzo , et al. · 2024

Computer science

International audience

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Open

Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema , et al. · 2024

Computer science Engineering Political science

Large language models (LLMs) can store a significant amount of factual knowledge in their parameters. However, their parametric knowledge may conflict with the information provided in the context -- this phenomenon, known as \emph{context-…

Analysing the Residual Stream of Language Models Under Knowledge Conflicts Open

Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto , et al. · 2024

Computer science Philosophy

Large language models (LLMs) can store a significant amount of factual knowledge in their parameters. However, their parametric knowledge may conflict with the information provided in the context. Such conflicts can lead to undesirable mod…

Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning Open

Alessio Devoto, Federico Alvetreti, Jary Pomponi, Paolo Di Lorenzo, Pasquale Minervini , et al. · 2024

Computer science Engineering Physics

Recently, foundation models based on Vision Transformers (ViTs) have become widely available. However, their fine-tuning process is highly resource-intensive, and it hinders their adoption in several edge or low-energy applications. To thi…

Conditional computation in neural networks: Principles and research trends Open

Simone Scardapane, Alessandro Baiocchi, Alessio Devoto, Valerio Marsocci, Pasquale Minervini , et al. · 2024

Computer science

This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts…

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression Open

Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini · 2024

Computer science Materials science Political science

The deployment of large language models (LLMs) is often hindered by the extensive memory requirements of the Key-Value (KV) cache, especially as context lengths increase. Existing approaches to reduce the KV cache size involve either fine-…

Are We Done with MMLU? Open

Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino , et al. · 2024

Business

Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabiliti…

Adaptive Semantic Token Selection for AI-native Goal-oriented Communications Open

Alessio Devoto, Simone Petruzzi, Jary Pomponi, Paolo Di Lorenzo, Simone Scardapane · 2024

Computer science

In this paper, we propose a novel design for AI-native goal-oriented communications, exploiting transformer neural networks under dynamic inference constraints on bandwidth and computation. Transformers have become the standard architectur…

Class incremental learning with probability dampening and cascaded gated classifier Open

Jary Pomponi, Alessio Devoto, Simone Scardapane · 2024

Computer science Mathematics

Humans are capable of acquiring new knowledge and transferring learned knowledge into different domains, incurring a small forgetting. The same ability, called Continual Learning, is challenging to achieve when operating with neural networ…

Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference Open

Bartosz Wójcik, Alessio Devoto, Karol Pustelnik, Pasquale Minervini, Simone Scardapane · 2023

Computer science Physics Chemistry

While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…

On the robustness of vision transformers for in-flight monocular depth estimation Open

Simone Ercolino, Alessio Devoto, Luca Monorchio, Matteo Santini, S. Mazzaro , et al. · 2023

Computer science Engineering Chemistry

Monocular depth estimation (MDE) has shown impressive performance recently, even in zero-shot or few-shot scenarios. In this paper, we consider the use of MDE on board low-altitude drone flights, which is required in a number of safety-cri…

Reidentification of Objects From Aerial Photos With Hybrid Siamese Neural Networks Open

Alessio Devoto, Indro Spinelli, Francesca Murabito, Fabrizio Chiovoloni, Riccardo Musmeci , et al. · 2022

Computer science Geography Engineering

In this paper, we consider the task of re-identifying the same object in different photos taken from separate positions and angles during aerial reconnaissance, which is a crucial task for the maintenance and surveillance of critical large…

Alessio Devoto YOU? Author Swipe