Explanipedia

Communication Efficient Split Learning of ViTs with Attention-based Double Compression Open

Simone Scardapane · 2025

This paper proposes a novel communication-efficient Split Learning (SL) framework, named Attention-based Double Compression (ADC), which reduces the communication overhead required for transmitting intermediate Vision Transformers activati…

Universal Properties of Activation Sparsity in Modern Large Language Models Open

Filip Szatkowski, Patryk Będkowski, Alessio Devoto, Jan Dubiński, Pasquale Minervini , et al. · 2025

Input-dependent activation sparsity is a notable property of deep learning models, which has been extensively studied in networks with ReLU activations and is associated with efficiency, robustness, and interpretability. However, the appro…

A data attribution approach for unsupervised anomaly detection on multivariate time series Open

Alessio Verdone, Simone Scardapane, Massimo Panella · 2025

Anomaly detection is a challenging task that manifests in several forms depending on its context (e.g., fraud detection, network security, fault monitoring): given a collection of data, the goal is discovering the anomalous patterns diverg…

Mixture-of-experts graph transformers for interpretable particle collision detection Open

Donatella Genovese, Amanda Sgroi, Alessio Devoto, S. H. Valentine, Lincoln C. Wood , et al. · 2025

Computer science Engineering

The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …

Micrographia in Parkinson's Disease: Automatic Recognition through Artificial Intelligence Open

Francesco Asci, Gaetano Saurio, Giulia Pinola, Marco Falletti, Alessandro Zampogna , et al. · 2025

Psychology Medicine Computer science

Background Parkinson's disease (PD) leads to handwriting abnormalities primarily characterized by micrographia. Whether micrographia manifests early in PD, worsens throughout the disease, and lastly responds to L‐Dopa is still under scient…

Interpretable classification of Levantine ceramic thin sections via neural networks Open

Sara Capriotti, Alessio Devoto, Simone Scardapane, Silvano Mignardi, Laura Medeghini · 2025

Computer science Geology Materials science

Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…

Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks Open

Sara Capriotti, Alessio Devoto, Simone Scardapane, Silvano Mignardi, Laura Medeghini · 2025

Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…

Bringing AI to Clinicians: Simplifying Pleural Effusion Cytology Diagnosis with User-Friendly Models Open

Enrico Giarnieri, Elisabetta Carico, Stefania Scarpino, Alberto Rícci, Pierdonato Bruno , et al. · 2025

Medicine Computer science

Background: Malignant pleural effusions (MPEs) are common in advanced lung cancer patients. Cytological examination of pleural fluid is essential for identifying cell types but presents diagnostic challenges, particularly when reactive mes…

Adaptive token selection for scalable point cloud transformers Open

Alessandro Baiocchi, Indro Spinelli, Alessandro Nicolosi, Simone Scardapane · 2025

Computer science

The recent surge in 3D data acquisition has spurred the development of geometric deep learning models for point cloud processing, boosted by the remarkable success of transformers in natural language processing. While point cloud transform…

Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference Open

Bartosz Wójcik, Alessio Devoto, Karol Pustelnik, Pasquale Minervini, Simone Scardapane · 2025

Computer science

While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…

MASS: MoErging through Adaptive Subspace Selection Open

Antonio Andrea Gargiulo, Maria Sofia Bucarelli, Simone Scardapane, Fabrizio Silvestri, Iacopo Masi , et al. · 2025

Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set of parameters with no additional training overhead. Yet, existing merging methods fall short of matching …

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Open

Nathan Godey, Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini , et al. · 2025

Autoregressive language models rely on a Key-Value (KV) Cache, which avoids re-computing past hidden states during generation, making it faster. As model sizes and context lengths grow, the KV Cache becomes a significant memory bottleneck,…

Towards zero-shot learning in 3D change detection: improving generalization with custom augmentations and evaluation Open

Riccardo Contu, Valerio Marsocci, Virginia Coletta, Roberta Ravanelli, Simone Scardapane · 2025

Computer science Mathematics Materials science

peer reviewed

Spatio-temporal transformers for decoding neural movement control Open

Benedetta Candelori, Giampiero Bardella, Indro Spinelli, Surabhi Ramawat, Pierpaolo Pani , et al. · 2025

Computer science Biology Physics

Objective . Deep learning tools applied to high-resolution neurophysiological data have significantly progressed, offering enhanced decoding, real-time processing, and readability for practical applications. However, the design of artifici…

Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection Open

Donatella Genovese, Amanda Sgroi, Alessio Devoto, S. H. Valentine, Lincoln C. Wood , et al. · 2025

Computer science Engineering

The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …

Goal-oriented Communications based on Recursive Early Exit Neural Networks Open

Jary Pomponi, Mattia Merluzzi, Alessio Devoto, Mateus P. Mota, Paolo Di Lorenzo , et al. · 2024

Computer science

This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models. The proposed approach is built on two key components. First, we introduce an innovative early exit strategy that dynami…

Task Singular Vectors: Reducing Task Interference in Model Merging Open

Antonio Andrea Gargiulo, Donato Crisostomi, Maria Sofia Bucarelli, Simone Scardapane, Fabrizio Silvestri , et al. · 2024

Computer science Engineering

Task Arithmetic has emerged as a simple yet effective method to merge models without additional training. However, by treating entire networks as flat parameter vectors, it overlooks key structural information and is susceptible to task in…

Goal-Oriented Communications Based on Recursive Early Exit Neural Networks Open

Jary Pomponi, Mattia Merluzzi, Alessio Devoto, Mateus P. Mota, Paolo Di Lorenzo , et al. · 2024

Computer science

International audience

Interpreting Temporal Graph Neural Networks with Koopman Theory Open

Michele Guerra, Simone Scardapane, Filippo Maria Bianchi · 2024

Computer science Mathematics Psychology

Spatiotemporal graph neural networks (STGNNs) have shown promising results in many domains, from forecasting to epidemiology. However, understanding the dynamics learned by these models and explaining their behaviour is significantly more …

How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not Open

Francesco Verdini, Pierfrancesco Melucci, Stefano Perna, Francesco Cariaggi, Marco Gaido , et al. · 2024

Computer science History Philosophy

The remarkable performance achieved by Large Language Models (LLM) has driven research efforts to leverage them for a wide range of tasks and input modalities. In speech-to-text (S2T) tasks, the emerging solution consists of projecting the…

Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes Open

Marco Montagna, Simone Scardapane, Lev Telyatnikov · 2024

Mathematics Computer science

Graph Neural Networks based on the message-passing (MP) mechanism are a dominant approach for handling graph-structured data. However, they are inherently limited to modeling only pairwise interactions, making it difficult to explicitly ca…

Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning Open

Alessio Devoto, Federico Alvetreti, Jary Pomponi, Paolo Di Lorenzo, Pasquale Minervini , et al. · 2024

Computer science Engineering Physics

Recently, foundation models based on Vision Transformers (ViTs) have become widely available. However, their fine-tuning process is highly resource-intensive, and it hinders their adoption in several edge or low-energy applications. To thi…

Enhancing High-Energy Particle Physics Collision Analysis through Graph Data Attribution Techniques Open

Alessio Verdone, A. Devoto, C. Sebastiani, J. Carmignani, M. D’Onofrio , et al. · 2024

Physics Computer science Psychology

The experiments at the Large Hadron Collider at CERN generate vast amounts of complex data from high-energy particle collisions. This data presents significant challenges due to its volume and complex reconstruction, necessitating the use …

Conditional computation in neural networks: Principles and research trends Open

Simone Scardapane, Alessandro Baiocchi, Alessio Devoto, Valerio Marsocci, Pasquale Minervini , et al. · 2024

Computer science

This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts…

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression Open

Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini · 2024

Computer science Materials science Political science

The deployment of large language models (LLMs) is often hindered by the extensive memory requirements of the Key-Value (KV) cache, especially as context lengths increase. Existing approaches to reduce the KV cache size involve either fine-…

Simone Scardapane YOU? Author Swipe