Simone Scardapane
YOU?
Author Swipe
View article: Communication Efficient Split Learning of ViTs with Attention-based Double Compression
Communication Efficient Split Learning of ViTs with Attention-based Double Compression Open
This paper proposes a novel communication-efficient Split Learning (SL) framework, named Attention-based Double Compression (ADC), which reduces the communication overhead required for transmitting intermediate Vision Transformers activati…
View article: Universal Properties of Activation Sparsity in Modern Large Language Models
Universal Properties of Activation Sparsity in Modern Large Language Models Open
Input-dependent activation sparsity is a notable property of deep learning models, which has been extensively studied in networks with ReLU activations and is associated with efficiency, robustness, and interpretability. However, the appro…
View article: A data attribution approach for unsupervised anomaly detection on multivariate time series
A data attribution approach for unsupervised anomaly detection on multivariate time series Open
Anomaly detection is a challenging task that manifests in several forms depending on its context (e.g., fraud detection, network security, fault monitoring): given a collection of data, the goal is discovering the anomalous patterns diverg…
View article: Mixture-of-experts graph transformers for interpretable particle collision detection
Mixture-of-experts graph transformers for interpretable particle collision detection Open
The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …
View article: Micrographia in Parkinson's Disease: Automatic Recognition through Artificial Intelligence
Micrographia in Parkinson's Disease: Automatic Recognition through Artificial Intelligence Open
Background Parkinson's disease (PD) leads to handwriting abnormalities primarily characterized by micrographia. Whether micrographia manifests early in PD, worsens throughout the disease, and lastly responds to L‐Dopa is still under scient…
View article: Interpretable classification of Levantine ceramic thin sections via neural networks
Interpretable classification of Levantine ceramic thin sections via neural networks Open
Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…
View article: Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks
Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks Open
Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…
View article: Bringing AI to Clinicians: Simplifying Pleural Effusion Cytology Diagnosis with User-Friendly Models
Bringing AI to Clinicians: Simplifying Pleural Effusion Cytology Diagnosis with User-Friendly Models Open
Background: Malignant pleural effusions (MPEs) are common in advanced lung cancer patients. Cytological examination of pleural fluid is essential for identifying cell types but presents diagnostic challenges, particularly when reactive mes…
View article: Adaptive token selection for scalable point cloud transformers
Adaptive token selection for scalable point cloud transformers Open
The recent surge in 3D data acquisition has spurred the development of geometric deep learning models for point cloud processing, boosted by the remarkable success of transformers in natural language processing. While point cloud transform…
View article: Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference
Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference Open
While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…
View article: MASS: MoErging through Adaptive Subspace Selection
MASS: MoErging through Adaptive Subspace Selection Open
Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set of parameters with no additional training overhead. Yet, existing merging methods fall short of matching …
View article: Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression
Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Open
Autoregressive language models rely on a Key-Value (KV) Cache, which avoids re-computing past hidden states during generation, making it faster. As model sizes and context lengths grow, the KV Cache becomes a significant memory bottleneck,…
View article: Towards zero-shot learning in 3D change detection: improving generalization with custom augmentations and evaluation
Towards zero-shot learning in 3D change detection: improving generalization with custom augmentations and evaluation Open
peer reviewed
View article: Spatio-temporal transformers for decoding neural movement control
Spatio-temporal transformers for decoding neural movement control Open
Objective . Deep learning tools applied to high-resolution neurophysiological data have significantly progressed, offering enhanced decoding, real-time processing, and readability for practical applications. However, the design of artifici…
View article: Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection
Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection Open
The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …
View article: Goal-oriented Communications based on Recursive Early Exit Neural Networks
Goal-oriented Communications based on Recursive Early Exit Neural Networks Open
This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models. The proposed approach is built on two key components. First, we introduce an innovative early exit strategy that dynami…
View article: Task Singular Vectors: Reducing Task Interference in Model Merging
Task Singular Vectors: Reducing Task Interference in Model Merging Open
Task Arithmetic has emerged as a simple yet effective method to merge models without additional training. However, by treating entire networks as flat parameter vectors, it overlooks key structural information and is susceptible to task in…
View article: Goal-Oriented Communications Based on Recursive Early Exit Neural Networks
Goal-Oriented Communications Based on Recursive Early Exit Neural Networks Open
International audience
View article: Interpreting Temporal Graph Neural Networks with Koopman Theory
Interpreting Temporal Graph Neural Networks with Koopman Theory Open
Spatiotemporal graph neural networks (STGNNs) have shown promising results in many domains, from forecasting to epidemiology. However, understanding the dynamics learned by these models and explaining their behaviour is significantly more …
View article: How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not Open
The remarkable performance achieved by Large Language Models (LLM) has driven research efforts to leverage them for a wide range of tasks and input modalities. In speech-to-text (S2T) tasks, the emerging solution consists of projecting the…
View article: Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes
Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes Open
Graph Neural Networks based on the message-passing (MP) mechanism are a dominant approach for handling graph-structured data. However, they are inherently limited to modeling only pairwise interactions, making it difficult to explicitly ca…
View article: Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning
Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning Open
Recently, foundation models based on Vision Transformers (ViTs) have become widely available. However, their fine-tuning process is highly resource-intensive, and it hinders their adoption in several edge or low-energy applications. To thi…
View article: Enhancing High-Energy Particle Physics Collision Analysis through Graph Data Attribution Techniques
Enhancing High-Energy Particle Physics Collision Analysis through Graph Data Attribution Techniques Open
The experiments at the Large Hadron Collider at CERN generate vast amounts of complex data from high-energy particle collisions. This data presents significant challenges due to its volume and complex reconstruction, necessitating the use …
View article: Conditional computation in neural networks: Principles and research trends
Conditional computation in neural networks: Principles and research trends Open
This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts…
View article: A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression
A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression Open
The deployment of large language models (LLMs) is often hindered by the extensive memory requirements of the Key-Value (KV) cache, especially as context lengths increase. Existing approaches to reduce the KV cache size involve either fine-…