Alessio Devoto
YOU?
Author Swipe
View article: Universal Properties of Activation Sparsity in Modern Large Language Models
Universal Properties of Activation Sparsity in Modern Large Language Models Open
Input-dependent activation sparsity is a notable property of deep learning models, which has been extensively studied in networks with ReLU activations and is associated with efficiency, robustness, and interpretability. However, the appro…
View article: Mixture-of-experts graph transformers for interpretable particle collision detection
Mixture-of-experts graph transformers for interpretable particle collision detection Open
The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …
View article: Interpretable classification of Levantine ceramic thin sections via neural networks
Interpretable classification of Levantine ceramic thin sections via neural networks Open
Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…
View article: Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks
Interpretable Classification of Levantine Ceramic Thin Sections via Neural Networks Open
Classification of ceramic thin sections is fundamental for understanding ancient pottery production techniques, provenance, and trade networks. Although effective, traditional petrographic analysis is time-consuming. This study explores th…
View article: Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference
Adaptive Computation Modules: Granular Conditional Computation for Efficient Inference Open
While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…
View article: Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression
Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Open
Autoregressive language models rely on a Key-Value (KV) Cache, which avoids re-computing past hidden states during generation, making it faster. As model sizes and context lengths grow, the KV Cache becomes a significant memory bottleneck,…
View article: Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection
Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection Open
The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, …
View article: Goal-oriented Communications based on Recursive Early Exit Neural Networks
Goal-oriented Communications based on Recursive Early Exit Neural Networks Open
This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models. The proposed approach is built on two key components. First, we introduce an innovative early exit strategy that dynami…
View article: Goal-Oriented Communications Based on Recursive Early Exit Neural Networks
Goal-Oriented Communications Based on Recursive Early Exit Neural Networks Open
International audience
View article: Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Open
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters. However, their parametric knowledge may conflict with the information provided in the context -- this phenomenon, known as \emph{context-…
View article: Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Analysing the Residual Stream of Language Models Under Knowledge Conflicts Open
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters. However, their parametric knowledge may conflict with the information provided in the context. Such conflicts can lead to undesirable mod…
View article: Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning
Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning Open
Recently, foundation models based on Vision Transformers (ViTs) have become widely available. However, their fine-tuning process is highly resource-intensive, and it hinders their adoption in several edge or low-energy applications. To thi…
View article: Conditional computation in neural networks: Principles and research trends
Conditional computation in neural networks: Principles and research trends Open
This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts…
View article: A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression
A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression Open
The deployment of large language models (LLMs) is often hindered by the extensive memory requirements of the Key-Value (KV) cache, especially as context lengths increase. Existing approaches to reduce the KV cache size involve either fine-…
View article: Are We Done with MMLU?
Are We Done with MMLU? Open
Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabiliti…
View article: Adaptive Semantic Token Selection for AI-native Goal-oriented Communications
Adaptive Semantic Token Selection for AI-native Goal-oriented Communications Open
In this paper, we propose a novel design for AI-native goal-oriented communications, exploiting transformer neural networks under dynamic inference constraints on bandwidth and computation. Transformers have become the standard architectur…
View article: Class incremental learning with probability dampening and cascaded gated classifier
Class incremental learning with probability dampening and cascaded gated classifier Open
Humans are capable of acquiring new knowledge and transferring learned knowledge into different domains, incurring a small forgetting. The same ability, called Continual Learning, is challenging to achieve when operating with neural networ…
View article: Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference
Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference Open
While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…
View article: On the robustness of vision transformers for in-flight monocular depth estimation
On the robustness of vision transformers for in-flight monocular depth estimation Open
Monocular depth estimation (MDE) has shown impressive performance recently, even in zero-shot or few-shot scenarios. In this paper, we consider the use of MDE on board low-altitude drone flights, which is required in a number of safety-cri…
View article: Reidentification of Objects From Aerial Photos With Hybrid Siamese Neural Networks
Reidentification of Objects From Aerial Photos With Hybrid Siamese Neural Networks Open
In this paper, we consider the task of re-identifying the same object in different photos taken from separate positions and angles during aerial reconnaissance, which is a crucial task for the maintenance and surveillance of critical large…