Explanipedia

Leveraging knowledge graphs and LLMs for content-based reviewer assignment Open

F Bagheri, Davide Buscaldi, Diego Reforgiato Recupero · 2025

The growing volume of academic submissions in recent years highlighted the need for scalable and accurate reviewer assignment systems, able to go beyond techniques based on manual processes and basic keyword matching. We propose a novel pi…

Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders Open

Mathis Le Bail, Jérémie Dentan, Davide Buscaldi · 2025

Sparse Autoencoders (SAEs) have been successfully used to probe Large Language Models (LLMs) and extract interpretable concepts from their internal representations. These concepts are linear combinations of neuron activations that correspo…

CS-KG 2.0: A Large-scale Knowledge Graph of Computer Science Open

Danilo Dessı̀, Francesco Osborne, Davide Buscaldi, Diego Reforgiato Recupero, Enrico Motta · 2025

The rapid evolution of AI and the increased accessibility of scientific articles through open access marks a pivotal moment in research. AI-driven tools are reshaping how scientists explore, interpret, and contribute to the body of scienti…

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models Open

Mohamed Dhouib, Davide Buscaldi, Sonia Vanier, Aymen Shabou · 2025

Visual Language Models require substantial computational resources for inference due to the additional input tokens needed to represent visual information. However, these visual tokens often contain redundant and unimportant information, r…

Research hypothesis generation over scientific knowledge graphs Open

Agustín Borrego, Danilo Dessı̀, Daniel Ayala, Inma Hernández, Francesco Osborne , et al. · 2025

Computer science

Generating research hypotheses is a crucial step in scientific investigation that involves the creation of precise, verifiable, and logically valid statements that can be empirically examined. Therefore, many efforts have been made to auto…

Rewiring Techniques to Mitigate Oversquashing and Oversmoothing in GNNs: A Survey Open

Hugo Attali, Davide Buscaldi, Nathalie Pernelle · 2024

Environmental science Geology

Graph Neural Networks (GNNs) are powerful tools for learning from graph-structured data, but their effectiveness is often constrained by two critical challenges: oversquashing, where the excessive compression of information from distant no…

Predicting memorization within Large Language Models fine-tuned for classification Open

Jérémie Dentan, Davide Buscaldi, Aymen Shabou, Sonia Vanier · 2024

Computer science Psychology Philosophy

Large Language Models have received significant attention due to their abilities to solve a wide range of complex tasks. However these models memorize a significant proportion of their training data, posing a serious threat when disclosed …

Workshop on Deep Learning and Large Language Models for Knowledge Graphs (DL4KG) Open

Mehwish Alam, Davide Buscaldi, Michael Cochez, Genet Asefa Gesese, Francesco Osborne , et al. · 2024

Computer science

The use of Knowledge Graphs (KGs) which constitute large networks of real-world entities and their interrelationships, has grown rapidly. A substantial body of research has emerged, exploring the integration of deep learning (DL) and large…

Triplétoile: Extraction of knowledge from microblogging text Open

Vanni Zavarella, Sergio Consoli, Diego Reforgiato Recupero, Gianni Fenu, Simone Angioni , et al. · 2024

Computer science Mathematics

Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like …

An Ensemble Method Based on the Combination of Transformers with Convolutional Neural Networks to Detect Artificially Generated Text Open

Vijini Liyanage, Davide Buscaldi · 2023

Computer science Engineering

Thanks to the state-of-the-art Large Language Models (LLMs), language generation has reached outstanding levels. These models are capable of generating high quality content, thus making it a challenging task to detect generated text from h…

A Knowledge Graph-Based Method for the Geolocation of Tweets Open

Fernando Lovera, Yudith Cardinale, Davide Buscaldi, Thierry Charnois · 2023

Computer science Geography

Twitter geolocation is useful for various purposes, including tracking COVID-19 perceptions, analyzing political trends, and managing natural disasters. However, accurately predicting geolocations based on tweet content remains a challenge…

Data produced in the context of RCLN particpation to the Visual WSD task at SemEval 2023 Open

Davide Buscaldi · 2023

Computer science Geography Engineering

The data contain: - generated captions from train, trial and test images - generated images from the diffusion model refer to https://github.com/dbuscaldi/VisualWSD23 for code

Data produced in the context of RCLN particpation to the Visual WSD task at SemEval 2023 Open

Davide Buscaldi · 2023

Computer science Geography Engineering

The data contain: - generated captions from train, trial and test images - generated images from the diffusion model refer to https://github.com/dbuscaldi/VisualWSD23 for code

RCLN at SemEval-2023 Task 1: Leveraging Stable Diffusion and Image Captions for Visual WSD Open

Antonina Mijatovic, Davide Buscaldi, Ekaterina Borisova · 2023

Computer science Mathematics Economics

This paper describes the participation of the RCLN team at the Visual Word Sense Disambiguation task at SemEval 2023. The participation was focused on the use of CLIP as a base model for the matching between text and images with additional…

ArXiV-Entity/Relation annotated dataset Open

Davide Buscaldi, Seynabou Sarr, Juan Luis Garcia-Mendoza · 2022

Computer science

This dataset is a collection of abstracts from the CS section of ArXiV, each annotated with DyGIE++ (SciERC model) The dataset can be used to train triple extractors or to cluster triples (in the Computer Science and AI domains). Supersede…

ArXiV-Entity/Relation annotated dataset Open

Davide Buscaldi, Seynabou Sarr, Juan Luis Garcia-Mendoza · 2022

Computer science

This dataset is a collection of abstracts from the CS section of ArXiV, each annotated with DyGIE++ (SciERC model) The dataset can be used to train triple extractors or to cluster triples (in the Computer Science and AI domains). Supersede…

SciCheck Open

Agustín Borrego, Danilo Dessı̀, Inma Hernández, Francesco Osborne, Diego Reforgiato Recupero , et al. · 2022

Computer science

This archive contains AI-KG with additional 300K triples used in the paper "Completing Scientific Facts in Knowledge Graphs of Research Concepts", accepted in IEEE Access.

Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization Open

Hadi Abdine, Moussa Kamal Eddine, Michalis Vazirgiannis, Davide Buscaldi · 2022

Computer science Mathematics Political science

Word sense induction (WSI) is a difficult problem in natural language processing that involves the unsupervised automatic detection of a word's senses (i.e. meanings). Recent work achieves significant results on the WSI task by pre-trainin…

ArXiV-AIKG dataset Open

Davide Buscaldi, Seynabou Sarr · 2022

Computer science

This dataset is a collection of abstracts from the CS section of ArXiV, each paired with triples from the Artificial Intelligence Knowledge Graph (AIKG) https://scholkg.kmi.open.ac.uk/ The pairing is determined by the fact that one or more…

Editorial of the Special Issue on Deep Learning and Knowledge Graphs Open

Mehwish Alam, Davide Buscaldi, Michael Cochez, Francesco Osborne, Diego Reforgiato Recupero , et al. · 2022

Computer science Psychology

This special issue aims to reinforce the relationships between these communities and foster interdisciplinary research in the areas of KG, Deep Learning, and Natural Language Processing.The works that we have requested from authors should …

A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications Open

Vijini Liyanage, Davide Buscaldi, Adeline Nazarenko · 2022

Computer science Geography Economics

Automatic text generation based on neural language models has achieved performance levels that make the generated text almost indistinguishable from those written by humans. Despite the value that text generation can have in various applic…

Completing Scientific Facts in Knowledge Graphs of Research Concepts Open

Agustín Borrego, Danilo Dessı̀, Inma Hernández, Francesco Osborne, Diego Reforgiato Recupero , et al. · 2022

Computer science Mathematics Philosophy

In the last few years, we have witnessed the emergence of several knowledge graphs that explicitly describe research knowledge with the aim of enabling intelligent systems for supporting and accelerating the scientific process. These resou…

SciCheck Open

Agustín Borrego, Danilo Dessı̀, Inma Hernández, Francesco Osborne, Diego Reforgiato Recupero , et al. · 2021

Computer science

This archive contains AI-KG with additional 300K triples used in the paper "Completing Scientific Facts in Knowledge Graphs of Research Concepts", accepted in IEEE Access.

Davide Buscaldi YOU? Author Swipe