Marc Bertin
YOU?
Author Swipe
Citation Context Analysis: Evaluating Human vs. AI Annotations in Gameplay Bricks Research Open
Recent advances in citation analysis have moved beyond traditional bibliometric approaches to explore the contextual roles of citations in academic discourse. While Large Language Models (LLMs) offer new possibilities for analyzing citatio…
Categorization of scientometric data in a Benfordian context Open
This article aims to improve our understanding of scientometric data in a Benfordian context. Recently, Benford’s law has been used to detect scientific fraud. However, we need to better understand its application to scientometric data. Th…
Scientometric Dataset for Benford's law Open
This dataset proposes datasets built around Benford's law in the field of Scientometrics. This new version adds data on the ratio All first digit data are available
CriticalMinds: Enhancing ML Models for ESG Impact Analysis Categorisation Using Linguistic Resources and Aspect-Based Sentiment Analysis Open
International audience
Breaking Boundaries in Citation Parsing: A Comparative Study of Generative LLMs and Traditional Out-of-the-box Citation Parsers Open
International audience
Synthetic Dataset of Citation Strings in 12 Styles Open
This dataset was produced in the aim of testing different tools for citation string parsing, as part of the experiment reported in the paper: Iana Atanassova and Marc Bertin, 2024. "Breaking Boundaries in Citation Parsing: A Comparative St…
Semantic annotation of PLoS journal citation contexts Open
Dataset
Semantic annotation of PLoS journal citation contexts Open
Dataset
Scientometric Dataset for Benford's law Open
This dataset proposes datasets built around Benford's law in the field of Scientometrics. This new version adds data on the ratio
Contextual Analysis of Citations Using Rule-Based Approaches: "The Best Soups are Made in Old Pots" Open
In this paper, we present an experiment on annotating citation contexts using a rule-based approach to investigate the extent of text around citations. The study of citation contexts and their types has been a central issue in recent work …
Multilinguism in References: a Study of the ISTEX Dataset Open
International audience
Citing Foreign Language Sources : an Analysis of the S2ORC Dataset Open
International audience
The multilingual aspect of citation contexts Open
English is the language of reference among scientists. Dissemination in a dominant language, such as English, necessarily influences the production of knowledge and this can have a number of consequences especially in the evaluative metric…
The multilingual aspect of citation contexts Open
English is the language of reference among scientists. Dissemination in a dominant language, such as English, necessarily influences the production of knowledge and this can have a number of consequences especially in the evaluative metric…
Scientometric Dataset for Benford's law Open
This dataset proposes datasets built around Benford's law in the field of Scientometrics
Editorial: Mining Scientific Papers, Volume II: Knowledge Discovery and Data Exploitation Open
International audience
Preprint Citations in PLOS Dataset Open
Preprints are research articles that have been published online before undergoing peer review. The role of preprints in the scientific production has been growing in recent years. Our objective is to study these practices and evaluate the …
Preprint Citations in PLOS Dataset Open
Preprints are research articles that have been published online before undergoing peer review. The role of preprints in the scientific production has been growing in recent years. Our objective is to study these practices and evaluate the …
Determining Citation Blocks using End-to-end Neural Coreference Resolution Model for Citation Context Analysis Open
International audience
Identifying the Conceptual Space of Citation Contexts using Coreferences. Open
International audience
Editorial: Mining Scientific Papers: NLP-enhanced Bibliometrics Open
International audience
A preliminary study to compare deep learning with rule-based approaches for citation classification Open
International audience
Studying Uncertainty in Science: a distributional analysis through the IMRaD structure Open
International audience
Interec: In-Text Reference Corpus - Single References Dataset Open
This dataset contains a set of sentences extracted from articles published by the Public Library of Science (PLOS) up to September 2013. Information is given on the position of the sentences relative to the article and the section in which…
Interec: In-Text Reference Corpus - Single References Dataset Open
This dataset contains a set of sentences extracted from articles published by the Public Library of Science (PLOS) up to September 2013. Information is given on the position of the sentences relative to the article and the section in which…
Editorial for the Second Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics (CLBib2017) Open
International audience