Explanipedia

RETROcode: Leveraging a Code Database for Improved Natural Language to Code Generation Open

Nathanaël Beau, Benoît Crabbé · 2025

As text and code resources have expanded, large-scale pre-trained models have shown promising capabilities in code generation tasks, typically employing supervised fine-tuning with problem statement-program pairs. However, increasing model…

NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data Open

Sergei Bogdanov, Alexandre Constantin, Timothée Bernard, Benoît Crabbé, Etienne P. Bernard · 2024

Computer science Geography

Large Language Models (LLMs) have shown impressive abilities in data annotation, opening the way for new approaches to solve classic NLP problems. In this paper, we show how to use LLMs to create NuNER, a compact language representation mo…

Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement Open

Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé · 2023

Computer science Philosophy Physics

Many studies have shown that transformers are able to predict subject-verb agreement, demonstrating their ability to uncover an abstract representation of the sentence in an unsupervised way. Recently, Li et al. (2021) found that transform…

Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement Open

Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé · 2022

Computer science Psychology Engineering

The long-distance agreement, evidence for syntactic structure, is increasingly used to assess the syntactic generalization of Neural Language Models. Much work has shown that transformers are capable of high accuracy in varied agreement ta…

The impact of lexical and grammatical processing on generating code from natural language Open

Nathanaël Beau, Benoît Crabbé · 2022

Computer science History Physics

Considering the seq2seq architecture of TranX for natural language to code translation, we identify four key components of importance: grammatical constraints, lexical preprocessing, input representations, and copy mechanisms. To study the…

How Distributed are Distributed Representations? An Observation on the Locality of Syntactic Information in Verb Agreement Tasks Open

Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé · 2022

Computer science Physics Political science

This work addresses the question of the localization of syntactic information encoded in the transformers representations. We tackle this question from two perspectives, considering the object-past participle agreement in French, by identi…

Unifying Parsing and Tree-Structured Models for Generating Sentence Semantic Representations Open

Antoine Simoulin, Benoît Crabbé · 2022

Computer science Philosophy Mathematics

International audience

Are Transformers a Modern Version of ELIZA? Observations on French Object Verb Agreement Open

Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé · 2021

Computer science Physics Philosophy

Many recent works have demonstrated that unsupervised sentence representations of neural networks encode syntactic information by observing that neural language models are able to predict the agreement between a verb and its subject. We ta…

Word order in French: the role of animacy Open

Juliette Thuilier, Margaret Grant, Benoît Crabbé, Anne Abeillé · 2021

Psychology Philosophy

A major goal of the quantitative study of syntax has been to identify factors that have predictive power on speaker choices in the face of word-order or valence alternations (e.g. Arnold et al. 2000; Bresnan et al. 2007; Bresnan & Ford…

Can RNNs learn Recursive Nested Subject-Verb Agreements? Open

Yair Lakretz, Théo Desbordes, Jean-Rémi King, Benoît Crabbé, Maxime Oquab , et al. · 2021

Computer science Mathematics Biology

One of the fundamental principles of contemporary linguistics states that language processing requires the ability to extract recursively nested tree structures. However, it remains unclear whether and how this code could be implemented in…

Contrasting distinct structured views to learn sentence embeddings Open

Antoine Simoulin, Benoît Crabbé · 2021

Computer science Psychology Biology

We propose a self-supervised method that builds sentence embeddings from the combination of diverse explicit syntactic structures of a sentence. We assume structure is crucial to building consistent representations as we expect sentence me…

How Many Layers and Why? An Analysis of the Model Depth in Transformers Open

Antoine Simoulin, Benoît Crabbé · 2021

Computer science Engineering

Antoine Simoulin, Benoit Crabbé. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. 2021.

Are Transformers a Modern Version of ELIZA? Observations on French Object Verb Agreement Open

Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé · 2021

Computer science Physics Philosophy

Many recent works have demonstrated that unsupervised sentence representations of neural networks encode syntactic information by observing that neural language models are able to predict the agreement between a verb and its subject. We ta…

Unlexicalized Transition-based Discontinuous Constituency Parsing Open

Maximin Coavoux, Benoît Crabbé, Shay B. Cohen · 2019

Computer science Chemistry Biology

Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head and (ii) bilexical statistics are crucial to solve ambiguities. In this paper, we introduce an unlexicalized transition-based…

Unlexicalized Transition-based Discontinuous Constituency Parsing Open

Maximin Coavoux, Benoît Crabbé, Shay B. Cohen · 2019

Computer science History Chemistry

Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head (ii) bilexical statistics are crucial to solve ambiguities. In this paper, we introduce an unlexicalized transition-based par…

Using Wiktionary as a resource for WSD : the case of French verbs Open

Vincent Segonne, Marie Candito, Benoît Crabbé · 2019

Computer science Physics Economics

As opposed to word sense induction, word sense disambiguation (WSD) has the advantage of us-ing interpretable senses, but requires annotated data, which are quite rare for most languages except English (Miller et al. 1993; Fellbaum, 1998).…

Taraldsen’s generalization in diachrony : evidence from a diachronic corpus Open

Alexandra Simonenko, Benoît Crabbé, Sophie Prévost · 2017

Mathematics History Computer science

International audience

Multilingual Lexicalized Constituency Parsing with Word-Level Auxiliary Tasks Open

Maximin Coavoux, Benoît Crabbé · 2017

Computer science Mathematics Philosophy

We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character biLSTM (Ballesteros et al., 2015; Plank et al., …

Incremental Discontinuous Phrase Structure Parsing with the GAP Transition Open

Maximin Coavoux, Benoît Crabbé · 2017

Computer science Chemistry Philosophy

This article introduces a novel transition system for discontinuous lexicalized constituent parsing called SR-GAP. It is an extension of the shift-reduce algorithm with an additional gap transition. Evaluation on two German treebanks shows…

Boosting for Efficient Model Selection for Syntactic Parsing Open

Rachel Bawden, Benoît Crabbé · 2016

Computer science

International audience

Natural Language Processing, 60 years after the Chomsky-Schützenberger hierarchy Open

Laurence Danlos, Benoît Crabbé · 2016

Computer science History Philosophy

Overview of Natural Language Processing, 60 years after the Chomsky-Schutzenberger hierarchy

Neural Greedy Constituent Parsing with Dynamic Oracles Open

Maximin Coavoux, Benoît Crabbé · 2016

Computer science

Dynamic oracle training has shown substantial improvements for dependency parsing in various settings, but has not been explored for constituent parsing.The present article introduces a dynamic oracle for transition-based constituent parsi…

Benoît Crabbé YOU? Author Swipe