Nathan Noiry
YOU?
Author Swipe
View article: Learning to rank anomalies: scalar performance criteria and maximization of rank statistics
Learning to rank anomalies: scalar performance criteria and maximization of rank statistics Open
The ability to collect and store ever more massive data, unlabeled in many cases, has been accompanied by the need to process them efficiently in order to extract relevant information and possibly design solutions based on the latter. In v…
View article: Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks Open
International audience
A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification Open
One of the pursued objectives of deep learning is to provide tools that learn abstract representations of reality from the observation of multiple contextual situations. More precisely, one wishes to extract disentangled representations wh…
Toward Stronger Textual Attack Detectors Open
The landscape of available textual adversarial attacks keeps growing, posing severe threats and raising concerns regarding the deep NLP system's integrity. However, the crucial problem of defending against malicious attacks has only drawn …
Online Matching in Geometric Random Graphs Open
We investigate online maximum cardinality matching, a central problem in ad allocation. In this problem, users are revealed sequentially, and each new user can be paired with any previously unmatched campaign that it is compatible with. De…
A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection Open
A key feature of out-of-distribution (OOD) detection is to exploit a trained neural network by extracting statistical patterns and relationships through the multi-layer classifier to detect shifts in the expected input data distribution. D…
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks Open
The evaluation of natural language processing (NLP) systems is crucial for advancing the field, but current benchmarking approaches often assume that all systems have scores available for all tasks, which is not always practical. In realit…
The Glass Ceiling of Automatic Evaluation in Natural Language Generation Open
International audience
A Novel Information Theoretic Objective to Disentangle Representations for Fair Classification Open
International audience
Toward Stronger Textual Attack Detectors Open
International audience
Beyond Mahalanobis-Based Scores for Textual OOD Detection Open
Deep learning methods have boosted the adoption of NLP systems in real-life applications. However, they turn out to be vulnerable to distribution shifts over time which may cause severe dysfunctions in production systems, urging practition…
View article: Mitigating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model
Mitigating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model Open
In spite of the high performance and reliability of deep learning algorithms in a wide range of everyday applications, many investigations tend to show that a lot of models exhibit biases, discriminating against specific subgroups of the p…
The Glass Ceiling of Automatic Evaluation in Natural Language Generation Open
Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods. Thus, numerous research efforts have focused on crafting such metrics. In this work, we take a step back and analyz…
Learning Disentangled Textual Representations via Statistical Measures of Similarity Open
When working with textual data, a natural application of disentangled representations is fair classification where the goal is to make predictions without being biased (or influenced) by sensitive attributes that may be present in the data…
Large deviations for spectral measures of some spiked matrices Open
We prove large deviations principles for spectral measures of perturbed (or spiked) matrix models in the direction of an eigenvector of the perturbation. In each model under study, we provide two approaches, one of which relying on large d…
What are the best systems? New perspectives on NLP Benchmarking Open
In Machine Learning, a benchmark refers to an ensemble of datasets associated with one or multiple metrics together with a way to aggregate different systems performances. They are instrumental in (i) assessing the progress of new methods …
Depth first exploration of a configuration model Open
We introduce an algorithm that constructs a random uniform graph with\nprescribed degree sequence together with a depth first exploration of it. In\nthe so-called supercritical regime where the graph contains a giant component,\nwe prove t…
Learning Disentangled Textual Representations via Statistical Measures of Similarity Open
When working with textual data, a natural application of disentangled representations is the fair classification where the goal is to make predictions without being biased (or influenced) by sensible attributes that may be present in the d…
Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics Open
The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are…
Learning from Biased Data: A Semi-Parametric Approach Open
International audience
Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of\n Greedy Algorithm Open
Motivated by sequential budgeted allocation problems, we investigate online\nmatching problems where connections between vertices are not i.i.d., but they\nhave fixed degree distributions -- the so-called configuration model. We\nestimate …
Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm Open
Motivated by sequential budgeted allocation problems, we investigate online matching problems where connections between vertices are not i.i.d., but they have fixed degree distributions -- the so-called configuration model. We estimate the…
Long induced paths in a configuration model Open
In an article published in 1987 in Combinatorica \cite{MR918397}, Frieze and Jackson established a lower bound on the length of the longest induced path (and cycle) in a sparse random graph. Their bound is obtained through a rough analysis…
A solvable class of renewal processes and its applications Open
When the distribution of the inter-arrival times of a renewal process is a mixture of geometric laws, we prove that the renewal function of the process is given by the moments of a probability measure which is explicitly related to the mix…
A solvable class of renewal processes Open
When the distribution of the inter-arrival times of a renewal process is a mixture of geometric laws, we prove that the renewal function of the process is given by the moments of a probability measure which is explicitly related to the mix…
Depth First Exploration of a Configuration Model Open
We introduce an algorithm that constructs a random uniform graph with prescribed degree sequence together with a depth first exploration of it. In the so-called supercritical regime where the graph contains a giant component, we prove that…
Spectra of Wishart Matrices with size-dependent entries Open
We prove the convergence of the empirical spectral measure of Wishart matrices with size-dependent entries and characterize the limiting law by its moments. We apply our result to the cases where the entries are Bernoulli variables with pa…