Jeremy Goldwasser
YOU?
Author Swipe
View article: Gaussian Rank Verification
Gaussian Rank Verification Open
Statistical experiments often seek to identify random variables with the largest population means. This inferential task, known as rank verification, has been well‐studied on Gaussian data with equal variances. This work provides the first…
View article: Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks
Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks Open
Counterfactuals are a popular framework for interpreting machine learning predictions. These what if explanations are notoriously challenging to create for computer vision models: standard gradient-based methods are prone to produce advers…
View article: Gaussian Rank Verification
Gaussian Rank Verification Open
Statistical experiments often seek to identify random variables with the largest population means. This inferential task, known as rank verification, has been well-studied on Gaussian data with equal variances. This work provides the first…
View article: Challenges in Estimating Time-Varying Epidemic Severity Rates from Aggregate Data
Challenges in Estimating Time-Varying Epidemic Severity Rates from Aggregate Data Open
Severity rates like the case-fatality rate and infection-fatality rate are key metrics in public health. To guide decision-making in response to changes like new variants or vaccines, it is imperative to understand how these rates shift in…
View article: Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study
Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study Open
Background Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consuming and labor-intensive process. To address this, natural language processing (NLP) algorithms have been developed t…
View article: Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation (Preprint)
Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation (Preprint) Open
BACKGROUND Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consuming and labor-intensive process. Therefore, natural language processing (NLP) algorithms have been developed to aut…
View article: Statistical Significance of Feature Importance Rankings
Statistical Significance of Feature Importance Rankings Open
Feature importance scores are ubiquitous tools for understanding the predictions of machine learning models. However, many popular attribution methods suffer from high instability due to random sampling. Leveraging novel ideas from hypothe…
View article: Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation
Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation Open
This study introduces Ascle, a pioneering natural language processing (NLP) toolkit designed for medical text generation. Ascle is tailored for biomedical researchers and healthcare professionals with an easy-to-use, all-in-one solution th…
View article: Stabilizing Estimates of Shapley Values with Control Variates
Stabilizing Estimates of Shapley Values with Control Variates Open
Shapley values are among the most popular tools for explaining predictions of blackbox machine learning models. However, their high computational cost motivates the use of sampling approximations, inducing a considerable degree of uncertai…
View article: Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review
Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review Open
Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research. Despite this central role, EHRs are notoriously …
View article: Forest Fire Clustering: Iterative Label Propagation Clustering and Monte Carlo Validation For Single-cell Sequencing Analysis
Forest Fire Clustering: Iterative Label Propagation Clustering and Monte Carlo Validation For Single-cell Sequencing Analysis Open
With the rise of single-cell sequencing technologies, there is a growing need for robust clustering algorithms to extract deeper insights from data. Here, we introduce an intuitive and efficient clustering method, Forest Fire Clustering, f…
View article: Forest Fire Clustering for Single-cell Sequencing with Iterative Label Propagation and Parallelized Monte Carlo Simulation
Forest Fire Clustering for Single-cell Sequencing with Iterative Label Propagation and Parallelized Monte Carlo Simulation Open
In the era of single-cell sequencing, there is a growing need to extract insights from data with clustering methods. Here, we introduce Forest Fire Clustering, an efficient and interpretable method for cell-type discovery from single-cell …
View article: Forest Fire Clustering: Cluster-oriented Label Propagation Clustering and Monte Carlo Verification Inspired by Forest Fire Dynamics.
Forest Fire Clustering: Cluster-oriented Label Propagation Clustering and Monte Carlo Verification Inspired by Forest Fire Dynamics. Open
Clustering methods group data points together and assign them group-level labels. However, it has been difficult to evaluate the confidence of the clustering results. Here, we introduce a novel method that could not only find robust cluste…