David B. Dunson
YOU?
Author Swipe
View article: Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods
Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods Open
DNA-based biodiversity surveys result in massive-scale data, including up to millions of species—of which, most are rare. Making the most of such data for inference and prediction requires modeling approaches that can relate species occurr…
View article: Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage
Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage Open
Discrete Bayesian networks (DBNs) provide a broadly useful framework for modeling dependence structures in multivariate categorical data. There is a vast literature on methods for inferring conditional probabilities and graphical structure…
Inference on covariance structure in high-dimensional multi-view data Open
This article focuses on covariance estimation for multi-view data. Popular approaches rely on factor-analytic decompositions that have shared and view-specific latent factors. Posterior computation is conducted via expensive and brittle Ma…
Bayesian learning of clinically meaningful sepsis phenotypes in northern Tanzania Open
Sepsis is a life-threatening condition caused by a dysregulated host response to infection. Recently, researchers have hypothesized that sepsis consists of a heterogeneous spectrum of distinct subtypes, motivating several studies to identi…
View article: Local graph estimation: Interpretable network discovery for complex data
Local graph estimation: Interpretable network discovery for complex data Open
Large, complex datasets often include a small set of variables of primary interest, such as clinical outcomes or known biomarkers, whose relation to the broader system is the main focus of analysis. In these situations, exhaustively estima…
View article: Correction
Correction Open
Bhattacharya et al. (2015, Journal of the American Statistical Association 110(512): 1479-1490) introduce a novel prior, the Dirichlet-Laplace (DL) prior, and propose a Markov chain Monte Carlo (MCMC) method to simulate posterior draws und…
View article: Workflow for Statistical Analysis of Environmental Mixtures
Workflow for Statistical Analysis of Environmental Mixtures Open
We note several methods may be equally appropriate for a specific context. This article does not present a comparison or contrast of methods or recommend one method over another. Rather, the presented workflow can be used to identify a set…
View article: Targeted empirical Bayes for more supervised joint factor analysis
Targeted empirical Bayes for more supervised joint factor analysis Open
Joint Bayesian factor models are popular for characterizing relationships between multivariate correlated predictors and a response variable. Standard models assume that all variables, including both the predictors and the response, are co…
Bayesian Deep Latent Class Regression Open
High-dimensional categorical data arise in diverse scientific domains and are often accompanied by covariates. Latent class regression models are routinely used in such settings, reducing dimensionality by assuming conditional independence…
View article: A Bayesian theory for estimation of biodiversity
A Bayesian theory for estimation of biodiversity Open
Statistical inference on biodiversity has a rich history going back to RA Fisher. An influential ecological theory suggests the existence of a fundamental biodiversity number, denoted $α$, which coincides with the precision parameter of a …
View article: On the Statistical Capacity of Deep Generative Models
On the Statistical Capacity of Deep Generative Models Open
Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough t…
View article: Environmental Mixtures Analysis (E-MIX) Workflow and Methods Repository
Environmental Mixtures Analysis (E-MIX) Workflow and Methods Repository Open
Human exposure to complex, changing, and variably correlated mixtures of environmental chemicals has presented analytical challenges to epidemiologists and human health researchers. There have been a wide variety of recent advances in stat…
View article: Joint species distribution modeling of abundance data through latent variable barcodes
Joint species distribution modeling of abundance data through latent variable barcodes Open
Accelerating global biodiversity loss has highlighted the role of complex relationships and shared patterns among species in determining their responses to environmental changes. The structure of an ecological community, represented by pat…
View article: Nested exemplar latent space models for dimension reduction in dynamic networks
Nested exemplar latent space models for dimension reduction in dynamic networks Open
Dynamic latent space models are widely used for characterizing changes in networks and relational data over time. These models assign to each node latent attributes that characterize connectivity with other nodes, with these latent attribu…
View article: Marginally interpretable spatial logistic regression with bridge processes
Marginally interpretable spatial logistic regression with bridge processes Open
In including random effects to account for dependent observations, the odds ratio interpretation of logistic regression coefficients is changed from population-averaged to subject-specific. This is unappealing in many applications, motivat…
Bayesian Clustering via Fusing of Localized Densities Open
Bayesian clustering typically relies on mixture models, with each component interpreted as a different cluster. After defining a prior for the component parameters and weights, Markov chain Monte Carlo (MCMC) algorithms are commonly used t…
Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis Open
Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, …
View article: Nonparametric IPSS: Fast, flexible feature selection with false discovery control
Nonparametric IPSS: Fast, flexible feature selection with false discovery control Open
Feature selection is a critical task in machine learning and statistics. However, existing feature selection methods either (i) rely on parametric methods such as linear or generalized linear models, (ii) lack theoretical false discovery c…
View article: Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods
Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods Open
Modern DNA-based biodiversity surveys result in massive-scale data, including up to millions of species – of which most are rare. Making the most of such data for inference and prediction requires modelling approaches that can relate speci…
View article: Bayesian Semiparametric Inference in LongitudinalMetabolomics Data: The EarlyBird Study
Bayesian Semiparametric Inference in LongitudinalMetabolomics Data: The EarlyBird Study Open
The article is motivated by an application to the EarlyBird cohort study aiming to explore how anthropometrics and clinicaland metabolic processes are associated with obesity and glucose control during childhood. There is interest in infer…
View article: Annotation aggregation of multi-label ecological datasets via Bayesian modeling
Annotation aggregation of multi-label ecological datasets via Bayesian modeling Open
Ecological and conservation studies monitoring bird communities typically rely on species classification based on bird vocalizations. Historically, this has been based on expert volunteers going into the field and making lists of the bird …
View article: Bayesian Deep Generative Models for Multiplex Networks with Multiscale Overlapping Clusters
Bayesian Deep Generative Models for Multiplex Networks with Multiscale Overlapping Clusters Open
Our interest is in multiplex network data with multiple network samples observed across the same set of nodes. Examples originate from a variety of fields, including brain connectivity, international trade networks, and social networks, am…
View article: Exact Sampling of Spanning Trees via Fast-forwarded Random Walks
Exact Sampling of Spanning Trees via Fast-forwarded Random Walks Open
Tree graphs are routinely used in statistics. When estimating a Bayesian model with a tree component, sampling the posterior remains a core difficulty. Existing Markov chain Monte Carlo methods tend to rely on local moves, often leading to…
View article: Bayesian Learning of Clinically Meaningful Sepsis Phenotypes in Northern Tanzania
Bayesian Learning of Clinically Meaningful Sepsis Phenotypes in Northern Tanzania Open
Sepsis is a life-threatening condition caused by a dysregulated host response to infection. Recently, researchers have hypothesized that sepsis consists of a heterogeneous spectrum of distinct subtypes, motivating several studies to identi…
View article: Spatial predictions on physically constrained domains: Applications to Arctic sea salinity data
Spatial predictions on physically constrained domains: Applications to Arctic sea salinity data Open
In this paper we predict sea surface salinity (SSS) in the Arctic Ocean based on satellite measurements. SSS is a crucial indicator for ongoing changes in the Arctic Ocean and can offer important insights about climate change. We particula…
View article: Blessing of dimension in Bayesian inference on covariance matrices
Blessing of dimension in Bayesian inference on covariance matrices Open
Bayesian factor analysis is routinely used for dimensionality reduction in modeling of high-dimensional covariance matrices. Factor analytic decompositions express the covariance as a sum of a low rank and diagonal matrix. In practice, Gib…
View article: Centered Partition Processes: Informative Priors for Clustering (with Discussion)
Centered Partition Processes: Informative Priors for Clustering (with Discussion) Open
There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable P…
View article: Bayesian Level Set Clustering
Bayesian Level Set Clustering Open
Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible…