Giles Hooker
YOU?
Author Swipe
View article: The importance of accounting for spatial heterogeneity in studies of plant competition and coexistence
The importance of accounting for spatial heterogeneity in studies of plant competition and coexistence Open
Spatial environmental heterogeneity maintains species diversity in ecological communities, but it is difficult to model and its effects on individuals are often ignored in empirical studies of species coexistence. This is problematic becau…
View article: Statistical Inference for Gradient Boosting Regression
Statistical Inference for Gradient Boosting Regression Open
Gradient boosting is widely popular due to its flexibility and predictive accuracy. However, statistical inference and uncertainty quantification for gradient boosting remain challenging and under-explored. We propose a unified framework f…
View article: Gaussian Rank Verification
Gaussian Rank Verification Open
Statistical experiments often seek to identify random variables with the largest population means. This inferential task, known as rank verification, has been well‐studied on Gaussian data with equal variances. This work provides the first…
View article: Pitfalls in machine learning interpretability: Manipulating partial dependence plots to hide discrimination
Pitfalls in machine learning interpretability: Manipulating partial dependence plots to hide discrimination Open
The adoption of artificial intelligence (AI) across industries has led to the widespread use of complex black-box models and interpretation tools for decision making. This paper proposes an adversarial framework to uncover the vulnerabilit…
View article: Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks
Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks Open
Counterfactuals are a popular framework for interpreting machine learning predictions. These what if explanations are notoriously challenging to create for computer vision models: standard gradient-based methods are prone to produce advers…
View article: Theory of Random Forests: A Review
Theory of Random Forests: A Review Open
Random forests (RF) have a long history, originally defined by Leo Breiman in 2001, but with antecedents in bagging methods introduced in 1996. They have become one of the most widely adopted machine learning tools thanks to their computat…
View article: CLaRe: Compact near-lossless Latent Representations of High-Dimensional Object Data
CLaRe: Compact near-lossless Latent Representations of High-Dimensional Object Data Open
Latent feature representation methods play an important role in the dimension reduction and statistical modeling of high-dimensional complex data objects. However, existing approaches to assess the quality of these methods often rely on ag…
View article: Gaussian Rank Verification
Gaussian Rank Verification Open
Statistical experiments often seek to identify random variables with the largest population means. This inferential task, known as rank verification, has been well-studied on Gaussian data with equal variances. This work provides the first…
View article: Targeted Maximum Likelihood Estimation for Integral Projection Models in Population Ecology
Targeted Maximum Likelihood Estimation for Integral Projection Models in Population Ecology Open
Integral projection models (IPMs) are widely used to study population growth and the dynamics of demographic structure (e.g. age and size distributions) within a population.These models use data on individuals' growth, survival, and reprod…
View article: Targeted Learning for Variable Importance
Targeted Learning for Variable Importance Open
Variable importance is one of the most widely used measures for interpreting machine learning with significant interest from both statistics and machine learning communities. Recently, increasing attention has been directed toward uncertai…
View article: A novel device for conservation of supplemental oxygen: A randomized, crossover study of the efficiency and non-inferiority of the OXFO System
A novel device for conservation of supplemental oxygen: A randomized, crossover study of the efficiency and non-inferiority of the OXFO System Open
Background Oxygen is critical to the prevention of hypoxemia-induced morbidity and mortality; yet, access remains inadequate in most low- and middle-income countries despite billions of dollars spent over decades to increase supply. We pre…
View article: It's about (taking up) space: Discreteness of individuals and the strength of spatial coexistence mechanisms
It's about (taking up) space: Discreteness of individuals and the strength of spatial coexistence mechanisms Open
One strand of modern coexistence theory (MCT) partitions invader growth rates (IGR) to quantify how different mechanisms contribute to species coexistence, highlighting fluctuation‐dependent mechanisms. A general conclusion from the classi…
View article: Variable Importance Measures for Multivariate Random Forests
Variable Importance Measures for Multivariate Random Forests Open
Multivariate random forests (or MVRFs) are an extension of tree-based ensembles to examine multivariate responses. MVRF can be particularly helpful where some of the responses exhibit sparse (e.g., zero-inflated) distributions, making borr…
View article: A generic approach for reproducible model distillation
A generic approach for reproducible model distillation Open
Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable “student” model to mimic the predictions made by the black box “teacher” model. However, when the student model is sensitiv…
View article: Accelerated Inference for Partially Observed Markov Processes using Automatic Differentiation
Accelerated Inference for Partially Observed Markov Processes using Automatic Differentiation Open
Automatic differentiation (AD) has driven recent advances in machine learning, including deep neural networks and Hamiltonian Markov Chain Monte Carlo methods. Partially observed nonlinear stochastic dynamical systems have proved resistant…
View article: An Understanding of Principal Differential Analysis
An Understanding of Principal Differential Analysis Open
In functional data analysis, replicate observations of a smooth functional process and its derivatives offer a unique opportunity to flexibly estimate continuous-time ordinary differential equation models. Ramsay (1996) first proposed to e…
View article: Differentiable Programming for Differential Equations: A Review
Differentiable Programming for Differential Equations: A Review Open
The differentiable programming paradigm is a cornerstone of modern scientific computing. It refers to numerical methods for computing the gradient of a numerical model's output. Many scientific models are based on differential equations, w…
View article: Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots
Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots Open
The adoption of artificial intelligence (AI) across industries has led to the widespread use of complex black-box models and interpretation tools for decision making. This paper proposes an adversarial framework to uncover the vulnerabilit…
View article: The natural history of luck: A synthesis study of structured population models
The natural history of luck: A synthesis study of structured population models Open
Chance pervades life. In turn, life histories are described by probabilities (e.g., survival, breeding) and averages across individuals (e.g., mean growth rate, age at maturity). In this study, we explored patterns of luck in lifetime outc…
View article: Statistical Significance of Feature Importance Rankings
Statistical Significance of Feature Importance Rankings Open
Feature importance scores are ubiquitous tools for understanding the predictions of machine learning models. However, many popular attribution methods suffer from high instability due to random sampling. Leveraging novel ideas from hypothe…
View article: Stabilizing Estimates of Shapley Values with Control Variates
Stabilizing Estimates of Shapley Values with Control Variates Open
Shapley values are among the most popular tools for explaining predictions of blackbox machine learning models. However, their high computational cost motivates the use of sampling approximations, inducing a considerable degree of uncertai…
View article: Supplemental code and data for Hernandez et al. "The natural history of luck: A synthesis study of structured population models"
Supplemental code and data for Hernandez et al. "The natural history of luck: A synthesis study of structured population models" Open
This dataset enables the user to repeat the analyses presented in the manuscript "The natural history of luck: A synthesis study of structured population models." It is comprised of two compressed archives: one which contains code, and one…
View article: Variable Importance Measures for Variable Selection and Statistical Inference in Multivariate Random Forests
Variable Importance Measures for Variable Selection and Statistical Inference in Multivariate Random Forests Open
Multivariate random forests (or MVRFs) are an extension of tree-based ensembles to examine multivariate responses. MVRF can be particularly helpful where some of the responses exhibit sparse (i.e., zero-inflated) distributions, making borr…
View article: An exact version of Life Table Response Experiment analysis, and the R package exactLTRE
An exact version of Life Table Response Experiment analysis, and the R package exactLTRE Open
Matrix population models are frequently built and used by ecologists to analyse demography and elucidate the processes driving population growth or decline. Life Table Response Experiments (LTREs) are comparative analyses that decompose th…
View article: A Generic Approach for Reproducible Model Distillation
A Generic Approach for Reproducible Model Distillation Open
Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable "student" model to mimic the predictions made by the black box "teacher" model. However, when the student model is sensitiv…
View article: Decision tree boosted varying coefficient models
Decision tree boosted varying coefficient models Open
Varying coefficient models are a flexible extension of generic parametric models whose coefficients are functions of a set of effect-modifying covariates instead of fitted constants. They are capable of achieving higher model complexity wh…