Dianne Cook
YOU?
Author Swipe
View article: Automated Residual Plot Assessment With the R Package autovi and the Shiny Application autovi.web
Automated Residual Plot Assessment With the R Package autovi and the Shiny Application autovi.web Open
Visual assessment of residual plots is a common approach for diagnosing linear models, but it relies on manual evaluation, which does not scale well and can lead to inconsistent decisions across analysts. The lineup protocol, which embeds …
View article: Effect of Human Factors on Visual Statistical Inference
Effect of Human Factors on Visual Statistical Inference Open
Visual statistical inference determines the significance of patterns found in data exploration through graphics. It involves human observers inspecting a lineup of plots, with one real data plot randomly placed among decoys. Each observer'…
View article: Demonstrating the Capabilities of the lionfish Software for Interactive Visualization of Market Segmentation Partitions
Demonstrating the Capabilities of the lionfish Software for Interactive Visualization of Market Segmentation Partitions Open
Market segmentation partitions multivariate data using some clustering algorithm, resulting in some number of homogeneousclusters of consumers for marketing purposes. Often this type of data has no clear cluster structure, that is, no sepa…
View article: Is this normal? A new projection pursuit index to assess a sample against a multivariate null distribution
Is this normal? A new projection pursuit index to assess a sample against a multivariate null distribution Open
Many data problems contain some reference or normal conditions, upon which to compare newly collected data. This scenario occurs in data collected as part of clinical trials to detect adverse events, or for measuring climate change against…
View article: The Noisy Work of Uncertainty Visualisation Research: A Review
The Noisy Work of Uncertainty Visualisation Research: A Review Open
Uncertainty visualisation is quickly becomming a hot topic in information visualisation. Exisiting reviews in the field take the definition and purpose of an uncertainty visualisation to be self evident which results in a large amout of co…
View article: Automated Assessment of Residual Plots with Computer Vision Models
Automated Assessment of Residual Plots with Computer Vision Models Open
Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineu…
View article: Designing the Australian Cancer Atlas: visualizing geostatistical model uncertainty for multiple audiences
Designing the Australian Cancer Atlas: visualizing geostatistical model uncertainty for multiple audiences Open
Objective The Australian Cancer Atlas (ACA) aims to provide small-area estimates of cancer incidence and survival in Australia to help identify and address geographical health disparities. We report on the 21-month user-centered design stu…
View article: Squintability and Other Metrics for Assessing Projection Pursuit Indexes, and Guiding Optimization Choices
Squintability and Other Metrics for Assessing Projection Pursuit Indexes, and Guiding Optimization Choices Open
The projection pursuit (PP) guided tour optimizes a criterion function, known as the PP index, to gradually reveal projections of interest from high-dimensional data through animation. Optimization of some PP indexes can be non-trivial, if…
View article: cardinalR: Collection of Data Structures
cardinalR: Collection of Data Structures Open
A collection of simple simulation datasets designed for generating Nonlinear Dimension Reduction representations techniques such as t-distributed Stochastic Neighbor Embedding, and Uniform Manifold Approximation and Projection.These datase…
View article: quollr: Visualising How Nonlinear Dimension Reduction Warps Your Data
quollr: Visualising How Nonlinear Dimension Reduction Warps Your Data Open
Version 0.1.1
View article: Exploring local explanations of nonlinear models using animated linear projections
Exploring local explanations of nonlinear models using animated linear projections Open
The increased predictive power of machine learning models comes at the cost of increased complexity and loss of interpretability, particularly in comparison to parametric statistical models. This trade-off has led to the emergence of eXpla…
View article: A Tidy Framework and Infrastructure to Systematically Assemble Spatio-temporal Indexes from Multivariate Data
A Tidy Framework and Infrastructure to Systematically Assemble Spatio-temporal Indexes from Multivariate Data Open
Indexes are useful for summarizing multivariate information into single metrics for monitoring, communicating, and decision-making. While most work has focused on defining new indexes for specific purposes, more attention needs to be direc…
View article: <b>cubble</b>: An <i>R</i> Package for Organizing and Wrangling Multivariate Spatio-Temporal Data
<b>cubble</b>: An <i>R</i> Package for Organizing and Wrangling Multivariate Spatio-Temporal Data Open
Multivariate spatio-temporal data refers to multiple measurements taken across space and time. For many analyses, spatial and time components can be separately studied: for example, to explore the temporal trend of one variable for a singl…
View article: Frame to frame interpolation for high-dimensional data visualisation using the woylier package
Frame to frame interpolation for high-dimensional data visualisation using the woylier package Open
The woylier package implements tour interpolation paths between frames using Givens rotations. This provides an alternative to the geodesic interpolation between planes currently available in the tourr package. Tours are used to visualise …
View article: A Clustering Algorithm to Organize Satellite Hotspot Data for the Purpose of Tracking Bushfires Remotely
A Clustering Algorithm to Organize Satellite Hotspot Data for the Purpose of Tracking Bushfires Remotely Open
This paper proposes a spatiotemporal clustering algorithm and its implementation in the R package spotoroo. This work is motivated by the catastrophic bushfires in Australia throughout the summer of 2019-2020 and made possible by the avail…
View article: A Hexagon Tile Map Algorithm for Displaying Spatial Data
A Hexagon Tile Map Algorithm for Displaying Spatial Data Open
Spatial distributions have been presented on alternative representations of geography, such as cartograms, for many years. In modern times, interactivity and animation have allowed alternative displays to play a larger role. Alternative re…
View article: A Clustering Algorithm to Organize Satellite Hotspot Data for the Purpose of Tracking Bushfires Remotely
A Clustering Algorithm to Organize Satellite Hotspot Data for the Purpose of Tracking Bushfires Remotely Open
This paper proposes a spatiotemporal clustering algorithm and its implementation in the R package spotoroo. This work is motivated by the catastrophic bushfires in Australia throughout the summer of 2019-2020 and made possible by the avail…
View article: A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol
A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol Open
Regression experts consistently recommend plotting residuals for model diagnosis, despite the availability of many numerical hypothesis test procedures designed to use residuals to assess problems with a model fit. Here we provide evidence…
View article: Performance is not enough: the story told by a Rashomon quartet
Performance is not enough: the story told by a Rashomon quartet Open
The usual goal of supervised learning is to find the best model, the one that optimizes a particular performance measure. However, what if the explanation provided by this model is completely different from another model and different agai…
View article: Index construction: a pipeline approach for transparency and diagnostics
Index construction: a pipeline approach for transparency and diagnostics Open
Indexes are commonly used to combine multivariate information into a single number for monitoring, communicating, and decision-making. They are applied in many areas including the environment (e.g. drought index, Southern Oscillation Index…
View article: New and simplified manual controls for projection and slice tours, with application to exploring classification boundaries in high dimensions
New and simplified manual controls for projection and slice tours, with application to exploring classification boundaries in high dimensions Open
This paper describes new user controls for examining high-dimensional data using low-dimensional linear projections and slices. A user can interactively change the contribution of a given variable to a low-dimensional projection, which is …
View article: Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations Open
Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this…
View article: New and simplified manual controls for projection and slice tours, with application to exploring classification boundaries in high dimensions
New and simplified manual controls for projection and slice tours, with application to exploring classification boundaries in high dimensions Open
This paper describes new user controls for examining high-dimensional data using low-dimensional linear projections and slices. A user can interactively change the contribution of a given variable to a low-dimensional projection, which is …