J. S. Marron
YOU?
Author Swipe
View article: Advanced Distribution Theory for Significance in Scale Space
Advanced Distribution Theory for Significance in Scale Space Open
Smoothing methods find signals in noisy data. A challenge for Statistical inference is the choice of smoothing parameter. SiZer addressed this challenge in one-dimension by detecting significant slopes across multiple scales, but was not a…
View article: Multifaceted Neuroimaging Data Integration via Analysis of Subspaces
Multifaceted Neuroimaging Data Integration via Analysis of Subspaces Open
Neuroimaging studies, such as the Human Connectome Project (HCP), often collect multifaceted data to study the human brain. However, these data are often analyzed in a pairwise fashion, which can hinder our understanding of how different b…
View article: Variable screening based on Gaussian Centered L-moments
Variable screening based on Gaussian Centered L-moments Open
View article: PAIRWISE NONLINEAR DEPENDENCE ANALYSIS OF GENOMIC DATA.
PAIRWISE NONLINEAR DEPENDENCE ANALYSIS OF GENOMIC DATA. Open
In The Cancer Genome Atlas (TCGA) data set, there are many interesting nonlinear dependencies between pairs of genes that reveal important relationships and subtypes of cancer. Such genomic data analysis requires a rapid, powerful and inte…
View article: Investigating the relationship between radiographic joint space width loss and deep learning-derived magnetic resonance imaging-based cartilage thickness loss in the medial weight-bearing region of the tibiofemoral joint
Investigating the relationship between radiographic joint space width loss and deep learning-derived magnetic resonance imaging-based cartilage thickness loss in the medial weight-bearing region of the tibiofemoral joint Open
View article: Multi-faceted Neuroimaging Data Integration via Analysis of Subspaces
Multi-faceted Neuroimaging Data Integration via Analysis of Subspaces Open
Neuroimaging studies, such as the Human Connectome Project (HCP), often collect multi-faceted and multi-block data to study the complex human brain. However, these data are often analyzed in a pairwise fashion, which can hinder our underst…
View article: Angle-based joint and individual variation explained
Angle-based joint and individual variation explained Open
View article: Fast Algorithms for Large-Scale Generalized Distance Weighted Discrimination
Fast Algorithms for Large-Scale Generalized Distance Weighted Discrimination Open
High-dimension-low-sample size statistical analysis is important in a wide range of applications. In such situations, the highly appealing discrimination method, support vector machine, can be improved to alleviate data piling at the margi…
View article: Interior Object Geometry via Fitted Frames
Interior Object Geometry via Fitted Frames Open
We propose a means of computing fitted frames on the boundary and in the interior of objects and using them to provide the basis for producing geometric features from them that are not only alignment-free but most importantly can be made t…
View article: The Poisson distribution model fits UMI-based single-cell RNA-sequencing data
The Poisson distribution model fits UMI-based single-cell RNA-sequencing data Open
View article: Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma
Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma Open
Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma
View article: Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma
Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma Open
Supplementary Figure S1 from A 10-Gene Classifier for Distinguishing Head and Neck Squamous Cell Carcinoma and Lung Squamous Cell Carcinoma
View article: Translating transcriptomic findings from cancer model systems to humans through joint dimension reduction
Translating transcriptomic findings from cancer model systems to humans through joint dimension reduction Open
View article: Patterns of variation among baseline femoral and tibial cartilage thickness and clinical features: Data from the osteoarthritis initiative
Patterns of variation among baseline femoral and tibial cartilage thickness and clinical features: Data from the osteoarthritis initiative Open
This exploratory analysis, combining the rich OAI dataset with novel methods for determining and visualizing cartilage thickness, reinforces known associations in knee OA while providing insights into the potential for data integration in …
View article: Evidence for Multiple Subpopulations of Herpesvirus-Latently Infected Cells
Evidence for Multiple Subpopulations of Herpesvirus-Latently Infected Cells Open
Latency is the defining characteristic of the Herpesviridae and central to the tumorigenesis phenotype of Kaposi’s sarcoma-associated herpesvirus (KSHV). KSHV-driven primary effusion lymphomas (PEL) rapidly develop resistance to therapy, s…
View article: Joint and individual analysis of breast cancer histologic images and genomic covariates
Joint and individual analysis of breast cancer histologic images and genomic covariates Open
The two main approaches in the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genomics. While both histopathology and genomics are fundamental to cancer research, the connections between these fi…
View article: Asymptotic optimality of the least-squares cross-validation bandwidth for kernel estimates of intensity functions
Asymptotic optimality of the least-squares cross-validation bandwidth for kernel estimates of intensity functions Open
In this paper, kernel function methods are considered for estimating the intensity function of a non-homogeneous Poisson process. A least-squares cross-validation bandwidth for the kernel intensity estimator is introduced, and it is proven…
View article: Statistical Inference for Data Integration
Statistical Inference for Data Integration Open
In the age of big data, data integration is a critical step especially in the understanding of how diverse data types work together and work separately. Among the data integration methods, the Angle-Based Joint and Individual Variation Exp…
View article: Visual High Dimensional Hypothesis Testing
Visual High Dimensional Hypothesis Testing Open
In exploratory data analysis of known classes of high dimensional data, a central question is how distinct are the classes? The Direction Projection Permutation (DiProPerm) hypothesis test provides an answer to this that is directly connec…
View article: Geometric insights into support vector machine behavior using the KKT conditions
Geometric insights into support vector machine behavior using the KKT conditions Open
The support vector machine (SVM) is a powerful and widely used classification algorithm. This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof for new insights into the behavior of SVM. These insights pro…
View article: diproperm: An R Package for the DiProPerm Test
diproperm: An R Package for the DiProPerm Test Open
High-dimensional low sample size (HDLSS) data sets frequently emerge in many biomedical applications.The direction-projection-permutation (DiProPerm) test is a two-sample hypothesis test for comparing two high-dimensional distributions.The…
View article: Statistical Significance of Clustering Using Soft Thresholding
Statistical Significance of Clustering Using Soft Thresholding Open
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts…
View article: diproperm: An R Package for the DiProPerm Test
diproperm: An R Package for the DiProPerm Test Open
High-dimensional low sample size (HDLSS) data sets emerge frequently in many biomedical applications. A common task for analyzing HDLSS data is to assign data to the correct class using a classifier. Classifiers which use two labels and a …
View article: Theory of high-dimensional outliers
Theory of high-dimensional outliers Open
This study concerns the issue of high dimensional outliers which are challenging to distinguish from inliers due to the special structure of high dimensional space. We introduce a new notion of high dimensional outliers that embraces vario…
View article: Variable screening based on Gaussian Centered L-moments
Variable screening based on Gaussian Centered L-moments Open
An important challenge in big data is identification of important variables. In this paper, we propose methods of discovering variables with non-standard univariate marginal distributions. The conventional moments-based summary statistics …
View article: Methods for quantitative characterization of bone injury from computed-tomography images, supplemental information
Methods for quantitative characterization of bone injury from computed-tomography images, supplemental information Open
Supplemental information for SPIE Medical Imaging 2019 submission.
View article: Eigenvalue Significance Testing for Genetic Association
Eigenvalue Significance Testing for Genetic Association Open
Summary Genotype eigenvectors are widely used as covariates for control of spurious stratification in genetic association. Significance testing for the accompanying eigenvalues has typically been based on a standard Tracy–Widom limiting di…
View article: Persistent homology analysis of brain artery trees
Persistent homology analysis of brain artery trees Open
New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persisten…
View article: A General Framework for Constrained Smoothing
A General Framework for Constrained Smoothing Open
There are a wide array of smoothing methods available for finding structure in data. A general framework is developed which shows that many of these can be viewed as a projection of the data, with respect to appropriate norms. The underlyi…
View article: High dimension low sample size asymptotics of robust PCA
High dimension low sample size asymptotics of robust PCA Open
Conventional principal component analysis is highly susceptible to outliers. In particular, a sufficiently outlying single data point, can draw the leading principal component toward itself. In this paper, we study the effects of outliers …