Anru R. Zhang
YOU?
Author Swipe
View article: Reliable Curation of EHR Dataset via Large Language Models under Environmental Constraints
Reliable Curation of EHR Dataset via Large Language Models under Environmental Constraints Open
Electronic health records (EHRs) are central to modern healthcare delivery and research; yet, many researchers lack the database expertise necessary to write complex SQL queries or generate effective visualizations, limiting efficient data…
View article: CaDAVEr: a metagenome-assembled genome catalog of microbial decomposers across vertebrate environments
CaDAVEr: a metagenome-assembled genome catalog of microbial decomposers across vertebrate environments Open
Microbial degradation of organic matter is a fundamental Earth process, yet a mechanistic understanding of microbial metabolisms and successional ecology involved in decomposition remains poorly understood. Here, we announce the recovery o…
View article: Leadership in Linking Statistical Theory and Scientific Inquiry: A COPSS-NISS Leadership Webinar with Drs. Michael Kosorok and Daniela Witten
Leadership in Linking Statistical Theory and Scientific Inquiry: A COPSS-NISS Leadership Webinar with Drs. Michael Kosorok and Daniela Witten Open
View article: A functional tensor model for dynamic multilayer networks with common invariant subspaces and the RKHS estimation
A functional tensor model for dynamic multilayer networks with common invariant subspaces and the RKHS estimation Open
Dynamic multilayer networks are frequently used to describe the structure and temporal evolution of multiple relationships among common entities, with applications in fields such as sociology, economics, and neuroscience. However, explorat…
View article: Smooth Flow Matching
Smooth Flow Matching Open
Functional data, i.e., smooth random functions observed over a continuous domain, are increasingly available in areas such as biomedical research, health informatics, and epidemiology. However, effective statistical analysis for functional…
View article: Functional Tensor Regression
Functional Tensor Regression Open
View article: Vispro improves imaging analysis for Visium spatial transcriptomics
Vispro improves imaging analysis for Visium spatial transcriptomics Open
View article: Machine Learning Computer Vision Point of Care Decision Support of Echocardiographic Identification of Hypertrophic Cardiomyopathy
Machine Learning Computer Vision Point of Care Decision Support of Echocardiographic Identification of Hypertrophic Cardiomyopathy Open
View article: Optimization of abdominal CT based on a model of total risk minimization by putting radiation risk in perspective with imaging benefit
Optimization of abdominal CT based on a model of total risk minimization by putting radiation risk in perspective with imaging benefit Open
View article: TEMPTED: time-informed dimensionality reduction for longitudinal microbiome studies
TEMPTED: time-informed dimensionality reduction for longitudinal microbiome studies Open
View article: A Modified Mediterranean Ketogenic Diet reverses signatures and mitigates modifiable risk factors of Alzheimer’s disease
A Modified Mediterranean Ketogenic Diet reverses signatures and mitigates modifiable risk factors of Alzheimer’s disease Open
Background Alzheimer’s disease (AD) is a neurodegenerative disorder with significant environmental factors, including diet, that influence its onset and progression. While the ketogenic diet (KD) holds promise in reducing metabolic risks a…
View article: Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm
Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm Open
Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security, making it indispensable in fields such as healthcare, finance, and personalized services. This paper investigat…
View article: Tensor Decomposition with Unaligned Observations
Tensor Decomposition with Unaligned Observations Open
This paper presents a canonical polyadic (CP) tensor decomposition that addresses unaligned observations. The mode with unaligned observations is represented using functions in a reproducing kernel Hilbert space (RKHS). We introduce a vers…
View article: Vispro improves imaging analysis for Visium spatial transcriptomics
Vispro improves imaging analysis for Visium spatial transcriptomics Open
Spatial transcriptomics (ST) enables the comprehensive analysis of gene expression while preserving the spatial context of tissues. The histological images accompanying ST data provide spatially cohesive information that is often challengi…
View article: Statistical Inference for Low-Rank Tensors: Heteroskedasticity, Subgaussianity, and Applications
Statistical Inference for Low-Rank Tensors: Heteroskedasticity, Subgaussianity, and Applications Open
In this paper, we consider inference and uncertainty quantification for low Tucker rank tensors with additive noise in the high-dimensional regime. Focusing on the output of the higher-order orthogonal iteration (HOOI) algorithm, a commonl…
View article: Functional Singular Value Decomposition
Functional Singular Value Decomposition Open
Heterogeneous functional data commonly arise in time series and longitudinal studies. To uncover the statistical structures of such data, we propose Functional Singular Value Decomposition (FSVD), a unified framework encompassing various t…
View article: Nonconvex Factorization and Manifold Formulations Are Almost Equivalent in Low-Rank Matrix Optimization
Nonconvex Factorization and Manifold Formulations Are Almost Equivalent in Low-Rank Matrix Optimization Open
In this paper, we consider the geometric landscape connection of the widely studied manifold and factorization formulations in low-rank positive semidefinite (PSD) and general matrix optimization. We establish a sandwich relation on the sp…
View article: Tensor Decomposition Meets RKHS: Efficient Algorithms for Smooth and Misaligned Data
Tensor Decomposition Meets RKHS: Efficient Algorithms for Smooth and Misaligned Data Open
The canonical polyadic (CP) tensor decomposition decomposes a multidimensional data array into a sum of outer products of finite-dimensional vectors. Instead, we can replace some or all of the vectors with continuous functions (infinite-di…
View article: Serum and CSF metabolomics analysis shows Mediterranean Ketogenic Diet mitigates risk factors of Alzheimer’s disease
Serum and CSF metabolomics analysis shows Mediterranean Ketogenic Diet mitigates risk factors of Alzheimer’s disease Open
Alzheimer’s disease (AD) is influenced by a variety of modifiable risk factors, including a person’s dietary habits. While the ketogenic diet (KD) holds promise in reducing metabolic risks and potentially affecting AD progression, only a f…
View article: Functional Post-Clustering Selective Inference with Applications to EHR Data Analysis
Functional Post-Clustering Selective Inference with Applications to EHR Data Analysis Open
In electronic health records (EHR) analysis, clustering patients according to patterns in their data is crucial for uncovering new subtypes of diseases. Existing medical literature often relies on classical hypothesis testing methods to te…
View article: Blessing of dimension in Bayesian inference on covariance matrices
Blessing of dimension in Bayesian inference on covariance matrices Open
Bayesian factor analysis is routinely used for dimensionality reduction in modeling of high-dimensional covariance matrices. Factor analytic decompositions express the covariance as a sum of a low rank and diagonal matrix. In practice, Gib…
View article: Expansion and transmission dynamics of high risk carbapenem-resistant Klebsiella pneumoniae subclones in China: An epidemiological, spatial, genomic analysis
Expansion and transmission dynamics of high risk carbapenem-resistant Klebsiella pneumoniae subclones in China: An epidemiological, spatial, genomic analysis Open
The high-risk ST11 KL64 CRKP subclone showed strong expansion potential and survival advantages, probably owing to genetic factors.
View article: A conserved interdomain microbial network underpins cadaver decomposition despite environmental variables
A conserved interdomain microbial network underpins cadaver decomposition despite environmental variables Open
View article: Increase in antioxidant capacity associated with the successful subclone of hypervirulent carbapenem-resistant Klebsiella pneumoniae ST11-KL64
Increase in antioxidant capacity associated with the successful subclone of hypervirulent carbapenem-resistant Klebsiella pneumoniae ST11-KL64 Open
View article: Cocaine Use Prediction With Tensor-Based Machine Learning on Multimodal MRI Connectome Data
Cocaine Use Prediction With Tensor-Based Machine Learning on Multimodal MRI Connectome Data Open
This letter considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study used functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 indiv…
View article: A Modified Mediterranean Ketogenic Diet mitigates modifiable risk factors of Alzheimer’s Disease: a serum and CSF-based metabolic analysis
A Modified Mediterranean Ketogenic Diet mitigates modifiable risk factors of Alzheimer’s Disease: a serum and CSF-based metabolic analysis Open
Alzheimer’s disease (AD) is influenced by a variety of modifiable risk factors, including a person’s dietary habits. While the ketogenic diet (KD) holds promise in reducing metabolic risks and potentially affecting AD progression, only a f…
View article: Soft Phenotyping for Sepsis via EHR Time-aware Soft Clustering
Soft Phenotyping for Sepsis via EHR Time-aware Soft Clustering Open
Objective: Sepsis is one of the most serious hospital conditions associated with high mortality. Sepsis is the result of a dysregulated immune response to infection that can lead to multiple organ dysfunction and death. Due to the wide var…
View article: Computational and Statistical Thresholds in Multi-layer Stochastic Block Models
Computational and Statistical Thresholds in Multi-layer Stochastic Block Models Open
We study the problem of community recovery and detection in multi-layer stochastic block models, focusing on the critical network density threshold for consistent community structure inference. Using a prototypical two-block model, we reve…
View article: Reliable Generation of Privacy-preserving Synthetic Electronic Health Record Time Series via Diffusion Models
Reliable Generation of Privacy-preserving Synthetic Electronic Health Record Time Series via Diffusion Models Open
Electronic Health Records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis. However, privacy concerns often restrict access to EHRs, hindering downstream analysis. Current EHR de-identifi…
View article: Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data
Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data Open
This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 in…