Explanipedia

Atacformer: A transformer-based foundation model for analysis and interpretation of ATAC-seq data Open

Nathan J. LeRoy, Guangtao Zheng, Oleksandr Khoroshevskyi, Donald R. Campbell, Aidong Zhang , et al. · 2025

Introduction Chromatin accessibility profiling is an important tool for understanding gene regulation and cellular function. While public repositories house nearly 10,000 scATAC-seq experiments, unifying this data for meaningful analysis r…

ConceptDrift: leveraging spatial, temporal and semantic evolution of biomedical concepts for hypothesis generation Open

Amir Hassan Shariatmadari, Alireza Jafari, Sikun Guo, S. Srinivasan, Nathan C. Sheffield , et al. · 2025

Motivation Hypothesis generation is a fundamental problem in biomedical text mining that aims to generate ideas that are new, interesting, and plausible by discovering unexplored links between biomedical concepts. Despite significant advan…

KDD 2025 Panel on AI for Science Open

Vipin Kumar, Yan Liu, Aidong Zhang · 2025

AI and Science Day Open

Aidong Zhang, Vipin Kumar, Yan Liu · 2025

IdeaBench: Benchmarking Large Language Models for Research Idea Generation Open

Sikun Guo, Amir Hassan Shariatmadari, Guangzhi Xiong, Albert Huang, Myles Kim , et al. · 2025

Improving Group Robustness on Spurious Correlation via Evidential Alignment Open

Wenqian Ye, Guangtao Zheng, Aidong Zhang · 2025

Deep neural networks often learn and rely on spurious correlations, i.e., superficial associations between non-causal features and the targets. For instance, an image classifier may identify camels based on the desert backgrounds. While it…

NeuronTune: Towards Self-Guided Spurious Bias Mitigation Open

Guangtao Zheng, Wenqian Ye, Aidong Zhang · 2025

Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intri…

ShortcutProbe: Probing Prediction Shortcuts for Learning Robust Models Open

Guangtao Zheng, Wenqian Ye, Aidong Zhang · 2025

Deep learning models often achieve high performance by inadvertently learning spurious correlations between targets and non-essential features. For example, an image classifier may identify an object via its background that spuriously corr…

Client-Centric Federated Adaptive Optimization Open

Jianhui Sun, Xidong Wu, Heng Huang, Aidong Zhang · 2025

Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private. With an increasing scale of clients and models, FL encounters two key challenges, client drift due…

ASCENT-ViT: Attention-based Scale-aware Concept Learning Framework for Enhanced Alignment in Vision Transformers Open

Sanchit Sinha, Guangzhi Xiong, Aidong Zhang · 2025

As Vision Transformers (ViTs) are increasingly adopted in sensitive vision applications, there is a growing demand for improved interpretability. This has led to efforts to forward-align these models with carefully annotated abstract, huma…

Determining the Importance of Clinical Modalities for NeuroDegenerative Disorders and Risk of Patient Injury Using Machine Learning and Survival Analysis. Open

Kazi Noshin, Mary Regina Boland, Bojian Hou, Weiqing He, Victoria Lu , et al. · 2025

Falls among the elderly and especially those with NeuroDegenerative Disorders (NDD) reduces life expectancy. The purpose of this study is to explore the role of Machine Learning on Electronic Health Records (EHR) data for time-to-event sur…

COCO-Tree: Compositional Hierarchical Concept Trees for Enhanced Reasoning in Vision-Language Models Open

Sanchit Sinha, Guangzhi Xiong, Aidong Zhang · 2025

InfAL: Inference Time Adversarial Learning for Improving Research Ideation Open

Shuitao Guo, Amir Hassan Shariatmadari, Peng Wang, Albert Huang, Aidong Zhang · 2025

Embracing Foundation Models for Advancing Scientific Discovery Open

Shuitao Guo, Amir Hassan Shariatmadari, Guangzhi Xiong, Aidong Zhang · 2024

Machine learning foundation models, particularly large language models (LLMs) such as GPT-4o, have revolutionized traditional applications in computer vision and natural language processing, marking a significant shift in recent years. Bui…

Uncovering Important Diagnostic Features for Alzheimer’s, Parkinson’s and Other Dementias Using Interpretable Association Mining Methods Open

Kazi Noshin, Mary Regina Boland, Bojian Hou, Victoria Lu, Carol Manning , et al. · 2024

Alzheimer's Disease and Related Dementias (ADRD) afflict almost 7 million people in the USA alone. The majority of research in ADRD is conducted using post-mortem samples of brain tissue or carefully recruited clinical trial patients. Whil…

Ensuring Safety and Trust: Analyzing the Risks of Large Language Models in Medicine Open

Yifan Yang, Qiao Jin, Robert Leaman, Xiaoyu Liu, Guangzhi Xiong , et al. · 2024

The remarkable capabilities of Large Language Models (LLMs) make them increasingly compelling for adoption in real-world healthcare applications. However, the risks associated with using LLMs in medical applications have not been systemati…

Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models Open

Guangzhi Xiong, Eric Xie, Amir Hassan Shariatmadari, Shuitao Guo, Stefan Bekiranov , et al. · 2024

Large language models (LLMs) have demonstrated remarkable capabilities in various scientific domains, from natural language processing to complex problem-solving tasks. Their ability to understand and generate human-like text has opened up…

IdeaBench: Benchmarking Large Language Models for Research Idea Generation Open

Shengshan Guo, Amir Hassan Shariatmadari, Guangzhi Xiong, Albert Huang, Eric Xie , et al. · 2024

Large Language Models (LLMs) have transformed how people interact with artificial intelligence (AI) systems, achieving state-of-the-art results in various tasks, including scientific discovery and hypothesis generation. However, the lack o…

Demystifying Large Language Models for Medicine: A Primer Open

Qiao Jin, Nicholas Wan, Robert Leaman, Shubo Tian, Zhizheng Wang , et al. · 2024

Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare by generating human-like responses across diverse contexts and adapting to novel tasks following human instr…

Structural Causality-based Generalizable Concept Discovery Models Open

Sanchit Sinha, Guangzhi Xiong, Aidong Zhang · 2024

The rising need for explainable deep neural network architectures has utilized semantic concepts as explainable units. Several approaches utilizing disentangled representation learning estimate the generative factors and utilize them as co…

ProtoNAM: Prototypical Neural Additive Models for Interpretable Deep Tabular Learning Open

Guangzhi Xiong, Sanchit Sinha, Aidong Zhang · 2024

Generalized additive models (GAMs) have long been a powerful white-box tool for the intelligible analysis of tabular data, revealing the influence of each feature on the model predictions. Despite the success of neural networks (NNs) in va…

BEDMS: A metadata standardizer for genomic region attributes Open

Saanika Tambe, Oleksandr Khoroshevskyi, Sang‐Hoon Park, Nathan J. LeRoy, Donald R. Campbell , et al. · 2024

High-throughput sequencing technologies have generated vast omics data annotating genomic regions. A challenge arises in integrating this data because the associated metadata does not follow a uniform schema. This hinders data management, …

Benchmarking Spurious Bias in Few-Shot Image Classifiers Open

Guangtao Zheng, Wenqian Ye, Aidong Zhang · 2024

Few-shot image classifiers are designed to recognize and classify new data with minimal supervision and limited data but often show reliance on spurious correlations between classes and spurious attributes, known as spurious bias. Spurious…

CoLiDR: Co ncept L earn i ng using Aggregated D isentangled R epresentations Open

Sanchit Sinha, Guangzhi Xiong, Aidong Zhang · 2024

Interpretability of Deep Neural Networks using concept-based models offers a promising way to explain model behavior through human understandable concepts. A parallel line of research focuses on disentangling the data distribution into its…

Spuriousness-Aware Meta-Learning for Learning Robust Classifiers Open

Guangtao Zheng, Wenqian Ye, Aidong Zhang · 2024

Spurious correlations are brittle associations between certain attributes of inputs and target variables, such as the correlation between an image background and an object class. Deep image classifiers often leverage them for predictions, …

MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning Open

Sanchit Sinha, Yuguang Yue, Victor Soto, Mayank Kulkarni, Jianhua Lu , et al. · 2024

Adapting large language models (LLMs) to unseen tasks with incontext training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have…

WRKY transcription factor 40 from eggplant (Solanum melongena L.) regulates ABA and salt stress responses Open

Aidong Zhang, Jing Shang, Kai Xiao, Min Zhang, Shengjie Wang , et al. · 2024

Methods for constructing and evaluating consensus genomic interval sets Open

Julia Rymuza, Yuchen Sun, Guangtao Zheng, Nathan J. LeRoy, Maria Murach , et al. · 2024

The amount of genomic region data continues to increase. Integrating across diverse genomic region sets requires consensus regions, which enable comparing regions across experiments, but also by necessity lose precision in region definitio…

CoLiDR: Concept Learning using Aggregated Disentangled Representations Open

Sanchit Sinha, Guangzhi Xiong, Aidong Zhang · 2024

Interpretability of Deep Neural Networks using concept-based models offers a promising way to explain model behavior through human-understandable concepts. A parallel line of research focuses on disentangling the data distribution into its…

Fast clustering and cell-type annotation of scATAC data using pre-trained embeddings Open

Nathan J. LeRoy, Jason P. Smith, Guangtao Zheng, Julia Rymuza, Erfaneh Gharavi , et al. · 2024

Data from the single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) are now widely available. One major computational challenge is dealing with high dimensionality and inherent sparsity, which is typically ad…

Aidong Zhang YOU? Author Swipe