Rick Stevens
YOU?
Author Swipe
View article: BV-BRC: a unified bacterial and viral bioinformatics resource with expanded functionality and AI integration
BV-BRC: a unified bacterial and viral bioinformatics resource with expanded functionality and AI integration Open
The Bacterial and Viral Bioinformatics Resource Center (BV-BRC; https://www.bv-brc.org) is a comprehensive resource supporting research on bacterial and viral pathogens. It currently hosts over 14 million publicly available genomes and 33 …
View article: A Workflow for Error Analysis for Drug Response Prediction via Statistical Standardization and Distribution Analysis
A Workflow for Error Analysis for Drug Response Prediction via Statistical Standardization and Distribution Analysis Open
View article: BioR5: A Three-Layer Architecture for Biological Reasoning in Scientific AI
BioR5: A Three-Layer Architecture for Biological Reasoning in Scientific AI Open
View article: Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models
Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models Open
View article: CACHE Challenge #3: Targeting the Nsp3 Macrodomain of SARS-CoV-2
CACHE Challenge #3: Targeting the Nsp3 Macrodomain of SARS-CoV-2 Open
The third Critical Assessment of Computational Hit-finding Experiments (CACHE) challenged computational teams to identify chemically novel ligands targeting the macrodomain 1 of SARS-CoV-2 Nsp3, a promising coronavirus drug target. Twenty-…
View article: Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models
Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models Open
As scientific knowledge grows at an unprecedented pace, evaluation benchmarks must evolve to reflect new discoveries and ensure language models are tested on current, diverse literature. We propose a scalable, modular framework for generat…
View article: HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights Open
The volume of scientific literature is growing exponentially, leading to underutilized discoveries, duplicated efforts, and limited cross-disciplinary collaboration. Retrieval Augmented Generation (RAG) offers a way to assist scientists by…
View article: Active Advantage-Aligned Online Reinforcement Learning with Offline Data
Active Advantage-Aligned Online Reinforcement Learning with Offline Data Open
Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but…
View article: ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization
ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization Open
Drug optimization has become increasingly crucial in light of fast-mutating virus strains and drug-resistant cancer cells. Nevertheless, it remains challenging as it necessitates retaining the beneficial properties of the original drug whi…
View article: Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis
Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis Open
Large language models (LLMs) have demonstrated immense capabilities in understanding textual data and are increasingly being adopted to help researchers accelerate scientific discovery through knowledge extraction (information retrieval), …
View article: Binding Affinity Prediction: From Conventional to Machine Learning-Based Approaches
Binding Affinity Prediction: From Conventional to Machine Learning-Based Approaches Open
Protein-ligand binding is the process by which a small molecule (drug or inhibitor) attaches to a target protein. Binding affinity, which characterizes the strength of biomolecular interactions, is essential for tackling diverse challenges…
View article: Assessing Reusability of Deep Learning-Based Monotherapy Drug Response Prediction Models Trained with Omics Data
Assessing Reusability of Deep Learning-Based Monotherapy Drug Response Prediction Models Trained with Omics Data Open
Cancer drug response prediction (DRP) models present a promising approach towards precision oncology, tailoring treatments to individual patient profiles. While deep learning (DL) methods have shown great potential in this area, models tha…
View article: Impact of Molecular Representations on Deep Learning Model Comparisons in Drug Response Predictions
Impact of Molecular Representations on Deep Learning Model Comparisons in Drug Response Predictions Open
Deep learning (DL) plays a crucial role in tackling the complexity and heterogeneity of cancer, particularly in predicting drug response. However, the effectiveness of these models is often hindered by inconsistent benchmarks and disparate…
View article: LUCID Thrust 1 - Dataset Identification and Biodata Catalog Creation
LUCID Thrust 1 - Dataset Identification and Biodata Catalog Creation Open
The LUCID DOE consortium, part of the Department of Energy’s Biological and Environmental Research (BER) program, advances Low Dose Radiation (LDR) research through multidisciplinary efforts across seven key thrusts. This document focuses …
View article: CACHE Challenge #1: targeting the WDR domain of LRRK2, a Parkinson’s Disease associated protein
CACHE Challenge #1: targeting the WDR domain of LRRK2, a Parkinson’s Disease associated protein Open
The CACHE challenges are a series of prospective benchmarking exercises meant to evaluate progress in the field of computational hit-finding. Here we report the results of the inaugural CACHE #1 challenge in which 23 computational teams ea…
View article: Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Entropy-Reinforced Planning with Large Language Models for Drug Discovery Open
The objective of drug discovery is to identify chemical compounds that possess specific pharmaceutical properties toward a binding target. Existing large language models (LLMS) can achieve high token matching scores in terms of likelihood …
View article: Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning
Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning Open
The aim in many sciences is to understand the mechanisms that underlie the observed distribution of variables, starting from a set of initial hypotheses. Causal discovery allows us to infer mechanisms as sets of cause and effect relationsh…
View article: Assessment of BER Research in Low-Dose Radiation: Report from the BERAC Advisory Committee
Assessment of BER Research in Low-Dose Radiation: Report from the BERAC Advisory Committee Open
On April 6, 2023, Dr. Asmeret Asefaw Berhe, director of the U.S. Department of Energy (DOE) Office of Science, charged the Biological and Environmental Research (BER) program’s advisory committee with “formulating potential research opport…
View article: Advanced Research Directions on AI for Energy
Advanced Research Directions on AI for Energy Open
This AI for Energy report further details grand challenges that provide significant opportunities for energy applications across nuclear energy, the power grid, carbon management, energy storage, and energy materials over the next decade. …
View article: Data Imbalance in Drug Response Prediction - Multi-Objective Optimization Approach in Deep Learning Setting
Data Imbalance in Drug Response Prediction - Multi-Objective Optimization Approach in Deep Learning Setting Open
Drug response prediction (DRP) methods tackle the complex task of associating the effectiveness of small molecules with the specific genetic makeup of the patient. Anti-cancer DRP is a particularly challenging task requiring costly exper-i…
View article: Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision
Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision Open
Deep learning methods are transforming research, enabling new techniques, and ultimately leading to new discoveries. As the demand for more capable AI models continues to grow, we are now entering an era of Trillion Parameter Models (TPM),…
View article: A Comprehensive Investigation of Active Learning Strategies for Conducting Anti-Cancer Drug Screening
A Comprehensive Investigation of Active Learning Strategies for Conducting Anti-Cancer Drug Screening Open
It is well-known that cancers of the same histology type can respond differently to a treatment. Thus, computational drug response prediction is of paramount importance for both preclinical drug screening studies and clinical treatment des…
View article: A yeast love triangle: multiple hybridizations shape genome evolution in the Pichia cactophila species complex
A yeast love triangle: multiple hybridizations shape genome evolution in the Pichia cactophila species complex Open
View article: Integration of Computational Docking into Anti-Cancer Drug Response Prediction Models
Integration of Computational Docking into Anti-Cancer Drug Response Prediction Models Open
Cancer is a heterogeneous disease in that tumors of the same histology type can respond differently to a treatment. Anti-cancer drug response prediction is of paramount importance for both drug development and patient treatment design. Alt…
View article: WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data
WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data Open
We introduce WordScape, a novel pipeline for the creation of cross-disciplinary, multilingual corpora comprising millions of pages with annotations for document layout detection. Relating visual and textual items on document pages has gain…
View article: Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision
Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision Open
Deep learning methods are transforming research, enabling new techniques, and ultimately leading to new discoveries. As the demand for more capable AI models continues to grow, we are now entering an era of Trillion Parameter Models (TPM),…
View article: GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics
GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics Open
We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSL…
View article: Influencing factors on false positive rates when classifying tumor cell line response to drug treatment
Influencing factors on false positive rates when classifying tumor cell line response to drug treatment Open
Informed selection of drug candidates for laboratory experimentation provides an efficient means of identifying suitable anti-cancer treatments. The advancement of artificial intelligence has led to the development of computational models …
View article: DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies Open
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across …
View article: Blending Imitation and Reinforcement Learning for Robust Policy Improvement
Blending Imitation and Reinforcement Learning for Robust Policy Improvement Open
While reinforcement learning (RL) has shown promising performance, its sample complexity continues to be a substantial hurdle, restricting its broader application across a variety of domains. Imitation learning (IL) utilizes oracles to imp…