Keyan Ding
YOU?
Author Swipe
View article: CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning
CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning Open
While chain-of-thought (CoT) distillation from advanced large language models (LLMs) has proven effective in general reasoning tasks, it struggles in scientific domains where even advanced models often produce incorrect or superficial reas…
View article: Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization
Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization Open
Protein language models have emerged as powerful tools for sequence generation, offering substantial advantages in functional optimization and denovo design. However, these models also present significant risks of generating harmful protei…
View article: KEPLA: A Knowledge-Enhanced Deep Learning Framework for Accurate Protein-Ligand Binding Affinity Prediction
KEPLA: A Knowledge-Enhanced Deep Learning Framework for Accurate Protein-Ligand Binding Affinity Prediction Open
Accurate prediction of protein-ligand binding affinity is critical for drug discovery. While recent deep learning approaches have demonstrated promising results, they often rely solely on structural features of proteins and ligands, overlo…
View article: OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases
OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases Open
Large Language Models (LLMs) have demonstrated substantial progress on reasoning tasks involving unstructured text, yet their capabilities significantly deteriorate when reasoning requires integrating structured external knowledge such as …
View article: SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models
SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models Open
Large Language Models (LLMs) have shown impressive capabilities in contextual understanding and reasoning. However, evaluating their performance across diverse scientific domains remains underexplored, as existing benchmarks primarily focu…
View article: SAFER: Advancing Safety Alignment via Efficient Ex-Ante Reasoning
SAFER: Advancing Safety Alignment via Efficient Ex-Ante Reasoning Open
Recent advancements in large language models (LLMs) have accelerated progress toward artificial general intelligence, yet their potential to generate harmful content poses critical safety challenges. Existing alignment methods often strugg…
View article: Integrating protein language models and automatic biofoundry for enhanced protein evolution
Integrating protein language models and automatic biofoundry for enhanced protein evolution Open
View article: SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration
SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration Open
Scientific research increasingly relies on specialized computational tools, yet effectively utilizing these tools demands substantial domain expertise. While Large Language Models (LLMs) show promise in tool automation, they struggle to se…
View article: Boosting LLM’s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning
Boosting LLM’s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning Open
View article: Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition
Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition Open
View article: Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization
Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization Open
View article: Multi-purpose controllable protein generation via prompted language models
Multi-purpose controllable protein generation via prompted language models Open
Deep learning is increasingly powerful for designing proteins that meet structural and functional requirements. However, most existing methods follow a conventional pipeline: first defining a backbone structure and then generating sequence…
View article: Advancing biomolecular understanding and design following human instructions
Advancing biomolecular understanding and design following human instructions Open
Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology and enzyme engineering. Recent breakthroughs in artificial intelligence have revolutionized biomolecu…
View article: SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks
SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks Open
Large language models (LLMs) have a transformative impact on a variety of scientific tasks across disciplines including biology, chemistry, medicine, and physics. However, ensuring the safety alignment of these models in scientific researc…
View article: Retrosynthesis prediction with an iterative string editing model
Retrosynthesis prediction with an iterative string editing model Open
View article: SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models Open
Large language models (LLMs) are playing an increasingly important role in scientific research, yet there remains a lack of comprehensive benchmarks to evaluate the breadth and depth of scientific knowledge embedded in these models. To add…
View article: Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics
Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics Open
Deep learning-based methods have significantly influenced the blind image quality assessment (BIQA) field, however, these methods often require training using large amounts of human rating data. In contrast, traditional knowledge-based met…
View article: Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition
Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition Open
Reliable evaluation of large language models (LLMs) is impeded by two key challenges: objective metrics often fail to reflect human perception of natural language, and exhaustive human labeling is prohibitively expensive. Here, we propose …
View article: Deep Shape-Texture Statistics for Completely Blind Image Quality Evaluation
Deep Shape-Texture Statistics for Completely Blind Image Quality Evaluation Open
Opinion-Unaware Blind Image Quality Assessment (OU-BIQA) models aim to predict image quality without training on reference images and subjective quality scores. Thereinto, image statistical comparison is a classic paradigm, while the perfo…
View article: Learning Invariant Molecular Representation in Latent Discrete Space
Learning Invariant Molecular Representation in Latent Discrete Space Open
Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environ…
View article: InstructProtein: Aligning Human and Protein Language via Knowledge Instruction
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction Open
Large Language Models (LLMs) have revolutionized the field of natural language processing, but they fall short in comprehending biological sequences such as proteins. To address this challenge, we propose InstructProtein, an innovative LLM…
View article: Active Finetuning Protein Language Model: A Budget-Friendly Method for Directed Evolution
Active Finetuning Protein Language Model: A Budget-Friendly Method for Directed Evolution Open
Directed evolution is a widely-used strategy of protein engineering to improve protein function via mimicking natural mutation and selection. Machine learning-assisted directed evolution (MLDE) approaches aim to learn a fitness predictor, …
View article: Graph Sampling-based Meta-Learning for Molecular Property Prediction
Graph Sampling-based Meta-Learning for Molecular Property Prediction Open
Molecular property is usually observed with a limited number of samples, and researchers have considered property prediction as a few-shot problem. One important fact that has been ignored by prior works is that each molecule can be record…
View article: Graph Sampling-based Meta-Learning for Molecular Property Prediction
Graph Sampling-based Meta-Learning for Molecular Property Prediction Open
Molecular property is usually observed with a limited number of samples, and researchers have considered property prediction as a few-shot problem. One important fact that has been ignored by prior works is that each molecule can be record…
View article: Locally Adaptive Structure and Texture Similarity for Image Quality Assessment
Locally Adaptive Structure and Texture Similarity for Image Quality Assessment Open
The latest advances in full-reference image quality assessment (IQA) involve\nunifying structure and texture similarity based on deep representations. The\nresulting Deep Image Structure and Texture Similarity (DISTS) metric, however,\nmak…
View article: A Comparative Study of Image Quality Assessment Models through Perceptual Optimization
A Comparative Study of Image Quality Assessment Models through Perceptual Optimization Open
The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments. Perceptual datasets gathered for this purpose have provided useful benchmarks for im…
View article: Image Quality Assessment: Unifying Structure and Texture Similarity
Image Quality Assessment: Unifying Structure and Texture Similarity Open
Objective measures of image quality generally operate by comparing pixels of a "degraded" image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing on…
View article: A Simple Method to improve Initialization Robustness for Active Contours driven by Local Region Fitting Energy
A Simple Method to improve Initialization Robustness for Active Contours driven by Local Region Fitting Energy Open
Active contour models based on local region fitting energy can segment images with intensity inhomogeneity effectively, but their segmentation results are easy to error if the initial contour is inappropriate. In this paper, we present a s…