Kaize Ding
YOU?
Author Swipe
View article: Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers
Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers Open
View article: Uncertainty Quantification for Multiple-Choice Questions is Just One-Token Deep
Uncertainty Quantification for Multiple-Choice Questions is Just One-Token Deep Open
View article: Revisiting Multivariate Time Series Forecasting with Missing Values
Revisiting Multivariate Time Series Forecasting with Missing Values Open
Missing values are common in real-world time series, and multivariate time series forecasting with missing values (MTSF-M) has become a crucial area of research for ensuring reliable predictions. To address the challenge of missing data, c…
View article: AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering Open
Medical Multimodal Large Language Models (Med-MLLMs) have shown great promise in medical visual question answering (Med-VQA). However, when deployed in low-resource settings where abundant labeled data are unavailable, existing Med-MLLMs c…
View article: Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers
Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers Open
Dense retrievers and rerankers are central to retrieval-augmented generation (RAG) pipelines, where accurately retrieving factual information is crucial for maintaining system trustworthiness and defending against RAG poisoning. However, l…
View article: RelKD 2025: The Third International Workshop on Resource-Efficient Learning for Knowledge Discovery
RelKD 2025: The Third International Workshop on Resource-Efficient Learning for Knowledge Discovery Open
View article: A Survey on Model Extraction Attacks and Defenses for Large Language Models
A Survey on Model Extraction Attacks and Defenses for Large Language Models Open
View article: Data-Efficient Graph Learning
Data-Efficient Graph Learning Open
View article: A Survey on Model Extraction Attacks and Defenses for Large Language Models
A Survey on Model Extraction Attacks and Defenses for Large Language Models Open
Model extraction attacks pose significant security threats to deployed language models, potentially compromising intellectual property and user privacy. This survey provides a comprehensive taxonomy of LLM-specific extraction attacks and d…
View article: Cross-Domain Conditional Diffusion Models for Time Series Imputation
Cross-Domain Conditional Diffusion Models for Time Series Imputation Open
Cross-domain time series imputation is an underexplored data-centric research task that presents significant challenges, particularly when the target domain suffers from high missing rates and domain shifts in temporal dynamics. Existing t…
View article: Resource-Efficient Learning for the Web
Resource-Efficient Learning for the Web Open
View article: RelWeb 2025: The International Workshop on Resource-Efficient Learning for the Web
RelWeb 2025: The International Workshop on Resource-Efficient Learning for the Web Open
View article: Histone methyltransferase SMYD2 regulates the activation of hepatic stellate cells by activating TLR4 signaling
Histone methyltransferase SMYD2 regulates the activation of hepatic stellate cells by activating TLR4 signaling Open
Liver fibrosis represents a pathological outcome in the progression of chronic liver diseases, primarily driven by the activation of hepatic stellate cells (HSCs) induced by various chronic liver injury factors. Substantial evidence indica…
View article: Survey of Uncertainty Estimation in Large Language Models -Sources, Methods, Applications, and Challenge
Survey of Uncertainty Estimation in Large Language Models -Sources, Methods, Applications, and Challenge Open
Large Language Models (LLMs) have demonstrated exceptional performance across a wide range of domains, including everyday life, finance, law, and healthcare. However, inaccurate LLM generation has led to significant penalties in sensitive…
View article: A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments
A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments Open
Model Extraction Attacks (MEAs) threaten modern machine learning systems by enabling adversaries to steal models, exposing intellectual property and training data. With the increasing deployment of machine learning models in distributed co…
View article: AD-LLM: Benchmarking Large Language Models for Anomaly Detection
AD-LLM: Benchmarking Large Language Models for Anomaly Detection Open
View article: ALERT: An LLM-powered Benchmark for Automatic Evaluation of Recommendation Explanations
ALERT: An LLM-powered Benchmark for Automatic Evaluation of Recommendation Explanations Open
View article: Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey
Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey Open
View article: Explaining Length Bias in LLM-Based Preference Evaluations
Explaining Length Bias in LLM-Based Preference Evaluations Open
View article: AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering Open
View article: RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Open
Supervised fine-tuning (SFT) plays a crucial role in adapting large language models (LLMs) to specific domains or tasks. However, as demonstrated by empirical experiments, the collected data inevitably contains noise in practical applicati…
View article: AD-LLM: Benchmarking Large Language Models for Anomaly Detection
AD-LLM: Benchmarking Large Language Models for Anomaly Detection Open
Anomaly detection (AD) is an important machine learning task with many real-world uses, including fraud detection, medical diagnosis, and industrial monitoring. Within natural language processing (NLP), AD helps detect issues like spam, mi…
View article: Political-LLM: Large Language Models in Political Science
Political-LLM: Large Language Models in Political Science Open
In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematicall…
View article: Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models Open
The evolution of previous Click-Through Rate (CTR) models has mainly been driven by proposing complex components, whether shallow or deep, that are adept at modeling feature interactions. However, there has been less focus on improving fus…
View article: A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation
A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation Open
Distribution shifts on graphs -- the discrepancies in data distribution between training and employing a graph machine learning model -- are ubiquitous and often unavoidable in real-world scenarios. These shifts may severely deteriorate mo…
View article: LEGO-Learn: Label-Efficient Graph Open-Set Learning
LEGO-Learn: Label-Efficient Graph Open-Set Learning Open
How can we train graph-based models to recognize unseen classes while keeping labeling costs low? Graph open-set learning (GOL) and out-of-distribution (OOD) detection aim to address this challenge by training models that can accurately cl…
View article: Data‐efficient graph learning: Problems, progress, and prospects
Data‐efficient graph learning: Problems, progress, and prospects Open
Graph‐structured data, ranging from social networks to financial transaction networks, from citation networks to gene regulatory networks, have been widely used for modeling a myriad of real‐world systems. As a prevailing model architectur…
View article: Let's Ask GNN: Empowering Large Language Model for Graph In-Context Learning
Let's Ask GNN: Empowering Large Language Model for Graph In-Context Learning Open
Textual Attributed Graphs (TAGs) are crucial for modeling complex real-world systems, yet leveraging large language models (LLMs) for TAGs presents unique challenges due to the gap between sequential text processing and graph-structured da…
View article: Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey
Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey Open
Detecting anomalies or out-of-distribution (OOD) samples is critical for maintaining the reliability and trustworthiness of machine learning systems. Recently, Large Language Models (LLMs) have demonstrated their effectiveness not only in …
View article: Mastering Long-Tail Complexity on Graphs: Characterization, Learning, and Generalization
Mastering Long-Tail Complexity on Graphs: Characterization, Learning, and Generalization Open
In the context of long-tail classification on graphs, the vast majority of existing work primarily revolves around the development of model debiasing strategies, intending to mitigate class imbalances and enhance the overall performance. D…