Hui Xiong
YOU?
Author Swipe
View article: Enhancing Conversational Recommender Systems with Tree-Structured Knowledge and Pretrained Language Models
Enhancing Conversational Recommender Systems with Tree-Structured Knowledge and Pretrained Language Models Open
Recent advances in pretrained language models (PLMs) have significantly improved conversational recommender systems (CRS), enabling more fluent and context-aware interactions. To further enhance accuracy and mitigate hallucination, many me…
View article: From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing Open
Clear imaging under hazy conditions is a critical task. Prior-based and neural methods have improved results. However, they operate on RGB frames, which suffer from limited dynamic range. Therefore, dehazing remains ill-posed and can erase…
View article: The 2025 China report of the Lancet Countdown on health and climate change: empowering cities for synergistic action
The 2025 China report of the Lancet Countdown on health and climate change: empowering cities for synergistic action Open
View article: Multifractal Comparison of Billboard and AI-Generated Music
Multifractal Comparison of Billboard and AI-Generated Music Open
View article: See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model Open
We introduce SEE&TREK, the first training-free prompting framework tailored to enhance the spatial understanding of Multimodal Large Language Models (MLLMS) under vision-only constraints. While prior efforts have incorporated modalities li…
View article: Launching a new low-carbon, healthy journey
Launching a new low-carbon, healthy journey Open
View article: VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding
VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding Open
Multimodal large language models (MLLMs) demonstrate exceptional performance in vision-language tasks, yet their processing of long videos is constrained by input context length and high computational costs. Sparse frame sampling thus beco…
View article: SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models Open
View article: Unleashing The Power of Pre-Trained Language Models for Irregularly Sampled Time Series
Unleashing The Power of Pre-Trained Language Models for Irregularly Sampled Time Series Open
View article: Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs Open
View article: Filter-And-Refine: A MLLM Based Cascade System for Industrial-Scale Video Content Moderation
Filter-And-Refine: A MLLM Based Cascade System for Industrial-Scale Video Content Moderation Open
Effective content moderation is essential for video platforms to safeguard user experience and uphold community standards. While traditional video classification models effectively handle well-defined moderation tasks, they struggle with c…
View article: Transparent prediction of financial analyst recommendation quality using generalized additive model
Transparent prediction of financial analyst recommendation quality using generalized additive model Open
View article: Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study Open
Large language models (LLMs) have shown impressive capabilities across tasks such as mathematics, coding, and reasoning, yet their learning ability, which is crucial for adapting to dynamic environments and acquiring new knowledge, remains…
View article: Deep Generative Architectures for Automated Music Composition: Optimizing Neural Structures and Multimodal Inputs for Style-Conscious Melody and Harmony Generation
Deep Generative Architectures for Automated Music Composition: Optimizing Neural Structures and Multimodal Inputs for Style-Conscious Melody and Harmony Generation Open
This study explores the application of deep generative models in the field of intelligent composition, focusing on the impact of network architecture optimization and multimodal input integration on music style fidelity and emotional expre…
View article: ScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific Research
ScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific Research Open
Scientific researchers need intensive information about datasets to effectively evaluate and develop theories and methodologies. The information needs regarding datasets are implicitly embedded in particular research tasks, rather than exp…
View article: On the Transferability and Discriminability of Repersentation Learning in Unsupervised Domain Adaptation
On the Transferability and Discriminability of Repersentation Learning in Unsupervised Domain Adaptation Open
In this paper, we addressed the limitation of relying solely on distribution alignment and source-domain empirical risk minimization in Unsupervised Domain Adaptation (UDA). Our information-theoretic analysis showed that this standard adve…
View article: LLMs as Better Recommenders with Natural Language Collaborative Signals: A Self-Assessing Retrieval Approach
LLMs as Better Recommenders with Natural Language Collaborative Signals: A Self-Assessing Retrieval Approach Open
Incorporating collaborative information (CI) effectively is crucial for leveraging LLMs in recommendation tasks. Existing approaches often encode CI using soft tokens or abstract identifiers, which introduces a semantic misalignment with t…
View article: GCAL: Adapting Graph Models to Evolving Domain Shifts
GCAL: Adapting Graph Models to Evolving Domain Shifts Open
This paper addresses the challenge of graph domain adaptation on evolving, multiple out-of-distribution (OOD) graphs. Conventional graph domain adaptation methods are confined to single-step adaptation, making them ineffective in handling …
View article: STP: single-cell partition for subcellular spatially-resolved transcriptomics
STP: single-cell partition for subcellular spatially-resolved transcriptomics Open
View article: npj Artificial Intelligence—Editorial journal inauguration
npj Artificial Intelligence—Editorial journal inauguration Open
View article: LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System
LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System Open
View article: Investigating the early intervention effect of <italic>Taohong Siwu Decoction</italic>(桃红四物汤) on rats’ H-Type Vessel-MSC coupling in fracture healing via the HIF-1α/VEGF/FAK pathway
Investigating the early intervention effect of <italic>Taohong Siwu Decoction</italic>(桃红四物汤) on rats’ H-Type Vessel-MSC coupling in fracture healing via the HIF-1α/VEGF/FAK pathway Open
View article: Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation Open
View article: Unleashing the Power of Large Language Model for Denoising Recommendation
Unleashing the Power of Large Language Model for Denoising Recommendation Open
Recommender systems are crucial for personalizing user experiences but often depend on implicit feedback data, which can be noisy and misleading. Existing denoising studies involve incorporating auxiliary information or learning strategies…
View article: Automatic Instruction Data Selection for Large Language Models via Uncertainty-Aware Influence Maximization
Automatic Instruction Data Selection for Large Language Models via Uncertainty-Aware Influence Maximization Open
View article: Acetyl-11-keto-β-boswellic acid alleviates hepatic metabolic dysfunction by inhibiting MGLL activity
Acetyl-11-keto-β-boswellic acid alleviates hepatic metabolic dysfunction by inhibiting MGLL activity Open
View article: A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook
A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook Open
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves as a fundamental methodology in the field of Artifi…
View article: TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning
TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning Open
Large language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspect…
View article: SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models Open
In recent years, the rapid advancement of Artificial Intelligence (AI) technologies, particularly Large Language Models (LLMs), has revolutionized the paradigm of scientific discovery, establishing AI-for-Science (AI4Science) as a dynamic …
View article: TimeFound: A Foundation Model for Time Series Forecasting
TimeFound: A Foundation Model for Time Series Forecasting Open
We present TimeFound, an encoder-decoder transformer-based time series foundation model for out-of-the-box zero-shot forecasting. To handle time series data from various domains, TimeFound employs a multi-resolution patching strategy to ca…