Explanipedia

Bridging VLMs and Embodied Intelligence with Deliberate Practice Policy Optimization Open

Chao Liu, Xiaolu Ren, H. Ni, Yingji Zhang, Shuai Zhang , et al. · 2025

Developing a universal and versatile embodied intelligence system presents two primary challenges: the critical embodied data bottleneck, where real-world data is scarce and expensive, and the algorithmic inefficiency of existing methods, …

A Polynomial-time Algorithm for Online Sparse Linear Regression with Improved Regret Bound under Weaker Conditions Open

Junfan Li, Shizhong Liao, Zenglin Xu, Liqiang Nie · 2025

In this paper, we study the problem of online sparse linear regression (OSLR) where the algorithms are restricted to accessing only $k$ out of $d$ attributes per instance for prediction, which was proved to be NP-hard. Previous work gave p…

NEXUS-O: An Omni-Perceptive and -Interactive Model for Language, Audio, and Vision Open

Che Liu, Yingji Zhang, Dong Zhang, Weijie Zhang, Chenggong Gong , et al. · 2025

IndexNet: Timestamp and Variable-Aware Modeling for Time Series Forecasting Open

Beiliang Wu, Peiyuan Liu, Yifan Hu, Luyan Zhang, Ao Hu , et al. · 2025

Multivariate time series forecasting (MTSF) plays a vital role in a wide range of real-world applications, such as weather prediction and traffic flow forecasting. Although recent advances have significantly improved the modeling of tempor…

From Implicit Exploration to Structured Reasoning: Leveraging Guideline and Refinement for LLMs Open

Jiaxiang Chen, Zhuo Wang, Mingxi Zou, Zhigang Li, Zhijian Zhou , et al. · 2025

Large language models (LLMs) have advanced general-purpose reasoning, showing strong performance across diverse tasks. However, existing methods often rely on implicit exploration, where the model follows stochastic and unguided reasoning …

Introducing Academia AI and Applications: a new platform for responsible and interdisciplinary AI research Open

Zenglin Xu, Irwin King · 2025

FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation Open

Maolin Wang, Yutian Xiao, B. L. Wang, Sheng Zhang, Shanshan Ye , et al. · 2025

FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation Open

Maolin Wang, Yutian Xiao, B. L. Wang, Sheng Zhang, Shanshan Ye , et al. · 2025

Modern recommendation systems face significant challenges in processing multimodal sequential data, particularly in temporal dynamics modeling and information flow coordination. Traditional approaches struggle with distribution discrepanci…

Efficient Network Automatic Relevance Determination Open

Hongwei Zhang, Ziqi Ye, Xinyuan Wang, Xin Guo, Zenglin Xu , et al. · 2025

We propose Network Automatic Relevance Determination (NARD), an extension of ARD for linearly probabilistic models, to simultaneously model sparse relationships between inputs $X \in \mathbb R^{d \times N}$ and outputs $Y \in \mathbb R^{m …

Simple Yet Effective: Extracting Private Data Across Clients in Federated Fine-Tuning of Large Language Models Open

Yabin Hu, Zhuo Zhang, Jingyuan Zhang, Lizhen Qu, Zenglin Xu · 2025

Federated fine-tuning of large language models (FedLLMs) presents a promising approach for achieving strong model performance while preserving data privacy in sensitive domains. However, the inherent memorization ability of LLMs makes them…

Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems Open

Guanzhong Chen, Shaoxiong Yang, Chao Li, Wei Liu, Jian Luan , et al. · 2025

Large Language Models (LLMs) have achieved remarkable success across diverse natural language processing tasks, yet their deployment in real-world applications is hindered by fixed knowledge cutoffs and difficulties in generating controlla…

FL@FM-TheWebConf'25: International Workshop on Federated Foundation Models for the Web 2025 Open

Irwin King, Guodong Long, Zenglin Xu, Yifei Zhang, Han Yu · 2025

GeoPro-Net: Learning Interpretable Spatiotemporal Prediction Models Through Statistically-Guided Geo-Prototyping Open

Bang An, Xun Yu Zhou, Zirui Zhou, Ronilo Ragodos, Zenglin Xu , et al. · 2025

The problem of forecasting spatiotemporal events such as crimes and accidents is crucial to public safety and city management. Besides accuracy, interpretability is also a key requirement for spatiotemporal forecasting models to justify th…

Position: LLMs Can be Good Tutors in English Education Open

Jingheng Ye, S. Wang, D. Zou, Yibo Yan, Kun Wang , et al. · 2025

While recent efforts have begun integrating large language models (LLMs) into English education, they often rely on traditional approaches to learning tasks without fully embracing educational methodologies, thus lacking adaptability to la…

From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs Open

Jiaxiang Chen, Zhuo Wang, Mingxi Zou, Zhigang Li, Zhijian Zhou , et al. · 2025

Position: LLMs Can be Good Tutors in English Education Open

Jingheng Ye, S. Wang, D. Zou, Yibo Yan, Kun Wang , et al. · 2025

Towards Incomplete Multimodal Learning with Prompt-Based Hierarchical Knowledge Distillation Open

Ruiting Dai, Xin Gao, Lisi Mo, Zonghang Li, Taiping He , et al. · 2025

From Continuous Pre-Training to Alignment: A Comprehensive Toolkit for Large Language Models in Federated Learning Open

Zhuo Zhang, Yukun Zhang, Guanzhong Chen, Lizhen Qu, Xun Zhou , et al. · 2025

Enhancing Progressive Ensemble Learning via Normalized Extra-Gradient Initialization Open

Zheshun Wu, Yu Pan, Dun Zeng, Qifan Wang, Zenglin Xu , et al. · 2025

FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making Open

Jiaxiang Chen, M. Zou, Zhuo Wang, Qifan Wang, Dandan Sun , et al. · 2025

GeoPro-Net: Learning Interpretable Spatiotemporal Prediction Models through Statistically-Guided Geo-Prototyping Open

Bang An, Xun Yu Zhou, Zirui Zhou, Ronilo Ragodos, Zenglin Xu , et al. · 2024

The problem of forecasting spatiotemporal events such as crimes and accidents is crucial to public safety and city management. Besides accuracy, interpretability is also a key requirement for spatiotemporal forecasting models to justify th…

CENTAUR: Bridging the Impossible Trinity of Privacy, Efficiency, and Performance in Privacy-Preserving Transformer Inference Open

Jinglong Luo, Guanzhong Chen, Yehong Zhang, Shiyu Liu, Hui Wang , et al. · 2024

With the growing deployment of pre-trained models like Transformers on cloud platforms, privacy concerns about model parameters and inference data are intensifying. Existing Privacy-Preserving Transformer Inference (PPTI) frameworks face t…

Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack Open

G Chen, Zhenghan Qin, Mingxin Yang, Yajie Zhou, Tao Fan , et al. · 2024

Recent advancements in pre-trained large language models (LLMs) have significantly influenced various domains. Adapting these models for specific tasks often involves fine-tuning (FT) with private, domain-specific data. However, privacy co…

Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization Open

Dun Zeng, Zheshun Wu, Shiyu Liu, Yu Pan, Xiaoying Tang , et al. · 2024

Federated Learning (FL) is a distributed learning approach that trains machine learning models across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsi…

CUPID: Improving Battle Fairness and Position Satisfaction in Online MOBA Games with a Re-matchmaking System Open

Ge Fan, Chaoyun Zhang, Kai Wang, Yingjie Li, Junyang Chen , et al. · 2024

The multiplayer online battle arena (MOBA) genre has gained significant popularity and economic success, attracting considerable research interest within the Human-Computer Interaction community. Enhancing the gaming experience requires a …

MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning Open

Ziliang Gan, Lu Yu, Dong Zhang, Haohan Li, Che Liu , et al. · 2024

In recent years, multimodal benchmarks for general domains have guided the rapid development of multimodal models on general tasks. However, the financial field has its peculiarities. It features unique graphical images (e.g., candlestick …

FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data Open

Yukun Zhang, Guanzhong Chen, Zenglin Xu, Jianyong Wang, Dun Zeng , et al. · 2024

Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies o…

TimeCNN: Refining Cross-Variable Interaction on Time Point for Time Series Forecasting Open

Ao Hu, Dongkai Wang, Yong Dai, Shiyi Qi, Liangjian Wen , et al. · 2024

Time series forecasting is extensively applied across diverse domains. Transformer-based models demonstrate significant potential in modeling cross-time and cross-variable interaction. However, we notice that the cross-variable correlation…

M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning Open

Taowen Wang, Yiyang Liu, Jia Liang, Jie Zhao, Yiming Cui , et al. · 2024

Multimodal Large Language Models (MLLMs) demonstrate remarkable performance across a wide range of domains, with increasing emphasis on enhancing their zero-shot generalization capabilities for unseen tasks across various modalities. Instr…

Can we only use guideline instead of shot in prompt? Open

Jiaxiang Chen, Song Wang, Zhigang Li, Wayne Xiong, Lizhen Qu , et al. · 2024

Currently, prompting techniques can be mainly divided into two categories:1)shot method implicitly inspires the model to answer the question by mimicing the steps in the given example, e.g., the few-shot CoT. 2) Guideline method explicitly…

Zenglin Xu YOU? Author Swipe