Explanipedia

Learning Dynamics of VLM Finetuning Open

J. Q. Zhang, Kaitong Cai, Jing Yang, Keze Wang · 2025

Preference-based finetuning of vision--language models (VLMs) is brittle: trivially wrong negatives inject uninformative gradients that destabilize training. We recast alignment as \textbf{learning-dynamics--aware optimization} and introdu…

GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning Open

J. Q. Zhang, Yutong Fan, Ruiqi Chen, Haoyi Jiang, Wenhao Chai , et al. · 2025

We propose GAM-Agent, a game-theoretic multi-agent framework for enhancing vision-language reasoning. Unlike prior single-agent or monolithic models, GAM-Agent formulates the reasoning process as a non-zero-sum game between base agents--ea…

A cartographic generalization method for 3D visualization of trajectories in space–time cubes: case study of epidemic spread Open

Fei Yang, Jie Shen, Fengzhen Zhu, J. Q. Zhang · 2025

The widespread adoption of positioning technology and location-based services has resulted in the continuous generation of substantial volumes of accessible spatiotemporal trajectory data. While many studies focus on 2D trajectory visualiz…

A Study on Blueberry Variety Classification Based on the HMS-ResNeXt50 Model Open

Rongli Gai, J. Q. Zhang, Ming Gao, Liya Hu, Guokai Xu , et al. · 2025

J. Q. Zhang YOU? Author Swipe