Yangfan He
YOU?
Author Swipe
View article: Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning
Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning Open
Chinese ancient documents, invaluable carriers of millennia of Chinese history and culture, hold rich knowledge across diverse fields but face challenges in digitization and understanding, i.e., traditional methods only scan images, while …
View article: Attention and Risk-Aware Decision Framework for Safe Autonomous Driving
Attention and Risk-Aware Decision Framework for Safe Autonomous Driving Open
Autonomous driving has attracted great interest due to its potential capability in full-unsupervised driving. Model-based and learning-based methods are widely used in autonomous driving. Model-based methods rely on pre-defined models of t…
View article: Adaptive Evolution Factor Risk Ellipse Framework for Reliable and Safe Autonomous Driving
Adaptive Evolution Factor Risk Ellipse Framework for Reliable and Safe Autonomous Driving Open
In recent years, ensuring safety, efficiency, and comfort in interactive autonomous driving has become a critical challenge. Traditional model-based techniques, such as game-theoretic methods and robust control, are often overly conservati…
View article: Enhanced Mean Field Game for Interactive Decision-Making with Varied Stylish Multi-Vehicles
Enhanced Mean Field Game for Interactive Decision-Making with Varied Stylish Multi-Vehicles Open
This paper presents an MFG-based decision-making framework for autonomous driving in heterogeneous traffic. To capture diverse human behaviors, we propose a quantitative driving style representation that maps abstract traits to parameters …
View article: Enhancing Commentary Strategies for Guandan: A Study of LLMs in Game Commentary Generation
Enhancing Commentary Strategies for Guandan: A Study of LLMs in Game Commentary Generation Open
Recent advancements in large language models (LLMs) have unlocked the potential for generating high-quality game commentary. However, producing insightful and engaging commentary for complex games with incomplete information remains a sign…
View article: See the Forest and the Trees: A Synergistic Reasoning Framework for Knowledge-Based Visual Question Answering
See the Forest and the Trees: A Synergistic Reasoning Framework for Knowledge-Based Visual Question Answering Open
Multimodal Large Language Models (MLLMs) have pushed the frontiers of Knowledge-Based Visual Question Answering (KBVQA), yet their reasoning is fundamentally bottlenecked by a reliance on uni-dimensional evidence. This "seeing only the tre…
View article: MountainLion: A Multi-Modal LLM-Based Agent System for Interpretable and Adaptive Financial Trading
MountainLion: A Multi-Modal LLM-Based Agent System for Interpretable and Adaptive Financial Trading Open
Cryptocurrency trading is a challenging task requiring the integration of heterogeneous data from multiple modalities. Traditional deep learning and reinforcement learning approaches typically demand large training datasets and encode dive…
View article: GLIMPSE: Do Large Vision-Language Models Truly Think With Videos or Just Glimpse at Them?
GLIMPSE: Do Large Vision-Language Models Truly Think With Videos or Just Glimpse at Them? Open
Existing video benchmarks often resemble image-based benchmarks, with question types like "What actions does the person perform throughout the video?" or "What color is the woman's dress in the video?" For these, models can often answer by…
View article: FASIONAD++ : Integrating High-Level Instruction and Information Bottleneck in FAt-Slow fusION Systems for Enhanced Safety in Autonomous Driving with Adaptive Feedback
FASIONAD++ : Integrating High-Level Instruction and Information Bottleneck in FAt-Slow fusION Systems for Enhanced Safety in Autonomous Driving with Adaptive Feedback Open
Ensuring safe, comfortable, and efficient planning is crucial for autonomous driving systems. While end-to-end models trained on large datasets perform well in standard driving scenarios, they struggle with complex low-frequency events. Re…
View article: MaRI: Material Retrieval Integration across Domains
MaRI: Material Retrieval Integration across Domains Open
Accurate material retrieval is critical for creating realistic 3D assets. Existing methods rely on datasets that capture shape-invariant and lighting-varied representations of materials, which are scarce and face challenges due to limited …
View article: PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net
PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net Open
Learning and improving large language models through human preference feedback has become a mainstream approach, but it has rarely been applied to the field of low-light image enhancement. Existing low-light enhancement evaluations typical…
View article: Enhancing Intent Understanding for Ambiguous prompt: A Human-Machine Co-Adaption Strategy
Enhancing Intent Understanding for Ambiguous prompt: A Human-Machine Co-Adaption Strategy Open
Current image generation systems produce high-quality images but struggle with ambiguous user prompts, making interpretation of actual user intentions difficult. Many users must modify their prompts several times to ensure the generated im…
View article: Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion
Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion Open
Recent advancements in text-to-image (T2I) generation using diffusion models have enabled cost-effective video-editing applications by leveraging pre-trained models, eliminating the need for resource-intensive training. However, the frame-…
View article: ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects Open
This paper presents a novel framework for modeling and conditional generation of 3D articulated objects. Troubled by flexibility-quality tradeoffs, existing methods are often limited to using predefined structures or retrieving shapes from…
View article: FASIONAD : FAst and Slow FusION Thinking Systems for Human-Like Autonomous Driving with Adaptive Feedback
FASIONAD : FAst and Slow FusION Thinking Systems for Human-Like Autonomous Driving with Adaptive Feedback Open
Ensuring safe, comfortable, and efficient navigation is a critical goal for autonomous driving systems. While end-to-end models trained on large-scale datasets excel in common driving scenarios, they often struggle with rare, long-tail eve…
View article: FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system Open
Recently, large language models (LLMs) have achieved significant progress in automated code generation. Despite their strong instruction-following capabilities, these models frequently struggled to align with user intent in coding scenario…
View article: Systematic review and meta-analysis of breathing exercises effects on lung function and quality of life in postoperative lung cancer patients
Systematic review and meta-analysis of breathing exercises effects on lung function and quality of life in postoperative lung cancer patients Open
This study indicates that breathing exercises significantly improve postoperative pulmonary function and QoL in lung cancer patients. Future research should delve into the mechanisms behind these exercises and evaluate their long-term reha…
View article: Azaphosphinate Dyes: A Low Molecular Weight Near‐Infrared Scaffold for Development of Photoacoustic or Fluorescence Imaging Probes
Azaphosphinate Dyes: A Low Molecular Weight Near‐Infrared Scaffold for Development of Photoacoustic or Fluorescence Imaging Probes Open
Near‐infrared (NIR) dyes are desirable for biological imaging applications including photoacoustic (PA) and fluorescence imaging. Nonetheless, current NIR dyes are often plagued by relatively large molecular weights, poor water solubility,…
View article: Azaphosphinate Dyes: A Low Molecular Weight Near-Infrared Scaffold for Development of Photoacoustic and Fluorescence Imaging Probes
Azaphosphinate Dyes: A Low Molecular Weight Near-Infrared Scaffold for Development of Photoacoustic and Fluorescence Imaging Probes Open
Near-infrared (NIR) dyes are desirable for biological imaging applications including photoacoustic and fluorescence imaging. Nonetheless, current NIR dyes are often plagued by relatively large molecular weights, poor water solubility, and …
View article: Classification and Generation of Light Sources Using Gamma Fitting
Classification and Generation of Light Sources Using Gamma Fitting Open
In general, the typical approach to discriminate antibunching, bunching or superbunching categories make use of calculating the second-order coherence function ${g^{(2)}}(τ)$ of light. Although the classical light sources correspond to the…
View article: Frequency-Diverse Bunching Metamaterial Antenna for Coincidence Imaging
Frequency-Diverse Bunching Metamaterial Antenna for Coincidence Imaging Open
A frequency-diverse bunching metamaterial antenna for coincidence imaging in the Ka band is proposed in this paper. The bunching metamaterial antenna includes a broadband circular array and a frequency-diverse bunching metalens. Firstly, i…
View article: Wideband polarization-independent anomalous reflection metasurface with multiple resonance modes
Wideband polarization-independent anomalous reflection metasurface with multiple resonance modes Open
An ultra-thin metasurface is proposed to realize wideband polarization-independent anomalous reflection. The sub-wavelength resonator can produce different resonance modes, which are the result of the combined effect of dielectric and the …