Explanipedia

Evidence of scaling advantage on an NP-Complete problem with enhanced quantum solvers Open

Quanfeng Lu, Keren Li, Yan Bao, Muxi Zheng, Haoran Zhang , et al. · 2025

Achieving quantum advantage remains a key milestone in the noisy intermediate-scale quantum era. Without rigorous complexity proofs, scaling advantage-where quantum resource requirements grow more slowly than their classical counterparts-s…

Quantum-classical hybrid algorithm for solving the learning-with-errors problem on NISQ devices Open

Muxi Zheng, Jinfeng Zeng, Wentao Yang, Pei-Jie Chang, Quanfeng Lu , et al. · 2025

Computer science Physics

The Learning-With-Errors (LWE) problem is a fundamental computational challenge with implications for post-quantum cryptography and computational learning theory. Here we propose a quantum-classical hybrid algorithm with Ising model to add…

MM-Eureka: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning Open

Fanqing Meng, Lingxiao Du, Zongkai Liu, Zhixiang Zhou, Quanfeng Lu , et al. · 2025

DeepSeek R1, and o1 have demonstrated powerful reasoning capabilities in the text domain through stable large-scale reinforcement learning. To enable broader applications, some works have attempted to transfer these capabilities to multimo…

An Investigation of Energy Consumption Characteristics of the Pump-Control System for Electric Excavator Arms Open

Anpeng He, Liejiang Wei, Quanfeng Lu, Pengfei He · 2024

Environmental science Engineering Physics

The conventional hydraulic system of excavators suffers from significant valve throttling losses and inadequate matching between the hydraulic power source and the load, which substantially impact the system’s overall energy consumption an…

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Open

Fanqing Meng, Jiaqi Liao, Xinyu Tan, Wenqi Shao, Quanfeng Lu , et al. · 2024

Computer science Geography

Text-to-video (T2V) models like Sora have made significant strides in visualizing complex prompts, which is increasingly viewed as a promising path towards constructing the universal world simulator. Cognitive psychologists believe that th…

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Open

Fanqing Meng, Jin Wang, Chuanhao Li, Quanfeng Lu, Hao Tian , et al. · 2024

Computer science

The capability to process multiple images is crucial for Large Vision-Language Models (LVLMs) to develop a more thorough and nuanced understanding of a scene. Recent multi-image LVLMs have begun to address this need. However, their evaluat…

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models Open

Fanqing Meng, Wenqi Shao, Lixin Luo, Yahong Wang, Yiran Chen , et al. · 2024

Computer science Geography

Text-to-image (T2I) models have made substantial progress in generating images from textual prompts. However, they frequently fail to produce images consistent with physical commonsense, a vital capability for applications in world simulat…

GUIOdyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices Open

Quanfeng Lu, Wenqi Shao, Zitao Liu, Fanqing Meng, Boxuan Li , et al. · 2024

Computer science

Autonomous Graphical User Interface (GUI) navigation agents can enhance user experience in communication, entertainment, and productivity by streamlining workflows and reducing manual intervention. However, prior GUI agents often trained w…

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Open

Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Lin Han , et al. · 2024

Computer science Geography

Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimod…

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM Open

Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He , et al. · 2024

Computer science Geography

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in various multimodal tasks. However, their potential in the medical domain remains largely unexplored. A significant challenge arises from the scarcity of dive…

ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning Open

Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang , et al. · 2024

Computer science Mathematics

Charts play a vital role in data visualization, understanding data patterns, and informed decision-making. However, their unique combination of graphical elements (e.g., bars, lines) and textual components (e.g., labels, legends) poses cha…

Quanfeng Lu YOU? Author Swipe