Fei Huang
YOU?
Author Swipe
View article: Conceptual design report of the Super Tau-Charm Facility: the accelerator
Conceptual design report of the Super Tau-Charm Facility: the accelerator Open
Electron–positron colliders operating in the GeV center-of-mass range, or tau-charm energy region, have been proved to enable competitive frontier research due to several unique features. With the progress of high-energy physics in the las…
View article: InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration
InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration Open
Large Language Models (LLMs) frequently generate buggy code with complex logic errors that are challenging to diagnose. While existing LLM-based self-repair approaches conduct intensive static semantic analysis or reply on superficial exec…
View article: Scaling Generalist Data-Analytic Agents
Scaling Generalist Data-Analytic Agents Open
Data-analytic agents are emerging as a key catalyst for automated scientific discovery and for the vision of Innovating AI. Current approaches, however, rely heavily on prompt engineering over proprietary models, while open-source models s…
View article: GSID: Generative Semantic Indexing for E-Commerce Product Understanding
GSID: Generative Semantic Indexing for E-Commerce Product Understanding Open
Structured representation of product information is a major bottleneck for the efficiency of e-commerce platforms, especially in second-hand ecommerce platforms. Currently, most product information are organized based on manually curated p…
View article: PARL-MT: Learning to Call Functions in Multi-Turn Conversation with Progress Awareness
PARL-MT: Learning to Call Functions in Multi-Turn Conversation with Progress Awareness Open
Large language models (LLMs) have achieved impressive success in single-turn function calling, yet real-world applications such as travel planning or multi-stage data analysis typically unfold across multi-turn conversations. In these sett…
View article: Towards General Agentic Intelligence via Environment Scaling
Towards General Agentic Intelligence via Environment Scaling Open
Advanced agentic intelligence is a prerequisite for deploying Large Language Models in practical, real-world applications. Diverse real-world APIs demand precise, robust function-calling intelligence, which needs agents to develop these ca…
View article: WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents Open
Recent advances in deep-research systems have demonstrated the potential for AI agents to autonomously discover and synthesize knowledge from external sources. In this paper, we introduce WebResearcher, a novel framework for building such …
View article: CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis
CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis Open
Cultural competence, defined as the ability to understand and adapt to multicultural contexts, is increasingly vital for large language models (LLMs) in global environments. While several cultural benchmarks exist to assess LLMs' cultural …
View article: WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Open
Web agents such as Deep Research have demonstrated superhuman cognitive abilities, capable of solving highly challenging information-seeking problems. However, most research remains primarily text-centric, overlooking visual information in…
View article: TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence Open
Recently, Large Language Models (LLMs) have made significant progress in IQ-related domains that require careful thinking, such as mathematics and coding. However, enhancing LLMs' cognitive development in social domains, particularly from …
View article: ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents Open
Role-Playing Language Agents (RPLAs) aim to simulate characters for realistic and engaging human-computer interactions. However, traditional reward models often struggle with scalability and adapting to subjective conversational preference…
View article: Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns
Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns Open
Process Reward Models (PRMs) are crucial in complex reasoning and problem-solving tasks (e.g., LLM agents with long-horizon decision-making) by verifying the correctness of each intermediate reasoning step. In real-world scenarios, LLMs ma…
View article: Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation Open
The exponential rise in mobile device usage necessitates streamlined automation for effective task management, yet many AI frameworks fall short due to inadequate operational expertise. While manually written knowledge can bridge this gap,…
View article: Qwen3 Technical Report
Qwen3 Technical Report Open
In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes mod…
View article: WritingBench: A Comprehensive Benchmark for Generative Writing
WritingBench: A Comprehensive Benchmark for Generative Writing Open
Recent advancements in large language models (LLMs) have significantly enhanced text generation capabilities, yet evaluating their performance in generative writing remains a challenge. Existing benchmarks primarily focus on generic text g…
View article: Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference Open
Despite the advancements made in Vision Large Language Models (VLLMs), like text Large Language Models (LLMs), they have limitations in addressing questions that require real-time information or are knowledge-intensive. Indiscriminately ad…
View article: Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation Open
The exponential rise in mobile device usage necessitates streamlined automation for effective task management, yet many AI frameworks fall short due to inadequate operational expertise. While manually written knowledge can bridge this gap,…
View article: Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation Open
Multilingual neural machine translation (MNMT) aims at using one single model for multiple translation directions. Recent work applies non-autoregressive Transformers to improve the efficiency of MNMT, but requires expensive knowledge dist…
View article: Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Open
Smartphones have become indispensable in modern life, yet navigating complex tasks on mobile devices often remains frustrating. Recent advancements in large multimodal model (LMM)-based mobile agents have demonstrated the ability to percei…
View article: OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis
OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis Open
Recent advancements in omnimodal learning have significantly improved understanding and generation across images, text, and speech, yet these developments remain predominantly confined to proprietary models. The lack of high-quality omnimo…
View article: CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis
CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis Open
View article: GSID: Generative Semantic Indexing for E-Commerce Product Understanding
GSID: Generative Semantic Indexing for E-Commerce Product Understanding Open
View article: ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions Open
View article: NOVA-63: Native Omni-lingual Versatile Assessments of 63 Disciplines
NOVA-63: Native Omni-lingual Versatile Assessments of 63 Disciplines Open
View article: Dimensionality Reduction and Classification Based on Enhanced Graph and Hypergraph Joint Discriminative Learning
Dimensionality Reduction and Classification Based on Enhanced Graph and Hypergraph Joint Discriminative Learning Open
View article: mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding Open
View article: KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models Open
View article: Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference Open
View article: DecoupleSearch: Decouple Planning and Search via Hierarchical Reward Modeling
DecoupleSearch: Decouple Planning and Search via Hierarchical Reward Modeling Open
View article: Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling
Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling Open