Jie Tang
YOU?
Author Swipe
View article: Glyph: Scaling Context Windows via Visual-Text Compression
Glyph: Scaling Context Windows via Visual-Text Compression Open
Large language models (LLMs) increasingly rely on long-context modeling for tasks such as document understanding, code analysis, and multi-step reasoning. However, scaling context windows to the million-token level brings prohibitive compu…
View article: UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Open
The development of autonomous agents for graphical user interfaces (GUIs) presents major challenges in artificial intelligence. While recent advances in native agent models have shown promise by unifying perception, reasoning, action, and …
View article: ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding
ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding Open
With respect to improving the reasoning accuracy of LLMs, the representative reinforcement learning (RL) method GRPO faces failure due to insignificant reward variance, while verification methods based on process reward models (PRMs) suffe…
View article: GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model
GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model Open
View article: UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks
UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks Open
In unmanned aerial vehicle (UAV) networks, communication protocols and algorithms are essential for cooperation and collaboration between UAVs. Simulation provides a cost-effective solution for prototyping, debugging, and analyzing protoco…
View article: ProtGO: universal protein function prediction utilizing multi-modal gene ontology knowledge
ProtGO: universal protein function prediction utilizing multi-modal gene ontology knowledge Open
Motivation As one of the recalcitrant challenges in life sciences and biomedicine, protein function prediction suffers from a deluge of AI-designed proteins, particularly having to face multi-modal information in the era of big data. Impor…
View article: Colony Binary Classification Based on Persistent Homology Feature Extraction and Improved EfficientNet
Colony Binary Classification Based on Persistent Homology Feature Extraction and Improved EfficientNet Open
Classifying newly formed colonies is instrumental in uncovering sources of infection and enabling precision medicine, holding significant clinical value. However, due to the ambiguous features of early-stage colony images in culture dishes…
View article: Deep learning-based automated segmentation for the quantitative diagnosis of cerebral small vessel disease via multisequence MRI
Deep learning-based automated segmentation for the quantitative diagnosis of cerebral small vessel disease via multisequence MRI Open
Objective Existing visual scoring systems for cerebral small vessel disease (CSVD) cannot assess the global lesion load accurately and quantitatively. We aimed to develop an automated segmentation method based on deep learning (DL) to quan…
View article: Small Language Model Makes an Effective Long Text Extractor
Small Language Model Makes an Effective Long Text Extractor Open
Named Entity Recognition (NER) is a fundamental problem in natural language processing (NLP). However, the task of extracting longer entity spans (e.g., awards) from extended texts (e.g., homepages) is barely explored. Current NER methods …
View article: StepMathAgent: A Step-Wise Agent for Evaluating Mathematical Processes through Tree-of-Error
StepMathAgent: A Step-Wise Agent for Evaluating Mathematical Processes through Tree-of-Error Open
Evaluating mathematical capabilities is critical for assessing the overall performance of large language models (LLMs). However, existing evaluation methods often focus solely on final answers, resulting in highly inaccurate and uninterpre…
View article: HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators Open
Since the adoption of large language models (LLMs) for text evaluation has become increasingly prevalent in the field of natural language processing (NLP), a series of existing works attempt to optimize the prompts for LLM evaluators to im…
View article: Small Language Model Makes an Effective Long Text Extractor
Small Language Model Makes an Effective Long Text Extractor Open
Named Entity Recognition (NER) is a fundamental problem in natural language processing (NLP). However, the task of extracting longer entity spans (e.g., awards) from extended texts (e.g., homepages) is barely explored. Current NER methods …
View article: WCN25-1183 IMPLEMENTING CKD PATIENT EDUCATION IN A LOW-RESOURCE SETTING: A MIXED METHODS FEASIBILITY STUDY IN WESTERN KENYA
WCN25-1183 IMPLEMENTING CKD PATIENT EDUCATION IN A LOW-RESOURCE SETTING: A MIXED METHODS FEASIBILITY STUDY IN WESTERN KENYA Open
View article: Effect of magnetization on antibacterial, lipid-lowering and antioxidant activities of isoquinoline alkaloids
Effect of magnetization on antibacterial, lipid-lowering and antioxidant activities of isoquinoline alkaloids Open
View article: MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models Open
In recent years, vision language models (VLMs) have made significant advancements in video understanding. However, a crucial capability - fine-grained motion comprehension - remains under-explored in current benchmarks. To address this gap…
View article: Dynamic Scaling of Unit Tests for Code Reward Modeling
Dynamic Scaling of Unit Tests for Code Reward Modeling Open
Current large language models (LLMs) often struggle to produce accurate responses on the first attempt for complex reasoning tasks like code generation. Prior research tackles this challenge by generating multiple candidate solutions and v…
View article: AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Open
View article: HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators Open
View article: A Machine Learning-Based Online Prognostic Prediction Model for Patients with Pancreatitis Complicated by Sepsis: Development and Validation in Two Retrospective Cohorts
A Machine Learning-Based Online Prognostic Prediction Model for Patients with Pancreatitis Complicated by Sepsis: Development and Validation in Two Retrospective Cohorts Open
View article: Dynamic Scaling of Unit Tests for Code Reward Modeling
Dynamic Scaling of Unit Tests for Code Reward Modeling Open
View article: LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models Open
View article: A Survey of Post-Training Scaling in Large Language Models
A Survey of Post-Training Scaling in Large Language Models Open
View article: LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Open
View article: VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation Open
Visual generative models have achieved remarkable progress in synthesizing photorealistic images and videos, yet aligning their outputs with human preferences across critical dimensions remains a persistent challenge. Though reinforcement …
View article: LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Open
This paper introduces LongBench v2, a benchmark designed to assess the ability of LLMs to handle long-context problems requiring deep understanding and reasoning across real-world multitasks. LongBench v2 consists of 503 challenging multip…
View article: The Superalignment of Superhuman Intelligence with Large Language Models
The Superalignment of Superhuman Intelligence with Large Language Models Open
We have witnessed superhuman intelligence thanks to the fast development of large language models and multimodal language models. As the application of such superhuman models becomes more and more popular, a critical question arises here: …
View article: GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model
GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model Open
Anomaly detection on text-rich graphs is widely prevalent in real life, such as detecting incorrectly assigned academic papers to authors and detecting bots in social networks. The remarkable capabilities of large language models (LLMs) pa…
View article: Application Research on Improving Few shot Semi supervised Network In Fault Diagnosis of Rocket Artillery Rotation Machine
Application Research on Improving Few shot Semi supervised Network In Fault Diagnosis of Rocket Artillery Rotation Machine Open
The success of fault diagnosis based on deep learning is attributed to a large number of labeled samples. However, in the application of fault diagnosis for artillery rotating devices, the scarcity of labeled samples can easily lead to ove…
View article: A Deep Learning-Based Framework for Bearing RUL Prediction to Optimize Laser Shock Peening Remanufacturing
A Deep Learning-Based Framework for Bearing RUL Prediction to Optimize Laser Shock Peening Remanufacturing Open
Accurate prediction of the remaining useful life (RUL) of bearings is crucial for maintaining the reliability and efficiency of industrial systems. This study introduces a novel methodology integrating advanced machine learning and optimiz…
View article: AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Open
Autonomous agents have become increasingly important for interacting with the real world. Android agents, in particular, have been recently a frequently-mentioned interaction method. However, existing studies for training and evaluating An…