Explanipedia

CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment Open

Xue Jiang, Yihong Dong, Meng-Yang Liu, H. Deng, Wang Tian , et al. · 2025

While Large Language Models (LLMs) excel at code generation by learning from vast code corpora, a fundamental semantic gap remains between their training on textual patterns and the goal of functional correctness, which is governed by form…

Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model Open

Yihong Dong, Zhaoyu Ma, Xue Jiang, Zheng Fan, Jiaru Qian , et al. · 2025

Diffusion language models (DLMs) are emerging as a powerful and promising alternative to the dominant autoregressive paradigm, offering inherent advantages in parallel generation and bidirectional context modeling. However, the performance…

Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format Open

Dingzirui Wang, Xiang Zhang, Rongyu Cao, Longxu Dou, Xi Luo , et al. · 2025

Generating and voting multiple answers is an effective method to mitigate reasoning inconsistencies of large language models (LLMs). Prior works have shown that multiple reasoning formats outperform a single format when generating multiple…

SWE-GPT: A Process-Centric Language Model for Automated Software Improvement Open

Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen , et al. · 2025

Large language models (LLMs) have demonstrated remarkable performance in code generation, significantly enhancing the coding efficiency of developers. Recent advancements in LLM-based agents have led to significant progress in end-to-end a…

Do Code LLMs Understand Design Patterns? Open

Zhenyu Pan, X. M. Song, Yunkun Wang, Rongyu Cao, Binhua Li , et al. · 2025

Code Large Language Models (LLMs) demonstrate great versatility in adapting to various downstream tasks, including code generation and completion, as well as bug detection and fixing. However, Code LLMs often fail to capture existing codin…

LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues Open

Yalan Lin, Yingwei Ma, Rongyu Cao, Binhua Li, Fei Huang , et al. · 2024

Reproducing buggy code is the first and crucially important step in issue resolving, as it aids in identifying the underlying problems and validating that generated patches resolve the problem. While numerous approaches have been proposed …

Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement Open

Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen , et al. · 2024

Recent advancements in LLM-based agents have led to significant progress in automatic software engineering, particularly in software maintenance and evolution. Despite these encouraging advances, current research faces two major challenges…

Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? Open

Zhenyu Pan, Rongyu Cao, Yongchang Cao, Yingwei Ma, Binhua Li , et al. · 2024

Code completion, a key downstream task in code generation, is one of the most frequent and impactful methods for enhancing developer productivity in software development. As intelligent completion tools evolve, we need a robust evaluation …

In-Context Transfer Learning: Demonstration Synthesis by Transferring Similar Tasks Open

Dingzirui Wang, X. Zhang, Qiguang Chen, Longxu Dou, Xu Xiao , et al. · 2024

In-context learning (ICL) is an effective approach to help large language models (LLMs) adapt to various tasks by providing demonstrations of the target task. Considering the high cost of labeling demonstrations, many methods propose synth…

Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration Open

Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang , et al. · 2024

This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution. Deployed in TONGYI Lingma, an IDE-based coding assi…

CircAGFG1 Promotes Ovarian Cancer Progression Through the miR-409-3 p/ZEB1 Axis Open

Jie Luo, Hua Zhong, Mei Guo, Peihong Xiao, Rongyu Cao , et al. · 2024

Objectives Circular RNAs (circRNAs) serve a crucial regulatory role in ovarian cancer (OC). Circular RNA ArfGAP with FG repeats 1 (circAGFG1) has been shown to be involved in promoting the progression of several cancers, containing triple-…

Benefit distribution of integrated regional energy systems under carbon trading mechanisms based on improved Shapley value methods Open

Yubao Wang, Rongyu Cao, Da Luo, Pengpeng Li, Xiang Cheng · 2023

Carbon trading mechanisms and the development of integrated energy systems are important ways to realize the “carbon peaking and carbon neutrality” goal, and the problem of benefit distribution is of paramount importance to achieving the g…

CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality Open

Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma , et al. · 2023

There are three problems existing in the popular data-to-text datasets. First, the large-scale datasets either contain noise or lack real application scenarios. Second, the datasets close to real applications are relatively small in size. …

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs Open

Jinyang Li, Binyuan Hui, Ge Qu, Binhua Li, Jiaxi Yang , et al. · 2023

Text-to-SQL parsing, which aims at converting natural language instructions into executable SQLs, has gained increasing attention in recent years. In particular, Codex and ChatGPT have shown impressive results in this task. However, most o…

CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality Open

Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma , et al. · 2023

Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023.

Application of long non-coding RNA RBAT1 in improving diagnosis and prognosis of ovarian carcinoma Open

Jie Luo, Yuqing Zhang, Ting Zheng, Yongping Jing, Rongyu Cao , et al. · 2022

Tumorigenesis of bladder cancer and retinoblastoma is correlated with long non-coding RNA (lncRNA) RBAT1. However, the role of RBAT1 in ovarian carcinoma (OC) is unclear. Thus, the study explored the role of RBAT1 in OC. This research enro…

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions Open

Bowen Qin, Binyuan Hui, Lihan Wang, Min Yang, Jinyang Li , et al. · 2022

Text-to-SQL parsing is an essential and challenging task. The goal of text-to-SQL parsing is to convert a natural language (NL) question to its corresponding structured query language (SQL) based on the evidences provided by relational dat…

Extracting Variable-Depth Logical Document Hierarchy from Long Documents: Method, Evaluation, and Application Open

Rongyu Cao, Yixuan Cao, Ganbin Zhou, Ping Luo · 2022

Extracting Zero-shot Structured Information from Form-like Documents: Pretraining with Keys and Triggers Open

Rongyu Cao, Ping Luo · 2021

In this paper, we revisit the problem of extracting the values of a given set of key fields from form-like documents. It is the vital step to support many downstream applications, such as knowledge base construction, question answering, do…

LncRNA DLGAP1-AS2 Suppresses the Maturation of miR-16 to Suppress Cell Invasion and Migration of Ovarian Cancer Cells Open

Jie Luo, Yuqiang Zhang, Ting Zheng, Yongping Jing, Rongyu Cao , et al. · 2021

Background: This study aimed to explore the role of lncRNA DLGAP1-AS2 in ovarian cancer (OC). Methods: Expression of DLGAP1-AS2, mature miR-16 and miR-16 precursor in paired OC tissues and non-tumor tissues collected from 62 OC patients wa…

Hierarchical Neural Network for Extracting Knowledgeable Snippets and\n Documents Open

Ganbin Zhou, Rongyu Cao, Xiang Ao, Ping Luo, Lin Fen , et al. · 2018

In this study, we focus on extracting knowledgeable snippets and annotating\nknowledgeable documents from Web corpus, consisting of the documents from\nsocial media and We-media. Informally, knowledgeable snippets refer to the text\ndescri…

Hierarchical Neural Network for Extracting Knowledgeable Snippets and Documents Open

Ganbin Zhou, Rongyu Cao, Xiang Ao, Ping Luo, Fen Lin , et al. · 2018

In this study, we focus on extracting knowledgeable snippets and annotating knowledgeable documents from Web corpus, consisting of the documents from social media and We-media. Informally, knowledgeable snippets refer to the text describin…

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation Open

Ganbin Zhou, Ping Luo, Rongyu Cao, Yijun Xiao, Fen Lin , et al. · 2018

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate p…

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation Open

Ganbin Zhou, Ping Luo, Rongyu Cao, Yijun Xiao, Fen Lin , et al. · 2017

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate p…

Generative Neural Machine for Tree Structures Open

Ganbin Zhou, Ping Luo, Rongyu Cao, Yijun Xiao, Fen Lin , et al. · 2017

Tree structures are commonly used in the tasks of semantic analysis and understanding over the data of different modalities, such as natural language, 2D or 3D graphics and images, or Web pages. Previous studies model the structures in a …

Mechanism-Aware Neural Machine for Dialogue Response Generation Open

Ganbin Zhou, Ping Luo, Rongyu Cao, Fen Lin, Bo Chen , et al. · 2017

To the same utterance, people's responses in everyday dialogue may be diverse largely in terms of content semantics, speaking styles, communication intentions and so on. Previous generative conversational models ignore these 1-to-n relatio…

Robust Indoor Human Activity Recognition Using Wireless Signals Open

Yi Wang, Xinli Jiang, Rongyu Cao, Xiyang Wang · 2015

Wireless signals–based activity detection and recognition technology may be complementary to the existing vision-based methods, especially under the circumstance of occlusions, viewpoint change, complex background, lighting condition chang…

Rongyu Cao YOU? Author Swipe