Explanipedia

A Syntactic Neural Model for General-Purpose Code Generation Open

Pengcheng Yin, Graham Neubig · 2017

Computer science Philosophy

We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python. Existing data-driven methods treat this problem as a language generation task without consider…

Convolutional Neural Networks over Tree Structures for Programming Language Processing Open

Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin · 2016

Computer science

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a…

Using Machine Learning for Vulnerability Detection and Classification Open

Uri Alon, Shaked Brody, Omer Levy, Eran Yahav · 2021

Computer science Chemistry Mathematics

The work described in this paper aims at developing a machine learning based tool for automatic identification of vulnerabilities on programs (source, high level code), that uses an abstract syntax tree representation. It is based on FastS…

Abstract Syntax Networks for Code Generation and Semantic Parsing Open

Maxim Rabinovich, Mitchell Stern, Dan Klein · 2017

Computer science Geography Economics

Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs a…

Bill Hillier’s Legacy: Space Syntax—A Synopsis of Basic Concepts, Measures, and Empirical Application Open

Claudia Yamu, Akkelies van Nes, Chiara Garau · 2021

Computer science Philosophy

Bill Hillier’s space syntax method and theory enables us to describe the spatial properties of a sustainable city. Empirical testing of the space syntax method over time has confirmed the capacity and innovativeness of analyzing spatial re…

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X Open

Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang , et al. · 2023

Computer science Mathematics

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax-and function-correct code, making the coding of programmers more productive. In this paper, we introduce CodeGeeX, a multilingual model with 13 billion par…

Traduction Non Supervisée de Langages de Programmation Open

Baptiste Rozière · 2022

Computer science

A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to por…

TreeGen: A Tree-Based Transformer Architecture for Code Generation Open

Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou , et al. · 2020

Computer science

A code generation system generates programming language code based on an input natural language description. State-of-the-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems. …

GraphCodeBERT: Pre-training Code Representations with Data Flow Open

Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang , et al. · 2020

Computer science

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code …

Deep learning similarities from different representations of source code Open

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White , et al. · 2018

Computer science Philosophy Economics

Assessing the similarity between code components plays a pivotal role in a number of Software Engineering (SE) tasks, such as clone detection, impact analysis, refactoring, etc. Code similarity is generally measured by relying on manually …

Software Defect Prediction via Attention-Based Recurrent Neural Network Open

Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen · 2019

Computer science

In order to improve software reliability, software defect prediction is applied to the process of software maintenance to identify potential bugs. Traditional methods of software defect prediction mainly focus on designing static code metr…

Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder Open

Huadong Chen, Shujian Huang, David Chiang, Jiajun Chen · 2017

Computer science Chemistry Philosophy

Most neural machine translation (NMT) models are based on the sequential encoder-decoder framework, which makes no use of syntactic information. In this paper, we improve this model by explicitly incorporating source-side syntactic trees. …

Modeling Source Syntax for Neural Machine Translation Open

Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang , et al. · 2017

Computer science Philosophy Chemistry

Even though a linguistics-free sequence to sequence model in neural machine translation (NMT) has certain capability of implicitly learning syntactic information of source sentences, this paper shows that source syntax can be explicitly in…

Seml: A Semantic LSTM Model for Software Defect Prediction Open

Hongliang Liang, Yue Yu, Lin Jiang, Zhuosi Xie · 2019

Computer science Philosophy

Software defect prediction can assist developers in finding potential bugs and reducing maintenance cost. Traditional approaches usually utilize software metrics (Lines of Code, Cyclomatic Complexity, etc.) as features to build classifiers…

CODIT: Code Editing With Tree-Based Neural Models Open

Saikat Chakraborty, Yangruibo Ding, Miltiadis Allamanis, Baishakhi Ray · 2020

Computer science Mathematics Philosophy

The way developers edit day-to-day code tends to be repetitive, often using\nexisting code elements. Many researchers have tried to automate repetitive code\nchanges by learning from specific change templates which are applied to limited\n…

A deep tree-based model for software defect prediction Open

Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy , et al. · 2018

Computer science Mathematics Physics

Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on d…

A Unified Syntax-aware Framework for Semantic Role Labeling Open

Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang, Hai Zhao , et al. · 2018

Computer science Chemistry

Semantic role labeling (SRL) aims to recognize the predicate-argument structure of a sentence. Syntactic information has been paid a great attention over the role of enhancing SRL. However, the latest advance shows that syntax would not be…

Retrieval-Based Neural Code Generation Open

Shirley Anugrah Hayati, Raphaël Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic , et al. · 2018

Computer science Mathematics

In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to mem…

Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs Open

Yanlin Wang, Hui Li · 2021

Computer science Economics Political science

Code completion has become an essential component of integrated development environments. Contemporary code completion methods rely on the abstract syntax tree (AST) to generate syntactically correct code. However, they cannot fully captur…

Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks Open

Sahil Bhatia, Rishabh Singh · 2024

Computer science

We present a method for automatically generating repair feedback for syntax errors for introductory programming problems. Syntax errors constitute one of the largest classes of errors (34%) in our dataset of student submissions obtained fr…

Correlating Neural and Symbolic Representations of Language Open

Grzegorz Chrupała, Afra Alishahi · 2019

Computer science Philosophy

Analysis methods which enable us to better understand the representations and\nfunctioning of neural models of language are increasingly needed as deep\nlearning becomes the dominant approach in NLP. Here we present two methods\nbased on R…

Stack-propagation: Improved Representation Learning for Syntax Open

Yuan Zhang, David J. Weiss · 2016

Computer science Political science

Traditional syntax models typically leverage part-of-speech (POS) information by constructing features from hand-tuned templates.We demonstrate that a better approach is to utilize POS tags as a regularizer of learned representations.We pr…

Language-Agnostic Representation Learning of Source Code from Structure and Context Open

Daniel Zügner, Tobias Kirschstein, Michele Catasta, Jure Leskovec, Stephan Günnemann · 2021

Computer science Political science Biology

Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program. Traditionally, designers of machine learning models have relied predominantly either on Structur…

Automated Classification of Overfitting Patches With Statically Extracted Code Features Open

He Ye, Jian Gu, Matías Martínez, Thomas Durieux, Martin Monperrus · 2021

Computer science

Automatic program repair (APR) aims to reduce the cost of manually fixing\nsoftware defects. However, APR suffers from generating a multitude of\noverfitting patches, those patches that fail to correctly repair the defect\nbeyond making th…

Unsupervised Translation of Programming Languages Open

Marie-Anne Lachaux, Baptiste Rozière, Lowik Chanussot, Guillaume Lample · 2020

Computer science

A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to por…

code2seq: Generating Sequences from Structured Representations of Code Open

Uri Alon, Shaked Brody, Omer Levy, Eran Yahav · 2018

Computer science Mathematics Chemistry

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine transla…

Getting More Out Of Syntax with PropS Open

Gabriel Stanovsky, Jessica Ficler, Ido Dagan, Yoav Goldberg · 2016

Computer science Political science Philosophy

Semantic NLP applications often rely on dependency trees to recognize major elements of the proposition structure of sentences. Yet, while much semantic structure is indeed expressed by syntax, many phenomena are not easily read out of dep…

Syntax in the Treetops Open

Shigeru Miyagawa · 2022

Computer science Psychology History

A proposal that syntax extends to the domain of discourse in making core syntax link to the conversational context. In Syntax in the Treetops, Shigeru Miyagawa proposes that syntax extends into the domain of discourse by making linkages be…

Deep Learning With Customized Abstract Syntax Tree for Bug Localization Open

Hongliang Liang, Lu Sun, Meilin Wang, Yuxing Yang · 2019

Computer science Mathematics

Given a bug report, bug localization technique can help developers automatically locate potential buggy files. Information retrieval and deep learning approaches have been applied in bug localization by extracting lexical features in bug r…

Software defect prediction using hybrid model (CBIL) of convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM) Open

Ahmed Bahaa Farid, E. Fathy, Ahmed Sharaf Eldin, Laila A. Abd-Elmegid · 2021

Computer science

In recent years, the software industry has invested substantial effort to improve software quality in organizations. Applying proactive software defect prediction will help developers and white box testers to find the defects earlier, and …

Abstract syntax tree ≈ Abstract syntax tree