Explanipedia

Multi Language Models for On-the-Fly Syntax Highlighting Open

Marco Edoardo Palma, Pooja Rani, Harald C. Gall · 2025

Syntax highlighting is a critical feature in modern software development environments, enhancing code readability and developer productivity. However, delivering accurate highlighting in real time remains challenging for online and web-bas…

Anchor Attention, Small Cache: Code Generation with Large Language Models Open

Xiangyu Zhang, Yu Zhou, Guang Yang, Harald C. Gall, Taolue Chen · 2024

Computer science

The development of large language models (LLMs) has revolutionized automated code generation. However, their high demand of computation resources has hindered a broader deployment and raised environmental concerns. A common strategy for di…

Toward granular search-based automatic unit test case generation Open

Fabiano Pecorelli, Giovanni Grano, Fabio Palomba, Harald C. Gall, Andrea De Lucia · 2024

Computer science Geology

Unit testing verifies the presence of faults in individual software components. Previous research has been targeting the automatic generation of unit tests through the adoption of random or search-based algorithms. Despite their effectiven…

Trustworthy Distributed Certification of Program Execution Open

F. Alexander Wolf, Marco Edoardo Palma, Pasquale Salza, Harald C. Gall · 2024

Computer science Political science

Verifying the execution of a program is complicated and often limited by the inability to validate the code's correctness. It is a crucial aspect of scientific research, where it is needed to ensure the reproducibility and validity of expe…

On-the-Fly Syntax Highlighting Generalisation and Speed-ups - Replication Package Open

Marco Edoardo Palma, F. Alexander Wolf, Pasquale Salza, Harald C. Gall · 2024

Computer science Philosophy

This replication package contains both the extended syntax highlighting dataset and the software needed to reproduce the study presented in the paper On-the-Fly Syntax Highlighting Generalisability and Speed-ups. It can be reused for futur…

DRIVE: Dockerfile Rule Mining and Violation Detection Open

Yu Zhou, Weilin Zhan, Zi Li, Tingting Han, Taolue Chen , et al. · 2023

Computer science Mathematics Political science

A Dockerfile defines a set of instructions to build Docker images, which can then be instantiated to support containerized applications. Recent studies have revealed a considerable amount of quality issues with Dockerfiles. In this paper, …

Towards Top-Down Automated Development in Limited Scopes: A Neuro-Symbolic Framework from Expressibles to Executables Open

Jian Gu, Harald C. Gall · 2023

Computer science Physics Philosophy

Deep code generation is a topic of deep learning for software engineering\n(DL4SE), which adopts neural models to generate code for the intended\nfunctions. Since end-to-end neural methods lack domain knowledge and software\nhierarchy awar…

On-the-fly syntax highlighting using neural networks Open

Marco Edoardo Palma, Pasquale Salza, Harald C. Gall · 2022

Computer science Biology

With the presence of online collaborative tools for software developers,\nsource code is shared and consulted frequently, from code viewers to merge\nrequests and code snippets. Typically, code highlighting quality in such\nscenarios is sa…

Continuous Deep Learning: A Workflow to Bring Models into Production Open

Janosch Baltensperger, Pasquale Salza, Harald C. Gall · 2022

Computer science

Researchers have been highly active to investigate the classical machine learning workflow and integrate best practices from the software engineering lifecycle. However, deep learning exhibits deviations that are not yet covered in this co…

Synthetic End-User Testing: Modeling Realistic Agents Based on Behavioral Examples Open

Pasquale Salza, Marco Edoardo Palma, Harald C. Gall · 2022

Computer science Mathematics Physics

For software interacting directly with real-world end-users, it is common practice to script scenario tests validating the system's compliance with a number of its features. However, these do not accommodate the replication of the type of …

On-the-Fly Syntax Highlighting Using Neural Networks - Replication Package (Data) Open

Marco Edoardo Palma, Pasquale Salza, Harald C. Gall · 2022

Computer science Biology

This dataset includes the data to replicate the study for the paper On-the-Fly Syntax Highlighting Using Neural Networks. It can be reused for future research in the field. We also include the detailed results obtained by executing our app…

On-the-Fly Syntax Highlighting Using Neural Networks - Replication Package (Data) Open

Marco Edoardo Palma, Pasquale Salza, Harald C. Gall · 2022

Computer science Biology

This dataset includes the data to replicate the study for the paper On-the-Fly Syntax Highlighting Using Neural Networks. It can be reused for future research in the field. We also include the detailed results obtained by executing our app…

On the Effectiveness of Transfer Learning for Code Search Open

Pasquale Salza, Christoph Schwizer, Jian Gu, Harald C. Gall · 2022

Computer science Art Physics

The Transformer architecture and transfer learning have marked a quantum leap\nin natural language processing, improving the state of the art across a range\nof text-based tasks. This paper examines how these advancements can be applied\nt…

On the Effectiveness of Transfer Learning for Code Search - Replication Package Open

Pasquale Salza, Christoph Schwizer, Jian Gu, Harald C. Gall · 2022

Computer science Biology

This repository represents the replication package for the paper On the Effectiveness of Transfer Learning for Code Search. The paper is published in the journal IEEE Transactions on Software Engineering (TSE). In this replication package,…

On the Effectiveness of Transfer Learning for Code Search - Replication Package Open

Pasquale Salza, Christoph Schwizer, Jian Gu, Harald C. Gall · 2022

Computer science Biology

This repository represents the replication package for the paper On the Effectiveness of Transfer Learning for Code Search. The paper is published in the journal IEEE Transactions on Software Engineering (TSE). In this replication package,…

Toward Granular Automatic Unit Test Case Generation Open

Fabiano Pecorelli, Giovanni Grano, Fabio Palomba, Harald C. Gall, Andrea De Lucia · 2022

Computer science Engineering Economics

Unit testing verifies the presence of faults in individual software components. Previous research has been targeting the automatic generation of unit tests through the adoption of random or search-based algorithms. Despite their effectiven…

Assemble Foundation Models for Automatic Code Summarization Open

Jian Gu, Pasquale Salza, Harald C. Gall · 2022

Computer science Geography

Automatic code summarization is beneficial to daily software development since it could help reduce the requirement of manual writing. Currently, artificial intelligence is undergoing a paradigm shift. The foundation models pretrained on m…

Replication Package "Applying Test Case Prioritization to Software Microbenchmarks" Open

Christoph Laaber, Harald C. Gall, Philipp Leitner · 2021

Computer science Biology Engineering

Replication package for the paper "Applying Test Case Prioritization to Software Microbenchmarks" accepted for publication in Empirical Software Engineering.

Replication Package "Applying Test Case Prioritization to Software Microbenchmarks" Open

Christoph Laaber, Harald C. Gall, Philipp Leitner · 2021

Computer science Biology Business

Replication package for the paper "Applying Test Case Prioritization to Software Microbenchmarks" accepted for publication in Empirical Software Engineering.

On the Effectiveness of Transfer Learning for Code Search Open

Pasquale Salza, Christoph Schwizer, Jian Gu, Harald C. Gall · 2021

Computer science Physics Art

The Transformer architecture and transfer learning have marked a quantum leap in natural language processing, improving the state of the art across a range of text-based tasks. This paper examines how these advancements can be applied to a…

Adversarial Robustness of Deep Code Comment Generation Open

Yu Zhou, Xiaoqing Zhang, Juanjuan Shen, Tingting Han, Taolue Chen , et al. · 2021

Computer science Chemistry

Deep neural networks (DNNs) have shown remarkable performance in a variety of domains such as computer vision, speech recognition, or natural language processing. Recently they also have been applied to various software engineering tasks, …

Boosting API Recommendation With Implicit Feedback Open

Yu Zhou, Yang Xinying, Taolue Chen, Zhiqiu Huang, Xiaoxing Ma , et al. · 2021

Computer science Mathematics

Developers often need to use appropriate APIs to program efficiently, but it is usually a difficult task to identify the exact one they need from a vast of candidates. To ease the burden, a multitude of API recommendation approaches have b…

Dynamically reconfiguring software microbenchmarks: reducing execution time without sacrificing result quality Open

Christoph Laaber, Stefan Würsten, Harald C. Gall, Philipp Leitner · 2020

Computer science Geography

Executing software microbenchmarks, a form of small-scale performance tests predominantly used for libraries and frameworks, is a costly endeavor. Full benchmark suites take up to multiple hours or days to execute, rendering frequent check…

Configuration smells in continuous delivery pipelines: a linter and a six-month study on GitLab Open

Carmine Vassallo, Sebastian Proksch, Anna Jancso, Harald C. Gall, Massimiliano Di Penta · 2020

Computer science Engineering Philosophy

An effective and efficient application of Continuous Integration (CI) and Delivery (CD) requires software projects to follow certain principles and good practices. Configuring such a CI/CD pipeline is challenging and error-prone. Therefore…

Configuration Smells in Continuous Delivery Pipelines: A Linter and A Six-Month Study on GitLab Open

Carmine Vassallo, Sebastian Proksch, Anna Jancso, Harald C. Gall, Massimiliano Di Penta · 2020

Computer science Engineering

An effective and efficient application of Continuous Integration (CI) and Delivery (CD) requires software projects to follow certain principles and good practices. Configuring such a CI/CD pipeline is challenging and error-prone. Therefore…

Harald C. Gall YOU? Author Swipe