Explanipedia

Resolving Build Conflicts via Example-Based and Rule-Based Program Transformations Open

Sheikh Shadab Towqir, Fei He, Todd Mytkowicz, Na Meng · 2025

Merge conflicts often arise when developers integrate changes from different software branches. The conflicts can result from overlapping edits in programs (i.e., textual conflicts) or cause build and test errors (i.e., build and test conf…

CodeExp: Explanatory Code Document Generation Open

Haotian Cui, Chenglong Wang, Junjie Huang, Jeevana Priya Inala, Todd Mytkowicz , et al. · 2022

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of cod…

Program merge conflict resolution via neural transformers Open

A. Svyatkovskiy, Sarah Fakhoury, Negar Ghorbani, Todd Mytkowicz, Elizabeth Dinella , et al. · 2022

Computer science

Collaborative software development is an integral part of the modern software\ndevelopment life cycle, essential to the success of large-scale software\nprojects. When multiple developers make concurrent changes around the same\nlines of c…

TOGA Open

Elizabeth Dinella, Gabriel Ryan, Todd Mytkowicz, Shuvendu K. Lahiri · 2022

Computer science Biology

Testing is widely recognized as an important stage of the software\ndevelopment lifecycle. Effective software testing can provide benefits such as\nbug finding, preventing regressions, and documentation. In terms of\ndocumentation, unit te…

CodeExp: Explanatory Code Document Generation Open

Haotian Cui, Chenglong Wang, Junjie Huang, Jeevana Priya Inala, Todd Mytkowicz , et al. · 2022

Computer science Geology Economics

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of cod…

Can Pre-trained Language Models be Used to Resolve Textual and Semantic Merge Conflicts? Open

Jialu Zhang, Todd Mytkowicz, Mike Kaufman, Ružica Piskač, Shuvendu K. Lahiri · 2021

Computer science

Program merging is standard practice when developers integrate their individual changes to a common code base. When the merge algorithm fails, this is called a merge conflict. The conflict either manifests in textual merge conflicts where …

Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL Open

Aashaka Shah, Vijay Chidambaram, Meghan Cowan, Saeed Maleki, Madan Musuvathi , et al. · 2021

Computer science

Large ML models and datasets have necessitated the use of multi-GPU systems for distributed model training. To harness the power offered by multi-GPU systems, it is critical to eliminate bottlenecks in inter-GPU communication - a problem m…

TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches Open

Aashaka Shah, Vijay Chidambaram, Meghan Cowan, Saeed Maleki, Madan Musuvathi , et al. · 2021

Computer science

Machine learning models are increasingly being trained across multiple GPUs and servers. In this setting, data is transferred between GPUs using communication collectives such as AlltoAll and AllReduce, which can become a significant bottl…

Understanding the Efficiency of Social Tagging Systems Using Information Theory Open

Ed H., Todd Mytkowicz · 2021

Computer science Geography Psychology

Given the rise in popularity of social tagging systems, it seems only natural to ask how efficient is the organically evolved tagging vocabulary in describing any underlying document objects? Does this distributed process really provide a …

Neural Unit Test Suggestions. Open

Elizabeth Dinella, Shuvendu K. Lahiri, Todd Mytkowicz, Gabriel Ryan · 2021

Computer science Biology

Testing is widely recognized as an important stage of the software development lifecycle. Effective software testing can provide benefits such as documentation, bug finding, and preventing regressions. In particular, unit tests document a …

DeepMerge: Learning to Merge Programs Open

Elizabeth Dinella, Todd Mytkowicz, A. Svyatkovskiy, Christian Bird, Mayur Naik , et al. · 2021

Computer science

In collaborative software development, program merging is the mechanism to integrate changes from multiple programmers. Merge algorithms in modern version control systems report a conflict when changes interfere textually. Merge conflicts …

Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads Open

Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki , et al. · 2021

Computer science Philosophy

Recent trend towards increasing large machine learning models require both training and inference tasks to be distributed. Considering the huge cost of training these models, it is imperative to unlock optimizations in computation and comm…

Distributed Training of Embeddings using Graph Analytics Open

Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz , et al. · 2021

Computer science

Many applications today, such as NLP, network analysis, and code analysis, rely on semantically embedding objects into low-dimensional fixed-length vectors. Such embeddings naturally provide a way to perform useful downstream tasks, such a…

Scaling Distributed Training with Adaptive Summation Open

Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi, Tianju Xu , et al. · 2020

Computer science Mathematics Geography

Stochastic gradient descent (SGD) is an inherently sequential training algorithm--computing the gradient at batch $i$ depends on the model parameters learned from batch $i-1$. Prior approaches that break this dependence do not honor them (…

Niijima Open

Guoqing Xu, Margus Veanes, Michael P. Barnett, Madan Musuvathi, Todd Mytkowicz , et al. · 2019

Computer science

Multilingual data-parallel pipelines, such as Microsoft's Scope and Apache Spark, are widely used in real-world analytical tasks. While the involvement of multiple languages (often including both managed and native languages) provides much…

Distributed Word2Vec using Graph Analytics Frameworks. Open

Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz , et al. · 2019

Computer science

Word embeddings capture semantic and syntactic similarities of words, represented as vectors. Word2Vec is a popular implementation of word embeddings; it takes as input a large corpus of text and learns a model that maps unique words in th…

CHET: Compiler and Runtime for Homomorphic Evaluation of Tensor Programs Open

Roshan Dathathri, Olli Saarikivi, Hao Chen, Kim Laine, Kristin Lauter , et al. · 2018

Computer science

Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes that allow computations to be applied directly on encrypted data without requiring a secret key. This enables novel application scenarios where a client can safely of…

High Five: Improving Gesture Recognition by Embracing Uncertainty Open

Diman Zad Tootaghaj, Adrian Sampson, Todd Mytkowicz, Kathryn S. McKinley · 2017

Computer science

Sensors on mobile devices---accelerometers, gyroscopes, pressure meters, and GPS---invite new applications in gesture recognition, gaming, and fitness tracking. However, programming them remains challenging because human gestures captured …

Debugging probabilistic programs Open

Chandrakana Nandi, Dan Grossman, Adrian Sampson, Todd Mytkowicz, Kathryn S. McKinley · 2017

Computer science Mathematics Art

Many applications compute with estimated and uncertain data. While advances in probabilistic programming help developers build such applications, debugging them remains extremely challenging. New types of errors in probabilistic programs i…

Parallel Stochastic Gradient Descent with Sound Combiners Open

Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz · 2017

Computer science Mathematics Physics

Stochastic gradient descent (SGD) is a well known method for regression and classification tasks. However, it is an inherently sequential algorithm at each step, the processing of the current example depends on the parameters learned from …

Jumping the ORDER BY Barrier in Large-Scale Pattern Matching Open

Daniel Lupei, Mike Barnett, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz · 2017

Computer science Mathematics Geography

Event-series pattern matching is a major component of large-scale data analytics pipelines enabling a wide range of system diagnostics tasks. A precursor to pattern matching is an expensive ``shuffle the world'' stage wherein data are orde…

Efficient parallelization using rank convergence in dynamic programming algorithms Open

Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz · 2016

Computer science Mathematics

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman--Wunsch, Smith--Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems t…

Low-Rank Methods for Parallelizing Dynamic Programming Algorithms Open

Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz · 2016

Computer science Mathematics

This article proposes efficient parallel methods for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that d…

Guest Editors' Introduction: Approximate Computing Open

Qiang Xu, Todd Mytkowicz, Nam Sung Kim · 2016

Computer science Mathematics Philosophy

Ih classicial technology scaling, also known as Dennard's scaling has tremendously improved computer's performance over the past decades, which in turn has enabled countless innovative applications benefiting our daily lives today. However…

Approximate and Probabilistic Computing: Design, Coding, Verification (Dagstuhl Seminar 15491) Open

Antonio Filieri, Marta Kwiatkowska, Saša Misailović, Todd Mytkowicz · 2016

Computer science Mathematics

Computing has entered the era of approximation, in which hardware and software generate and reason about estimates. Navigation applications turn maps and location estimates from hardware GPS sensors into driving directions; speech recognit…

Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup Open

Yufei Ding, Yue Zhao, Xipeng Shen, Madanlal Musuvathi, Todd Mytkowicz · 2015

Computer science Mathematics Economics

This paper presents Yinyang K-means, a new algorithm for K-means clustering. By cluster-ing the centers in the initial stage, and lever-aging efficiently maintained lower and upper bounds between a point and centers, it more effectively av…

InterPoll: Crowd-Sourced Internet Polls Open

Benjamin Livshits, Todd Mytkowicz · 2015

Computer science Engineering

Crowd-sourcing is increasingly being used to provide answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, provide little to help the sur…

Todd Mytkowicz YOU? Author Swipe