Skip to content

Commit 2c602f9

Browse files
author
Miltos Allamanis
committed
Add papers.
1 parent 014e184 commit 2c602f9

File tree

2 files changed

+24
-0
lines changed

2 files changed

+24
-0
lines changed

Diff for: _publications/brody2020neural.markdown

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
layout: publication
3+
title: "Neural Edit Completion"
4+
authors: Shaked Brody, Uri Alon, Eran Yahav
5+
conference:
6+
year: 2020
7+
bibkey: brody2020neural
8+
additional_links:
9+
- {name: "ArXiV", url: "https://arxiv.org/abs/2005.13209"}
10+
tags: ["edit", "AST", "autocomplete"]
11+
---
12+
We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We refer to this task as the EditCompletion task and present a novel approach for tackling it. The main idea is to directly represent structural edits. This allows us to model the likelihood of the edit itself, rather than learning the likelihood of the edited code. We represent an edit operation as a path in the program's Abstract Syntax Tree (AST), originating from the source of the edit to the target of the edit. Using this representation, we present a powerful and lightweight neural model for the EditCompletion task. We conduct a thorough evaluation, comparing our approach to a variety of representation and modeling approaches that are driven by multiple strong models such as LSTMs, Transformers, and neural CRFs. Our experiments show that our model achieves 28% relative gain over state-of-the-art sequential models and 2× higher accuracy than syntactic models that learn to generate the edited code instead of modeling the edits directly. We make our code, dataset, and trained models publicly available.

Diff for: _publications/nair2020funcgnn.markdown

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
layout: publication
3+
title: "funcGNN: A Graph Neural Network Approach to Program Similarity"
4+
authors: Aravind Nair, Avijit Roy, Karl Meinke
5+
conference: ESEM
6+
year: 2020
7+
bibkey: nair2020funcgnn
8+
additional_links:
9+
- {name: "ArXiV", url: "https://arxiv.org/abs/2007.13239"}
10+
tags: ["GNN", "clone"]
11+
---
12+
Program similarity is a fundamental concept, central to the solution of software engineering tasks such as software plagiarism, clone identification, code refactoring and code search. Accurate similarity estimation between programs requires an in-depth understanding of their structure, semantics and flow. A control flow graph (CFG), is a graphical representation of a program which captures its logical control flow and hence its semantics. A common approach is to estimate program similarity by analysing CFGs using graph similarity measures, e.g. graph edit distance (GED). However, graph edit distance is an NP-hard problem and computationally expensive, making the application of graph similarity techniques to complex software programs impractical. This study intends to examine the effectiveness of graph neural networks to estimate program similarity, by analysing the associated control flow graphs. We introduce funcGNN, which is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector. To our knowledge, this is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs. Results: We demonstrate the effectiveness of funcGNN to estimate the GED between programs and our experimental analysis demonstrates how it achieves a lower error rate (0.00194), with faster (23 times faster than the quickest traditional GED approximation method) and better scalability compared with the state of the art methods. funcGNN posses the inductive learning ability to infer program structure and generalise to unseen programs. The graph embedding of a program proposed by our methodology could be applied to several related software engineering problems (such as code plagiarism and clone identification) thus opening multiple research directions.

0 commit comments

Comments
 (0)