Learning to Represent Programs with Graphs
Authors: Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our models on a large dataset of 2.9 million lines of real-world source code, showing that our best model achieves 32.9% accuracy on the VARNAMING task and 85.5% accuracy on the VARMISUSE task, beating simpler baselines (cf. section 5). |
| Researcher Affiliation | Collaboration | Miltiadis Allamanis Microsoft Research Cambridge, UK miallama@microsoft.com Marc Brockschmidt Microsoft Research Cambridge, UK mabrocks@microsoft.com Mahmoud Khademi Simon Fraser University Burnaby, BC, Canada mkhademi@sfu.ca |
| Pseudocode | No | The paper describes the methods for transforming source code into program graphs and the Gated Graph Neural Network (GGNN) model in detailed text, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation of graph neural networks (on a simpler task) can be found at https://github.com/Microsoft/gated-graph-neural-network-samples and the dataset can be found at https://aka.ms/iclr18-prog-graphs-dataset. [...] Our (generic) implementation of GGNNs is available at https://github.com/Microsoft/gated-graph-neural-network-samples, using a simpler demonstration task. |
| Open Datasets | Yes | Our implementation of graph neural networks (on a simpler task) can be found at https://github.com/Microsoft/gated-graph-neural-network-samples and the dataset can be found at https://aka.ms/iclr18-prog-graphs-dataset. |
| Dataset Splits | Yes | We split the remaining 23 projects into train/validation/test sets in the proportion 60-10-30, splitting along files (i.e., all examples from one source file are in the same set). |
| Hardware Specification | Yes | Our Tensor Flow (Abadi et al., 2016) implementation scales to 55 graphs per second during training and 219 graphs per second during test-time using a single NVidia Ge Force GTX Titan X with graphs having on average 2,228 (median 936) nodes and 8,350 (median 3,274) edges and 8 GGNN unrolling iterations, all 20 edge types (forward and backward edges for 10 original edge types) and the size of the hidden layer set to 64. |
| Software Dependencies | No | The paper mentions using 'TensorFlow (Abadi et al., 2016)' but does not specify a version number for TensorFlow or any other software dependencies, such as Python or specific libraries. |
| Experiment Setup | Yes | Using the initial node representations, concatenated with an extra bit that is set to one for the candidate nodes vt,v, we run GGNN propagation for 8 time steps. [...] the size of the hidden layer set to 64. [...] We train using a max-margin objective. |