reproducibilityindex.ai

Learning to Represent Programs with Graphs

Authors: Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our models on a large dataset of 2.9 million lines of real-world source code, showing that our best model achieves 32.9% accuracy on the VARNAMING task and 85.5% accuracy on the VARMISUSE task, beating simpler baselines (cf. section 5).
Researcher Affiliation	Collaboration	Miltiadis Allamanis Microsoft Research Cambridge, UK miallama@microsoft.com Marc Brockschmidt Microsoft Research Cambridge, UK mabrocks@microsoft.com Mahmoud Khademi Simon Fraser University Burnaby, BC, Canada mkhademi@sfu.ca
Pseudocode	No	The paper describes the methods for transforming source code into program graphs and the Gated Graph Neural Network (GGNN) model in detailed text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our implementation of graph neural networks (on a simpler task) can be found at https://github.com/Microsoft/gated-graph-neural-network-samples and the dataset can be found at https://aka.ms/iclr18-prog-graphs-dataset. [...] Our (generic) implementation of GGNNs is available at https://github.com/Microsoft/gated-graph-neural-network-samples, using a simpler demonstration task.
Open Datasets	Yes	Our implementation of graph neural networks (on a simpler task) can be found at https://github.com/Microsoft/gated-graph-neural-network-samples and the dataset can be found at https://aka.ms/iclr18-prog-graphs-dataset.
Dataset Splits	Yes	We split the remaining 23 projects into train/validation/test sets in the proportion 60-10-30, splitting along ﬁles (i.e., all examples from one source ﬁle are in the same set).
Hardware Specification	Yes	Our Tensor Flow (Abadi et al., 2016) implementation scales to 55 graphs per second during training and 219 graphs per second during test-time using a single NVidia Ge Force GTX Titan X with graphs having on average 2,228 (median 936) nodes and 8,350 (median 3,274) edges and 8 GGNN unrolling iterations, all 20 edge types (forward and backward edges for 10 original edge types) and the size of the hidden layer set to 64.
Software Dependencies	No	The paper mentions using 'TensorFlow (Abadi et al., 2016)' but does not specify a version number for TensorFlow or any other software dependencies, such as Python or specific libraries.
Experiment Setup	Yes	Using the initial node representations, concatenated with an extra bit that is set to one for the candidate nodes vt,v, we run GGNN propagation for 8 time steps. [...] the size of the hidden layer set to 64. [...] We train using a max-margin objective.