reproducibilityindex.ai

Generative Code Modeling with Graphs

Authors: Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Gaunt, Oleksandr Polozov

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An experimental evaluation shows that our new model can generate semantically meaningful expressions, outperforming a range of strong baselines.
Researcher Affiliation	Industry	Marc Brockschmidt, Miltiadis Allamanis, Alexander Gaunt Microsoft Research Cambridge, UK {mabrocks,miallama,algaunt}@microsoft.com Oleksandr Polozov Microsoft Research Redmond, WA, USA polozov@microsoft.com
Pseudocode	Yes	Algorithm 1 Pseudocode for Expand Input: Context c, partial AST a, node v to expand. Algorithm 2 Pseudocode for Compute Edge Input: Partial AST a, node v
Open Source Code	Yes	We have released the code for this on https://github.com/Microsoft/graph-based-code-modelling.
Open Datasets	Yes	We have collected a dataset for our Expr Gen task from 593 highly-starred open-source C# projects on Git Hub, removing any near-duplicate ﬁles, following the work of Lopes et al. (2017). Samples from our dataset can be found in the supplementary material.
Dataset Splits	Yes	We split the data into four separate sets. A test-only dataset is made up from 100k samples generated from 114 projects. The remaining data we split into training-validation-test sets (3 : 1 : 1), keeping all expressions collected from a single source ﬁle within a single fold.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions software components like GRU, GGNN, and the C# compiler Roslyn, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper describes the training objective as "maximum likelihood objective without pre-trained components" and mentions "beam search decoding with beam width 5", but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed optimizer settings needed for reproduction.