reproducibilityindex.ai

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

Authors: Milan Cvitkovic, Badal Singh, Animashree Anandkumar

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluated the utility of a Graph Structured Cache on two tasks: a code completion (a.k.a. ﬁll in the blank) task and a variable naming task. We found that using a GSC improved performance on both tasks at the cost of an approximately 30% increase in training time. More precisely: even when using hyperparameters optimized for the baseline model, adding a GSC to a baseline model improved its accuracy by at least 7% on the ﬁll in the blank task and 103% on the variable naming task.
Researcher Affiliation	Collaboration	1Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA 2Amazon Web Services, Seattle, Washington, USA. Correspondence to: Milan Cvitkovic <mcvitkov@caltech.edu>.
Pseudocode	No	The paper describes the model's procedure in narrative text and a diagram, but does not provide structured pseudocode or an algorithm block.
Open Source Code	Yes	Code to reproduce all experiments is available online.3 4 3https://github.com/mwcvitkovic/ Deep_Learning_On_Code_With_A_Graph_ Vocabulary--Code_Preprocessor 4https://github.com/mwcvitkovic/Deep_ Learning_On_Code_With_A_Graph_Vocabulary
Open Datasets	No	The paper mentions constructing its dataset from Java repos from the Maven repository and references 'Supplementary Table 5 for the list' of repositories. However, the paper itself does not include Supplementary Table 5 or provide a direct link to their specific processed dataset or its publicly available version, nor a formal citation for a pre-existing public dataset that fully matches their experimental data.
Dataset Splits	Yes	We then separated out 15% of the ﬁles in the remaining 15 repositories to serve as our Seen Repos test set. The remaining ﬁles served as our training set, from which we separated 15% of the datapoints to act as a validation set.
Hardware Specification	No	The paper does not explicitly specify hardware details such as GPU models, CPU types, or memory specifications. It only generally refers to 'GPU' in the context of computational cost.
Software Dependencies	No	The paper mentions using 'Javaparser' and 'Apache MXNet' but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup	Yes	All hidden states in the GNN contained 64 units; all GNNs ran for 8 rounds of message passing; all models used a 2 layer Char CNN with max pooling to perform the name embedding; all models were optimized using the Adam optimizer (Kingma & Ba, 2015); all inputs to the GNNs were truncated to a maximum size of 500 nodes... The only regularization we used was early stopping... we tuned all hyperparameters on the Closed Vocab baseline model, and also did a small amount of extra learning rate exploration for the Pointer Sentinel baseline model...