Structural Language Models of Code

Authors: Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate SLMs on Java any-code completion, achieving a new state of the art: exact-match accuracy@1 of 18.04% and accuracy@5 of 24.83%...", "4. Experimental Setup", "Table 1. Results on any-code completion in Java.", "6. Ablation Study"
Researcher Affiliation Collaboration 1Technion, Israel 2Tel Aviv University 3Facebook AI Research.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks labeled as such.
Open Source Code Yes Our code, data, and trained models are available at http://github.com/tech-srl/ slm-code-generation/.
Open Datasets Yes We take the Java-small dataset of Alon et al. (2019a), which is a re-split of the dataset of Allamanis et al. (2016).extracted examples from the raw dataset of Allamanis et al. (2018) using their unseen projects test set.
Dataset Splits Yes Ultimately, this dataset contains 1.3M/10k/20k train/dev/test examples. This dataset contains 16k/8k/3k train/dev/test examples.
Hardware Specification Yes We train the model end-to-end on a single V100 GPU, using cross entropy and the Adam optimizer (Kingma & Ba, 2015), an initial learning rate of 10 4 multiplied by 0.95 every 20k steps.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'Open NMT' for baselines, but does not provide specific version numbers for key software dependencies (e.g., Python, PyTorch/TensorFlow, specific libraries with versions) used for their own model's implementation.
Experiment Setup Yes We use embeddings of size 512, 2 layers of LSTMs with 256 units, and 4 transformer layers with 8 attention heads. initial learning rate of 10 4 multiplied by 0.95 every 20k steps. vary the batch size such that each batch contains about 512 targets. apply dropout of 0.25 in the Transformer layers, and a recurrent dropout of 0.5 in the LSTMs.