Emergent Representations of Program Semantics in Language Models Trained on Programs

Authors: Charles Jin, Martin Rinard

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Specifically, we train a Transformer model on a synthetic corpus of programs written in a domain-specific language for navigating 2D grid world environments. Each program in the corpus is preceded by a (partial) specification in the form of several input-output grid world states. Despite providing no further inductive biases, we find that a probing classifier is able to extract increasingly accurate representations of the unobserved, intermediate grid world states from the LM hidden states over the course of training, suggesting the LM acquires an emergent ability to interpret programs in the formal sense.
Researcher Affiliation Academia 1CSAIL, MIT, Cambridge, MA, USA. Correspondence to: Charles Jin <ccj@csail.mit.edu>.
Pseudocode No The paper contains mathematical equations and descriptive text for processes like the autoregressive loop and trace generation, but it does not include explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes To aid in reproducibility, we open source all our code, including the code we use to generate the training data, train the LM, and conduct the probing experiments, at https://github.com/charlesjin/ emergent-semantics.
Open Datasets No Our training set consists of 500,000 randomly sampled Karel programs of lengths between 6 and 10, inclusive. For each program, we randomly sample 5 grid worlds as input, then evaluate the program to obtain 5 output grids. We create textual representations for Karel grid worlds by scanning the grid in row order, with one token per grid space.
Dataset Splits No Our training set consists of 500,000 randomly sampled Karel programs of lengths between 6 and 10, inclusive. ... We also generate a test set of 10,000 specifications in the same manner, except that we sample reference programs of length between 1 and 10.
Hardware Specification Yes On a single NVIDIA A100 GPU with 80GB of VRAM, training takes around 8 days.
Software Dependencies No We train a 350M-parameter variant of the Code Gen architecture (Nijkamp et al., 2023) from the Hugging Face Transformers library (Wolf et al., 2020), implemented in Py Torch (Paszke et al., 2019). Specific version numbers for these libraries are not provided.
Experiment Setup Yes We use the Adam optimizer, a learning rate of 5e-5, a block size of 2048, and a batch size of 32768 tokens. We train for 2.5 billion tokens, which was close to 6 passes over our training corpus; we did not observe any instabilities with training (see the results in the main text). We use a warm up over roughly the first 3000 batches, then linearly decay the learning rate to 0 after 80000 batches (training runs over 76000 batches).