reproducibilityindex.ai

Learning to Compile Programs to Neural Networks

Authors: Logan Weber, Jesse Michel, Alex Renda, Michael Carbin

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement neural surrogate compilers using hypernetworks trained on a dataset of C programs and ﬁnd they produce neural surrogates that are 1.91-9.50 as dataefﬁcient and train in 4.31-7.28 fewer epochs than neural surrogates trained from scratch.
Researcher Affiliation	Academia	1MIT CSAIL, Cambridge, MA. Correspondence to: Logan Weber <loganweb@mit.edu>.
Pseudocode	No	The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks, nor does it present structured steps for a method or procedure formatted like code.
Open Source Code	No	The paper does not include an unambiguous statement from the authors about releasing their source code, nor does it provide a direct link to a code repository for the methodology described.
Open Datasets	Yes	To fulﬁll this requirement, we developed EXESTACK, a dataset of numerical, executable, deterministic C programs and corresponding input-output examples. EXESTACK is based on The Stack, a dataset of 3 TB of permissively licensed source code written in various programming languages scraped from Git Hub (Kocetkov et al., 2022).
Dataset Splits	Yes	From the full set of programs, we created a train, validation, and test set using an 80/10/10 split.
Hardware Specification	Yes	GPU NVIDIA Tesla T4 16GB
Software Dependencies	No	The paper mentions using the BERT-Tiny architecture and the Adam optimizer, but it does not specify version numbers for any software libraries, programming languages, or specific frameworks like PyTorch or TensorFlow.
Experiment Setup	Yes	COMPNETs are controlled by the following hyperparameters: program batch size, input batch size, learning rate, number of training epochs, dataset program split, dataset input split, and the surrogate topology.