Neural Functional Programming

Authors: John K. Feser, Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluation shows that this language allows to learn far more programs than existing baselines.
Researcher Affiliation Collaboration John K. Feser Massachusetts Institute of Technology feser@csail.mit.edu Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow Microsoft Research {mabrocks,t-algaun,dtarlow}@microsoft.com
Pseudocode Yes function FOLDLI(list, acc, func) idx 0 for ele in list do acc func(acc, ele, idx) idx idx + 1 return acc
Open Source Code No We aim to release Terpre T, together with these models, under an open source license in the near future.
Open Datasets No The paper mentions generating input/output pairs for tasks ("For all tasks, three groups of five input/output example pairs were sampled as training data"), but it does not specify a publicly available dataset by name, citation, or link that can be accessed by others.
Dataset Splits No The paper specifies training and test data splits ("three groups of five input/output example pairs were sampled as training data and another 25 input/output pairs as test data") but does not mention a separate validation set.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies No All of our models are implemented in Terpre T (Gaunt et al., 2016b) and we learn using Terpre T s TENSORFLOW (Abadi et al., 2015) backend. Software names (Terpre T, TensorFlow) are mentioned, but specific version numbers are not provided.
Experiment Setup Yes After training for 3500 epochs (tests with longer training runs showed no significant changes in the outcomes)... We ran the remaining experiments with the best configuration obtained by this process: the RMSProp optimization algorithm, a learning rate of 0.1, clipped gradients at 1, and no gradient noise.