Lifelong Perceptual Programming By Example

Authors: Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically we show that this leads to a lifelong learning system that transfers knowledge to new tasks more effectively than baselines, and the performance on earlier tasks continues to improve even as the system learns on new, different tasks.
Researcher Affiliation Industry Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow Microsoft Research
Pseudocode Yes Figure 1: (NEURAL) TERPRET programs for counting symbols on a tape, with input-output examples. Both programs describe an interpreter with instructions to MOVE on the tape and READ the tape according to source code parametrized by instr. (left) A TERPRET program that counts 1 s. (right) A NEURAL TERPRET program that additionally learns a classifier is dinosaur.
Open Source Code No The paper does not provide an explicit statement about the release of their own source code for the described methodology or a link to its repository. It only links to the implementation of a baseline model: 'We use the original authors implementation available at https://github.com/tensorflow/ models/tree/master/neural_gpu'
Open Datasets Yes ADD2X2 scenario: The first scenario in Fig. 2(a) uses of a 2x2 grid of MNIST digits... APPLY2X2 scenario: The second scenario in Fig. 2(b) presents a 2x2 grid of of handwritten arithmetic operators.
Dataset Splits Yes We detect convergence to the correct program by a rapid increase in the accuracy on a validation set (typically occurring after around 30k training examples).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions 'TensorFlow (Abadi et al., 2016)' as part of their compilation process, but it does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup Yes We refer to the 2 networks in the shared library as net 0 and net 1. Both networks have similar architectures: they take a 28x28 monochrome image as input and pass this sequentially through two fully connected layers each with 256 neurons and ReLU activations. The last hidden vector is passed through a fully connected layer and a softmax to produce a 10 dimensional output (net 0) or 4 dimensional output (net 1) to feed to the differentiable interpreter.