Convolutional Conditional Neural Processes

Authors: Jonathan Gordon, Wessel P. Bruinsma, Andrew Y. K. Foong, James Requeima, Yann Dubois, Richard E. Turner

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate CONVCNPs in several settings, demonstrating that they achieve state-of-the-art performance compared to existing NPs. We demonstrate that building in translation equivariance enables zero-shot generalization to challenging, out-of-domain tasks.
Researcher Affiliation Collaboration Jonathan Gordon University of Cambridge jg801@cam.ac.uk Wessel P. Bruinsma University of Cambridge Invenia Labs wpb23@cam.ac.uk Andrew Y. K. Foong University of Cambridge ykf21@cam.ac.uk James Requeima University of Cambridge Invenia Labs jrr41@cam.ac.uk Yann Dubois University of Cambridge yanndubois96@gmail.com Richard E. Turner University of Cambridge Microsoft Research ret26@cam.ac.uk
Pseudocode Yes Figure 1: (a) Illustration of the CONVCNP forward pass in the off-the-grid case and pseudo-code for (b) off-the-grid and (c) on-the-grid data.
Open Source Code Yes Source code available at https://github.com/cambridge-mlg/convcnp.
Open Datasets Yes The PLAs Ti CC data set (Allam Jr et al., 2018) is a simulation... We evaluate the model on four common benchmarks: MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), and 32x32 and 64x64 Celeb A (Liu et al., 2018).
Dataset Splits No The paper mentions 'training' and 'testing' procedures and how context/target points are sampled for batches. However, it does not explicitly provide details about dataset splits (e.g., percentages, counts) for training, validation, and testing sets, nor does it specify if standard validation splits were used for the benchmark datasets.
Hardware Specification No The paper mentions memory usage in VRAM (945MB, 5839 MB, 1443 MB) and notes that ATTNCNP 'could not fit onto a 32GB GPU' (Section 5.4). While it specifies VRAM capacity, it does not identify specific GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions using 'Adam (Kingma & Ba, 2015)' for optimization. However, it does not list specific versions for programming languages, machine learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used in the implementation.
Experiment Setup Yes All models in this experiment were trained for 200 epochs using 256 batches per epoch of batch size 16. ...We use a learning rate of 3e-4 for all models, except for CONVCNPXL on the sawtooth data, where we use a learning rate of 1e-3. ...The weights are optimised using Adam (Kingma & Ba, 2015) with learning rate 5e-4. We use a maximum of 100 epochs, with early stopping of 15 epochs patience.