Approximately Equivariant Neural Processes

Authors: Matthew Ashman, Cristiana Diaconu, Adrian Weller, Wessel Bruinsma, Richard Turner

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments, showing that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.
Researcher Affiliation Collaboration Matthew Ashman University of Cambridge mca39@cam.ac.uk Cristiana Diaconu University of Cambridge cdd43@cam.ac.uk Adrian Weller University of Cambridge The Alan Turing Institute aw665@cam.ac.uk Wessel Bruinsma Microsoft Research AI for Science wessel.p.bruinsma@gmail.com Richard E. Turner University of Cambridge Microsoft Research AI for Science The Alan Turing Institute ret23@cam.ac.uk
Pseudocode Yes Algorithm 1: Forward pass through the Conv CNP (T) for off-the-grid data.
Open Source Code Yes An implementation of our models can be found at cambridge-mlg/aenp.
Open Datasets Yes We consider a synthetic 1-D regression task with datasets drawn from a Gaussian process (GP) with the Gibbs kernel [Gibbs, 1998]. derived from ERA5 [Copernicus Climate Change Service, 2020], consisting of surface air temperatures for the years 2018 and 2019.
Dataset Splits Yes For each task, we sample the number of context points Nc U{1, 64} and set the number of target points to Nt = 128. The context range [xc,min, xc,max] (from which the context points are uniformly sampled) is an interval of length 4, with its centre randomly sampled according to U[ 7,7] for the ID task, and according to U[13,27] for the OOD task. The target range is [xt,min, xt,max] = [xc,min 1, xc,max + 1]. This is also applicable during testing, with the test dataset consisting of 80,000 datasets.
Hardware Specification Yes We train and evaluate all models on a single 11 GB NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions 'Adam W [Loshchilov and Hutter, 2017]' as the optimizer but does not specify versions for core software libraries like Python, PyTorch, or TensorFlow.
Experiment Setup Yes For all models, we optimise the model parameters using Adam W [Loshchilov and Hutter, 2017] with a learning rate of 5 10 4 and batch size of 16. Gradient value magnitudes are clipped at 0.5. We train for a maximum of 500 epochs, with each epoch consisting of 16,000 datasets (10,000 iterations per epoch).