Autoregressive Conditional Neural Processes

Authors: Wessel Bruinsma, Stratis Markou, James Requeima, Andrew Y. K. Foong, Tom Andersson, Anna Vaughan, Anthony Buonomo, Scott Hosking, Richard E Turner

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Perhaps surprisingly, in an extensive range of tasks with synthetic and real data, we show that CNPs in autoregressive (AR) mode not only significantly outperform non-AR CNPs, but are also competitive with more sophisticated models that are significantly more expensive and challenging to train.
Researcher Affiliation Collaboration 1Microsoft Research AI4Science, 2University of Cambridge, 3British Antarctic Survey, 4The Alan Turing Institute
Pseudocode Yes Procedure 2.1 (Autoregressive application of neural process). For a neural process πθ, context set D(c) = (x(c), y(c)), and target inputs x(t), let ARx(t)(πθ, D(c)) be the distribution defined as follows: for i = 1, . . . , N, y(t) i Px(t) i πθ(x(c) x(t) 1:(i 1), y(c) y(t) 1:(i 1)), (2) where a b concatenates two vectors a and b. See Figure 7 in Appendix C for an illustration.
Open Source Code Yes We make publicly available all code necessary to reproduce our experiments4 as well as instructions for downloading, preprocessing, and modelling the Antarctic cloud cover data5. 4https://github.com/wesselb/neuralprocesses. 5https://github.com/tom-andersson/iclr2023-antarctic-arconvcnp.
Open Datasets Yes All our experiments are carried out using either synthetic or publicly available datasets. The EEG data set is available through the UCI database,2 and the environmental data are also publicly available through the European Climate Data Service.3
Dataset Splits Yes To train a model, we consider batches of 16 tasks at a time, compute an objective function value, and update the model parameters using ADAM (Kingma & Ba, 2015). The learning rate is specified separately for every experiment. We define an epoch to consist of 214 16 k tasks. We typically train a model for between 100 and 1000 epochs. For an experiment, we split up the meta data set into a training set, a cross-validation set, and an evaluation set. The model is trained on the training set. During training, after every epoch, the model is cross-validated on the cross-validation set. Cross-validation uses 212 fixed tasks.
Hardware Specification Yes The time taken to train each model on a Tesla V100 GPU is as follows: Conv CNP: 25.0 hours, Conv GNP: 27.5 hours, Conv LNP: 43.6 hours. ... It took 14 minutes to generate these AR Conv CNP samples on a Tesla V100 GPU.
Software Dependencies No The paper mentions using 'ADAM' as an optimizer but does not specify its version number or any other software libraries (e.g., PyTorch, TensorFlow, CUDA) with version numbers.
Experiment Setup Yes For this experiment, the learning rate is 3 10 4, the margin is 0.1, and the points per unit is 64. We trained the models for 100 epochs.