Practical Equivariances via Relational Conditional Neural Processes
Authors: Daolang Huang, Manuel Haussmann, Ulpu Remes, ST John, Grégoire Clarté, Kevin Luck, Samuel Kaski, Luigi Acerbi
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the competitive performance of RCNPs on a large array of tasks naturally containing equivariances. In this section, we evaluate the proposed relational models on several tasks and compare their performance with other conditional neural process models. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Aalto University, Finland 2Department of Mathematics and Statistics, University of Helsinki 3Department of Computer Science, University of Helsinki 4Department of Electrical Engineering and Automation (EEA), Aalto University, Finland 5Department of Computer Science, University of Manchester 6Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/acerbilab/relational-neural-processes. |
| Open Datasets | Yes | We run this benchmark evaluation using the simulator and experimental setup proposed by [3]. Here the CNP models are trained with simulated data and evaluated with both simulated and real data; the learning tasks represented in the training and evaluation data include interpolation, forecasting, and reconstruction. The evaluation results presented in Table 3 indicate that the best model depends on the task, but overall our proposed relational CNP models with translational equivariance perform comparably to their convolutional and attentive counterparts, showing that our simpler approach does not hamper performance on real data. (referring to Lotka Volterra model in 5.3, which uses 'hare lynx dataset [32]') and 'MNIST [24] and Celeb A [27] image data' in H.3. |
| Dataset Splits | Yes | After each epoch during model training, each model undergoes validation using a pre-generated validation set. The validation score is a confidence bound based on the log-likelihood values. Specifically, the mean (µval) and standard deviation (σval) of the log-likelihood values over the entire validation set are used to calculate the validation score as µval 1.96σval/ Nval where Nval is the validation dataset size. The validation sets used in training included 212 datasets and the evaluation sets used to compare the models in interpolation (INT) and out-of-input-distribution (OOID) tasks included 212 datasets each. |
| Hardware Specification | Yes | All results are calculated on an Intel Core i7-12700K CPU, under the assumption that these models can be deployed on devices or local machines without GPU access. For the entire paper, we conducted all experiments, including baseline model computations and preliminary experiments not included in the paper, on a GPU cluster consisting of a mix of Tesla P100, Tesla V100, and Tesla A100 GPUs. |
| Software Dependencies | No | The experiments carried out in this work used the open-source neural processes package released with previous work [3]. The package is distributed under the MIT license and available at https://github.com/wesselb/neuralprocesses [1]. The paper mentions software used but does not provide specific version numbers for reproducibility. |
| Experiment Setup | Yes | The CNP and GNP models used in this experiment encode the context sets as 256-dimensional vectors when dx < 5 and as 128-dimensional vectors when dx 5. Similarly, all relational models including RCNP, RGNP, Full RCNP, and Full RGNP produce relational encodings with dimension 256 when dx < 5 and dimension 128 when dx 5. The encoding network used in both model families to produce the encoding or relational encoding is a three-layer MLP, featuring 256 hidden units per layer for dx < 5 and 128 for dx 5. We also maintain the same setting across all CNP and RCNP models in terms of the decoder network architecture, using a six-layer MLP with 256 hidden units per layer for dx < 5 and 128 for dx 5. The encoder and decoder networks use Re LU activation functions. Finally, the convolutional models Conv CNP and Conv GNP, which are included in experiments where dx = {1, 2}, are employed with the configuration detailed in [3, Appendix F], and GNP, RGNP, Full RGNP, and Conv GNP models all use linear covariance with 64 basis functions. The models were trained for 100 epochs with 214 datasets in each epoch and learning rate 3 10 4. |