Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Neural Diffusion Processes
Authors: Vincent Dutordoir, Alan Saul, Zoubin Ghahramani, Fergus Simpson
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that NDPs can capture functional distributions close to the true Bayesian posterior, demonstrating that they can successfully emulate the behaviour of Gaussian processes and surpass the performance of neural processes.6. Experimental Evaluation |
| Researcher Affiliation | Collaboration | Vincent Dutordoir 1 2 Alan Saul 2 Zoubin Ghahramani 1 3 Fergus Simpson 2. 1Department of Engineering, University of Cambridge, Cambridge, UK 2Secondmind, Cambridge, UK 3Google Deep Mind. |
| Pseudocode | Yes | B. Algorithms In this section we list pseudo-code for training and sampling NDPs. B.1. Training Algorithm 1 Training |
| Open Source Code | Yes | The code is available at https://github.com/vdu tor/neural-diffusion-processes. |
| Open Datasets | Yes | For the MNIST dataset, our task simplifies to predicting a single output value that corresponds to grayscale intensity. However, when tackling the CELEBA 32 32 dataset, we deal with the added complexity of predicting three output values for each pixel to represent the RGB colour channels. We evaluate NDPs and NPs on two synthetic datasets, following the experimental setup from Bruinsma et al. (2021) but extending it to multiple input dimensions D. |
| Dataset Splits | No | The paper specifies training and test data sizes ("The training data is composed of 2^14 sample paths, whereas the test dataset comprises 128 paths."), and describes context and target sets for evaluation, but does not explicitly provide details about a distinct validation dataset split with percentages or counts. |
| Hardware Specification | Yes | Experiments were conducted on a 32-core machine and utilised a single Tesla V100-PCIE-32GB GPU. |
| Software Dependencies | No | The paper mentions GPflow and Tensor Flow, but does not specify their version numbers or the versions of other key software components (e.g., Python, PyTorch). |
| Experiment Setup | Yes | All experiments share the same model architecture illustrated in Figure 2, there are however a number of model parameters that must be chosen. An L1 (i.e., Mean Absolute Error, MAE) loss function was used throughout. We use four or five bi-dimensional attention blocks, each consisting of multi-head self-attention blocks (Vaswani et al., 2017) containing a representation dimensionality of H = 64 and 8 heads. Each experiment used either 500 or 1000 diffusion steps... The Adam optimiser is used throughout. Our learning rate follows a cosine-decay function, with a 20 epochs linear learning rate warm-up to a maximum learning rate of η = 0.001 before decaying. All NDP models were trained for 250 epochs... Each epoch contained 4096 example training (y0, x0) pairs. Training data was provided in batches of 32...Table 3: Experiment configuration and training time. |