Deep Stochastic Processes via Functional Markov Transition Operators
Authors: Jin Xu, Emilien Dupont, Kaspar Märtens, Thomas Rainforth, Yee Whye Teh
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate clear advantages of MNPs over baseline models on a variety of tasks. To empirically demonstrate the value of our proposed approach, we first benchmark MNPs against baselines on 1D function regression tasks, and conduct ablation studies in this controlled setting. |
| Researcher Affiliation | Collaboration | Jin Xu1 Emilien Dupont1 Kaspar Märtens2 Tom Rainforth1 Yee Whye Teh1 1 Department of Statistics, University of Oxford, UK. 2 Big Data Institute, University of Oxford, UK. Corresponding author: <jin.xu@stats.ox.ac.uk> ED is now at Google Deep Mind; this work was done while ED was at Oxford. YWT is at both Google Deep Mind and Oxford; this work was done at Oxford. |
| Pseudocode | No | The paper presents mathematical derivations, equations, and architectural diagrams, but it does not include any formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | For additional experimental details such as hyperparameters and architectures, please refer to Appendix B.2 and our reference implementation at https://github.com/jinxu06/mnp. |
| Open Datasets | Yes | We generate the Geo Fluvial dataset using the meanderpy [54] package. [54] Zoltán Sylvester, Paul Durkin, and Jacob A Covault. High curvatures drive river meandering. Geology, 47(3):263 266, 2019. |
| Dataset Splits | Yes | For all the aforementioned datasets, we use the following set sizes: 50000 for the training set, 5000 for the validation set, and 5000 for the test set. We train our model on a training set of 20k simulations and evaluate it on a test set of 5k simulations. |
| Hardware Specification | Yes | On a single Ge Force GTX 1080 GPU card, a standard 7-step MNP takes approximately one day to train for 200k steps on 1D functions. |
| Software Dependencies | No | The paper states: "All experiments are performed using Py Torch [44]." However, it does not specify the version number for PyTorch or any other software dependency. |
| Experiment Setup | Yes | For set transformers, we stack two layers of set attention blocks (SABs) with a hidden dimension of 64 and 4 heads. Conditional normalising flows...It has two hidden layers and a hidden dimension of 128. MLPs...It has two hidden layers and a hidden dimension of 64. We use Adam [31] optimiser with a learning rate of 0.0001. We use a batch size of 100 for 1D synthetic data and a batch size of 20 for the geological data. We use 80 frequencies randomly sampled from a standard normal. |