Deep Stochastic Processes via Functional Markov Transition Operators

Authors: Jin Xu, Emilien Dupont, Kaspar Märtens, Thomas Rainforth, Yee Whye Teh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate clear advantages of MNPs over baseline models on a variety of tasks. To empirically demonstrate the value of our proposed approach, we first benchmark MNPs against baselines on 1D function regression tasks, and conduct ablation studies in this controlled setting.
Researcher Affiliation Collaboration Jin Xu1 Emilien Dupont1 Kaspar Märtens2 Tom Rainforth1 Yee Whye Teh1 1 Department of Statistics, University of Oxford, UK. 2 Big Data Institute, University of Oxford, UK. Corresponding author: <jin.xu@stats.ox.ac.uk> ED is now at Google Deep Mind; this work was done while ED was at Oxford. YWT is at both Google Deep Mind and Oxford; this work was done at Oxford.
Pseudocode No The paper presents mathematical derivations, equations, and architectural diagrams, but it does not include any formal pseudocode blocks or algorithms.
Open Source Code Yes For additional experimental details such as hyperparameters and architectures, please refer to Appendix B.2 and our reference implementation at https://github.com/jinxu06/mnp.
Open Datasets Yes We generate the Geo Fluvial dataset using the meanderpy [54] package. [54] Zoltán Sylvester, Paul Durkin, and Jacob A Covault. High curvatures drive river meandering. Geology, 47(3):263 266, 2019.
Dataset Splits Yes For all the aforementioned datasets, we use the following set sizes: 50000 for the training set, 5000 for the validation set, and 5000 for the test set. We train our model on a training set of 20k simulations and evaluate it on a test set of 5k simulations.
Hardware Specification Yes On a single Ge Force GTX 1080 GPU card, a standard 7-step MNP takes approximately one day to train for 200k steps on 1D functions.
Software Dependencies No The paper states: "All experiments are performed using Py Torch [44]." However, it does not specify the version number for PyTorch or any other software dependency.
Experiment Setup Yes For set transformers, we stack two layers of set attention blocks (SABs) with a hidden dimension of 64 and 4 heads. Conditional normalising flows...It has two hidden layers and a hidden dimension of 128. MLPs...It has two hidden layers and a hidden dimension of 64. We use Adam [31] optimiser with a learning rate of 0.0001. We use a batch size of 100 for 1D synthetic data and a batch size of 20 for the geological data. We use 80 frequencies randomly sampled from a standard normal.