Neural SDEs as Infinite-Dimensional GANs
Authors: Patrick Kidger, James Foster, Xuechen Li, Terry J Lyons
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments across four datasets; each one is selected to represent a different regime. |
| Researcher Affiliation | Academia | 1Mathematical Institute, University of Oxford 2The Alan Turing Institute, The British Library 3Stanford. |
| Pseudocode | No | No pseudocode or algorithm blocks found. |
| Open Source Code | Yes | Example code has been made available as part of the torchsde repository. (Abstract) and Li, X. torchsde, 2020. https://github.com/google-research/torchsde. (References) |
| Open Datasets | Yes | Next we consider a dataset consisting of Google/Alphabet stock prices, obtained from LOBSTER (Haase, 2013). (Section 4.2) Next we consider a dataset of the air quality in Beijing, from the UCI repository (Zhang et al., 2017; Dua & Graff, 2017). (Section 4.3) We train several small convolutional networks on MNIST (Le Cun et al., 2010)... (Section 4.4) |
| Dataset Splits | No | The paper discusses metrics like "train-on-synthetic-test-on-real (TSTR)" which implies validation, but does not provide specific dataset split percentages or counts for training/validation/test sets. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are mentioned for the experimental setup. |
| Software Dependencies | No | Py Torch is cited in the references, and Adadelta and Adam optimizers are mentioned in Section 4.5, but no specific version numbers for any software dependencies are provided in the main text. |
| Experiment Setup | Yes | In all cases see Appendix A for details of hyperparameters, learning rates, optimisers and so on. (Section 4) Final tanh nonlinearity Using a final tanh nonlinearity (on both drift and diffusion, for both generator and discriminator)... (Section 4.5) Stochastic weight averaging Using the Ces aro mean of both the generator and discriminator weights, averaged over training, improves performance... (Section 4.5) Adadelta... Amongst all optimisers considered, Adadelta produced substantially better performance. (Section 4.5) Weight decay Nonzero weight decay also helped to damp the oscillatory behaviour... (Section 4.5) |