Amortized Reparametrization: Efficient and Scalable Variational Inference for Latent SDEs

Authors: Kevin Course, Prasanth Nair

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4 we provide a number of numerical studies including learning latent neural SDEs from video and performance benchmarking on a motion capture dataset.
Researcher Affiliation Academia Kevin Course University of Toronto kevin.course@mail.utoronto.ca Prasanth B. Nair University of Toronto prasanth.nair@utoronto.ca
Pseudocode No The paper describes its method using text and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code Yes All code is available at github.com/coursekevin/arlatentsde.
Open Datasets Yes In this experiment we consider the motion capture dataset from [32]. The dataset consists of 16 training, 3 validation, and 4 independent test sequences of a subject walking. We made use of the preprocessed data from [34].
Dataset Splits Yes We use the first 50 seconds for training and reserve the remaining 15 seconds for validation.
Hardware Specification Yes Experiments were performed on an Ubuntu server with a dual E5-2680 v3 with a total of 24 cores, 128GB of RAM, and an NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies No The paper mentions software like 'Py Torch', 'torchdiffeq', 'torchsde', and 'pytorch_lightning', but does not provide specific version numbers for these software components.
Experiment Setup Yes In terms of hyperparameters we set the schedule on the KL-divergence to increase from 0 to 1 over 1000 iterations. We choose a learning rate of 10-3 with exponential learning rate decay where the learning rate was decayed lr = γlr every iteration with γ = exp(log(0.9)/1000) (i.e. the effective rate of learning rate decay is lr = 0.9lr every 1000 iterations.). We used the nested Monte-Carlo approximation described in Equation (10) with R = 1, S = 10, and M = 256.