Latent SDEs on Homogeneous Spaces

Authors: Sebastian Zeng, Florian Graf, Roland Kwitt

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step geometric Euler-Maruyama scheme. Despite restricting ourselves to a less rich class of SDEs, we achieve competitive or even state-of-the-art results on various time series interpolation/classification problems.
Researcher Affiliation Academia University of Salzburg, Austria
Pseudocode Yes Algorithm 1 Geometric Euler-Maruyama Algorithm (g-EM) [36]. Listing 1 SDE solver in Lie algebra so(n). Listing 2 Generate basis in so(n). Listing 3 Generate SDE solution on Sn 1. Listing 4 Example usage with dummy K, µ, κ.
Open Source Code Yes Source code is available at https://github.com/plus-rkwitt/Latent SDEon HS.
Open Datasets Yes Human Activity: https://doi.org/10.24432/C57G8X Physio Net (2012): [51] Rotating MNIST: [9]
Dataset Splits Yes Human Activity: The dataset is split into 4,194 sequences for training, 1,311 for testing, and 1,049 for validation. Pendulum regression: 2,000 images are used for training, 1,000 images for testing and an additional 1,000 images for validation and hyperparameter selection. Rotating MNIST: 360 images are used for training, 360 for testing, and 36 for validation and hyperparameter selection.
Hardware Specification Yes All experiments were executed on systems running Ubuntu 22.04.2 LTS (kernel version 5.15.0-71-generic x86_64) with 128 GB of main memory and equipped with either NVIDIA Ge Force RTX 2080 Ti or Ge Force RTX 3090 Ti GPUs.
Software Dependencies Yes All experiments are implemented in Py Torch (v1.13 and also tested on v2.0). The reference implementation of the power spherical distribution from [8] as well as the einops package are needed.
Experiment Setup Yes We optimize all model parameters using Adam [28] with a cyclic cosine learning rate schedule [18] (990 epochs with cycle length of 60 epochs and learning rate within [1e-6, 1e-3]). The weighting of the KL divergences in the ELBO is selected on validation splits (if available) or set to 1e-5 without any annealing schedule. We fix the dimensionality of the latent space to n = 16 and use K = 6 polynomials in Eq. (11), unless stated otherwise.