Path-Gradient Estimators for Continuous Normalizing Flows
Authors: Lorenz Vaitl, Kim Andrea Nicoli, Shinichi Nakajima, Pan Kessel
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate for two application domains, i.e. VAEs and lattice field theories in theoretical physics, that a simple replacement of the standard total gradient by the path-gradient estimator improves the performance across different architectures of continuous normalizing flows, datasets as well as for fixed and adaptive step-size ODE solvers. |
| Researcher Affiliation | Academia | 1Machine Learning Group, Department of Electrical Engineering & Computer Science, Technische Universit at Berlin, Germany 2BIFOLD Berlin Institute for the Foundations of Learning and Data, Technische Universit at Berlin, Berlin, Germany 3RIKEN Center for AIP, 103-0027 Tokyo, Chuo City, Japan. |
| Pseudocode | Yes | Algorithm 1 Forw-Aug: Forward-mode derivative for path-wise gradient estimators for CNFs (...) Algorithm 2 Full path gradient computation (...) |
| Open Source Code | Yes | We provide code to reproduce the experiments using VAEs1. 1https://github.com/lenz3000/ffjord-path |
| Open Datasets | Yes | We repeat the VAE experiments in Grathwohl et al. (2019) which train a VAE for four datasets using a FFJORD flow. (...) MNIST, OMNIGLOT, CALTECH SILHOUETTES, FREY FACES |
| Dataset Splits | No | The training also used early-stopping which necessarily implies that training has no fixed runtime. |
| Hardware Specification | Yes | Each model was trained on a single A100 GPU. |
| Software Dependencies | No | The ODE solver was Dopri5. |
| Experiment Setup | Yes | Training was done with a learning rate of .001, the Adam optimizer (Kingma & Ba, 2015), batch-size 100. The ODE solver was Dopri5. |