Moser Flow: Divergence-based Generative Modeling on Manifolds
Authors: Noam Rozen, Aditya Grover, Maximilian Nickel, Yaron Lipman
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we demonstrate for the first time the use of flow models for sampling from general curved surfaces and achieve significant improvements in density estimation, sample quality, and training complexity over existing CNFs on challenging synthetic geometries and real-world benchmarks from the earth and climate sciences. We evaluate Moser Flows on a wide range of challenging real and synthetic problems defined over many different domains. On synthetic problems, we demonstrate improvements in convergence speed for attaining a desired level of details in generation quality. We then experiment with two kinds of complex geometries. First, we show significant improvements of 49% on average over Riemannian CNFs (Mathieu and Nickel, 2020) for density estimation as well as high-fidelity generation on 4 earth and climate science datasets corresponding to global locations of volcano eruptions, earthquakes, floods, and wildfires on spherical geometries. Next and last, we go beyond spherical geometries to demonstrate for the first time, generative models on general curved surfaces. |
| Researcher Affiliation | Collaboration | Noam Rozen1 Aditya Grover2,3 Maximilian Nickel2 Yaron Lipman1,2 1Weizmann Institute of Science 2Facebook AI Research 3UCLA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a link or explicit statement of code release) for its source code. |
| Open Datasets | Yes | We consider a collection of challenging toy 2D datasets explored in prior works (Chen et al., 2020; Huang et al., 2021). We evaluate our model on the earth and climate datasets gathered in Mathieu and Nickel (2020). We trained an SDF f for the Stanford Bunny surface M using the method in Gropp et al. (2020). |
| Dataset Splits | No | The paper mentions training and testing data but does not provide specific details on validation splits (e.g., percentages, sample counts, or explicit validation set descriptions). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like "Adam optimizer" and "Softplus activation" and the "Marching Cubes algorithm" but does not specify their version numbers or other ancillary software details needed for reproduction. |
| Experiment Setup | Yes | All models were trained using Adam optimizer (Kingma and Ba, 2014), and in all neural networks the activation is Softplus with β = 100. ... The MLP architecture used for Moser Flows consists of 3 hidden layers with 256 units each, whereas in the bottom two we used 4 hidden layers with 256 neurons... We set λ = 2. ... We used a batch size of 10k. We used learning rate of 1e-5 for Moser Flow and 1e-4 for FFJORD. ... We parameterize vθ as an MLP with 6 hidden layers of 512 neurons each. We used full batches for the NLL loss and batches of size 150k for our integral approximation. We trained for 30k epochs, with learning rate of 1e-4. We used λ = 100. ... We take vθ to be an MLP with 6 hidden layers of dimension 512. We use batch size of 10k for both the NLL loss and for the integral approximation; we ran for 1000 epochs with learning rate of 1e-4. We used λ = λ+ = 1. |