Large-Scale Wasserstein Gradient Flows

Authors: Petr Mokrov, Alexander Korotin, Lingxiao Li, Aude Genevay, Justin M. Solomon, Evgeny Burnaev

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our method on toy and real-world applications. Our code is written in Py Torch and is publicly available at https://github.com/Petr Mokrov/Large-Scale-Wasserstein-Gradient-Flows The experiments are conducted on a GTX 1080Ti. In most cases, we performed several random restarts to obtain mean and variation of the considered metric. As the result, experiments require about 100-150 hours of computation.
Researcher Affiliation Academia Petr Mokrov Skolkovo Institute of Science and Technology Moscow Institute of Physics and Technology Moscow, Russia petr.mokrov@skoltech.ru; Alexander Korotin* Skolkovo Institute of Science and Technology Artificial Intelligence Research Institute Moscow, Russia a.korotin@skoltech.ru; Lingxiao Li Massachusetts Institute of Technology Cambridge, Massachusetts, USA lingxiao@mit.edu; Aude Genevay Massachusetts Institute of Technology Cambridge, Massachusetts, USA aude.genevay@gmail.com; Justin Solomon Massachusetts Institute of Technology Cambridge, Massachusetts, USA jsolomon@mit.edu; Evgeny Burnaev Skolkovo Institute of Science and Technology Artificial Intelligence Research Institute Moscow, Russia e.burnaev@skoltech.ru
Pseudocode Yes Algorithm 1: Fokker-Planck JKO via ICNNs Input :Initial measure ρ0 accessible by samples; JKO discretization step h > 0, number of JKO steps K > 0; target potential Φ(x), diffusion process temperature β 1; batch size N; Output :trained ICNN models {ψ(k)}K k=1 representing JKO steps for k = 1, 2, . . . , K do
Open Source Code Yes Our code is written in Py Torch and is publicly available at https://github.com/Petr Mokrov/Large-Scale-Wasserstein-Gradient-Flows
Open Datasets Yes For evaluation, we consider the Bayesian linear regression setup of [42]. We use the 8 datasets from [47]. The number of features ranges from 2 to 60 and the dataset size from 700 to 7400 data points. We also use the Covertype dataset2 with 500K data points and 54 features. 2https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
Dataset Splits No We randomly split each dataset into train Strain and test Stest ones with ratio 4:1 and apply the inference on the posterior p(x|Strain). The paper specifies a train/test split ratio (4:1) but does not mention a separate validation split or its details.
Hardware Specification Yes The experiments are conducted on a GTX 1080Ti.
Software Dependencies No Our code is written in Py Torch and is publicly available at https://github.com/Petr Mokrov/Large-Scale-Wasserstein-Gradient-Flows. The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes In our method, we perform K = 40 JKO steps with step size h = 0.1. We approximate the dynamics of the process by our method with JKO step h = 0.05. In all experiments, we use the Dense ICNN [37, Appendix B.2] architecture for ψθ in Algorithm 1 with Soft Plus activations.