Non-geodesically-convex optimization in the Wasserstein space
Authors: Hoang Phuc Hau Luu, Hanlin Yu, Bernardo Williams, Petrus Mikkola, Marcelo Hartmann, Kai Puolamäki, Arto Klami
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform numerical sampling experiments from non-log-concave distributions: the Gaussian mixture distribution and the distance-to-set-prior [61] relaxed von Mises Fisher distribution. Both are log-DC and the latter has non-differentiable logarithmic probability density (see Appx. C). Fig. 1 presents the sampling results. Experiment details are in Appx. B and Appx. C1. |
| Researcher Affiliation | Academia | Hoang Phuc Hau Luu Hanlin Yu Bernardo Williams Petrus Mikkola Marcelo Hartmann Kai Puolamäki Arto Klami, Department of Computer Science, University of Helsinki |
| Pseudocode | Yes | Algorithm 1 Semi FB Euler for sampling (Appendix B) and Algorithm 2 FB Euler for sampling (Appendix B). |
| Open Source Code | Yes | Our code is available at https://github.com/MCS-hub/OW24 |
| Open Datasets | No | The paper defines synthetic distributions (Gaussian mixture, relaxed von Mises Fisher) with parameters in Appendix C for its experiments, rather than using pre-existing public datasets. No specific link, DOI, or formal citation to a public dataset is provided. |
| Dataset Splits | No | The paper conducts numerical sampling experiments and describes training parameters (e.g., iterations, learning rates, batch size) but does not define explicit training, validation, or test dataset splits in the conventional supervised learning sense. |
| Hardware Specification | No | We perform numerical experiments in a high-performance computing cluster with GPU support. We allocate 8G memory for the experiments. |
| Software Dependencies | No | We use Python version 3.8.0. Our implementation is based on the code of [53] (MIT license) with the Dense ICNN architecture [41]. |
| Experiment Setup | Yes | Experiment details We set K = 5 and randomly generate x1, x2, . . . , x5 R2. We set σ = 1. The initial distribution is µ0 = N(0, 16I). We use η = 0.1 for both FB Euler and semi FB Euler. We train both algorithms for 40 iterations using Adam optimizer with a batch size of 512 in which the first 20 iterations use a learning rate of 5 10 3 while the latter 20 iterations use 2 10 3. For the baseline ULA, we run 10000 chains in parallel for 4000 iterations with a learning rate of 10 3. |