Interaction-Force Transport Gradient Flows
Authors: Egor Gladin, Pavel Dvurechenskii, Alexander Mielke, Jia-Jie Zhu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then empirically demonstrate the use of the IFT gradient flow for the MMD inference task. Compared to the original MMD-energy-flow algorithm of Arbel et al. [2019], IFT flow does not suffer issues such as the collapsing-to-mode issue. Leveraging the first-principled spherical IFT gradient flow, our method does not require a heuristic noise injection that is commonly tuned over the iterations in practice; see [Korba et al., 2021] for a discussion.Our method can also be viewed as addressing a long-standing issue of the kernel-mean embedding methods [Smola et al., 2007, Muandet et al., 2017, Lacoste-Julien et al., 2015] for optimizing the support of distributions.4 Numerical Example The overall goal of the numerical experiments is to approximate the target measure π by minimizing the squared MMD energy, i.e., min µ A P MMD2(µ, π). |
| Researcher Affiliation | Academia | Egor Gladin Humboldt University of Berlin Berlin, Germany & HSE University egorgladin@yandex.ru Pavel Dvurechensky Weierstrass Institute for Applied Analysis and Stochastics Berlin, Germany pavel.dvurechensky@wias-berlin.de Alexander Mielke Humboldt University of Berlin & WIAS Berlin, Germany alexander.mielke@wias-berlin.de Jia-Jie Zhu Weierstrass Institute for Applied Analysis and Stochastics Berlin, Germany jia-jie.zhu@wias-berlin.de |
| Pseudocode | Yes | We summarize the resulting overall IFT particle gradient descent from the JKO splitting scheme in Algorithm 1 in the appendix. ... Algorithm 1 A JKO-splitting for IFT particle gradient descent |
| Open Source Code | Yes | We provide the code for the implementation at https://github.com/egorgladin/ift_flow. |
| Open Datasets | Yes | The overall goal of the numerical experiments is to approximate the target measure π by minimizing the squared MMD energy, i.e., min µ A P MMD2(µ, π). In all the experiments, we have access to the target measure π in the form of samples yi π. This setting was studied in [Arbel et al., 2019] as well as in many deep generative model applications. ... µ0 N(5 1, I) and π N 0, 1 1/2 1/2 2 . ... this time the target is a mixture of equally weighted Gaussian distributions, N 0, 1 1/2 1/2 2 , N 3 1 , I , N 1 4 , 3 1/2 1/2 1 . |
| Dataset Splits | No | The paper does not explicitly mention a 'validation set' or a 'validation split' of the data. The experiments focus on approximating a target distribution and sampling, rather than supervised learning with distinct validation data. |
| Hardware Specification | No | The main text of the paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. The NeurIPS checklist states that 'Our experiments are small-scale and can be reproduced on a standard laptop', but this general statement lacks the specificity required for hardware reproduction. |
| Software Dependencies | No | The paper does not explicitly provide specific software dependencies with version numbers (e.g., library names like PyTorch with a version number) within its main text or appendix. |
| Experiment Setup | Yes | A Gaussian kernel with bandwidth σ = 10 was used. For all three algorithms, we chose the largest stepsize that didn t cause unstable behavior, τ = 50. The parameter η in (23) was set to 0.1. |