Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
An Explicit Expansion of the Kullback-Leibler Divergence along its Fisher-Rao Gradient Flow
Authors: Carles Domingo-Enrich, Aram-Alexandre Pooladian
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude with simple synthetic experiments that demonstrate our theoretical findings are indeed tight. Based on our numerical findings, we conjecture that the asymptotic rates of convergence for Wasserstein-Fisher-Rao gradient flows are possibly related to this expansion in some cases. |
| Researcher Affiliation | Academia | Carles Domingo-Enrich EMAIL Courant Institute of Mathematical Sciences New York University Aram-Alexandre Pooladian EMAIL Center for Data Science New York University |
| Pseudocode | Yes | We used the following discretizations for the Fisher-Rao, Wasserstein and Wasserstein-Fisher-Rao gradient flows: (i) Fisher-Rao GF: We use mirror descent in log-space. The update reads: xk+1 xk + ϵ( v xk), xk+1 xk+1 log n X i=1 e xi k+1 . (ii) Wasserstein GF: We approximate numerically the gradient and the laplacian of the log-density: i [n], ( xk)i (xi+1 k xi 1 k )/(2h), i [n], ( xk)i (xi+1 k + xi 1 k 2xi k)/h2, xk+1 xk + ϵ( v + xk + ( v + xk) xk). (iii) Wasserstein-Fisher-Rao GF: We combine the two previous updates. Letting xk and xk be as in Eq. (41), we have xk+1 xk + ϵ( v xk + v + xk + ( v + xk) xk), xk+1 xk+1 log n X i=1 e xi k+1 . |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., repository link, explicit code release statement) for the source code. |
| Open Datasets | No | We present simple numerical simulations that demonstrates our asymptotic convergence rate of the KL divergence the FR gradient flows, as well as a comparison with the WFRand W-GFs. We consider two target distributions over the set [ π, π), each with two initializations: 1. Target distribution π1: We set π1 e V1 with V1(x) = 2.5 cos(2x)+0.5 sin(x). [...] 2. Target distribution π2: We set π2 e V2 with V2(x) = 6 cos(x). [...] |
| Dataset Splits | No | The paper uses synthetically generated data by defining specific potential functions for target and initial distributions. There is no mention of dataset splits (e.g., training, test, validation sets) as would be found with pre-existing datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | To run the simulations in Section 4, we discretized the interval [ π, π) in n = 2000 equispaced points. Let h = 2π/n. For each algorithm and initialization, we construct sequences (xk)k 0, where xk Rn represents the normalized log-density at each point. ... We used stepsizes ϵ = 2.5 10 6 and ϵ = 1 10 6 for the experiments on target distributions (1) and (2), respectively. ... We use periodic boundary conditions, so that the first discretization point is adjacent to the last one for the purposes of computing derivatives. ... We use different values for t1 and t2 for each target distribution; t1 and t2 must be large enough to capture the asymptotic slope of the curve, but not too large to avoid numerical errors. For all the curves corresponding to target π1, we take t1 = 7.0 and t2 = 7.5. For target π2, we take: for FR, t1 = 6.875 and t2 = 7.0; for WFR, t1 = 1.875 and t2 = 2.0; for W, t1 = 2.75 and t2 = 2.875. |