Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance
Authors: Dohyun Kwon, Ying Fan, Kangwook Lee
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments support our findings. By analyzing our upper bounds, we provide a few techniques to obtain tighter upper bounds. |
| Researcher Affiliation | Academia | Dohyun Kwon, Ying Fan, Kangwook Lee University of Wisconsin-Madison |
| Pseudocode | No | No pseudocode or algorithm block was found in the paper. |
| Open Source Code | Yes | Code is available at https://github.com/UW-Madison-Lee-Lab/score-wasserstein. |
| Open Datasets | Yes | Here we adopt three 2D datasets for simulation: One cluster Gaussian N(0, 0.1I), two moons in [28], and four clusters Gaussian mixture N(( 0.5, 0.5) , 0.01I) with equal weights for each cluster. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits with percentages, sample counts, or references to predefined splits for the datasets used. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments are mentioned. |
| Software Dependencies | No | The paper mentions software like Adam W, POT, and scikit-learn, but does not specify their version numbers. |
| Experiment Setup | Yes | We use a 4-layer neural network as the score matching model, with Re LU nonlinearity and skip-connection at the final output. Each layer is composed of a linear layer with 64 hidden neurons and an embedding layer for 10 timesteps. For optimizer, we use Adam W [22] with learning rate = 0.001 and weight decay coefficient 0.01. For loss function, we use JDSM with λ(t) = g(t)2 and batch size = 128. |