Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances
Authors: Ruben Ohana, Kimia Nadjahi, Alain Rakotomamonjy, Liva Ralaivola
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide three types of results: i) PAC-Bayesian generalization bounds that hold on what we refer as adaptive Sliced Wasserstein distances, i.e. SW defined with respect to arbitrary distributions of slices (among which data-dependent distributions), ii) a principled procedure to learn the distribution of slices that yields maximally discriminative SW, by optimizing our theoretical bounds, and iii) empirical illustrations of our theoretical findings. |
| Researcher Affiliation | Collaboration | 1Flatiron Institute, USA 2MIT, USA 3Criteo AI Lab, France. |
| Pseudocode | Yes | Algorithm 1 PAC-SW: Adaptive SW via PAC-Bayes bound optimization. ... Algorithm A2 PAC-Bayes bound optimization for v MF-based SW |
| Open Source Code | Yes | All our numerical experiments presented in Section 5 can be reproduced using the code we provided in https://github.com/rubenohana/PACBayesian_Sliced-Wasserstein. |
| Open Datasets | Yes | We consider a generative modeling task on MNIST data (Deng, 2012) |
| Dataset Splits | No | The paper mentions training and test data, but does not explicitly describe a validation split or its methodology. |
| Hardware Specification | Yes | Timing results of this experiment were obtained with a NVIDIA GPU A100 80 GB, compared to Figure 4 which was on a NVIDIA V100. |
| Software Dependencies | No | The paper mentions using Adam (Kingma & Ba, 2015) with its default parameters, but no specific version numbers for software libraries or dependencies are provided. |
| Experiment Setup | Yes | We sample n = 500 samples from µ and ν and optimize ρ (µn, νn): the optimization is performed on the space of v MF distributions, using Adam (Kingma & Ba, 2015) with its default parameters. ... For each minibatch of size 512, the distribution ρ is learned by optimizing 100 projections over 100 iterations and the generative model is trained over 400 epochs. |