Meta Optimal Transport
Authors: Brandon Amos, Giulia Luise, Samuel Cohen, Ievgen Redko
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We instantiate Meta OT models in discrete and continuous settings between grayscale images, spherical data, classification labels, and color palettes and use them to improve the computational time of standard OT solvers. 4. Experiments We demonstrate how Meta OT models improve the convergence of the state-of-the-art solvers in settings where solving multiple OT problems naturally arises. |
| Researcher Affiliation | Collaboration | 1Meta AI 2Microsoft Research 3University College London 4Fairgen 5Noah s Ark Lab, Huawei 6Aalto University. |
| Pseudocode | Yes | Algorithm 1 Sinkhorn(α, β, c, ϵ, f0 = 0) Algorithm 2 W2GN(α, β, φ0) Algorithm 3 Training Meta OT |
| Open Source Code | Yes | Our source code is available at http://github.com/ facebookresearch/meta-ot. |
| Open Datasets | Yes | Our Meta OT model ˆfθ (sect. 3) is an MLP that predicts the transport map between pairs of MNIST digits. We evaluate the efficiency of such learning strategy on three computer vision datasets, namely: Fashion MNIST, Cifar-10 and Cifar-100. |
| Dataset Splits | No | The paper mentions training on datasets and evaluating on 'test instances' or 'test data' (e.g., Table 1, Figure 3), but it does not specify explicit numerical splits for training, validation, and test sets. It implies standard splits for datasets like MNIST but doesn't detail them. |
| Hardware Specification | Yes | App. B covers further experimental and implementation details, and shows that all of our experiments take a few hours to run on our single Quadro GP100 GPU. |
| Software Dependencies | No | The paper mentions several software packages like JAX, Matplotlib, numpy, and Optimal Transport Tools (OTT), but it does not specify their version numbers, only provides citations. |
| Experiment Setup | Yes | Table 5. Discrete OT hyper-parameters. Batch size 128 Number of training iterations 50000 MLP Hidden Sizes [1024, 1024, 1024] Adam learning rate 1e-3. Table 6. Continuous OT hyper-parameters. Meta batch size (for α, β) 8 Inner batch size (to estimate L) 1024 Cycle loss weight (γ) 3. Adam learning rate 1e-3 ℓ2 weight penalty 1e-6 Max grad norm (for clipping) 1. Number of training iterations 200000 Meta ICNN Encoder Res Net18 Encoder output size (both measures) 256 2 Meta ICNN Decoder Hidden Sizes [512]. |