Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Gromov-Wasserstein Autoencoders
Authors: Nao Nakagawa, Ren Togo, Takahiro Ogawa, Miki Haseyama
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical comparisons with VAE-based models show that GWAE models work in two prominent meta-priors, disentanglement and clustering, with their GW objective unchanged. We conduct empirical evaluations on the capability of GWAE in prominent meta-priors: disentanglement and clustering. Several experiments on image datasets Celeb A (Liu et al., 2015), MNIST (Le Cun et al., 1998), and 3D Shapes (Burgess & Kim, 2018), show that GWAE models outperform the VAE-based representation learning methods whereas their GW objective is not changed over different meta-priors. |
| Researcher Affiliation | Academia | Graduate School of Information Science and Technology, Hokkaido University, Japan Faculty of Information Science and Technology, Hokkaido University, Japan |
| Pseudocode | No | The paper describes its methods through text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | To ensure reproducibility, our code is available online at https://github.com/ganmodokix/gwae and is provided as the supplementary material. |
| Open Datasets | Yes | For the reported experiments in Section 4, we used the following datasets: MNIST (Le Cun et al., 1998). Celeb A (Liu et al., 2015). 3D Shapes (Burgess & Kim, 2018). Omniglot (Lake et al., 2015). CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | We used the original test set and randomly split the original training set into 54,000 training images and 6,000 validation images. (for MNIST) We randomly split the entire dataset into 384,000/48,000/48,000 images for the train/validation/test set, respectively. (for 3D Shapes) |
| Hardware Specification | Yes | For the reported experimental results, we used a single GPU of NVIDIA Ge Force RTX 2080 Ti, and a single run of the entire GWAE training process until convergence takes about eight hours. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2019)' as a framework and optimizers like 'RMSProp' and 'Adam (Kingma & Ba, 2015)', but does not provide specific version numbers for these software components or any other libraries used. |
| Experiment Setup | Yes | For quantitative evaluations, we selected hyperparameters from λW [100, 101], λD [100, 101], and λH [10 4, 100] using their performance on the validation set. For the optimizers of GWAE, we used RMSProp with a learning rate of 10 4 for the main autoencoder network and used RMSProp with a learning rate of 5 10 5 for the critic network. For all the compared methods except for GWAE, we used the Adam (Kingma & Ba, 2015) optimizer with a learning rate of 10 4. In the experiments, we used an equal batch size of 64 for all evaluated models. |