Gromov-Wasserstein Autoencoders
Authors: Nao Nakagawa, Ren Togo, Takahiro Ogawa, Miki Haseyama
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical comparisons with VAE-based models show that GWAE models work in two prominent meta-priors, disentanglement and clustering, with their GW objective unchanged. We conduct empirical evaluations on the capability of GWAE in prominent meta-priors: disentanglement and clustering. Several experiments on image datasets Celeb A (Liu et al., 2015), MNIST (Le Cun et al., 1998), and 3D Shapes (Burgess & Kim, 2018), show that GWAE models outperform the VAE-based representation learning methods whereas their GW objective is not changed over different meta-priors. |
| Researcher Affiliation | Academia | Graduate School of Information Science and Technology, Hokkaido University, Japan Faculty of Information Science and Technology, Hokkaido University, Japan |
| Pseudocode | No | The paper describes its methods through text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | To ensure reproducibility, our code is available online at https://github.com/ganmodokix/gwae and is provided as the supplementary material. |
| Open Datasets | Yes | For the reported experiments in Section 4, we used the following datasets: MNIST (Le Cun et al., 1998). Celeb A (Liu et al., 2015). 3D Shapes (Burgess & Kim, 2018). Omniglot (Lake et al., 2015). CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | We used the original test set and randomly split the original training set into 54,000 training images and 6,000 validation images. (for MNIST) We randomly split the entire dataset into 384,000/48,000/48,000 images for the train/validation/test set, respectively. (for 3D Shapes) |
| Hardware Specification | Yes | For the reported experimental results, we used a single GPU of NVIDIA Ge Force RTX 2080 Ti, and a single run of the entire GWAE training process until convergence takes about eight hours. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2019)' as a framework and optimizers like 'RMSProp' and 'Adam (Kingma & Ba, 2015)', but does not provide specific version numbers for these software components or any other libraries used. |
| Experiment Setup | Yes | For quantitative evaluations, we selected hyperparameters from λW [100, 101], λD [100, 101], and λH [10 4, 100] using their performance on the validation set. For the optimizers of GWAE, we used RMSProp with a learning rate of 10 4 for the main autoencoder network and used RMSProp with a learning rate of 5 10 5 for the critic network. For all the compared methods except for GWAE, we used the Adam (Kingma & Ba, 2015) optimizer with a learning rate of 10 4. In the experiments, we used an equal batch size of 64 for all evaluated models. |