Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Vector Quantized Wasserstein Auto-Encoder
Authors: Long Tung Vuong, Trung Le, He Zhao, Chuanxia Zheng, Mehrtash Harandi, Jianfei Cai, Dinh Phung
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments to demonstrate our key contributions by comparing with VQ-VAE (Van Den Oord et al., 2017) and SQ-VAE (Takida et al., 2022) (i.e., the recent work that can improve the codebook utilization). The experimental results show that our VQ-WAE can achieve better codebook utilization with higher codebook perplexity, hence leading to lower (compared with VQ-VAE) or comparable (compared with SQ-VAE) reconstruction error, with significantly lower reconstructed Fr echlet Inception Distance (FID) score (Heusel et al., 2017). |
| Researcher Affiliation | Collaboration | 1Monash University, Australia 2Vinai, Vietnam 3CSIRO s Data61, Australia 4University of Oxford, United Kingdom. |
| Pseudocode | Yes | Algorithm 1 VQ-WAE |
| Open Source Code | No | The paper does not provide a direct link to a code repository or explicitly state that the source code for their method is released. |
| Open Datasets | Yes | Datasets: We empirically evaluate the proposed VQ-WAE in comparison with VQ-VAE (Van Den Oord et al., 2017) that is the baseline method, VQ-GAN (Esser et al., 2021) and recently proposed SQ-VAE (Takida et al., 2022) which is the state-of-the-art work of improving the codebook usage, on five different benchmark datasets: CIFAR10 (Van Den Oord et al., 2017), MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Celeb A dataset (Liu et al., 2015; Takida et al., 2022) and the high-resolution images dataset FFHQ. |
| Dataset Splits | No | The paper mentions 'test-set reconstruction results' but does not explicitly provide details about training, validation, or test splits with percentages, absolute counts, or references to predefined standard splits. |
| Hardware Specification | Yes | Precisely on the system of a GPU NVIDIA Tesla V100 with dual CPUs Intel Xeon E5-2698 v4, training VQ-WAE takes about 64 seconds for one epoch on CIFAR10 dataset, while training a standard VQ-VAE only takes approximately 40 seconds for one epoch. |
| Software Dependencies | No | The paper mentions the use of an "adam optimizer" but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages. |
| Experiment Setup | Yes | Additionally, in the primary setting, we use a codeword (discrete latent) dimensionality of 64 and codebook size |C| = 512 for all datasets except FFHQ, which has a codeword dimensionality of 256 and codebook size |C| = 1024, while the hyper-parameters {β, τ, λ} are specified as presented in the original papers, i.e., β = 0.25 for VQ-VAE and VQ-GAN (Esser et al., 2021), τ = 1e 5 for SQ-VAE and λ = 1e 3, λr = 1.0 for our VQ-WAE. The details of the experimental settings are presented in Appendix D. |