reproducibilityindex.ai

Vector Quantization-Based Regularization for Autoencoders

Authors: Hanwei Wu, Markus Flierl6380-6387

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that our proposed regularization method results in improved latent representations for both supervised learning and clustering downstream tasks when compared to autoencoders using other bottleneck structures. ... We test our proposed model on datasets MNIST, SVHN and CIFAR-10.
Researcher Affiliation	Academia	1KTH Royal Institute of Technology, Stockholm, Sweden 2Research Institutes of Sweden Stockholm, Sweden
Pseudocode	No	The paper includes 'Figure 2: Description of the soft VQ-VAE' which is a diagram, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of the paper is publicly available.1 https://github.com/Albert Oh90/Soft-VQ-VAE/
Open Datasets	Yes	We test our proposed model on datasets MNIST, SVHN and CIFAR-10.
Dataset Splits	No	The paper mentions 'Early stopping at 10000 iterations is applied by soft VQ-VAE on SVHN and CIFAR-10 datasets,' which implies the use of a validation set, but it does not provide specific split percentages or sample counts for validation data. Only 'training set' and 'test set' are explicitly mentioned for the main splits, without quantitative details for a three-way split.
Hardware Specification	No	No specific hardware details such as GPU models, CPU types, or memory amounts used for running experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'Glorot uniform initializer' but does not specify any software libraries or frameworks with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.0').
Experiment Setup	Yes	For the models tested on the CIFAR-10 and SVHN datasets, the encoder consists of 4 convolutional layers with stride 2 and ﬁlter size 3 3. The number of channels is doubled for each encoder layer. The number of channels of the ﬁrst layer is set to be 64. The decoder follows a symmetric structure of the encoder. For MINST dataset, we use multilayer perceptron networks (MLP) to construct the autoencoder. The dimensions of dense layers of the encoder and decoder are D500-500-2000-d and d-2000-500-500-D respectively, where d is the dimension of the learned latents and D is the dimension of the input datapoints. All the layers use rectiﬁed linear units (Re LU) as activation functions. We use the Glorot uniform initializer (Glorot and Bengio 2010) for the weights of encoder-decoder networks. The codebook is initialized by the uniform unit scaling. All models are trained using Adam optimizer (Kingma and Ba 2015) with learning rate 3e-4 and evaluate the performance after 40000 iterations with batch size 32. Early stopping at 10000 iterations is applied by soft VQ-VAE on SVHN and CIFAR-10 datasets.