reproducibilityindex.ai

Generative Pseudo-Inverse Memory

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran, Bao Ho, Svetha Venkatesh

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically we demonstrate the efﬁciency and versatility of GPM on a comprehensive suite of experiments involving binarized MNIST, binarized Omniglot, Fashion MNIST, CIFAR10 & CIFAR100 and Celeb A.
Researcher Affiliation	Academia	Kha Pham 1, Hung Le 1, Man Ngo 2, Truyen Tran 1, Bao Ho 3 and Svetha Venkatesh 1 1 Applied Artiﬁcial Intelligence Institute, Deakin University 2 Faculty of Mathematics and Computer Science, VNUHCM-University of Science 3 Vietnam Institute for Advanced Study in Mathematics
Pseudocode	Yes	Algorithm 1 Single training step of Generative Pseudo-Inverse Memory
Open Source Code	Yes	Codes are available at https://github.com/phamtienkha/generative-pseudoinverse-memory.
Open Datasets	Yes	We validate these theoretical insights through a comprehensive suite of experiments on binarized MNIST (Le Cun et al., 2010) , binarized Omniglot (Burda et al., 2016), Fashion MNIST (Xiao et al., 2017), CIFAR10 & CIFAR100 (Krizhevsky, 2009) and Celeb A (Liu et al., 2015), demonstrating superior results.
Dataset Splits	No	The paper specifies a training and test split for the Omniglot dataset ('24,345 training and 8,070 test examples') but does not mention a validation split.
Hardware Specification	No	The paper states 'All operations are computed on a single GPU.' but does not specify the model or type of GPU, CPU, or any other specific hardware details.
Software Dependencies	Yes	We use the inverse function of Pytorch 1.8.0 (Paszke et al., 2017) for batch matrix inverse.
Experiment Setup	Yes	In all experiments, we use the Adam optimizer with learning rate varying from 5e-5 to 5e-4 depending on the dataset. We use weight decay of 1e-3 along with gradient clipping at threshold 10. The encoder consists of 4 layers, each of which is a convolution layer with 4x4 filter with stride 2 followed by a Resnet block with bottleneck (He et al., 2016). The decoder is simply a mirror of the encoder with transpose convolutional layer. We use the swish activation function (Ramachandran et al., 2017) non-linear layers. We run the Ben-Cohen algorithm for 7 steps to approximate the pseudo-inverses, with the initial term is 10-3 times the transpose of the matrix which we want to calculate the pseudo-inverse.