reproducibilityindex.ai

Capsule Routing via Variational Bayes

Authors: Fabio De Sousa Ribeiro, Georgios Leontidis, Stefanos Kollias3749-3756

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We outperform the state-of-the-art on small NORB using 50% fewer capsules than previously reported, achieve competitive performances on CIFAR-10, Fashion MNIST, SVHN, and demonstrate signiﬁcant improvement in MNIST to aff NIST generalisation over previous works. 4 Experiments
Researcher Affiliation	Academia	Fabio De Sousa Ribeiro, Georgios Leontidis, Stefanos Kollias Machine Learning Group School of Computer Science, University of Lincoln, UK {fdesousaribeiro, gleontidis, skollias}@lincoln.ac.uk
Pseudocode	Yes	Algorithm 1 Variational Bayes Capsule Routing
Open Source Code	Yes	1https://github.com/fabio-deep/Variational-Capsule-Routing
Open Datasets	Yes	The main comparative results are reported in Table 1, using small NORB (Le Cun et al. 2004), Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), SVHN (Netzer et al. 2011) and CIFAR-10 (Krizhevsky, Hinton, and others 2009).
Dataset Splits	Yes	A 20% validation split of the training set was used to tune hyperparameters. During training, we validated using the portion of test data containing the same viewpoints as in training and measured the generalisation to novel viewpoints after matching the performance on familiar ones.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., library names with explicit version numbers like 'PyTorch 1.9' or 'CUDA 11.1').
Experiment Setup	Yes	In all cases, we use the diagonal parameterisation in Eq. (11), 3 VB routing iters and batch size 32. All hyperparameters were tuned using validation sets, then models were retrained with the full training set until convergence before testing. Our best model {64, 8, 16, 16, 5} was trained for 350 epochs using Adam, LNLL loss, and 3e-3 initial learning rate with exponentially decay.