Building Deep Equivariant Capsule Networks

Authors: Sai Raam Venkataraman, S. Balasubramanian, R. Raghunatha Sarma

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct several experiments on standard object-classification datasets that showcase the increased transformation-robustness, as well as general performance, of our model to several capsule baselines.
Researcher Affiliation Academia Sai Raam Venkataraman, S. Balasubramanian & R. Raghunatha Sarma Department of Mathematics and Computer Science Sri Sathya Sai Institute of Higher Learning {vsairaam,sbalasubramanian,rraghunathasarma}@sssihl.edu.in
Pseudocode Yes Algorithm 1 A general summation-based routing algorithm for SOVNET. Algorithm 2 The degree-centrality based routing algorithm for SOVNET.
Open Source Code Yes The complete details, both architecture-wise and about the training, can be found in the anonymised github repository https://github.com/sairaam Venkatraman/ SOVNET.
Open Datasets Yes Specifically, we perform experiments on MNIST (Le Cun & Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky & Hinton, 2009). We trained and tested SOVNET on KMNIST (Clanuwat et al., 2018) and SVHN (Netzer et al., 2011).
Dataset Splits No For each of these datasets, we created 5 variations of the train and test-splits by randomly transforming data according to the extents of the transformations given in Table 1. The paper discusses train and test splits, but does not explicitly mention a separate validation split or its size/methodology.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using PyTorch for implementing baselines, but does not provide specific version numbers for any software dependencies used for its own model implementation.
Experiment Setup Yes As in (Sabour et al., 2017), we used a margin loss and a regularising reconstruction loss to train the networks. The positive and negative margins for half of the training epochs were set to 0.9 and 0.1, respectively. Further, the negative margin-loss was weighted by 0.5, as in (Sabour et al., 2017). These values were used for the first half of the training epochs. In order to facilitate better predictions, these values were changed to 0.95, 0.05, and 0.8, respectively for the second half of the training. The reconstruction loss... was weighed by 0.0005... We used the Adam optimiser and an exponential learning rate scheduler that reduced the learning rate by a factor of 0.9 each epoch.