Building Deep Equivariant Capsule Networks
Authors: Sai Raam Venkataraman, S. Balasubramanian, R. Raghunatha Sarma
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct several experiments on standard object-classification datasets that showcase the increased transformation-robustness, as well as general performance, of our model to several capsule baselines. |
| Researcher Affiliation | Academia | Sai Raam Venkataraman, S. Balasubramanian & R. Raghunatha Sarma Department of Mathematics and Computer Science Sri Sathya Sai Institute of Higher Learning {vsairaam,sbalasubramanian,rraghunathasarma}@sssihl.edu.in |
| Pseudocode | Yes | Algorithm 1 A general summation-based routing algorithm for SOVNET. Algorithm 2 The degree-centrality based routing algorithm for SOVNET. |
| Open Source Code | Yes | The complete details, both architecture-wise and about the training, can be found in the anonymised github repository https://github.com/sairaam Venkatraman/ SOVNET. |
| Open Datasets | Yes | Specifically, we perform experiments on MNIST (Le Cun & Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky & Hinton, 2009). We trained and tested SOVNET on KMNIST (Clanuwat et al., 2018) and SVHN (Netzer et al., 2011). |
| Dataset Splits | No | For each of these datasets, we created 5 variations of the train and test-splits by randomly transforming data according to the extents of the transformations given in Table 1. The paper discusses train and test splits, but does not explicitly mention a separate validation split or its size/methodology. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using PyTorch for implementing baselines, but does not provide specific version numbers for any software dependencies used for its own model implementation. |
| Experiment Setup | Yes | As in (Sabour et al., 2017), we used a margin loss and a regularising reconstruction loss to train the networks. The positive and negative margins for half of the training epochs were set to 0.9 and 0.1, respectively. Further, the negative margin-loss was weighted by 0.5, as in (Sabour et al., 2017). These values were used for the first half of the training epochs. In order to facilitate better predictions, these values were changed to 0.95, 0.05, and 0.8, respectively for the second half of the training. The reconstruction loss... was weighed by 0.0005... We used the Adam optimiser and an exponential learning rate scheduler that reduced the learning rate by a factor of 0.9 each epoch. |