Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Momentum Capsule Networks
Authors: Josef Gugglberger, Antonio Rodriguez-sanchez, David Peer
TMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We will show that Mo Caps Net beats the accuracy of baseline capsule networks on MNIST, SVHN, CIFAR-10 and CIFAR-100 while using considerably less memory. The source code is available on https://github.com/moejoe95/Mo Caps Net. 4 Experimental evaluation We evaluated Mo Caps Net on four different datasets and compared it to other capsule network models. |
| Researcher Affiliation | Collaboration | Josef Gugglberger EMAIL Department of Computer Science University of Innsbruck, Austria David Peer EMAIL Deep Opinion Antonio Rodríguez-Sánchez EMAIL Department of Computer Science University of Innsbruck, Austria |
| Pseudocode | Yes | Algorithm 1: High level Python code of the modified forward step of a momentum capsule network. The current inputs along with the residual layers are passed into the forward function below. Algorithm 2: High level Python code of the modified backward step of a momentum capsule network. The current gradient along with the residual layers are arguments passed to the backward function. |
| Open Source Code | Yes | We will show that Mo Caps Net beats the accuracy of baseline capsule networks on MNIST, SVHN, CIFAR-10 and CIFAR-100 while using considerably less memory. The source code is available on https://github.com/moejoe95/Mo Caps Net. |
| Open Datasets | Yes | We evaluated our model on four different, popular datasets: MNIST (Le Cun et al., 2010), SVHN (Netzer et al., 2012), CIFAR-10 and CIFAR-100 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper mentions popular datasets (MNIST, SVHN, CIFAR-10, CIFAR-100) and describes preprocessing steps, but does not explicitly state the training, validation, or test dataset splits used for these experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used to run the experiments. It only mentions software frameworks like PyTorch. |
| Software Dependencies | No | The paper states 'Our implementation is implemented in Pytorch (Paszke et al., 2019)' and refers to optimizers like ADAM and Ranger21, but does not specify exact version numbers for PyTorch or any other software libraries or dependencies used in the implementation. |
| Experiment Setup | Yes | Our implementation is implemented in Pytorch (Paszke et al., 2019)... The weight of the Momentum term was set to γ=0.9. We initialize the weights of the transformation matrices at random from a normal distribution with mean 0 and standard deviation 0.01. The batch size for training was 128 and we trained each model for 30 (MNIST) or 60 (SVHN, CIFAR-10/100) epochs. We optimized our network weights with ADAM (Kingma & Ba, 2014), using an initial learning rate of 10 3 and an exponential decay of 0.96. We use 32 capsules in each capsule layer that is located inside a residual block... In the Primary Capsule layer, we do a convolution with a kernel size of 9 and a stride of 2... Between capsule layers, we perform dynamic routing for 3 routing iterations. The reconstruction network is made up of three dense layers, where the first two layers use Re Lu, and the last layer implements a sigmoid activation function... In our experiments, we set λ = 5 10 4. |