Group Equivariant Convolutional Networks
Authors: Taco Cohen, Max Welling
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In section 8 we report experimental results on MNIST-rot and CIFAR10, where G-CNNs achieve state of the art results (2.28% error on MNIST-rot, and 4.19% resp. 6.46% on augmented and plain CIFAR10). |
| Researcher Affiliation | Academia | Taco S. Cohen T.S.COHEN@UVA.NL University of Amsterdam Max Welling M.WELLING@UVA.NL University of Amsterdam University of California Irvine Canadian Institute for Advanced Research |
| Pseudocode | No | The paper describes implementation details for G-convolutions (Section 7) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a code repository. |
| Open Datasets | Yes | The rotated MNIST dataset (Larochelle et al., 2007) contains 62000 randomly rotated handwritten digits. The dataset is split into a training, validation and test sets of size 10000, 2000 and 50000, respectively. ... The CIFAR-10 dataset consists of 60k images of size 32 32, divided into 10 classes. The dataset is split into 40k training, 10k validation and 10k testing splits. |
| Dataset Splits | Yes | The rotated MNIST dataset (Larochelle et al., 2007) contains 62000 randomly rotated handwritten digits. The dataset is split into a training, validation and test sets of size 10000, 2000 and 50000, respectively. ... The CIFAR-10 dataset consists of 60k images of size 32 32, divided into 10 classes. The dataset is split into 40k training, 10k validation and 10k testing splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions general support from Google and Facebook in the acknowledgements. |
| Software Dependencies | No | The paper mentions using the 'Adam algorithm (Kingma & Ba, 2015)' for optimization and 'stochastic gradient descent' but does not specify any software names with version numbers (e.g., TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | We performed model selection using the validation set, yielding a CNN architecture (Z2CNN) with 7 layers of 3 3 convolutions (4 4 in the final layer), 20 channels in each layer, relu activation functions, batch normalization, dropout, and max-pooling after layer 2. For optimization, we used the Adam algorithm (Kingma & Ba, 2015). ... For the Res Nets, we used stochastic gradient descent with initial learning rate of 0.05 and momentum 0.9. The learning rate was divided by 10 at epoch 50, 100 and 150, and training was continued for 300 epochs. |