reproducibilityindex.ai

Efficient Equivariant Network

Authors: Lingshen He, Yuxuan Chen, zhengyang shen, Yiming Dong, Yisen Wang, Zhouchen Lin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments verify that our model can signiﬁcantly improve previous works with smaller model size. Especially, under the setting of training on 1/5 data of CIFAR10, our model improves G-CNNs by 5%+ accuracy, while using only 56% parameters and 68% FLOPs. (Abstract)
Researcher Affiliation	Collaboration	1Key Laboratory of Machine Perception (MOE), School of Artiﬁcial Intelligence, Peking University 2Institute for Artiﬁcial Intelligence, Peking University 3School of Mathematical Sciences and LMAM, Peking University 4Pazhou Lab, Guangzhou 510330, China
Pseudocode	No	The paper describes the layer implementation in text and includes Figure 1, which is a diagram illustrating the E4-layer, but it does not contain structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository. The 'Limitation and Future Work' section mentions plans for future implementation: 'In the future, we will try to implement a customized CUDA kernel for GPU acceleration to reduce training and inference time of our model,' implying the code is not currently publicly available.
Open Datasets	Yes	The MNIST-rot dataset [33] is the most widely used benchmark to test the equivariant models. It contains 62k 28 28 randomly rotated gray-scale handwritten digits. Images in the dataset are split into 10k for training, 2k for validation and 50k for testing. (Section 5.1) The CIFAR-10 and the CIFAR100 datasets consist of 32 32 images... Both of the datasets contain 50k training data and 10k testing data. (Section 5.2)
Dataset Splits	Yes	Images in the dataset are split into 10k for training, 2k for validation and 50k for testing. (Section 5.1)
Hardware Specification	Yes	All the experiments are done on the Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper mentions optimizers like 'Adam optimizer' and 'stochastic gradient descent' but does not specify versions for any software libraries, frameworks, or programming languages used (e.g., PyTorch 1.9, Python 3.8).
Experiment Setup	Yes	Our model is trained using the Adam optimizer [27] for 200 epochs with a batch size of 128. The learning rate is initialized as 0.02 and is reduced by 10 at the 60th, 120th and 160th epochs. The weight decay is set as 0.0001 and no data augmentation is used during training. (Section 5.1) We use the stochastic gradient descent with an initial learning rate of 0.1, a Nesterov momentum of 0.9 and a weight decay of 0.0005. The learning rate is reduced by 5 at 60th, 120th, and 160th epochs. Models are trained for 200 epochs using 128 batch size. (Section 5.2)