Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Deformation Robust Roto-Scale-Translation Equivariant CNNs

Authors: Liyao Gao, Guang Lin, Wei Zhu

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments on MNIST, Fashion-MNIST, and STL-10 demonstrate that the proposed model yields remarkable gains over prior arts, especially in the small data regime where both rotation and scaling variations are present within the data.
Researcher Affiliation	Academia	L. Mars Gao EMAIL Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, WA 98195, USA; Guang Lin EMAIL Department of Mathematics and School of Mechanical Engineering Purdue University West Lafayette, IN 47907, USA; Wei Zhu EMAIL Department of Mathematics and Statistics University of Massachusetts Amherst Amherst, MA 01003, USA
Pseudocode	No	The paper describes methods using mathematical equations and propositions, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and experiments in our paper are available at https://github.com/gaoliyao/ Roto-scale-translation-Equivariant-CNN. We speciﬁcally include the experiments for MNIST and Fashion-MNIST for this version.
Open Datasets	Yes	We conduct the experiments on the Rotated-and-Scaled MNIST (RS-MNIST), Rotated-and-Scaled Fashion MNIST (RS-Fashion), SIM2MNIST (Esteves et al., 2017), as well as the STL-10 data sets (Coates et al., 2011b).
Dataset Splits	Yes	We generate ﬁve independent realizations of the rotated and rescaled data [cf. Secion 6.1], which are split into Ntr = 5,000 or 2,000 images for training, 2,000 images for validation, and 50,000 images for testing.
Hardware Specification	No	In order for all models to be trained on a single GPU, we choose a Res Net (He et al., 2016) with 16 layers as the baseline... (This only mentions 'a single GPU' without any specific model or detailed specifications.)
Software Dependencies	No	The paper mentions using Python, PyTorch, and CUDA in the GitHub repository context in the reproducibility section, but does not provide specific version numbers for these or other key software dependencies within the main text.
Experiment Setup	Yes	We use the Adam optimizer (Kingma & Ba, 2014) to train all models for 60 epochs with the batch size set to 128. We set the initial learning rate to 0.01, which is scheduled to decrease tenfold after 30 epochs. ... We train all models for 1000 epochs with a batch size of 64, using an SGD optimizer with Nesterov momentum set to 0.9 and weight decay set to 5 10 4. Learning rate starts at 0.1, and is scheduled to decrease tenfold after 300, 400, 600, and 800 epochs.