reproducibilityindex.ai

Gradually Updated Neural Networks for Large-Scale Image Recognition

Authors: Siyuan Qiao, Zhishuai Zhang, Wei Shen, Bo Wang, Alan Yuille

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that the networks based on our method achieve the state-of-the-art performances on CIFAR and Image Net datasets.
Researcher Affiliation	Collaboration	1Johns Hopkins University 2Shanghai University 3Hikvision Research.
Pseudocode	Yes	Algorithm 1 Back-propagation for GUNN Input :U( ) = (Ucl Uc(l 1) ... Uc1)( ), input x, output y = U(x), gradients L/ y, and parameters Θ for U. Output: L/ Θ, L/ x L/ x L/ y for i l to 1 do yc xc, c ci L/ y, L/ Θci BP(y, L/ x, Uci, Θci) ( L/ x)c ( L/ y)c, c ci ( L/ x)c ( L/ x)c + ( L/ y)c, c ci end
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the methodology, nor does it include any links to a code repository.
Open Datasets	Yes	We test our proposed GUNN on highly competitive benchmark datasets, i.e. CIFAR (Krizhevsky & Hinton, 2009) and Image Net (Russakovsky et al., 2015).
Dataset Splits	Yes	For both of the datasets, the training and test set contain 50,000 and 10,000 images, respectively. ... The Image Net dataset (Russakovsky et al., 2015) contains about 1.28 million color images for training and 50,000 for validation.
Hardware Specification	Yes	All the results reported for CIFAR, regardless of the detailed conﬁgurations, were trained using 4 NVIDIA Titan X GPUs with the data parallelism. ... We use 8 Tesla V100 GPUs with the data parallelism to get the reported results.
Software Dependencies	No	The paper mentions using 'stochastic gradient descents' and 'data parallelism' but does not specify any software names with version numbers, such as a deep learning framework or specific libraries.
Experiment Setup	Yes	the initial learning rate is set to 0.1, the weight decay is set to 1e-4, and the momentum is set to 0.9 without dampening. We train the models for 300 epochs. The learning rate is divided by 10 at 150th epoch and 225th epoch. We set the batch size to 64, following (Huang et al., 2017b).