reproducibilityindex.ai

Charged Point Normalization: An Efficient Solution to the Saddle Point Problem

Authors: Armen Aghajanyan

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The system drastically improves learning in a range of deep neural networks on various data-sets in comparison to non-CPN neural networks.
Researcher Affiliation	Academia	Armen Aghajanyan Bellevue, WA 98007, USA armen.ag@live.com. The paper does not provide explicit university or company names for affiliation, only a personal email and city/state. However, given it is a research paper submitted to ICLR (an academic conference), it is classified as academic.
Pseudocode	No	The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions that CPN was implemented in Theano and Keras, which are third-party libraries, but does not provide any statement or link to the open-source code for their specific CPN implementation.
Open Datasets	Yes	The first test conducted was using a multilayer perceptron on the MNIST dataset. ... The next experiment conducted was using a convolutional neural network on the CIFAR10 (Krizhevsky et al., a). ... The CIFAR100 (Krizhevsky et al., b) setup was nearly identical... We selected the path-finding problem of the BABI dataset...
Dataset Splits	No	We do not show results on a validation set, because we care about the efﬁciency and performance of the optimization algorithm, not whether or not it overﬁts. ... We used the train split of each data-set. The paper uses subsets of CIFAR (10,000 or 20,000 random images) but does not specify explicit train/test/validation splits (percentages or counts) needed for full reproducibility.
Hardware Specification	Yes	All training and testing was run on a Nvidia GTX 980 GPU.
Software Dependencies	No	Charged Point Normalization was implemented in Theano (Bastien et al., 2012) and integrated with the Keras (Chollet, 2015) library. The paper does not specify version numbers for these software components.
Experiment Setup	Yes	The CPN hyper-parameters were: β = 0.001, λ = 0.1 with the moving average parameter α = 0.95. ... The optimization algorithm used was stochastic gradient descent with a learning rate of 0.01, decay of 1e 6, momentum of 0.9, with nesterov acceleration. The batch size used was 32. ... The ADAM (Kingma & Ba, 2014) optimization algorithm was used for both recurrent structures with the parameters: α = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 1e 08.