reproducibilityindex.ai

Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling

Authors: Shinichi Shirakawa, Yasushi Iwata, Youhei Akimoto

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply the proposed method to several structure optimization problems such as selection of layers, selection of unit types, and selection of connections using the MNIST, CIFAR-10, and CIFAR-100 datasets. The experimental results show that the proposed method can ﬁnd the appropriate and competitive network structures.
Researcher Affiliation	Academia	Shinichi Shirakawa Yokohama National University shirakawa-shinichi-bg@ynu.ac.jp Yasushi Iwata Yokohama National University iwata-yasushi-ct@ynu.jp Youhei Akimoto Shinshu University y akimoto@shinshu-u.ac.jp
Pseudocode	Yes	Algorithm 1 Optimization procedure of the proposed method instantiated with Bernoulli distribution.
Open Source Code	No	The paper mentions that algorithms are implemented using the Chainer framework, but it does not state that the authors' own implementation code for the proposed methodology is open-source or publicly available.
Open Datasets	Yes	We use the MNIST handwritten digits dataset containing the 60,000 training examples and 10,000 test examples of 28 28 gray-scale images. ... We use the CIFAR-10 and CIFAR-100 datasets in which the numbers of classes are 10 and 100, respectively. The numbers of training and test images are 50,000 and 10,000, respectively, and the size of the images is 32 32.
Dataset Splits	Yes	The training data is split into training and validation set in the ratio of nine to one; the validation set is used to evaluate a hyper-paremter after training the neural network with a candidate hyper-parameter.
Hardware Specification	Yes	The algorithms are implemented by the Chainer framework (Tokui et al. 2015) (version 1.23.0) on NVIDIA Geforce GTX 1070 GPU for experiments (I) to (III) and on NVIDIA TITAN X GPU for experiment (IV).
Software Dependencies	Yes	The algorithms are implemented by the Chainer framework (Tokui et al. 2015) (version 1.23.0). ... We use GPy Opt package (version 1.0.3, http://github.com/Shefﬁeld ML/ GPy Opt) for the Bayesian optimization implementation and adopt the default parameter setting.
Experiment Setup	Yes	In all experiments, the SGD with a Nesterov momentum (Sutskever et al. 2013) of 0.9 and a weight decay of 10 4 is used to optimize the weight parameters. The learning rate is divided by 10 at 1/2 and 3/4 of the maximum number of epochs. ... For (I) Selection of Layers: The data sample size and the number of epochs are set to N = 64 and 100 for Adaptive Layer (a), respectively, and N = 128 and 200 for other algorithms. ... We initialize the learning rate of SGD by 0.01 and the Bernoulli parameters by θinit = 0.5 or θinit = 1 1/31 0.968.