reproducibilityindex.ai

Deep Learning with S-Shaped Rectified Linear Activation Units

Authors: Xiaojie Jin, Chunyan Xu, Jiashi Feng, Yunchao Wei, Junjun Xiong, Shuicheng Yan

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments with two popular CNN architectures, Network in Network and Goog Le Net on scale-various benchmarks including CIFAR10, CIFAR100, MNIST and Image Net demonstrate that SRe LU achieves remarkable improvement compared to other activation functions.
Researcher Affiliation	Collaboration	1NUS Graduate School for Integrative Science and Engineering, NUS 2Department of ECE, NUS 3Beijing Samsung Telecom R&D Center 4School of CSE, Nanjing University of Science and Technology
Pseudocode	No	The paper describes the mathematical formulations and update rules for SRe LU but does not provide a formal pseudocode or algorithm block.
Open Source Code	Yes	The codes of SRe LU are available at https://github.com/AIROBOTAI/caffe/tree/SRe LU.
Open Datasets	Yes	We conduct experiments on four datasets with different scales, including CIFAR-10, CIFAR-100 (Krizhevsky and Hinton 2009), MNIST (Le Cun et al. 1998) and a much larger dataset, Image Net (Deng et al. 2009)
Dataset Splits	Yes	For every dataset, we randomly sample 20% of the total training data as the validation set to conﬁgure the needed hyperparameters in different methods.
Hardware Specification	Yes	To reduce the training time, four NVIDIA TITAN GPUs are employed in parallel for training. Other hardware information of the PCs we use includes Intel Core i7 3.3GHz CPU, 64G RAM and 2T hard disk.
Software Dependencies	No	The paper states 'We choose Caffe (Jia et al. 2014) as the platform to conduct our experiments,' but it does not specify a version number for Caffe or any other software dependencies.
Experiment Setup	Yes	For the setting of hyperparameters (such as learning rate, weight decay and dropout ratio, etc.), we follow the published conﬁgurations of original networks. ... For SRe LU, we use al = 0.2 and k = 0.9 \|Xi\| for all datasets.