Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonparametrically Learning Activation Functions in Deep Neural Nets

Authors: Carson Eisenach, Zhaoran Wang, Han Liu

ICLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the power of our novel techniques, we test them on image recognition datasets and achieve up to a 15% relative increase in test performance compared to the baseline.
Researcher Affiliation Academia Carson Eisenach Princeton University EMAIL Han Liu Princeton University EMAIL Zhaoran Wang Princeton University EMAIL
Pseudocode Yes Algorithm 1 Generic Two Stage Training for Deep Convolutional Neural Networks
Open Source Code No The paper mentions using open-source libraries like Theano, CUDA, and CuDNN but does not provide a link or explicit statement about the availability of their own implementation code.
Open Datasets Yes This dataset consists of 60,000 training and 10,000 testing images. ... The dataset is from Lecun & Cortes (1999). (for MNIST) and The CIFAR-10 dataset is due to Krizhevsky (2009). This dataset consists of 50,000 training and 10,000 test images.
Dataset Splits No The paper explicitly states the number of training and testing images for MNIST and CIFAR-10, but it does not specify a separate validation dataset split.
Hardware Specification Yes We ran our simulations on Princeton s SMILE server and TIGER computing cluster. The SMILE server was equipped with a single Tesla K40c GPU, while the TIGER cluster has 200 K20 GPUs.
Software Dependencies No The paper mentions using 'Theano (Bergstra et al., 2010)', 'CUDA (John Nickolls, 2008)', and 'Cu DNN (Chetlur et al., 2014)', but it does not provide specific version numbers for these software components.
Experiment Setup Yes A mini-batch size of 250 was used for all experiments. For dropout nets we follow Srivastava et al. (2014) and use a dropout of 0.9 on the input, 0.75 in the convolution layers and 0.5 in the fully connected layers.