Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units
Authors: Wenling Shang, Kihyuk Sohn, Diogo Almeida, Honglak Lee
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We integrate CRe LU into several state-of-the-art CNN architectures and demonstrate improvement in their recognition performance on CIFAR-10/100 and Image Net datasets with fewer trainable parameters. |
| Researcher Affiliation | Collaboration | 1University of Michigan, Ann Arbor; 2NEC Laboratories America; 3Enlitic; 4Oculus VR |
| Pseudocode | Yes | We use a simple linear reconstruction algorithm (see Algorithm 1 in the supplementary materials) to reconstruct the original image from conv1-conv4 features (left to right). |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or provide any links to a code repository. |
| Open Datasets | Yes | We evaluate the effectiveness of the CRe LU activation scheme on three benchmark datasets: CIFAR-10, CIFAR100 (Krizhevsky, 2009) and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | Since the datasets don t provide pre-defined validation set, we conduct two different cross-validation schemes: 1. Single : we hold out a subset of training set for initial training and retrain the network from scratch using the whole training set until we reach at the same loss on a hold out set (Goodfellow et al., 2013). 2. 10-folds: we divide training set into 10 folds and do validation on each of 10 folds while training the networks on the rest of 9 folds. |
| Hardware Specification | No | The paper only mentions "NVIDIA for the donation of GPUs" in the acknowledgments, which is too general and lacks specific model numbers or configurations required for reproducibility. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) in its main text. |
| Experiment Setup | Yes | Note that the models with CRe LU activation don t need sig-nificant hyperparameter tuning from the baseline Re LU model, and in most of our experiments, we only tune dropout rate while other hyperparameters (e.g., learning rate, mini-batch size) remain the same. We also replace Re LU with AVR for comparison with CRe LU. [...] We subtract the mean and divide by the standard deviation for preprocessing and use random horizontal flip for data augmentation. |