Regularizing by the Variance of the Activations' Sample-Variances
Authors: Etai Littwin, Lior Wolf
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate an improvement in accuracy over the batchnorm technique for both CNNs and fully connected networks. |
| Researcher Affiliation | Collaboration | 1Tel Aviv University 2Facebook AI Research |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | To support reproducibility, the entire code of all of our experiments is to be promptly released. |
| Open Datasets | Yes | The two CIFAR datasets (Krizhevsky Hinton, 2009) consist of colored natural images sized at 32 32 pixels. The Tiny Image Net dataset consists of a subset of Image Net [16]. UCI We also apply VCL to the 44 UCI datasets with more than 1000 samples. The train/test splits were provided by the authors of [11]. |
| Dataset Splits | Yes | For each dataset, there are 50,000 training images and 10,000 images reserved for testing. The Tiny Image Net dataset consists of a subset of Image Net [16], with 200 different classes, each of which has 500 training images and 50 validation images, downscaled to 64 64. The train/test splits were provided by the authors of [11]. |
| Hardware Specification | Yes | Table 1: Time in Seconds per 100 iterations (CIFAR-100). Method Intel i7 CPU Volta GPU |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | For all experiments, 500 epochs are used and a batch size N of 250. We employ a learning rate of 0.05, which was reduced at epoch 180 to 0.02, and further reduced by a factor of 10 every 100 epochs. A momentum of 0.9 was used and the L2 regularization term was weighed by 0.0001. The hyperparameters of VCL are fixed: the weight of the VCL regularization is set to γ = 0.01. |