Information Plane Analysis for Dropout Neural Networks
Authors: Linara Adilova, Bernhard C Geiger, Asja Fischer
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate in a range of experiments1 that this enables a meaningful information plane analysis for a class of dropout neural networks that is widely used in practice. |
| Researcher Affiliation | Collaboration | Linara Adilova Ruhr University Bochum, Faculty of Computer Science linara.adilova@ruhr-uni-bochum.de Bernhard C. Geiger Know-Center Gmb H geiger@ieee.org Asja Fischer Ruhr University Bochum, Faculty of Computer Science asja.fischer@ruhr-uni-bochum.de. The Know-Center is funded within the Austrian COMET Program Competence Centers for Excellent Technologies under the auspices of the Austrian Federal Ministry of Climate Action, Environment, Energy, Mobility, Innovation and Technology, the Austrian Federal Ministry of Digital and Economic Affairs, and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency FFG. |
| Pseudocode | Yes | Algorithm 1 Estimation of MI under Gaussian dropout |
| Open Source Code | Yes | Code for the experiments is public on https://github.com/link-er/IP_dropout. |
| Open Datasets | Yes | The analysis on the MNIST dataset was performed for a Le Net network... We also analyze the IPs for a Res Net18 trained on CIFAR10 |
| Dataset Splits | No | Not found. The paper discusses training and testing, and uses terms like 'train error' and 'test error', but does not specify the dataset split percentages or sample counts for training, validation, and testing sets. |
| Hardware Specification | No | Not found. The paper does not specify any hardware details such as GPU or CPU models used for running experiments. |
| Software Dependencies | No | Not found. The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | Training proceeded for 200 epochs using SGD with momentum and, different from the original setup, with only one dropout layer after the third convolutional layer. The batch size was set to 100, the learning rate was initially set to 0.05 and was reduced by multiplying it with 0.1 after the 40, 80, and 120 epoch. |