Information Plane Analysis for Dropout Neural Networks

Authors: Linara Adilova, Bernhard C Geiger, Asja Fischer

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate in a range of experiments1 that this enables a meaningful information plane analysis for a class of dropout neural networks that is widely used in practice.
Researcher Affiliation Collaboration Linara Adilova Ruhr University Bochum, Faculty of Computer Science linara.adilova@ruhr-uni-bochum.de Bernhard C. Geiger Know-Center Gmb H geiger@ieee.org Asja Fischer Ruhr University Bochum, Faculty of Computer Science asja.fischer@ruhr-uni-bochum.de. The Know-Center is funded within the Austrian COMET Program Competence Centers for Excellent Technologies under the auspices of the Austrian Federal Ministry of Climate Action, Environment, Energy, Mobility, Innovation and Technology, the Austrian Federal Ministry of Digital and Economic Affairs, and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency FFG.
Pseudocode Yes Algorithm 1 Estimation of MI under Gaussian dropout
Open Source Code Yes Code for the experiments is public on https://github.com/link-er/IP_dropout.
Open Datasets Yes The analysis on the MNIST dataset was performed for a Le Net network... We also analyze the IPs for a Res Net18 trained on CIFAR10
Dataset Splits No Not found. The paper discusses training and testing, and uses terms like 'train error' and 'test error', but does not specify the dataset split percentages or sample counts for training, validation, and testing sets.
Hardware Specification No Not found. The paper does not specify any hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No Not found. The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes Training proceeded for 200 epochs using SGD with momentum and, different from the original setup, with only one dropout layer after the third convolutional layer. The batch size was set to 100, the learning rate was initially set to 0.05 and was reduced by multiplying it with 0.1 after the 40, 80, and 120 epoch.