DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

Authors: Alexandre Rame, Matthieu Cord

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We obtain state-of-the-art accuracy results on CIFAR-10/100: for example, an ensemble of 5 networks trained with DICE matches an ensemble of 7 networks trained independently. We further analyze the consequences on calibration, uncertainty estimation, out-of-distribution detection and online co-distillation. In this section, we present our experimental results on the CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) datasets.
Researcher Affiliation Collaboration Alexandre Rame Sorbonne Universit e Paris, France alexandre.rame@lip6.fr Matthieu Cord Sorbonne Universit e & valeo.ai Paris, France matthieu.cord@lip6.fr
Pseudocode Yes Algorithm 1: Full DICE Procedure for M = 2 members
Open Source Code No The paper states: 'We borrowed the evaluation code from https://github.com/ uoguelph-mlrg/confidence_estimation (De Vries & Taylor, 2018).' It does not provide an explicit statement or link for the open-source code of their own proposed method (DICE).
Open Datasets Yes In this section, we present our experimental results on the CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) datasets.
Dataset Splits Yes Hyperparameters for adversarial training and information bottleneck were fine-tuned on a validation dataset made of 5% of the training dataset, see Appendix D.1. For hyperparameter selection and ablation studies, we train on 95% of the training dataset, and analyze performances on the validation dataset made of the remaining 5%.
Hardware Specification No This work was granted access to the HPC resources of IDRIS under the allocation 20XXAD011011953 made by GENCI.
Software Dependencies No The paper mentions general software components like 'Res Net' and 'Wide Res Net' architectures and optimization algorithms like 'SGD', but it does not specify version numbers for programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes Following (Chen et al., 2020b), we used SGD with Nesterov with momentum of 0.9, mini-batch size of 128, weight decay of 5e-4, 300 epochs, a standard learning rate scheduler that sets values {0.1, 0.001, 0.0001} at steps {0, 150, 225} for CIFAR-10/100. log(βceb) reaches values {100, 10, 2, 1.5, 1} at steps {0, 8, 175, 250, 300}.