DIBS: Diversity Inducing Information Bottleneck in Model Ensembles
Authors: Samarth Sinha, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg, Florian Shkurti9666-9674
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on benchmark datasets: MNIST, CIFAR100, Tiny Image Net and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. |
| Researcher Affiliation | Collaboration | 1 University of Toronto 2 Vector Institute 3 Mila 4 Google Brain |
| Pseudocode | No | The paper describes mathematical formulations and algorithmic steps but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate our method on benchmark datasets: MNIST, CIFAR100, Tiny Image Net and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly provide details about a validation set or specific percentages for train/validation/test splits, nor does it refer to predefined splits with citations for all experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch or TensorFlow with their respective versions) needed to replicate the experiment. |
| Experiment Setup | Yes | For optimization, we use Stochastic Gradient Descent (SGD) (Bottou 2010) with a learning rate of 0.05 and momentum of 0.9 (Sutskever et al. 2013). We decay the learning rate by a factor of 10 every 30 epochs of training. |