Maximizing Overall Diversity for Improved Uncertainty Estimates in Deep Ensembles

Authors: Siddhartha Jain, Ge Liu, Jonas Mueller, David Gifford4264-4271

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply MOD to regression tasks including 38 Protein-DNA binding datasets, 9 UCI datasets, and the IMDB-Wiki image dataset. We also explore variants that utilize adversarial training techniques and data density estimation. For out-of-distribution test examples, MOD significantly improves predictive performance and uncertainty calibration without sacrificing performance on test data drawn from same distribution as the training data.
Researcher Affiliation Collaboration Siddhartha Jain,*1 Ge Liu,*1 Jonas Mueller,2 David Gifford1 *The authors contribute equally, 1CSAIL,MIT, 2Amazon Web Services {sj1, geliu, gifford}@mit.edu, jonasmue@amazon.com
Pseudocode Yes Algorithm 1 MOD Training Procedure (+ Variants)
Open Source Code No The paper does not provide an explicit statement about releasing open-source code for the described methodology, nor does it include a link to a code repository.
Open Datasets No The paper mentions using well-known datasets such as '38 Protein-DNA binding datasets,' '9 UCI datasets,' and the 'IMDB-Wiki image dataset,' but it does not provide a direct link, DOI, or formal citation with author and year for accessing these specific datasets.
Dataset Splits Yes We separate them into extremely small training set (300 examples) and validation set (300 examples), and use the rest as in-distribution test set. ... we used 40% of the data for training and 10% for validation. The remaining data is used as an in-distribution test set.
Hardware Specification Yes All experiments were run on Nvidia Titan X 1080 Ti and Nvidia Titan X 2080 Ti GPUs
Software Dependencies Yes All experiments were run on Nvidia Titan X 1080 Ti and Nvidia Titan X 2080 Ti GPUs with Py Torch version 1.0.
Experiment Setup Yes All hyperparameters including learning rate, ℓ2-regularization, γ for MOD/Negative Correlation, and adversarial training δ were tuned based on validation set NLL. In every regression task, the search for hyperparameter γ was over the values 0.01, 0.1, 1, 5, 10, 20, 50. For MOD-Adv, we search for δ over 0.2,1.0,3.0,5.0 for UCI and 0.1,0.5,1 for the image data.