Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Power Laws in Deep Ensembles
Authors: Ekaterina Lobacheva, Nadezhda Chirkova, Maxim Kodryan, Dmitry P. Vetrov
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct our experiments with convolutional neural networks, Wide Res Net [33] and VGG16 [27], on CIFAR-10 [16] and CIFAR-100 [17] datasets. [...] The empirical results presented in sections 4, 5, 6, 7 were supported by the Russian Science Foundation grant 19-71-30020. |
| Researcher Affiliation | Collaboration | 1Samsung-HSE Laboratory, National Research University Higher School of Economics 2Samsung AI Center Moscow Moscow, Russia EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code is available at https://github.com/nadiinchi/power_laws_deep_ensembles. |
| Open Datasets | Yes | We conduct our experiments with convolutional neural networks, Wide Res Net [33] and VGG16 [27], on CIFAR-10 [16] and CIFAR-100 [17] datasets. |
| Dataset Splits | Yes | Following [2], we use the test-time cross-validation to compute the CNLL. [...] For each network size, we tune hyperparameters (weight decay and dropout) using grid search. |
| Hardware Specification | No | The paper mentions support from 'HPC facilities at NRU HSE' but does not provide specific details on CPU, GPU models, or other hardware specifications used for experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For each network size, we tune hyperparameters (weight decay and dropout) using grid search. We train all networks for 200 epochs with SGD with an annealing learning schedule and a batch size of 128. |