Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Gamblers: Learning to Abstain with Portfolio Theory
Authors: Ziyin Liu, Zhikang Wang, Paul Pu Liang, Russ R. Salakhutdinov, Louis-Philippe Morency, Masahito Ueda
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method can identify uncertainty in data points, and achieves strong results on SVHN and CIFAR10 at various coverages of the data. |
| Researcher Affiliation | Academia | Institute for Physics of Intelligence & Department of Physics, University of Tokyo Machine Learning Department, Carnegie Mellon University Language Technologies Institute, Carnegie Mellon University EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | Experiments show that our method can identify uncertainty in data points, and achieves strong results on SVHN and CIFAR10 at various coverages of the data. ... SVHN [31] (Table 3), CIFAR10 [23] (Table 4) and Cat vs. Dog (Table 5). |
| Dataset Splits | Yes | The best models of ours for a given coverage are chosen using a validation set, which is separated from the test set by a fixed random seed, and the best single model is chosen by using the model that achieves overall best validation accuracy. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions using a 'version of VGG16' but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | We use a version of VGG16 that is especially optimized for small datasets [27] with batchnorm and dropout. ... A grid search is done over hyperparameter o with a step size of 0.2. ... we train a network with 2 hidden layers each with 50 neurons and tanh activation. ... The model is a simple 4-layer CNN. |