reproducibilityindex.ai

Towards Reliable Learning for High Stakes Applications

Authors: Jinyang Gao, Junjie Yao, Yingxia Shao3614-3621

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate GALVE on CIFAR10 and SVHN dataset. The results demonstrate that the errors happened in samples with highest reliability measured by GALVE is only 40-45% of the errors happened in samples with highest reliability measured by conﬁdence in CIFAR10 and SVHN computer vision tasks.
Researcher Affiliation	Collaboration	Jinyang Gao,1 Junjie Yao,2 Yingxia Shao3 1Alibaba Group, 2ECNU, 3BUPT, jinyang.gjy@alibaba-inc.com, junjie.yao@sei.ecnu.edu.cn, shaoyx@bupt.edu.cn.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks, nor are there any clearly labeled algorithm sections.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described, either through a specific repository link, an explicit code release statement, or code in supplementary materials.
Open Datasets	Yes	CIFAR10. The CIFAR10 datasets consist of 32 32 size images drawn from 10 classes. The training and testing sets contain 50,000 and 10,000 images respectively. SVHN. The Street View House Numbers (SVHN) dataset contains 32 32 size images from Google Street View.
Dataset Splits	No	The paper specifies training and testing set sizes (e.g., 'The training and testing sets contain 50,000 and 10,000 images respectively' for CIFAR10), but does not explicitly mention or detail a validation set split for reproducing the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software components like WGAN-GP and WGAN, but does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	We use a network architecture with 56 layers (Conv+18*3-layer bottleneck learning blocks + Softmax), where the basic width of main path is 64 channels and the width of bottleneck is 16 channels. The model is trained using standard mini-batch SGD with batch size of 128 and momentum value of 0.9.