Towards Reliable Learning for High Stakes Applications
Authors: Jinyang Gao, Junjie Yao, Yingxia Shao3614-3621
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate GALVE on CIFAR10 and SVHN dataset. The results demonstrate that the errors happened in samples with highest reliability measured by GALVE is only 40-45% of the errors happened in samples with highest reliability measured by confidence in CIFAR10 and SVHN computer vision tasks. |
| Researcher Affiliation | Collaboration | Jinyang Gao,1 Junjie Yao,2 Yingxia Shao3 1Alibaba Group, 2ECNU, 3BUPT, jinyang.gjy@alibaba-inc.com, junjie.yao@sei.ecnu.edu.cn, shaoyx@bupt.edu.cn. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks, nor are there any clearly labeled algorithm sections. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, either through a specific repository link, an explicit code release statement, or code in supplementary materials. |
| Open Datasets | Yes | CIFAR10. The CIFAR10 datasets consist of 32 32 size images drawn from 10 classes. The training and testing sets contain 50,000 and 10,000 images respectively. SVHN. The Street View House Numbers (SVHN) dataset contains 32 32 size images from Google Street View. |
| Dataset Splits | No | The paper specifies training and testing set sizes (e.g., 'The training and testing sets contain 50,000 and 10,000 images respectively' for CIFAR10), but does not explicitly mention or detail a validation set split for reproducing the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like WGAN-GP and WGAN, but does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We use a network architecture with 56 layers (Conv+18*3-layer bottleneck learning blocks + Softmax), where the basic width of main path is 64 channels and the width of bottleneck is 16 channels. The model is trained using standard mini-batch SGD with batch size of 128 and momentum value of 0.9. |