SelectiveNet: A Deep Neural Network with an Integrated Reject Option
Authors: Yonatan Geifman, Ran El-Yaniv
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we show a consistently improved risk coverage trade-off over several well-known classification and regression datasets, thus reaching new state-of-the-art results for deep selective classification. |
| Researcher Affiliation | Academia | Yonatan Geifman 1 Ran El-Yaniv 1 1Technion Israel Institute of Technology. Correspondence to: Yonatan Geifman <yonatan.g@cs.technion.ac.il>. |
| Pseudocode | No | The paper describes the architecture and optimization procedure in text and figures, but does not include a formal pseudocode block or algorithm. |
| Open Source Code | Yes | Our complete code can be downloaded from the following link, https://github.com/geifmany/Selective Net. |
| Open Datasets | Yes | Street View House Numbers (SVHN). The SVHN dataset (Netzer et al., 2011) is an image classification dataset... CIFAR-10. The CIFAR-10 dataset (Krizhevsky & Hinton, 2009) is an image classification dataset... Cats vs. Dogs. The Cats vs. Dogs is an image classification dataset extracted from the ASIRRA dataset... Concrete Compressive Strength. The Concrete Compressive Strength dataset is a regression dataset from the UCI repository (Dheeru & Karra Taniskidou, 2017). |
| Dataset Splits | No | For Cats vs. Dogs, we randomly split this dataset into a training set containing 20,000 images, and a test set of 5000 images. For SVHN, the dataset contains 73,257 training images and 26,032 test images. For CIFAR-10, it comprises a training set of 50,000 images and 10,000 test images. While a validation set is mentioned for post-training calibration and validation accuracy is reported, the specific train/validation/test split percentages or absolute sample counts for hyperparameter tuning are not provided consistently across all datasets. |
| Hardware Specification | No | The paper describes the model architecture, training parameters, and datasets, but does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions training algorithms and techniques like SGD, ADAM, batch normalization, and dropout, but does not provide specific software dependencies or library version numbers (e.g., TensorFlow, PyTorch, scikit-learn, with their corresponding versions) required to reproduce the experiments. |
| Experiment Setup | Yes | For the convolutional neural network (CNN) experiments, we used the well-known VGG-16 architecture... The network was optimized using stochastic gradient descent (SGD) with a momentum of 0.9, an initial learning rate of 0.1, and a weight decay of 5e-4. The learning rate was reduced by 0.5 every 25 epochs, and trained for 300 epochs... The value of α (the convex combination between the selective loss and the auxiliary loss) was set to 0.5 for all experiments, and λ was set to 32... The regression model was optimized using the ADAM algorithm (Kingma & Ba, 2014) with a learning rate of 5×10−4 and mini-batch size of 256 and 800 epochs. We used squared loss with weight decay of 1×10−4 during optimization. |