Confident Multiple Choice Learning
Authors: Kimin Lee, Changho Hwang, KyoungSoo Park, Jinwoo Shin
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effect of CMCL via experiments on the image classification on CIFAR and SVHN, and the foregroundbackground segmentation on the i Coseg. In particular, CMCL using 5 residual networks provides 14.05% and 6.60% relative reductions in the top-1 error rates from the corresponding IE scheme for the classification task on CIFAR and SVHN, respectively. |
| Researcher Affiliation | Academia | 1School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Repulic of Korea. Correspondence to: Jinwoo Shin <jinwoos@kaist.ac.kr>. |
| Pseudocode | Yes | Algorithm 1 Confident MCL (CMCL) |
| Open Source Code | Yes | Our code is available at https://github.com/ chhwang/cmcl. |
| Open Datasets | Yes | We evaluate our algorithm for both classification and foreground-background segmentation tasks using CIFAR10 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011) and i Coseg (Batra et al., 2010) datasets. |
| Dataset Splits | Yes | The CIFAR-10 dataset consists of 50,000 training and 10,000 test images with 10 image classes... The SVHN dataset consists of 73,257 training and 26,032 test images... For each class, we randomly split 80% and 20% of the data into training and test sets, respectively. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU models, CPU types, memory) used for running the experiments. It only mentions training various CNNs. |
| Software Dependencies | No | The paper mentions using specific models (VGGNet, GoogLeNet, ResNet, FCNs) and optimization methods (stochastic gradient descent with Nesterov momentum), but does not provide specific version numbers for any software libraries, frameworks, or programming languages used (e.g., TensorFlow 2.x, PyTorch 1.x, Python 3.x). |
| Experiment Setup | Yes | For all models, we choose the best hyperparameters for confident oracle loss among the penalty parameter β {0.5, 0.75, 1, 1.25, 1.5} and the overlapping parameter K {2, 3, 4}. We use the softmax classifier, and train each model by minimizing the cross-entropy loss using the stochastic gradient descent method with Nesterov momentum. We share the nonlinear activated features right before the first pooling layer, i.e., the 6th, 2nd, and 1st Re LU activations for Res Net with 20 layers, VGGNet with 17 layers, and Goog Le Net with 18 layers, respectively. |