Ensemble Soft-Margin Softmax Loss for Image Classification
Authors: Xiaobo Wang, Shifeng Zhang, Zhen Lei, Si Liu, Xiaojie Guo, Stan Z. Li
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark datasets are conducted to show the superiority of our design over the baseline softmax loss and several state-of-the-art alternatives. |
| Researcher Affiliation | Academia | 1 CBSR&NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2 University of Chinese Academy of Sciences, Beijing, China 3 School of Computer Science and Engineering, Beihang University, Beijing, China 4 School of Computer Software, Tianjin University, Tianjin, China 5 Faculty of Information Technology, Macau University of Science and Technology, Macau, China {xiaobo.wang,shifeng.zhang,zlei,szli}@nlpr.ia.ac.cn, fifthzombiesi@gmail.com, xj.max.guo@gmail.com |
| Pseudocode | No | The paper describes the methods mathematically but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the public availability of their specific implementation code for the EM-Softmax loss. |
| Open Datasets | Yes | MNIST [Le Cun et al., 1998b]: The MNIST is a dataset of handwritten digits (from 0 to 9)... CIFAR10/CIFAR10+ [Krizhevsky and Hinton, 2009]: The CIFAR10 contains 10 classes... CIFAR100/CIFAR100+ [Krizhevsky and Hinton, 2009]: We also evaluate the performance of the proposed EM-Softmax loss on CIFAR100 dataset. Image Net32 [Chrabaszcz et al., 2017]: The Image Net32 is a downsampled version of the Image Net 2012 challenge dataset... |
| Dataset Splits | Yes | MNIST...There are 60, 000 training images and 10, 000 test images. CIFAR10...each with 5, 000 training samples and 1, 000 test samples. CIFAR100...There are 500 training images and 100 testing images per class. Image Net32...1281, 167 training images and 50, 000 validation images for 1, 000 classes. |
| Hardware Specification | Yes | the training time on 2 Titan X GPU is about 1.01h... The testing time on CPU (Intel Xeon E5-2660v0@2.20Ghz) is about 3.1m... |
| Software Dependencies | No | The paper mentions 'we implement the CNNs using the well-known Caffe [Jia et al., 2014] library', but no specific version number for Caffe is provided. |
| Experiment Setup | Yes | We start with a learning rate of 0.1, use a weight decay of 0.0005 and momentum of 0.9. For MNIST, the learning rate is divided by 10 at 8k and 14k iterations. For CIFAR10/CIFAR10+, the learning rate is also divided by 10 at 8k and 14k iterations. For CIFAR100/CIFAR100+, the learning rate is divided by 10 at 12k and 15k iterations. For all these three datasets, the training eventually terminates at 20k iterations. For Image Net32, the learning rate is divided by 10 at 15k, 25k and 35k iterations, and the maximal iteration is 40k. |