Improving CNN Performance with Min-Max Objective

Authors: Weiwei Shi, Yihong Gong, Jinjun Wang

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with shallow and deep models on four benchmark datasets including CIFAR10, CIFAR-100, SVHN and MNIST demonstrate that CNN models trained with the Min-Max objective achieve remarkable performance improvements compared to the corresponding baseline models.
Researcher Affiliation Academia Weiwei Shi, Yihong Gong , Jinjun Wang Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University, Xi an 710049, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No All the models are implemented using the Caffe platform [Jia et al., 2014] from scratch without pre-training. The paper does not provide concrete access to its own source code.
Open Datasets Yes We conduct performance evaluations using four benchmark datasets, i.e. CIFAR-10, CIFAR-100, MNIST and SVHN.
Dataset Splits Yes The Street View House Numbers (SVHN) dataset [Netzer et al., 2011] consists of 630,420 color images of 32x32 pixels in size, which are divided into the training set, testing set and an extra set with 73,257, 26,032 and 531,131 images, respectively. [...] 400 samples per class selected from the training set and 200 samples per class from the extra set were used for validation, while the remaining 598,388 images of the training and the extra sets were used for training. The validation set was only used for tuning hyper-parameters and was not used for training the model.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No All the models are implemented using the Caffe platform [Jia et al., 2014] from scratch without pre-training. No specific version numbers for software dependencies are provided.
Experiment Setup Yes For simplicity, we set k1 = 5, k2 = 10 for all the experiments, and it is possible that better results can be obtained by tuning k1 and k2. σ2 is empirically selected from {0.1, 0.5}, and λ ∈ [10^-6, 10^-9].