Training Group Orthogonal Neural Networks with Privileged Information

Authors: Yunpeng Chen, Xiaojie Jin, Jiashi Feng, Shuicheng Yan

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two benchmark datasets Image Net and PASCAL VOC clearly demonstrate the strong generalization ability of our proposed Go CNN model.
Researcher Affiliation Collaboration National University of Singapore 2Qihoo 360 AI Institute
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link for open-source code for the described methodology.
Open Datasets Yes We evaluate the performance of Go CNN in image classification on two benchmark datasets, i.e., the Image Net [Deng et al., 2009] dataset and the Pascal VOC 2012 dataset [Everingham et al., 2010].
Dataset Splits Yes We use the original validation set of Image Net for evaluation. For the classification task, there are 5,717 images for training and 5,823 images for validation.
Hardware Specification No The paper mentions 'single node' and '48 GPUs' but does not specify the exact models or types of GPUs, CPUs, or other specific hardware components used.
Software Dependencies No The paper states 'We use MXNet [Chen et al., 2015] to conduct model training and testing' but does not provide a specific version number for MXNet or any other software dependencies.
Experiment Setup Yes Images are resized with a shorter side randomly sampled within [256, 480] for scale augmentation and 224 224 crops are randomly sampled during training [He et al., 2015]. We use SGD with base learning rate equal to 0.1 at the beginning and reduce the learning rate by a factor of 10 when the validation accuracy saturates. For the experiments on Res Net-18 we use single node with a minibatch size of 512. For the Res Net-152 we use 48 GPUs with mini-batch size of 32 per GPU. Following [He et al., 2015], we use a weight decay of 0.0001 and a momentum of 0.9 in the training.