Training Group Orthogonal Neural Networks with Privileged Information
Authors: Yunpeng Chen, Xiaojie Jin, Jiashi Feng, Shuicheng Yan
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two benchmark datasets Image Net and PASCAL VOC clearly demonstrate the strong generalization ability of our proposed Go CNN model. |
| Researcher Affiliation | Collaboration | National University of Singapore 2Qihoo 360 AI Institute |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate the performance of Go CNN in image classification on two benchmark datasets, i.e., the Image Net [Deng et al., 2009] dataset and the Pascal VOC 2012 dataset [Everingham et al., 2010]. |
| Dataset Splits | Yes | We use the original validation set of Image Net for evaluation. For the classification task, there are 5,717 images for training and 5,823 images for validation. |
| Hardware Specification | No | The paper mentions 'single node' and '48 GPUs' but does not specify the exact models or types of GPUs, CPUs, or other specific hardware components used. |
| Software Dependencies | No | The paper states 'We use MXNet [Chen et al., 2015] to conduct model training and testing' but does not provide a specific version number for MXNet or any other software dependencies. |
| Experiment Setup | Yes | Images are resized with a shorter side randomly sampled within [256, 480] for scale augmentation and 224 224 crops are randomly sampled during training [He et al., 2015]. We use SGD with base learning rate equal to 0.1 at the beginning and reduce the learning rate by a factor of 10 when the validation accuracy saturates. For the experiments on Res Net-18 we use single node with a minibatch size of 512. For the Res Net-152 we use 48 GPUs with mini-batch size of 32 per GPU. Following [He et al., 2015], we use a weight decay of 0.0001 and a momentum of 0.9 in the training. |