Dual Path Networks

Authors: Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three benchmark datasets, Imag Net-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts.
Researcher Affiliation Collaboration National University of Singapore Beijing Institute of Technology National University of Defense Technology Qihoo 360 AI Institute
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code No The paper does not provide any statement or link regarding the open-sourcing of the code for the described methodology.
Open Datasets Yes Extensive experiments are conducted for evaluating the proposed Dual Path Networks. Specifically, we evaluate the proposed architecture on three tasks: image classification, object detection and semantic segmentation, using three standard benchmark datasets: the Image Net-1k dataset, Places365-Standard dataset and the PASCAL VOC datasets.
Dataset Splits No The paper mentions using 'validation set' for evaluation (e.g., 'Single crop validation error rate (%) on validation set' in Table 2), but does not explicitly provide the specific percentages or sample counts for the train/validation/test splits, nor does it cite a source defining those exact splits for reproduction within the main text.
Hardware Specification Yes We implement the DPNs using MXNet [2] on a cluster with 40 K80 graphic cards.
Software Dependencies No The paper mentions using 'MXNet [2]' for implementation but does not specify the version number of MXNet or any other software dependencies.
Experiment Setup Yes Following [3], we adopt standard data augmentation methods and train the networks using SGD with a mini-batch size of 32 for each GPU. For the deepest network, i.e. DPN-131, the mini-batch size is limited to 24 because of the 12GB GPU memory constraint. The learning rate starts from 0.1 for DPN-92 and DPN-131, and from 0.4 for DPN-98. It drops in a steps manner by a factor of 0.1. Following [5], batch normalization layers are refined after training.