CNN^{2}: Viewpoint Generalization via a Binocular Vision

Authors: Wei-Da Chen, Shan-Hung (Brandon) Wu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation shows that CNN2 has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN2 is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications. We conduct experiments using binocular images from the Small NORB (Le Cun et al. (2004)), Model Net (Wu et al. (2015)) and larger-scale RGB-D Object (Lai et al. (2011)) datasets.
Researcher Affiliation Academia Wei-Da Chen Department of Computer Science National Tsing-Hua University Taiwan, R.O.C. wdchen@datalab.cs.nthu.edu.tw Shan-Hung Wu Department of Computer Science National Tsing-Hua University Taiwan, R.O.C. shwu@cs.nthu.edu.tw
Pseudocode No No, the paper does not contain structured pseudocode or algorithm blocks. Model architecture and processes are described in text and diagrams.
Open Source Code No No, the paper does not provide an explicit statement about releasing its source code for CNN2 or a link to a code repository.
Open Datasets Yes We conduct experiments using three binocular image datasets: 1) the Model Net2D dataset rendered from Model Net40 (Wu et al. (2015)) following the settings used by Le Cun et al. (2004), 2) the Small NORB dataset (Le Cun et al. (2004)), and 3) the RGB-D Object dataset (Lai et al. (2011)).
Dataset Splits Yes On the Model Net2D dataset, we use the images taken from azimuths of degrees from 50 to 125 as the training set, degrees from 30 to 45 and from 130 to 145 as the validation set, and unlimited degrees as the test set. On the Small NORB dataset, we use the images taken from azimuths of degrees from 20 to 80 as the training set, degrees at 0 and 100 as the validation set, and the rest as the test set. On the RGB-D Object dataset, images of different objects are taken from different viewpoints. So, we use images taken from one third of continuous viewpoints of each object as the training set and the remaining images as the test set. We further split one third of the training images having continuous viewpoints as the validation set.
Hardware Specification Yes We conduct experiments on a computer with an Intel Core i7-6900K CPU, 64 GB RAM, and an NVIDIA Geforce GTX 1070 GPU.
Software Dependencies No No, the paper mentions using "Tensor Flow" but does not provide a specific version number for it or any other key software components used in the experiments.
Experiment Setup No No, while the paper mentions some experimental details like not augmenting data during training and using fewer filters for CNN2 (50 vs 112 for Vanilla CNN), it does not provide comprehensive specific experimental setup details such as learning rates, batch sizes, optimizers, or detailed training schedules. It states "we search for the best architecture for a given dataset" but does not list the resulting hyperparameter values.