reproducibilityindex.ai

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Authors: Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved. We conduct experimental evaluations using a subset of 3D models from Shape Net Core [1]. Results from single-class and multi-class training demonstrate excellent performance of our network for volumetric 3D reconstruction.
Researcher Affiliation	Collaboration	1University of Michigan, Ann Arbor 2Adobe Research 3Google Brain
Pseudocode	No	The paper describes procedures using mathematical equations but does not present a formal pseudocode or algorithm block.
Open Source Code	Yes	To download the code, please refer to the project webpage: http://goo.gl/YEJ2H6.
Open Datasets	Yes	Shape Net Core. This dataset contains about 51,300 unique 3D models from 55 common object categories [1]. [1] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, et al. Shapenet: An information-rich 3d model repository. ar Xiv preprint ar Xiv:1512.03012, 2015.
Dataset Splits	No	For multicategory experiment, the training set includes 13 major categories: airplane, bench, dresser, car, chair, display, lamp, loudspeaker, rifle, sofa, table, telephone and vessel. Basically, we preserved 20% of instances from each category as testing data.
Hardware Specification	No	We acknowledge NVIDIA for the donation of GPUs.
Software Dependencies	Yes	The models including the perspective transformer nets are implemented using Torch [3]. [3] R. Collobert, K. Kavukcuoglu, and C. Farabet. Torch7: A matlab-like environment for machine learning. In Big Learn, NIPS Workshop, number EPFL-CONF-192376, 2011.
Experiment Setup	Yes	Implementation Details. We used the ADAM [7] solver for stochastic optimization in all the experiments. During the pre-training stage (for encoder), we used mini-batch of size 32, 32, 8, 4, 3 and 2 for training the RNN-1, RNN-2, RNN-4, RNN-8, RNN-12 and RNN-16 as used in Yang et al. [23]. We used the learning rate 10^-4 for RNN-1, and 10^-5 for the rest of recurrent neural networks. During the fine-tuning stage (for volume decoder), we used mini-batch of size 6 and learning rate 10^-4. For each object in a mini-batch, we include projections from all 24 views as supervision.