Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction
Authors: Chen-Hsuan Lin, Chen Kong, Simon Lucey
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results for single-image 3D object reconstruction tasks show that we outperforms state-of-the-art methods in terms of shape similarity and prediction density. Our experimental results show that we generate much denser and more accurate shapes than state-of-the-art 3D prediction methods. We evaluate our proposed method by analyzing its performance in the application of single-image 3D reconstruction and comparing against state-of-the-art methods. |
| Researcher Affiliation | Academia | Chen-Hsuan Lin, Chen Kong, Simon Lucey The Robotics Institute Carnegie Mellon University chlin@cmu.edu, {chenk,slucey}@cs.cmu.edu |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any statement about releasing its source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We train and evaluate all networks using the Shape Net database (Chang et al. 2015), which contains a large collection of categorized 3D CAD models. |
| Dataset Splits | No | The paper mentions '80%-20% training/test split' but does not specify a validation dataset split. |
| Hardware Specification | No | The paper vaguely mentions 'high-end GPUnodes' but does not provide specific hardware details (e.g., GPU models, CPU types, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as libraries or frameworks used in implementation. |
| Experiment Setup | Yes | The structure generator predicts a 4N-channel image, which consists of the x, y, z coordinates and the binary mask from each of the N fixed viewpoint. We chose N = 8 with those viewpoints looking from the 8 corners of a centered cube. Orthographic projection is assumed in the transformation in (1) and (2). We take a two-stage training procedure: the structure generator is first pretrained to predict the x, y regular grids and depth images (z) from the N viewpoints (also pre-rendered with size 128 128), and then the network is fine-tuned with joint 2D projection optimization. |