Pose Adaptive Dual Mixup for Few-Shot Single-View 3D Reconstruction
Authors: Ta-Ying Cheng, Hsuan-Ru Yang, Niki Trigoni, Hwann-Tzong Chen, Tyng-Luh Liu427-435
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our PADMix significantly outperforms previous literature on fewshot settings over the Shape Net dataset and sets new benchmarks on the more challenging real-world Pix3D dataset. Experiments We extensively study the generalization results of PADMix on the Shape Net dataset (Chang et al. 2015), following the identical settings as previous work in the 80-20 split of base classes {airplanes, cars, chairs, displays, phone, speakers, tables} and novel classes {cabinet, sofa, bench, watercraft, rifle, lamp}. All procedures are trained using eight Nvidia Tesla V100s for 100 epochs with a batch size of 32. |
| Researcher Affiliation | Academia | 1 Institute of Information Science, Academia Sinica, Taiwan 2 Department of Computer Science, National Tsing Hua University, Taiwan 3 Department of Computer Science, University of Oxford, UK |
| Pseudocode | No | The paper describes the model architecture and training procedures using text and diagrams (Figure 1, 2, 3), but it does not include any explicit pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Our empirical study on the popular Shape Net dataset (Chang et al. 2015) shows that an image-prior encoder on par with previous work can improve significantly and achieve state-of-the-art results with the addition of PADMix. Finally, we extend PADMix to the challenging Pix3D dataset (Sun et al. 2018) to create a new benchmark in few-shot real-world object reconstruction. |
| Dataset Splits | Yes | We extensively study the generalization results of PADMix on the Shape Net dataset (Chang et al. 2015), following the identical settings as previous work in the 80-20 split of base classes {airplanes, cars, chairs, displays, phone, speakers, tables} and novel classes {cabinet, sofa, bench, watercraft, rifle, lamp}. We extract all training and testing data from the standard S1 split described in the Mesh-RCNN paper (Gkioxari, Malik, and Johnson 2019), which contains 7539 train images and 2530 test images. |
| Hardware Specification | Yes | All procedures are trained using eight Nvidia Tesla V100s for 100 epochs with a batch size of 32. |
| Software Dependencies | No | The paper mentions using an "Image Net-pretrained Res Net-34" but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers, nor the Python version used for implementation. |
| Experiment Setup | Yes | All procedures are trained using eight Nvidia Tesla V100s for 100 epochs with a batch size of 32. In terms of hyperparameters, µ is set to 0.1, α to 0.2, and w BCE and w ADP to 10 and 0.5. The learning rates of the entire base network and the additional shape encoder EGT are set to 1e-3 and 1e-4, respectively. |