LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery

Authors: Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Pascal-Part and self-collected in-the-wild animal datasets demonstrate considerably better 3D reconstructions as well as both 2D and 3D part discovery compared to prior arts.
Researcher Affiliation Collaboration Chun-Han Yao1 Wei-Chih Hung2 Yuanzhen Li3 Michael Rubinstein3 Ming-Hsuan Yang134 Varun Jampani3 1UC Merced 2Waymo 3Google Research 4Yonsei University
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Project page: https://chhankyao.github.io/lassie/ and Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes We conduct extensive experiments on image ensembles from the Pascal-Part Dataset [7] with quantitative evaluations using 2D ground-truth part segmentation and keypoints.
Dataset Splits No The paper states 'The LASSIE optimization and evaluations are performed on each image ensemble separately' and does not provide explicit train/validation/test splits, percentages, or sample counts for its datasets.
Hardware Specification No The paper does not provide specific details regarding the hardware used for experiments, such as GPU or CPU models.
Software Dependencies No We implement the framework in Py Torch [33]. The paper mentions PyTorch but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes The overall optimization objective is: Lmask + λ1Lsem + λ2Lpose + λ3Lang + λ4Llap + λ5Lnorm, where {λi} are weighting hyper-parameters. We first pre-train the part prior MLP Fp with 3D geometric primitives and freeze it during the optimization on a given image ensemble. For each image ensemble of an animal species, we perform multi-stage optimization on the camera, pose, and shape parameters until convergence. That is, we update the camera viewpoints and fix the rest first, then optimize the bone transformations, and finally the latent part codes as well as part deformation MLPs. In each iteration, we first update the semantic features of 3D surfaces, then use the updated features to update 3D surfaces, forming an EM-style optimization. We implement the framework in Py Torch [33] and update all the learnable parameters using an Adam optimizer [16].