PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

Authors: Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS4.1 EXPERIMENTAL SETTINGS4.2 EXPERIMENT RESULTS4.3 ABLATION STUDIES
Researcher Affiliation Collaboration Peng Wang Adobe Research & HKU totoro97@outlook.com Hao Tan Adobe Research hatan@adobe.com Sai Bi Adobe Research sbi@adobe.com Yinghao Xu Adobe Research & Stanford yhxu@stanford.edu Fujun Luan Adobe Research fluan@adobe.com Kalyan Sunkavalli Adobe Research sunkaval@adobe.com Wenping Wang Texas A&M University wenping@tamu.edu Zexiang Xu Adobe Research zexu@adobe.com Kai Zhang Adobe Research kaiz@adobe.com
Pseudocode No The paper describes the pipeline with diagrams and mathematical equations, but does not include an explicitly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Our project website is at: https://totoro97.github.io/pf-lrm.
Open Datasets Yes We use a mixture of multi-view posed renderings from Objaverse (Deitke et al., 2023) and posed real captures from MVImg Net (Yu et al., 2023) for training.
Dataset Splits No The paper states 'Our model only requires multi-view posed images to train' and details evaluation on various datasets, but does not explicitly provide information on train/validation/test splits for its own training data beyond using separate evaluation datasets.
Hardware Specification Yes 1.3 seconds on a single A100 GPU. ... It has 24 self-attention layers with 1024 token dimension, and is trained on 8 A100 GPUs for 20 epochs (~100k iterations), which takes around 5 days. In addition, to show the scaling law with respect to model sizes, we train a large model (Ours (L)) on 128 GPUs for 100 epochs (~70k iterations).
Software Dependencies No The paper mentions using Adam W optimizer and Flash Attention V2, but does not provide specific version numbers for these or any other software dependencies like PyTorch or Python.
Experiment Setup Yes We set the loss weights γ C, γ C , γp, γα, γy to 1, 2, 1, 1, 1, respectively. For more details, please refer to Sec. A.3 of the appendix. ... We use Adam W (Loshchilov & Hutter, 2017) (β1 = 0.9, β2 = 0.95) optimizer with weight decay 0.05 for model optimization. The initial learning rate is zero, which is linearly warmed up to 4e-4 for the first 3k steps and then decay to zero by cosine scheduling.