PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
Authors: Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS4.1 EXPERIMENTAL SETTINGS4.2 EXPERIMENT RESULTS4.3 ABLATION STUDIES |
| Researcher Affiliation | Collaboration | Peng Wang Adobe Research & HKU totoro97@outlook.com Hao Tan Adobe Research hatan@adobe.com Sai Bi Adobe Research sbi@adobe.com Yinghao Xu Adobe Research & Stanford yhxu@stanford.edu Fujun Luan Adobe Research fluan@adobe.com Kalyan Sunkavalli Adobe Research sunkaval@adobe.com Wenping Wang Texas A&M University wenping@tamu.edu Zexiang Xu Adobe Research zexu@adobe.com Kai Zhang Adobe Research kaiz@adobe.com |
| Pseudocode | No | The paper describes the pipeline with diagrams and mathematical equations, but does not include an explicitly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Our project website is at: https://totoro97.github.io/pf-lrm. |
| Open Datasets | Yes | We use a mixture of multi-view posed renderings from Objaverse (Deitke et al., 2023) and posed real captures from MVImg Net (Yu et al., 2023) for training. |
| Dataset Splits | No | The paper states 'Our model only requires multi-view posed images to train' and details evaluation on various datasets, but does not explicitly provide information on train/validation/test splits for its own training data beyond using separate evaluation datasets. |
| Hardware Specification | Yes | 1.3 seconds on a single A100 GPU. ... It has 24 self-attention layers with 1024 token dimension, and is trained on 8 A100 GPUs for 20 epochs (~100k iterations), which takes around 5 days. In addition, to show the scaling law with respect to model sizes, we train a large model (Ours (L)) on 128 GPUs for 100 epochs (~70k iterations). |
| Software Dependencies | No | The paper mentions using Adam W optimizer and Flash Attention V2, but does not provide specific version numbers for these or any other software dependencies like PyTorch or Python. |
| Experiment Setup | Yes | We set the loss weights γ C, γ C , γp, γα, γy to 1, 2, 1, 1, 1, respectively. For more details, please refer to Sec. A.3 of the appendix. ... We use Adam W (Loshchilov & Hutter, 2017) (β1 = 0.9, β2 = 0.95) optimizer with weight decay 0.05 for model optimization. The initial learning rate is zero, which is linearly warmed up to 4e-4 for the first 3k steps and then decay to zero by cosine scheduling. |