L4GM: Large 4D Gaussian Reconstruction Model
Authors: Jiawei Ren, Cheng Xie, Ashkan Mirzaei, hanxue liang, xiaohui zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our L4GM on the benchmark provided by Consistent4D [21]. and 6 Experiments |
| Researcher Affiliation | Collaboration | 1NVIDIA 2University of Toronto 3 University of Cambridge 4MIT 5S-Lab, Nanyang Technological University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Project page: https://research.nvidia.com/labs/toronto-ai/l4gm and Justification: code to be released upon internal approval. |
| Open Datasets | Yes | Key to our method, is a new large-scale dataset containing 12 million multiview videos of rendered animated 3D objects from Objaverse 1.0 [11]. and To collect a large-scale dataset for the 4D reconstruction task, we render all animated objects in Objaverse 1.0 [11]. |
| Dataset Splits | No | The paper describes the creation of the Objaverse-4D dataset and its use for training, but does not explicitly provide training/validation/test dataset splits (e.g., percentages or counts) or their usage in the experimental setup. |
| Hardware Specification | Yes | We train the model with one 8-frame clip per GPU on 128 80G A100 GPUs. and We test on a 16G RTX 4080 Super GPU. |
| Software Dependencies | No | The paper mentions software like Blender, EEVEE engine, and the LGM model but does not provide specific version numbers for the software dependencies used in the experiments. |
| Experiment Setup | Yes | We downsample the clips to 8 FPS and train the model for 200 epochs. In training, we set T = 8 and use 4 input cameras and 4 supervision cameras. During inference, we used T = 16 and Following LGM, we use a combination of an LPIPS [69] loss and an MSE loss on RGB images, and an MSE loss on segmentation masks to supervise the reconstruction |