4Diffusion: Multi-view Video Diffusion Model for 4D Generation

Authors: Haiyu Zhang, Xinyuan Chen, Yaohui WANG, Xihui Liu, Yunhong Wang, Yu Qiao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive qualitative and quantitative experiments demonstrate that our method achieves superior performance compared to previous methods.
Researcher Affiliation Academia Haiyu Zhang1,2 , Xinyuan Chen2, Yaohui Wang2, Xihui Liu3, Yunhong Wang1, Yu Qiao2 1Beihang University 2Shanghai AI Laboratory 3The University of Hong Kong 1{zhyzhy,yhwang}@buaa.edu.cn 2{chenxinyuan,wangyaohui,qiaoyu}@pjlab.org.cn 3xihuiliu@eee.hku.hk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We release our code and data at https://aejion.github.io/4diffusion.
Open Datasets Yes We utilize Objaverse dataset [11] to train 4DM.
Dataset Splits No The paper mentions using a 'curated subset' for training and 'test cases' for evaluation, but does not specify a validation set split or its details.
Hardware Specification Yes The training takes about 2 days with 16 NVIDIA Tesla A100 GPUs.
Software Dependencies No The paper mentions 'Stable Diffusion framework' and 'threestudio framework' but does not provide specific version numbers for these or other software dependencies like Python or PyTorch.
Experiment Setup Yes We train 4DM with multi-view videos with 256 256 resolutions for 30,000 steps with a batch size of 32, using the Adam W optimizer with a learning rate of 1e-4.