Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Authors: Yanqin Jiang, Chaohui Yu, Chenjie Cao, Fan Wang, Weiming Hu, Jin Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Qualitative and quantitative experiments demonstrate that Animate3D significantly outperforms previous approaches. |
| Researcher Affiliation | Collaboration | Yanqin Jiang1,2 Chaohui Yu3,4 Chenjie Cao3,4 Fan Wang3,4 Weiming Hu1,2,5 Jin Gao1,2 1State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3DAMO Academy, Alibaba Group 4Hupan Lab 5School of Information Science and Technology, Shanghai Tech University |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Data, code, and models are open-released. |
| Open Datasets | Yes | To train our MV-VDM, we build a large-scale multi-view video dataset, MV-Video. ... We will release this dataset to further advance the field of 4D generative research. |
| Dataset Splits | Yes | Following the evaluation setting of VBench [23], we use four different random seeds for each object and report the average results. |
| Hardware Specification | Yes | It costs 3 days to train our MV-VDM on 32 80G A800 GPUs, and the optimization for 4D generation takes around 30 minutes on a single A800 GPU per object. |
| Software Dependencies | No | The paper mentions optimizer (Adawm) and a method (freeinit), but does not provide specific version numbers for general software dependencies like programming languages or deep learning frameworks. |
| Experiment Setup | Yes | We use the Adawm optimizer with a learning rate of 4e 4 and a weight decay 0.01, and train the model for 20 epochs with a batch size of 2048. ... Learning rate is 0.0015 initially and decreases linearly to 0.0005 at the end of reconstruction stage. λ1, λ2 and λ3 in Eq. 11 are set to 10.0, 0.01 and 0.5 , respectively. |