InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
Authors: Sirui Xu, ziyin wang, Yu-Xiong Wang, Liangyan Gui
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply Inter Dreamer to the BEHAVE, OMOMO, and CHAIRS datasets, and our comprehensive experimental analysis demonstrates its capability to generate realistic and coherent interaction sequences that seamlessly align with the text directives. |
| Researcher Affiliation | Academia | Sirui Xu Ziyin Wang Yu-Xiong Wang Liang-Yan Gui University of Illinois Urbana-Champaign Equal Contribution Equal Advising {siruixu2, ziyin, yxw, lgui}@illinois.edu |
| Pseudocode | No | The paper describes its methodology in narrative text and diagrams (Figure 2) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Our code is not available at this time but will be released in the future. |
| Open Datasets | Yes | We use BEHAVE [7], CHAIRS [47], and OMOMO [66] datasets for quantitative evaluation. |
| Dataset Splits | No | The paper mentions training on 'training set' and evaluating on 'test set' but does not explicitly provide details or mention a 'validation set' or 'validation split' for dataset partitioning. |
| Hardware Specification | Yes | The dynamic model is trained on an NVIDIA A40 GPU for a day. This work used computational resources on NCSA Delta and PTI Jetstream2 through allocations CIS220014, CIS230012, CIS230013, and CIS240311 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, and on TACC Frontera through the National Artificial Intelligence Research Resource (NAIRR) Pilot. |
| Software Dependencies | No | The paper mentions various models and frameworks (e.g., GPT-4, Llama-2, MDM, PyTorch implicitly) but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | The dynamics model, which includes 2 dynamics blocks as described in the main paper, is trained on the BEHAVE training set [7], with a batch size of 32, a latent dimension of 64, and for 500 epochs. The optimization process is conducted over 300 epochs, utilizing a learning rate of 0.01. |