Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
Authors: Sirui Xu, ziyin wang, Yu-Xiong Wang, Liangyan Gui
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply Inter Dreamer to the BEHAVE, OMOMO, and CHAIRS datasets, and our comprehensive experimental analysis demonstrates its capability to generate realistic and coherent interaction sequences that seamlessly align with the text directives. |
| Researcher Affiliation | Academia | Sirui Xu Ziyin Wang Yu-Xiong Wang Liang-Yan Gui University of Illinois Urbana-Champaign Equal Contribution Equal Advising EMAIL |
| Pseudocode | No | The paper describes its methodology in narrative text and diagrams (Figure 2) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Our code is not available at this time but will be released in the future. |
| Open Datasets | Yes | We use BEHAVE [7], CHAIRS [47], and OMOMO [66] datasets for quantitative evaluation. |
| Dataset Splits | No | The paper mentions training on 'training set' and evaluating on 'test set' but does not explicitly provide details or mention a 'validation set' or 'validation split' for dataset partitioning. |
| Hardware Specification | Yes | The dynamic model is trained on an NVIDIA A40 GPU for a day. This work used computational resources on NCSA Delta and PTI Jetstream2 through allocations CIS220014, CIS230012, CIS230013, and CIS240311 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, and on TACC Frontera through the National Artificial Intelligence Research Resource (NAIRR) Pilot. |
| Software Dependencies | No | The paper mentions various models and frameworks (e.g., GPT-4, Llama-2, MDM, PyTorch implicitly) but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | The dynamics model, which includes 2 dynamics blocks as described in the main paper, is trained on the BEHAVE training set [7], with a batch size of 32, a latent dimension of 64, and for 500 epochs. The optimization process is conducted over 300 epochs, utilizing a learning rate of 0.01. |