CigTime: Corrective Instruction Generation Through Inverse Motion Editing
Authors: Qihang Fang, Chengcheng Tang, Bugra Tekin, Yanchao Yang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present both qualitative and quantitative results across a diverse range of applications that largely improve upon baselines. Our approach demonstrates its effectiveness in instructional scenarios, offering text-based guidance to correct and enhance user performance. |
| Researcher Affiliation | Collaboration | Qihang Fang1, Chengcheng Tang2, Bugra Tekin2, and Yanchao Yang1* 1The University of Hong Kong 2Meta Reality Labs {qihfang}@gmail.com, {chengcheng.tang,bugratekin}@meta.com, {yanchaoy}@hku.hk |
| Pseudocode | No | The paper describes the methodology using text and equations but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | We will release the codes and our generated dataset after acceptance. |
| Open Datasets | Yes | Datasets We obtain the source motion sequences from Human ML3D [13], a dataset containing 3D human motions and associated language descriptions. ... To evaluate the generalization ability of our algorithm, we collected 1525 samples from the Fit3D [11] dataset. ... We further evaluate our method baselines on KIT dataset. |
| Dataset Splits | Yes | We split Human ML3D following the original setting and for each motion sequence in Human ML3D, we randomly select one instruction from the corresponding split for editing the sequence. |
| Hardware Specification | Yes | We use a batch size of 512 and train on four NVIDIA Tesla A100 GPUs for eight epochs, which takes approximately 5 hours to complete. |
| Software Dependencies | No | The paper mentions models like 'Llama-3-8B' and 'Adam optimizer' but does not provide specific version numbers for programming languages, libraries (e.g., PyTorch), or other ancillary software dependencies. |
| Experiment Setup | Yes | We fine-tune a pre-trained Llama-3-8B [30] using full-parameter fine-tuning for corrective instruction generation. The model is optimized using the Adam optimizer with an initial learning rate of 10 5. We use a batch size of 512 and train on four NVIDIA Tesla A100 GPUs for eight epochs, which takes approximately 5 hours to complete. |