Automatic Translation of Music-to-Dance for In-Game Characters
Authors: Yinglin Duan, Tianyang Shi, Zhipeng Hu, Zhengxia Zou, Changjie Fan, Yi Yuan, Xi Li
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments 4.1 Dataset and Experimental Setup We test our method on the music-dance creation platform of a role-playing game named Heaven mobile and also generate both 2D and 3D animation for experiments. We built two datasets for our task. Labeled Dance-Music Dataset In this dataset, we first recorded 1,101 different dance phrases (≈ 2.3 hours) by using motion capturing devices (Vicon V16 cameras). ... Experimental results suggest that our method not only generalizes well over various styles of music but also succeeds in choreography for game players. |
| Researcher Affiliation | Collaboration | 1Net Ease Fuxi AI Lab , 2Zhejiang University , 3University of Michigan, Ann Arbor |
| Pseudocode | No | The paper describes its methodology in text and uses mathematical equations, but it does not include formal pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our project including the large-scale dataset and supplemental materials is available at https://github. com/Fuxi CV/music-to-dance. |
| Open Datasets | Yes | Our project including the large-scale dataset and supplemental materials is available at https://github. com/Fuxi CV/music-to-dance. Labeled Dance-Music Dataset In this dataset, we first recorded 1,101 different dance phrases (≈ 2.3 hours) by using motion capturing devices (Vicon V16 cameras). ... For performance evaluation, we split this dataset into a training set (90 %) and a test set (10 %). |
| Dataset Splits | No | The paper states 'we split this dataset into a training set (90 %) and a test set (10 %)' but does not explicitly mention a separate validation split or how it was used (e.g., for hyperparameter tuning or early stopping). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like librosa, Adam optimizer, SGD, ResNet50, DCGAN, and SENet, but it does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | In the pre-training stage, we use Adam optimizer [Kingma and Ba, 2014] to train our model with the learning rate of 10-4 and stop at 200 epochs. The learning rate decay is set to 0.1 per 50 epochs. We set the loss coefficient β1 = β2 = 1 and β3 = 10. In the supervised fine-tuning stage, we train our translator by SGD with the learning rate of 10-2, momentum 0.9, weight decay 5 × 10-4 and the max-epoch number of 500. In the co-ascent stage, we set the learning rate to 10-5, update pseudo labels every 5 epochs, initialize the transition matrix M based on the style of dance phrases (i.e. the similar dance moves are allowed to transfer) and further clip the range of M within [0.01, 1] to improve stability. |