GenéLive! Generating Rhythm Actions in Love Live!
Authors: Atsushi Takada, Daichi Yamazaki, Yudai Yoshida, Nyamkhuu Ganbat, Takayuki Shimotomai, Naoki Hamada, Likun Liu, Taiga Yamamoto, Daisuke Sakurai
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this article, we evaluate the generative performance of Gen e Live! using production datasets at KLab as well as open datasets for reproducibility, while the model continues to operate in their business. We have included a thorough evaluation of this improved model, employing a supercomputer and user feedback from our application in the gaming industry operation with a business company. The benchmarks show that our improved model outperformed the state-of-the-art model known as the Dance Dance Convolution; DDC (Donahue, Lipton, and Mc Auley 2017). For evaluation, we conducted a few experiments, while the business feedback is found in section 6. We took the ablation approach for evaluation, in which the Gen e Live! model with all the presented methods applied was compared against models lacking some single component each. Metrics Following Donahue, Lipton, and Mc Auley (2017), we use the F1-score as the evaluation metric. |
| Researcher Affiliation | Collaboration | 1KLab Inc. 2Kyushu University {takada-at,yamazaki-d, yoshida-yud, ganbat-n, shimotomai-t, hamada-n}@klab.com, {liu.likun.654, yamamoto.taiga.160}@s.kyushu-u.ac.jp, d.sakurai@ieee.org |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and the model, tuned and trained using a supercomputer, are publicly available. Our Py Torch-based source code and the trained models, found after extensive hyperparameter tuning (over 80,000 GPU hours of Tesla P100 on a supercomputer), are publicly available.3 All appendices are available on https://github.com/KLab/ AAAI-23.6040. |
| Open Datasets | Yes | We acquired songs and charts used in Love Live! School Idol Festival All Stars (in short Love Live! All Stars ) and Utano Princesama Shining Live ( Utapri ) operated by KLab. Both the songs and charts are provided by multiple artists. In addition, we use openly accessible songs and charts from Fraxtile and In the groove in the open source game Stepmania, which were used also in the prior work (Donahue, Lipton, and Mc Auley 2017). See appendix A for details on the datasets. |
| Dataset Splits | Yes | The dataset was split into 8 : 1 : 1 for training, evaluation, and testing sets with holdout employed. |
| Hardware Specification | Yes | We conducted the experiments using our supercomputer s NVIDIA Tesla P100 GPUs. We employed 64 of these GPUs to do the grid search in parallel. |
| Software Dependencies | No | The implementation is based on pytorch, and to pre-process audio data librosa was employed. See conda.yaml file in the supplementary material for the complete list of software dependencies. The main text does not specify version numbers for PyTorch or librosa, only listing the software names. |
| Experiment Setup | Yes | In each training, we used the BCE as the loss function. Model parameters were updated using the Adam optimizer (Kingma and Ba 2014). The cosine annealing scheduler (Loshchilov and Hutter 2017) tuned the learning rate for better convergence. During the training, the dropout strategy was employed in both the fully connected layer and Bi LSTM layer. The supercomputer let us conduct a grid search to determine the optimal combination of the following hyperparameters: the learning rate, ηmin in the cosine annealing scheduler (Loshchilov and Hutter 2017), the choice of conv-stack, the width and scale of fuzzy label (Liang, Li, and Ikeda 2019), the dropout rate in the linear layer, the dropout rate in the RNN layer, the number of RNN layers, and the weighting factor in BCE loss. |