M2Beats: When Motion Meets Beats in Short-form Videos
Authors: Dongxiang Jiang, Yongchang Zhang, Shuai He, Anlong Ming
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results substantiate the superior performance of our method compared to other existing algorithms in the domain of motion rhythm analysis. We propose M2BNet, the novel approach for extracting motion rhythm from videos by incorporating MFE block, which demonstrates superior performance compared to previous methodologies. The present study proposes a novel method for enhancing the rhythm of videos. Our experimental findings in the domain of short-form video processing underscore the substantial potential inherent in this approach. |
| Researcher Affiliation | Academia | Dongxiang Jiang , Yongchang Zhang , Shuai He , Anlong Ming School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications {jiangdx, zhangyongchang, hs19951021, mal}@bupt.edu.cn |
| Pseudocode | No | The paper describes its methods verbally and with architectural diagrams, but it does not include any formal pseudocode or algorithm blocks with numbered steps or code-like formatting. |
| Open Source Code | Yes | Our code is available at https://github.com/m Robotit/M2Beats. |
| Open Datasets | Yes | To address these challenges, we present the motion rhythm dataset AIST-M2B, which is annotated with meticulously curated motion rhythm labels derived from the profound correlation between motion and music in professional dance. Building upon the AIST++ dataset, which comprises hundreds of dance motions and music compositions, we present AIST-M2B. |
| Dataset Splits | Yes | In order to assess the performance of M2BNet on AIST-M2B, a random selection is made where 90% of the videos are assigned as the training set, while the remaining 10% serves as the validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions the software and models used. |
| Software Dependencies | No | The paper mentions several software tools and libraries like YOLOX, ResNet50, LibROSA, and madmom, along with citations to their respective papers. However, it does not provide specific version numbers for these software components or any other dependencies, which are necessary for reproducible setup. |
| Experiment Setup | Yes | The training process encompasses 200 epochs with a learning rate of 0.001. The model that exhibits superior performance on the validation set is chosen for testing purposes, with a confidence threshold set to 0.9. M2BNet consists of 10 MFE blocks, the TCN has a kernel size of (9, 1) in all MFE blocks. In the 5th and 8th MFE blocks, the TCN has a stride of (2, 1), while the others have a stride of (1, 1). We set w to 30 based on the distribution of rhythmic and non-rhythmic labels in the AIST-M2B dataset. In this study, we set the time window to 0.07 seconds. |