Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation

Authors: kaixing yang, Xulong Tang, Ziqiao Peng, Yuxuan Hu, Jun He, Hongyan Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the Fine Dance and AIST++ dataset demonstrate the state-of-the-art performance of MEGADance both qualitatively and quantitatively. Code is available at https://github.com/XulongT/MEGADance.
Researcher Affiliation Collaboration 1Renmin University of China 2Tsinghua University 3Malou Tech Inc
Pseudocode No The paper describes the methodology in prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/XulongT/MEGADance.
Open Datasets Yes 1) Fine Dance. Fine Dance [2] is the largest public dataset for 3D music-to-dance generation... 2) AIST++. AIST++ [1] is a widely used benchmark comprising...
Dataset Splits Yes Following [8], we evaluate on test-set music clips, generating 1024-frame (34.13s) dance sequences. Following [1], we use test-set music clips to generate 1200-frame (20.00s) sequences. For data construction, we augment the training set using a sliding window approach with a window size of 240 and a stride of 16.
Hardware Specification Yes All "Run Times" are conducted on an RTX 3090 GPU with an Intel Xeon Gold 5218 CPU.
Software Dependencies No We define each music feature mt as a 35-dim vector[8] extracted by Librosa[36]... The paper mentions using Librosa and the Adam optimizer, but does not provide specific version numbers for these or other software libraries.
Experiment Setup Yes The model is trained on 8-second SMPL 6D rotation sequences sampled at 30fps... The codebook size is 4375, with L = [7, 5, 5, 5, 5], and the feature dimension is set to 512. For reconstruction, we use both SMPL-parameter loss Lsmpl and joint-position loss Ljoint, with velocity and acceleration terms weighted by α1 = 0.5 and α2 = 0.25, respectively. The model is trained for 200 epochs using the Adam optimizer, with exponential decay rates of 0.5 and 0.99 for the first and second moment estimates. A fixed learning rate is used with a batch size of 32. In MEGADance, the Music Encoder consists of L = 6 processing layers. The Mamba block is configured with a model dimension of 512, state size of 16, convolution kernel size of 4, and expansion factor of 2. The Transformer block uses a hidden size of 512, 8 attention heads, a feedforward dimension of 2048, and a dropout rate of 0.25. For Slide Window Attention, we set the autoregressive step to 22 and the sliding window step to 8 to construct the attention matrix. The model is optimized using Adam with exponential decay rates of 0.9 and 0.99 for the first and second moment estimates, respectively, trained for 80 epochs with a fixed learning rate and a batch size of 64.