reproducibilityindex.ai

Music Style Transfer with Time-Varying Inversion of Diffusion Models

Authors: Sifei Li, Yuxin Zhang, Fan Tang, Chongyang Ma, Weiming Dong, Changsheng Xu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that our method can transfer the style of specific instruments, as well as incorporate natural sounds to compose melodies. Samples and source code are available at https://lsfhuihuiff.github.io/Music TI/. We conducted qualitative evaluation, quantitative evaluation and ablation study to demonstrate the effectiveness of our method, which performs well in both content preservation and style fit.
Researcher Affiliation	Collaboration	Sifei Li1,2, Yuxin Zhang1,2, Fan Tang3, Chongyang Ma4, Weiming Dong1,2*, Changsheng Xu1,2 1MAIS, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Institute of Computing Technology, Chinese Academy of Sciences 4Kuaishou Technology
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Samples and source code are available at https://lsfhuihuiff.github.io/Music TI/.
Open Datasets	Yes	We collected a small-scale dataset from a website (https://pixabay.com) where all the content is free for use.
Dataset Splits	No	The paper describes the total number of clips and their categories (style/content) but does not provide specific train, validation, or test dataset splits.
Hardware Specification	Yes	The training process on each style takes approximately 30 minutes using an NVIDIA Ge Force RTX3090 with a batch size of 1, less than the more than 60 minutes required for TI.
Software Dependencies	No	The paper mentions software components and models like Riffusion, LDMs, CLIP, DDIM, VAE, and Griffin-Lim, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We use the default hyperparameters of LDMs and set a base learning rate of 0.001. ... During inference, our approach employs two hyperparameters: strength and scale. ... We achieved the best results when strength ranged from 0.6 to 0.7 and the scale ranged from 3.0 to 5.0.