Generative Fractional Diffusion Models

Authors: Gabriel Nobis, Maximilian Springenberg, Marco Aversa, Michael Detzel, Rembert Daems, Roderick Murray-Smith, Shinichi Nakajima, Sebastian Lapuschkin, Stefano Ermon, Tolga Birdal, Manfred Opper, Christoph Knochenhauer, Luis Oala, Wojciech Samek

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID, offering a promising alternative to traditional diffusion models.Our experimental evaluation validates our contributions, demonstrating the gains of correlated-noise with long-term memory, approximated by a combination of a number of Markov processes, where the amount of processes further control the diverstiy.
Researcher Affiliation Collaboration Gabriel Nobis Fraunhofer HHI Maximilian Springenberg Fraunhofer HHI Marco Aversa Dotphoton Michael Detzel Fraunhofer HHI Rembert Daems Ghent University Flanders Make MIRO Roderick Murray-Smith University of Glasgow Shinichi Nakajima BIFOLD, TU Berlin RIKEN AIP Sebastian Lapuschkin Fraunhofer HHI Stefano Ermon Stanford University Tolga Birdal Imperial College London Manfred Opper TU Berlin University of Potsdam University of Birmingham Christoph Knochenhauer Technical University of Munich Luis Oala Dotphoton Wojciech Samek Fraunhofer HHI TU Berlin BIFOLD
Pseudocode No The paper contains mathematical formulations and descriptions of processes, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The implementation of our framework is available at https://github.com/Gabriel Nobis/gfdm.
Open Datasets Yes We conduct experiments on MNIST and CIFAR10 to evaluate the ability of GFDM to generate real images.
Dataset Splits No The paper uses MNIST and CIFAR10 datasets for experiments but does not explicitly provide the specific train/validation/test dataset splits (e.g., percentages or counts) used for reproduction.
Hardware Specification Yes For all MNIST training runs we used one A100 GPU per run, taking approximately 17 hours.all computation have been carried out on a GPU Tesla V100 with 32 GB RAM.
Software Dependencies No We used for all experiments a conditional U-Net [70] architecture and the Adam optimizer [71] with PyTorchs One Cylce learning rate scheduler [72].(This text names software but lacks specific version numbers for them or other key dependencies like Python/CUDA).
Experiment Setup Yes We used an attention resolution of [4, 2], 3 resnet blocks and a channel multiplication of [1, 2, 2, 2, 2] and trained with a maximal learning rate of 10-4 for 50k iterations and a batch size of 1024.For the experiments without EMA, we used the same setup as with MNIST, but trained the models in parallel on two A100 GPUs for 300k iterations with an effective batch size of 1024.When training with EMA, we followed the set up of Song et al. [16] using an EMA decay of 0.9999 for all FVP dynamics and an EMA decay of 0.999 for all FVE dynamics. In contrast to Song et al. [16] we used PyTorchs One Cycle LR learning rate scheduler with a maximal learning rate of 2e-4 and trained only for 1mio iterations instead of the 1.3mio iterations in Song et al. [16].