Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Continual Unsupervised Generative Modelling via Online Optimal Transport

Authors: Fei Ye, Adrian G. Bors, Kun Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the proposed approach achieves state-of-the-art performance in both supervised and unsupervised learning. Table 1: Evaluation of the mage generation performance using FID for class-incremental learning. Table 2: Average classification accuracy on continual learning benchmarks, considering 10 runs for various models. Figure 3: Ablation results for SDDM.
Researcher Affiliation	Academia	1School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 2Department of Computer Science, University of York, York YO10 5GH, UK 3MBZUAI, Abu Dhabi, UAE, 4Carnegie Mellon University, Pittsburgh, PA, USA
Pseudocode	Yes	The pseudocode used for implementing the SDDM memory system with the dynamic model mechanism is provided in Algorithm 2 in the Appendix A from SM1.
Open Source Code	Yes	Code https://github.com/dtuzi123/Dual Memory System
Open Datasets	Yes	We consider the class-incremental learning of the Split MNIST, Split Fashion, Split SVHN and Split CIFAR10, where each learning task consists of data from 2 consecutive classes of the original datasets MNIST, Fashion, SVHN, CIFAR10, as in (Aljundi, Kelchtermans, and Tuytelaars 2019), using a memory buffer of 2,000 data samples. [...] Methods Resolution Celeb A-HQ CACD FFHQ
Dataset Splits	Yes	We consider the class-incremental learning of the Split MNIST, Split Fashion, Split SVHN and Split CIFAR10, where each learning task consists of data from 2 consecutive classes of the original datasets MNIST, Fashion, SVHN, CIFAR10, as in (Aljundi, Kelchtermans, and Tuytelaars 2019), using a memory buffer of 2,000 data samples. The number of training epochs for each training session (time) is of 6 for all models and the FID score is calculated on 5,000 testing samples after the whole training process is completed. [...] We follow the experiment setting from the standard benchmark (Buzzega et al. 2020) in which the maximum memory size is 500 for the Split CIFAR10, Split Tiny Image Net (Split TI) and P-MNIST.
Hardware Specification	No	No specific hardware details such as GPU/CPU models or processor types are mentioned in the paper.
Software Dependencies	No	The paper mentions using Denoising Diffusion Probabilistic Model (DDPM) but does not specify software library names with version numbers for implementation.
Experiment Setup	Yes	The number of training epochs for each training session (time) is of 6 for all models and the FID score is calculated on 5,000 testing samples after the whole training process is completed. The final hyperparameter λ for Split MNIST, Split Fashion, Split SVHN and Split CIFAR10 is 44, 44, 43 and 44, respectively. [...] The hyperparameter λd from Eq. (13) is 32, 33, 32 and 33, for Split MNIST, Split Fashion, Split SVHN and Split CIFAR10, respectively.