Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Autoencoder-Based Hybrid Replay for Class-Incremental Learning
Authors: Milad Khademi Nori, Il Min Kim, Guanghui Wang
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide comprehensive experiments to demonstrate the strong performance of AHR: we conduct our experiments across five benchmarks and ten baselines to showcase the effectiveness of AHR utilizing HAE and RFA while operating with the same memory and compute budgets. ... Table 2: Empirical evaluation of AHR against a suite of CIL baselines. Accuracies and the SEMs. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Toronto Metropolitan University, Toronto, Ontario, Canada 2Electrical and Computer Engineering, Queen s University, Kingston, Ontario, Canada. Correspondence to: Milad Khademi Nori <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Autoencoder-Based Hybrid Replay Algorithm 2 CCE Placement Algorithm 3 HAE Training Algorithm 4 Memory Population |
| Open Source Code | Yes | Implementation is available at github.com/miladkhademinori/autoencoderhybrid-replay-cil. |
| Open Datasets | Yes | We have MNIST(5/2) (Le Cun et al., 2010), Balanced SVHN(5/2) (Netzer et al., 2011), CIFAR-10(5/2) (Krizhevsky et al., 2009), CIFAR-100(10/10) (Krizhevsky et al., 2009), and mini Image Net(20/5) (Vinyals et al., 2016) benchmarks. |
| Dataset Splits | Yes | The series of tasks for CIL are constructed according to (Masana et al., 2020; Ven et al., 2021; Zaj ac et al., 2023), where the popular image classification datasets are split up such that each task presents data pertaining to a subset of classes, in a non-overlapping manner. For naming benchmarks, we follow (Masana et al., 2020), where dataset D is divided into T tasks with C classes for each task. Hence, a benchmark is named as D(T/C). |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided in the paper. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma & Ba, 2014) as the optimizer' and 'Res Net-32' for network architecture, but does not provide specific version numbers for software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | No | The main text states: 'The learning rates, batch sizes, and strategy-dependent hyperparameters are detailed in Appendix B in the supplementary material.' However, specific concrete values for these hyperparameters are not provided in the main text of the paper. |