Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Continuous Subspace Optimization for Continual Learning

Authors: Quan Cheng, Yuanyu Wan, Lingyu Wu, Chenping Hou, Lijun Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on multiple datasets demonstrate that Co SO signiﬁcantly outperforms state-of-the-art methods, especially in challenging scenarios with long task sequences. Experimental results on CIFAR100, Image Net-R, and Domain Net show that Co SO consistently outperforms state-of-the-art methods by a signiﬁcant margin across diverse continual learning settings, particularly in challenging scenarios involving long task sequences.
Researcher Affiliation	Academia	1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artiﬁcial Intelligence, Nanjing University, Nanjing, China 3School of Software Technology, Zhejiang University, Ningbo, China 4Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, Hangzhou, China 5College of Science, National University of Defense Technology, Changsha, China EMAIL EMAIL, EMAIL
Pseudocode	Yes	A Co SO Algorithm We present the the detailed procedure in Algorithm 1.
Open Source Code	Yes	Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justiﬁcation: Please refer to the supplemental materials.
Open Datasets	Yes	Following previous works [Wang et al., 2022b, Liang and Li, 2024], we evaluate Co SO on three widely-used continual learning benchmarks: Image Net R [Hendrycks et al., 2021], CIFAR100 [Krizhevsky, 2009], and Domain Net [Peng et al., 2019].
Dataset Splits	Yes	Similar to existing works [Smith et al., 2023, Liang and Li, 2024, Wu et al., 2025], we create three different splits of Image Net-R: 5 tasks with 40 classes per task, 10 tasks with 20 classes per task, and 20 tasks with 10 classes per task. For CIFAR100, we divide it into 10 tasks, each containing 10 classes. Domain Net consists of 345 classes across six distinct domains and is split into 5 tasks, with 69 classes per task.
Hardware Specification	Yes	All experiments were conducted on NVIDIA A6000 GPUs with 48GB memory using Py Torch 2.5.1.
Software Dependencies	Yes	All experiments were conducted on NVIDIA A6000 GPUs with 48GB memory using Py Torch 2.5.1.
Experiment Setup	Yes	The optimization is performed using Adam [Kingma, 2014] optimizer with β1 = 0.9 and β2 = 0.999. The training epochs vary across datasets: 40 epochs for Image Net R, 20 epochs for CIFAR100, and 5 epochs for Domain Net. We maintain a consistent batch size of 128 across all experiments. We present the detailed hyperparameter settings of Co SO in Table 4. For all datasets, we employ minimal data augmentation, consisting of random resized cropping to 224 224 pixels and random horizontal ﬂipping during training, without any additional augmentation techniques. To prevent overﬁtting, we followed VPT-NSP2 [Lu et al., 2024], setting the temperature parameter in the cross-entropy loss to 3 for all datasets.