Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Concept-Driven Continual Learning

Authors: Sin-Han Yang, Tuomas Oikarinen, Tsui-Wei Weng

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our methods are designed to enhance interpretability, providing transparency and control over the continual training process. While our primary focus is to provide a new framework to design continual learning algorithms based on interpretability instead of improving performance, we observe that our methods often surpass existing ones: IG-CL employs interpretability tools to guide neural networks, showing an improvement of up to 1.4% in average incremental accuracy over existing methods; IN2, inspired by the Concept Bottleneck Model, adeptly adjusts concept units for both new and existing tasks, reducing average incremental forgetting by up to 9.1%. Both our frameworks demonstrate superior performance compared to exemplar-free methods, are competitive with exemplar-based methods, and can further improve their performance by up to 18% when combined with exemplar-based strategies. Additionally, IG-CL and IN2 are memory-efficient as they do not require extra memory space for storing data from previous tasks. These advancements mark a promising new direction in continual learning through enhanced interpretability1. ... To evaluate our methods, we perform experiments on two datasets: CIFAR-100 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015). Experiments on CIFAR-10 (Krizhevsky et al., 2009) and CUB-200 (Wah et al., 2011) are discussed in Appendix B and C. We consider T = 5, 10, 20 tasks scenario in class incremental setting. We use Res Net18 (He et al., 2016) as the experiment model. Experiment, dataset, result report and hyperparameter selection details are in Appendix D.1.
Researcher Affiliation	Academia	Sin-Han Yang EMAIL National Taiwan University Tuomas Oikarinen EMAIL UC San Diego Tsui-Wei Weng EMAIL UC San Diego
Pseudocode	Yes	IG-CL s full algorithm is summarized in Algorithm 1 in Appendix C.2. ... Algorithm 1 IG-CL: Freeze the subnetworks of the concept units Require: Dataset D; regularization coefficient µ; connection threshold τ; regularization factor λ; Neural network parameters θ 1: for t 1,...,T do 2: if t is 1 then 3: Train θ1 on Dt by solving min θ1 L(θ1; D1) + µ PL l=1 W1 l 2,1 4: else 5: Train θt on Dt by solving Eq. 1 6: Concept Unit CLIP-Dissect(W t) 7: Prev-active Concept Unit 8: for layer l L,...,1 do Find the subnetwork of the concept units 9: for Unit ul 1,...,Ul do 10: if Prev-active[ul] is True then ul is in subnetwork 11: for Unit ul 1 1,...,Ul 1 do 12: if Wt ul,ul 1 1 > τ then weight exceeds threshold 13: Active[ul 1] True 14: if Using freeze-all then 15: Prev-active[ul] is True, Freeze Wt ul,: 16: Active[ul 1] is True, Freeze Wt :,ul 1 17: else if Using freeze-part then 18: Prev-active[ul] is True, Freeze Wt ul,: 19: Prev-active Active
Open Source Code	Yes	1Our code is available at https://github.com/Trustworthy-ML-Lab/concept-driven-continual-learning
Open Datasets	Yes	To evaluate our methods, we perform experiments on two datasets: CIFAR-100 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015). Experiments on CIFAR-10 (Krizhevsky et al., 2009) and CUB-200 (Wah et al., 2011) are discussed in Appendix B and C.
Dataset Splits	Yes	To evaluate our methods, we perform experiments on two datasets: CIFAR-100 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015). Experiments on CIFAR-10 (Krizhevsky et al., 2009) and CUB-200 (Wah et al., 2011) are discussed in Appendix B and C. We consider T = 5, 10, 20 tasks scenario in class incremental setting. ... We split each dataset by 3 different random seeds, and run each class distribution for 3 times. The code and full training details will be released to public upon acceptance. ... The class distributions are in appendix D.3. ... Table 36: Classes distribution of CIFAR-10 separated by random seed 3456. Task 1 automobile, dog Task 2 deer, horse Task 3 bird, frog Task 4 ship, truck Task 5 airplane, cat
Hardware Specification	Yes	All models are trained on single NVIDIA V100s (32 GB SMX2).
Software Dependencies	No	The paper mentions using a 'continual learning library Avalanche (Lomonaco et al., 2021)' and also references 'CLIP-Dissect (Oikarinen & Weng, 2023)' and 'GPT-3 (Brown et al., 2020)', but it does not specify explicit version numbers for these software components or any other key software dependencies.
Experiment Setup	Yes	The hyperparameters used for our methods are in Table 31. We tune the hyperparameters for best performance in AT . The hyperparameters tuning results are in Table 32, 33, 34 and 35. ... Table 31: Hyperparameters for our methods. µ in Eq. (1) 10^-6 λ in Eq. (1) 0.4 λ in Eq. (3) 0 γ in Eq. (3) 0.4 τ in IG-CL’s Step 2 0.15