Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Adaptive Shortcut Debiasing for Online Continual Learning

Authors: Doyoung Kim, Dongmin Park, Yooju Shin, Jihwan Bang, Hwanjun Song, Jae-Gil Lee

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on five benchmark datasets demonstrate that, when combined with various OCL algorithms, Drop Top increases the average accuracy by up to 10.4% and decreases the forgetting by up to 63.2%.
Researcher Affiliation	Academia	Doyoung Kim, Dongmin Park, Yooju Shin, Jihwan Bang, Hwanjun Song, Jae-Gil Lee* KAIST, Daejeon, Republic of Korea EMAIL
Pseudocode	Yes	Appendix A describes the pseudocode of adaptive intensity shifting, which is self-explanatory.
Open Source Code	Yes	All algorithms are implemented using Py Torch 1.12.1 and tested on a single NVIDIA RTX 2080Ti GPU, and the source code is available at https://github.com/kaist-dmlab/Drop Top.
Open Datasets	Yes	We use the Split CIFAR-10 (Krizhevsky, Hinton et al. 2009), Split CIFAR-100 (Krizhevsky, Hinton et al. 2009), and Split Image Net-9 (Xiao et al. 2020) for the biased setup. ... In Image Net-Only FG (Xiao et al. 2020), the background is removed to evaluate the dependecy on the background in image recognition; in Image Net-Stylized (Geirhos et al. 2019), the local texture is shifted by style-transfer, and the reliance of a model on the local texture cue is removed.
Dataset Splits	No	The paper describes an online continual learning setting where data streams emerge continually. While it uses an 'episodic memory' for internal loss calculation and 'validates' performance, it does not specify traditional fixed training/validation/test dataset splits with percentages or absolute counts for the entire dataset.
Hardware Specification	Yes	All algorithms are implemented using Py Torch 1.12.1 and tested on a single NVIDIA RTX 2080Ti GPU
Software Dependencies	Yes	All algorithms are implemented using Py Torch 1.12.1 and tested on a single NVIDIA RTX 2080Ti GPU, and the source code is available at https://github.com/kaist-dmlab/Drop Top.
Experiment Setup	Yes	For all algorithms and datasets, the size of a minibatch from the data stream and the replay memory is set to 32, following (Buzzega et al. 2020). The size of episodic memory is set to 500 for Split CIFAR-10 and Split Image Net-9 and 2,000 for Split CIFAR-100 depending on the total number of classes. ... We train Res Net18 using SGD with a learning rate of 0.1 (Buzzega et al. 2020; Shim et al. 2021) for all Res Netbased algorithms. We optimize L2P and Dual Prompt with a pretrained Vi T-B/16 using Adam with a learning rate of 0.05, β1 of 0.9, and β2 of 0.999. ... For attentive debiasing, we fix the total drop ratio γ to 5.0% and set the initial drop intensity κ0 to 5.0% for ER, DER++, and MIR and to 0.5% for GSS and ASER, differently depending on the sampling method. For L2P and Dual Prompt, we set γ and κ0 to 2.0% and 1.0%, respectively, owing to the difference of the backbone network. For adaptive intensity shifting, we fix the history length l to 10... The alternating period p = 3 and the shifting step size α = 0.9 are adequate across the algorithms and datasets.