Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hybrid Re-matching for Continual Learning with Parameter-Efficient Tuning

Authors: Weicheng Wang, Guoli Jia, Xialei Liu, Liang Lin, Jufeng Yang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on four datasets under five pre-trained settings demonstrate that HRM-PET performs favorably against the state-of-the-art methods. The code is available in the https://github.com/wei-cheng777/HRM-PET. ... We first carry out experiments on two widely used datasets in CIL performance evaluation: Split CIFAR-100 and Split Image Net-R. ... We perform extensive comparative experiments with 7 state-of-the-art PET-based methods. This comparison involves five pre-trained models across four datasets: Split CIFAR100, Split Image Net-R, Image Net-A, and 5-Datasets. ... As illustrated in Table 3, we conduct a comprehensive analysis of each module, across five pre-trained models on the Split Image Net-R datasets. AN is utilized as measurement.
Researcher Affiliation	Academia	1 VCIP & TMCC & DISSec, College of Computer Science, Nankai University, Tianjin, China. 2 Pengcheng Laboratory, Shenzhen, China. 3 Department of Electronic Engineering, Tsinghua University, Beijing, China. 4 School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China. 5 Nankai International Advanced Research Institute (SHENZHEN FUTIAN), Shenzhen, China. EMAIL, EMAIL EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical formulations (e.g., equations 1-11) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available in the https://github.com/wei-cheng777/HRM-PET.
Open Datasets	Yes	We first carry out experiments on two widely used datasets in CIL performance evaluation: Split CIFAR-100 and Split Image Net-R. Split CIFAR-100 splits CIFAR-100[41] dataset into 10 tasks. ... Split Image Net-R [88] is a recently proposed, more challenging dataset. ... To investigate performance on datasets with the larger domain gap from the pre-trained Image Net, we evaluate HRM-PET on Image Net-A [27] ... Moreover, 5-Datasets [15] consisting of CIFAR-10 [41], MNIST [42], Fashion-MNIST [91], SVHN [62] and not MNIST [4] is examined...
Dataset Splits	No	Split CIFAR-100 splits CIFAR-100[41] dataset into 10 tasks. ... Image Net-R ... is similarly split into 10 tasks. ... Image Net-A [27], ... is also divided into 10 tasks and 20 classes for every task [59]. ... 5-Datasets [15] ... are divided into 5 tasks according to different datasets. This describes how data is grouped into tasks, but not the specific training, validation, and testing splits (e.g., percentages or counts) within those tasks or for the overall dataset.
Hardware Specification	Yes	Results are fairly compared on RTX3090 at the same batch size. ... Experimentally, we measured the training time of baseline and HRM-PET on Image Net-R on an RTX 3090.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiment.
Experiment Setup	Yes	During training, we get probability distribution d(x) = g(h(x; pt, θptm); θg) by using the pre-trained model followed with current parameter pt. Then, pt of the current task t and classifier gθg are optimized by cross-entropy. ... N is set to be 2. ... The λCT calibrates the trade-off between the acquisition of task-invariant and task-specific expertise. ... In all experiments, the threshold τ is consistently set as -10. ... HRM-PET improves as λCT increases and achieves the best performance when λCT = 0.2. ... we utilize the common setting K = 5 to reduce the training computational consumption. ... In HRM-PET, Rl = 8 is suitable to learn task-specific knowledge.