Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning

Authors: Xusheng Cao, Haori Lu, Linlan Huang, Fei Yang, Xialei Liu, Ming-Ming Cheng

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our method effectively leverages relational information to help the model correct mispredictions, achieving state-of-the-art results in both conventional CIL and few-shot CIL settings, confirming the efficacy of knowledge graphs at preserving knowledge in the continual learning scenarios. We test our model on two commonly used continual learning benchmarks: Tiny-Image Net and Image Net-R, and two few-shot continual learning benchmarks: CIFAR100 and Mini-Image Net.
Researcher Affiliation	Academia	1NKIARI, Shenzhen Futian, 2VCIP, CS, Nankai University EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Graph Construction For Task t
Open Source Code	No	The NeurIPS checklist states: Question: Does the paper provide open access to the data and code...? Answer: [No] Justification: We will soon open source our code.
Open Datasets	Yes	Datasets. We test our model on two commonly used continual learning benchmarks: Tiny-Image Net and Image Net-R, and two few-shot continual learning benchmarks: CIFAR100 and Mini-Image Net. ... We build our expanding knowledge graph based on Concept Net, which is a large-scale, multilingual knowledge graph...
Dataset Splits	Yes	For conventional continual learning, we follow the two standard configurations used in GMM [5]: B0, in which all classes are equally divided among different tasks, and B100 (i.e. Tiny-Image Net) in which the first task contains 100 classes (half of the dataset) and the rest are equally divided into subsequent tasks. For few-shot continual learning, we follow the data splits proposed by [48]. For both datasets, we divide the data into two parts: a base session and incremental sessions. The base session consists of 60 classes with full access to all associated data. Each incremental session follows a 5-way 5-shot setting, introducing 5 new classes with only 5 samples per class.
Hardware Specification	No	The paper mentions 'Computation is supported by the Supercomputing Center of Nankai University.' but does not provide specific hardware details such as GPU models, CPU models, or memory amounts used for the experiments.
Software Dependencies	No	The paper mentions using 'Mini GPT-4 [68] framework' and 'BERT tokenizer' but does not specify version numbers for these or any other software libraries or programming languages used.
Experiment Setup	Yes	In the B0 setting of all datasets, we employ a 200-iteration warmup with a learning rate of 3e-6 and a learning rate from 3e-5 to 3e-6 with a cosine decay scheduler in the following fine-tuning phase. In the B100 setting, we first employ a learning rate of 3e-6, and then on the subsequent tasks, we adopt a lower learning rate of 3e-7, both employing a cosine decay scheduler. ... Balancing inference time with performance gains, we select r = 3 as our final hyperparameter. ... Inference time was measured as the averaged processing duration per batch (size=64).