reproducibilityindex.ai

Black-Box Forgetting

Authors: Yusuke Kuwana, Yuta Goto, Takashi Shibata, Go Irie

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines. We evaluate the class forgetting performance of our method on image classiﬁcation tasks. We ﬁrst describe our experimental setup, including the datasets, baselines, implementation details, and evaluation metrics. We then report the main comparative results between our method and the baselines, as well as a series of analyses of our method.
Researcher Affiliation	Collaboration	Yusuke Kuwana Tokyo University of Science 4624513@ed.tus.ac.jp Yuta Goto Tokyo University of Science 4623511@ed.tus.ac.jp Takashi Shibata NEC Corporation t.shibata@ieee.org Go Irie Tokyo University of Science goirie@ieee.org
Pseudocode	No	The paper describes methods in text and uses figures but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/yusukekwn/Black-Box-Forgetting. The code used for the experiments in this paper can be accessed here : https: //github.com/yusukekwn/Black-Box-Forgetting.
Open Datasets	Yes	We use four benchmark datasets, i.e., CIFAR-10, CIFAR-100, CUB-200-2011, and Image Net30. CIFAR-103 and CIFAR-1004 comprise of a total of 50,000 training images and 10,000 test images [Krizhevsky et al., 2009]. CUB-2002011 [Wah et al., 2011] comprises of images of 200 distinct bird species, with 5,994 training images and 5,794 test images. Image Net30 [Hendrycks et al., 2019] is a 30-class subset of the original Image Net-1k dataset [Deng et al., 2009].
Dataset Splits	Yes	We randomly select different k samples of each class from the original training images to construct a k-shot training set and a k-shot validation set. All the hyperparameters are tuned on the validation sets, which are distinct from the training and test sets.
Hardware Specification	Yes	We use a single GV100 GPU with 12.885GB memory for all the experiments.
Software Dependencies	No	The paper mentions models like CLIP and ViT-B/16 and algorithms like CMA-ES, but it does not specify software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow version, CUDA version).
Experiment Setup	Yes	We set the number of latent contexts m = 4 for CIFAR-10, and m = 16 for CIFAR-100, CUB200-2011 and Image Net30, respectively. The dimension of a latent context in BBT d, Shared Latent Context (SLC) ds, and Unique Latent Contexts (ULC) du are set to d = 10, ds = 20, du = 5 for CIFAR-10, and d = 125, ds = 400, du = 100 for CIFAR-100, CUB-200-2011 and Image Net30, respectively. For optimization, CMA-ES with the population size of 20 is applied in all the conditions... We optimize the latent contexts for 400 iterations for CIFAR-10 and Image Net30, and 800 iterations for CIFAR-100 and CUB-200-2011.