Black-Box Forgetting

Authors: Yusuke Kuwana, Yuta Goto, Takashi Shibata, Go Irie

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines. We evaluate the class forgetting performance of our method on image classification tasks. We first describe our experimental setup, including the datasets, baselines, implementation details, and evaluation metrics. We then report the main comparative results between our method and the baselines, as well as a series of analyses of our method.
Researcher Affiliation Collaboration Yusuke Kuwana Tokyo University of Science 4624513@ed.tus.ac.jp Yuta Goto Tokyo University of Science 4623511@ed.tus.ac.jp Takashi Shibata NEC Corporation t.shibata@ieee.org Go Irie Tokyo University of Science goirie@ieee.org
Pseudocode No The paper describes methods in text and uses figures but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/yusukekwn/Black-Box-Forgetting. The code used for the experiments in this paper can be accessed here : https: //github.com/yusukekwn/Black-Box-Forgetting.
Open Datasets Yes We use four benchmark datasets, i.e., CIFAR-10, CIFAR-100, CUB-200-2011, and Image Net30. CIFAR-103 and CIFAR-1004 comprise of a total of 50,000 training images and 10,000 test images [Krizhevsky et al., 2009]. CUB-2002011 [Wah et al., 2011] comprises of images of 200 distinct bird species, with 5,994 training images and 5,794 test images. Image Net30 [Hendrycks et al., 2019] is a 30-class subset of the original Image Net-1k dataset [Deng et al., 2009].
Dataset Splits Yes We randomly select different k samples of each class from the original training images to construct a k-shot training set and a k-shot validation set. All the hyperparameters are tuned on the validation sets, which are distinct from the training and test sets.
Hardware Specification Yes We use a single GV100 GPU with 12.885GB memory for all the experiments.
Software Dependencies No The paper mentions models like CLIP and ViT-B/16 and algorithms like CMA-ES, but it does not specify software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow version, CUDA version).
Experiment Setup Yes We set the number of latent contexts m = 4 for CIFAR-10, and m = 16 for CIFAR-100, CUB200-2011 and Image Net30, respectively. The dimension of a latent context in BBT d, Shared Latent Context (SLC) ds, and Unique Latent Contexts (ULC) du are set to d = 10, ds = 20, du = 5 for CIFAR-10, and d = 125, ds = 400, du = 100 for CIFAR-100, CUB-200-2011 and Image Net30, respectively. For optimization, CMA-ES with the population size of 20 is applied in all the conditions... We optimize the latent contexts for 400 iterations for CIFAR-10 and Image Net30, and 800 iterations for CIFAR-100 and CUB-200-2011.