Probabilistic Bilevel Coreset Selection

Authors: Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Zonghao Chen, Tong Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct the following experiments in common application scenarios of coreset selection: 1) data summarization, where the selected coreset is directly used to train the model; 2) continual learning (Kirkpatrick et al., 2017; Lopez-Paz & Ranzato, 2017; Rebuffi et al., 2017) and streaming (Aljundi et al., 2019b; Hayes et al., 2019; Chrysakis & Moens, 2020), where coresets are selected from training data to construct the replay memory and resist catastrophic forgetting after sequentially learning a series of tasks; 3) feature selection (Cai et al., 2018; Li et al., 2017; Miao & Niu, 2016), where only a subset of features are selected for training and inference.
Researcher Affiliation Collaboration 1The Hong Kong University of Science and Technology 2Google Research.
Pseudocode Yes Algorithm 1 Probabilistic Bilevel Coreset Selection
Open Source Code No The paper does not provide an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on two widely used benchmarks, i.e., MNIST (Deng, 2012) and CIFAR10 (Krizhevsky et al., 2009).
Dataset Splits Yes The outer objective is calculated based on a held-out balanced validation dataset with 100 samples, comprised of 10 uniformly sampled data from each class.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions general aspects of training and models.
Software Dependencies No The paper mentions optimizers (SGD, Adam) and model architectures (ResNet18) but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions) used in the experiments.
Experiment Setup Yes We use the following hyper-parameters during optimization for our experiments. For the inner-loop, the model is trained for 100 epochs using SGD with learning rate of 0.1 and momentum of 0.9. For the outer-loop, the probabilities s are optimized by adam with learning rate of 2.5 and cosine scheduler. The outer-loop is updated for 500-2000 times.