Probabilistic Bilevel Coreset Selection
Authors: Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Zonghao Chen, Tong Zhang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct the following experiments in common application scenarios of coreset selection: 1) data summarization, where the selected coreset is directly used to train the model; 2) continual learning (Kirkpatrick et al., 2017; Lopez-Paz & Ranzato, 2017; Rebuffi et al., 2017) and streaming (Aljundi et al., 2019b; Hayes et al., 2019; Chrysakis & Moens, 2020), where coresets are selected from training data to construct the replay memory and resist catastrophic forgetting after sequentially learning a series of tasks; 3) feature selection (Cai et al., 2018; Li et al., 2017; Miao & Niu, 2016), where only a subset of features are selected for training and inference. |
| Researcher Affiliation | Collaboration | 1The Hong Kong University of Science and Technology 2Google Research. |
| Pseudocode | Yes | Algorithm 1 Probabilistic Bilevel Coreset Selection |
| Open Source Code | No | The paper does not provide an explicit statement or a link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We conduct experiments on two widely used benchmarks, i.e., MNIST (Deng, 2012) and CIFAR10 (Krizhevsky et al., 2009). |
| Dataset Splits | Yes | The outer objective is calculated based on a held-out balanced validation dataset with 100 samples, comprised of 10 uniformly sampled data from each class. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions general aspects of training and models. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Adam) and model architectures (ResNet18) but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions) used in the experiments. |
| Experiment Setup | Yes | We use the following hyper-parameters during optimization for our experiments. For the inner-loop, the model is trained for 100 epochs using SGD with learning rate of 0.1 and momentum of 0.9. For the outer-loop, the probabilities s are optimized by adam with learning rate of 2.5 and cosine scheduler. The outer-loop is updated for 500-2000 times. |