IDEAL: Query-Efficient Data-Free Learning from Black-Box Models
Authors: Jie Zhang, Chen Chen, Lingjuan Lyu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on various real-world datasets show the effectiveness of the proposed IDEAL. For instance, IDEAL can improve the performance of the best baseline method DFME by 5.83% on CIFAR10 dataset with only 0.02 the query budget of DFME. |
| Researcher Affiliation | Collaboration | Jie Zhang Zhejiang University {zj zhangjie}@zju.edu.cn Chen Chen & Lingjuan Lyu Sony AI {Chen A.Chen,Lingjuan.Lv}@sony.com |
| Pseudocode | Yes | The training procedure is demonstrated in the Appendix (see Algorithm 1) and the illustration of the training process of IDEAL is shown in Fig. 1. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | Our experiments are conducted on 7 real-world datasets: MNIST Le Cun et al. (1998), Fashion MNIST (FMNIST) Xiao et al. (2017), CIFAR10 and CIFAR100 Krizhevsky et al. (2009), SVHN Netzer et al. (2011), Tiny-Image Net Le & Yang (2015), and Image Net subset Deng et al. (2009). |
| Dataset Splits | No | The paper focuses on data-free knowledge distillation for the student model, meaning the student does not access real training or validation data. While teacher models were trained on datasets, the paper does not specify the train/validation/test splits for these original datasets, nor does it define a validation split for the synthetic data used by the student model. |
| Hardware Specification | No | The paper mentions running experiments on 'Microsoft Azure' for one section and 'resource-limited edge devices' in general, but does not specify any particular CPU, GPU, or other hardware components used for running their experiments. |
| Software Dependencies | No | The paper mentions using 'Adam Optimizer' and 'SGD optimizer' but does not provide specific version numbers for any software dependencies, libraries, or frameworks. |
| Experiment Setup | Yes | To update the generator, we use the Adam Optimizer with learning rate ηG = 1e 3. To train the student model, we use the SGD optimizer with momentum=0.9 and learning rate ηS = 1e 2. We set the batch size B = 250 for MNIST, FMNIST, SVHN, CIFAR10, and Image Net subset, and B = 1000 for CIFAR100 and Tiny-Image Net datasets. By default, we set the number of iterations in data generation EG = 5 and the scaling factor λ = 5. |