Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

Authors: Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, Dacheng Tao

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments in various real-world scenarios show the superior performance of our Bi Df MKD framework. 5. Experiments, 5.1. Experimental Setup, 5.2. Experiments of black-box DFML in API-SS, 5.3. Experiments of black-box DFML in SH, 5.4. Experiments of black-box DFML in API-MH, 5.5. Ablation Studies
Researcher Affiliation Collaboration 1Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China 2JD Explore Academy, Beijing, China 3Department of Computer Science and Engineering, University at Buffalo, NY, USA 4School of Data Science, the Chinese University of Hong Kong, Shenzhen, China 5School of Computer Science, the University of Sydney, Sydney, Australia.
Pseudocode Yes Overall, we integrate Bi Df-MKD and task memory replay in an end-to-end manner, which is summarized in Alg. 1 of App. D. Algorithm 1: Black-box data-free meta-learning.
Open Source Code No The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate our Bi Df MKD framework on the meta-testing subsets of CIFAR-FS (Bertinetto et al., 2018), Mini Image Net (Vinyals et al., 2016), and CUB-200-2011 (CUB) (Wah et al., 2011).
Dataset Splits Yes We split each dataset into three subsets following (Wang et al., 2022c): 64 classes for meta training, 16 classes for meta validation and 20 classes for meta testing.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch (Paszke et al., 2017)' and 'Tensor Flow (Abadi et al., 2015)' as frameworks used for automatic differentiation, but it does not specify their exact version numbers or other software dependencies with versions.
Experiment Setup Yes We adopt Adam optimizer to pre-train the model inside each black-box API via standard supervised learning with a learning rate of 0.01. In practice, we collect 100 black-box APIs. For Bi Df-MKD, we recover 30 images for the support set and query set, respectively. We adopt Adam optimizer to optimize the generator parameters θG and input z simultaneously by minimizing Eq. (2) with the learning rate of 0.001 for 200 epochs. We adopt Adam optimizer to optimize the meta model parameters θ by minimizing Eq. (8) with the inner-level learning rate of 0.01 and the outer-level learning rate of 0.001. For boundary query set recovery, we empirically set the coefficient λQ as 1. For task memory replay, we adopt MAML to perform meta-learning on the interpolated tasks. We conduct MAML with the Adam optimizer with the inner-level learning rate of 0.01 and the outer-level learning rate of 0.001. For the zero-order gradient estimator, we query each API with 100 random direction vectors drawn from the sphere of a unit ball. We set the smoothing parameter µ as 0.005 in Eq. (5).