Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization
Authors: Bing-Jing Hsieh, Ping-Chun Hsieh, Xi Liu
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that the FSAF achieves comparable or better regrets than the state-of-the-art benchmarks on a wide variety of synthetic and real-world test functions. and 4 Experimental Results |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan 2Applied Machine Learning, Facebook AI, Menlo Park, CA, USA |
| Pseudocode | Yes | The pseudo code of the training procedure of FSAF is provided in Appendix A. |
| Open Source Code | Yes | The source code for our experiments has been made publicly available4. 4https://github.com/pinghsieh/FSAF. |
| Open Datasets | Yes | We proceed to evaluate FSAF on test functions obtained from five open-source real-world tasks in different application domains. and The detailed description of the datasets is in Appendix C. In this setting, we consider 1-shot adaptation for FSAF, a rather sample-efficient scenario of few-shot learning. From Figure 3, we observe that FSAF remains the best or among the best for all the five real-world test functions, despite the salient structural differences of the datasets. and For training, we construct a collection of training tasks, each of which is a class of GP functions with either an RBF, Matern-3/2, or a spectral mixture kernel with different parameters (e.g., lengthscale and periods). |
| Dataset Splits | Yes | For any initial model parameters θ and training set Dtr of a task τ, let M(θ, Dtr τ) be an algorithm that outputs the adapted model parameters by applying few-shot fast adaptation to θ based on Dtr τ. The performance of the adapted model is evaluated on Dval τ by a meta-loss function L(M(θ, Dtr τ), Dval τ ). |
| Hardware Specification | Yes | Our experiments are conducted on NVIDIA GeForce RTX 2080 Ti GPUs. The average training time for FSAF is around 25 GPU-hours. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | Configuration of FSAF. For training, we construct a collection of training tasks, each of which is a class of GP functions with either an RBF, Matern-3/2, or a spectral mixture kernel with different parameters (e.g., lengthscale and periods). For the reward design of FSAF, we use g(z) = log z to encourage high-accuracy solutions. We choose N = 5, K = 1, and S = 1 given the limitation of GPU memory. For testing, we use the model with the best average total return during training as our initial model, which is later fine-tuned via few-shot fast adaptation for each task. For a fair comparison, we ensure that FSAF and Meta BO-T use the same amount of meta-data in each experiment. |