Meta-learning with an Adaptive Task Scheduler
Authors: Huaxiu Yao, Yu Wang, Ying Wei, Peilin Zhao, Mehrdad Mahdavi, Defu Lian, Chelsea Finn
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Under the setting of meta-learning with noise and limited budgets, ATS improves the performance on both mini Image Net and a real-world drug discovery benchmark by up to 13% and 18%, respectively, compared to state-of-the-art task schedulers. In this section, we empirically demonstrate the effectiveness of the proposed ATS through comprehensive experiments on both regression and classification problems. |
| Researcher Affiliation | Collaboration | 1Stanford University, 2University of Science and Technology, 3 Tencent AI Lab 4Pennsylvania State University, 5City University of Hong Kong |
| Pseudocode | Yes | Algorithm 1 Meta-training Process with ATS |
| Open Source Code | No | We will open-source the code once the paper is accepted. |
| Open Datasets | Yes | First, we use mini Imagenet as the classification dataset, where we apply the conventional N-way, Kshot setting to create tasks [4]. ... The second dataset aims to predict the activity of drug compounds [21]... These datasets are public datasets and we have cited the related reference. |
| Dataset Splits | Yes | There are 4,276 assays in total, and we split 4,100 / 76 / 100 tasks for meta-training / validation / testing, respectively. |
| Hardware Specification | Yes | All experiments are conducted on an NVIDIA RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions implementing the model using PyTorch but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We use the Adam optimizer with a learning rate of 0.001 and set the momentum to 0.9 and 0.999. The batch size is set to 4 tasks... The number of meta-training iterations is set to 60000. For the inner loop, we perform 5 gradient descent steps, and the learning rate α is set to 0.01. For the neural scheduler, the learning rate for φ is set to 0.001 and the number of layers of the MLP is 2, with hidden size 64. |