Making Scalable Meta Learning Practical
Authors: Sang Choe, Sanket Vaibhav Mehta, Hwijeen Ahn, Willie Neiswanger, Pengtao Xie, Emma Strubell, Eric Xing
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on multiple large-scale meta learning benchmarks, SAMA showcases up to 1.7/4.8 increase in throughput and 2.0/3.8 decrease in memory consumption respectively on single-/multi-GPU setups compared to other baseline meta learning algorithms. |
| Researcher Affiliation | Academia | 1Carnegie Mellon University 2Stanford University 3UCSD 4Allen Institute for AI 5MBZUAI |
| Pseudocode | No | The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | To facilitate research in scalable meta learning, we provide our implementation of SAMA with the above communication optimization in Betty3 that only requires a one-line change in the configuration. |
| Open Datasets | Yes | text classification with a BERT-base model with 110M parameters on multiple weak supervision datasets from the WRENCH benchmark [67]. |
| Dataset Splits | Yes | WRENCH dev set |
| Hardware Specification | Yes | We used 1 NVIDIA RTX 2080Ti GPU for the main experiment, and 4 NVIDIA Tesla V100 GPUs for the throughput-memory analysis in Table 2 and Figure 1. |
| Software Dependencies | No | The paper mentions "Py Torch [46]" and the "Betty" library, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | model: BERT-base, optimizer: Adam, init_lr: 1e-5, lr_scheduler: cosine, wdecay: 0, dataset: WRENCH train set (with majority voting), unroll step: 10, SAMA α: 1.0 |