AANG : Automating Auxiliary Learning
Authors: Lucio M. Dery, Paul Michel, Mikhail Khodak, Graham Neubig, Ameet Talwalkar
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With natural language processing (NLP) as our domain of study, we demonstrate that our automated auxiliary learning pipeline leads to strong improvements over competitive baselines across continued training experiments on a pre-trained model on 5 NLP tasks 1. |
| Researcher Affiliation | Collaboration | Lucio M. Dery1 Paul Michel2 Mikhail Khodak1 Graham Neubig 1 Ameet Talwalkar1,3 1 Carnegie Mellon University 2 ENS PSL University 3 Hewlett Packard Enterprise |
| Pseudocode | Yes | Algorithm 1 AANG |
| Open Source Code | Yes | 1Code available at : https://github.com/ldery/Automating-Auxiliary-Learning. |
| Open Datasets | Yes | Table 4 in Appendix C provides details of the 5 datasets used. ... BIOMED CHEMPROT Kringelum et al. (2016) ... CS SCIERC Luan et al. (2018) ... STANCE SE-2016-6 Mohammad et al. (2016) ... CS ACL-ARC Jurgens et al. (2018) ... NEWS H.PARTISAN Kiesel et al. (2019) |
| Dataset Splits | Yes | Table 4: Specifications of datasets used to evaluate our methods. ... Train Size Dev Size Test Size |
| Hardware Specification | Yes | All models were trained on one of two types of gpus: NVIDIA A100 or NVIDIA A6000. |
| Software Dependencies | No | The paper mentions using "Adam W optimizer" and "Ro BERTabase" but does not provide specific version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | Training Details : Please see Appendix D for more details about hyper-parameter configurations. ... We use a batch size of 128 for all end-tasks tasks except H.PARTISAN where we use a batch size of 64. ... We use the Adam W optimizer (Loshchilov & Hutter, 2017), with weight decay of 0.01 for all experiments. |