Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Weighted Training for Cross-Task Learning
Authors: Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J Su
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of TAWT is corroborated through extensive experiments with BERT on four sequence tagging tasks in natural language processing (NLP), including part-of-speech (Po S) tagging, chunking, predicate detection, and named entity recognition (NER). |
| Researcher Affiliation | Academia | Shuxiao Chen University of Pennsylvania EMAIL Koby Crammer The Technion EMAIL Hangfeng He University of Pennsylvania EMAIL Dan Roth University of Pennsylvania EMAIL Weijie J. Su University of Pennsylvania EMAIL |
| Pseudocode | Yes | Algorithm 1: Target-Aware Weighted Training (TAWT) |
| Open Source Code | Yes | Our code is publicly available at http://cogcomp.org/page/publication_view/963. |
| Open Datasets | Yes | In our experiments, we mainly use two widely-used NLP datasets, Ontontes 5.0 (Hovy et al., 2006) and Co NLL-2000 (Tjong Kim Sang & Buchholz, 2000). |
| Dataset Splits | Yes | There are about 116K sentences, 16K sentences, and 12K sentences in the training, development, and test sets for tasks in Ontonotes 5.0. As for Co NLL-2000, there are about 9K sentences and 2K sentences in the training and test sets. |
| Hardware Specification | Yes | It usually costs about half an hour to run the experiment for each setting (e.g. one number in Table 1) on one Ge Force RTX 2080 GPU. |
| Software Dependencies | No | Specifically, we use the pre-trained case-sensitive BERT-base Py Torch implementation (Wolf et al., 2020), and the common hyperparameters for BERT. ... the optimizer is Adam (Kingma & Ba, 2015). No specific version numbers for PyTorch or other libraries are provided. |
| Experiment Setup | Yes | Specifically, the max length is 128, the batch size is 32, the epoch number is 4, and the learning rate is 5e 5. ... In our experiments, we simply set the size of the randomly sampled subset of the training set as 64 ... In our experiments, we choose ηk = 1.0 in the mirror descent update (2.8). |