reproducibilityindex.ai

Toward Adversarial Training on Contextualized Language Representation

Authors: Hongqiu Wu, Yongxiang Liu, Hanwen Shi, hai zhao, Min Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Beyond the success story of adversarial training (AT) in the recent text domain on top of pre-trained language models (PLMs), our empirical study showcases the inconsistent gains from AT on some tasks, e.g. commonsense reasoning, named entity recognition.
Researcher Affiliation	Academia	Hongqiu Wu1,2 & Yongxiang Liu1,2 & Hanwen Shi1,2 & Hai Zhao1,2, & Min Zhang3 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China 3School of Computer Science and Technology, Soochow University, Suzhou, China
Pseudocode	Yes	Algorithm 1 Contextualized representation-Adversarial Training
Open Source Code	Yes	https://github.com/gingasan/Cre AT.
Open Datasets	Yes	For the training corpus, we use a subset (nearly 100GB) of C4 (Raffel et al., 2020).
Dataset Splits	Yes	For dev sets (upper), we report the results over five runs and report the mean and variance for each. For test sets (bottom), the results are taken from the official leaderboard, where Cre AT achieved the new state-of-the-art on March 16, 2022.
Hardware Specification	Yes	Training a base/large-size model takes about 30/100 hours on 16 V100 GPUs with FP16.
Software Dependencies	No	The paper mentions “The implementation is based on transformers (Wolf et al., 2020)” but does not specify version numbers for this or any other software dependencies.
Experiment Setup	Yes	Table 7: Hyperparameters for pre-training. Table 8: Hyperparameters for fine-tuning BERT. (dp: dropout rate, bsz: batch size, lr: learning rate, wd: weight decay, msl: max sequence length, wp: warmup, ep: epochs).