reproducibilityindex.ai

Adversarial Language Games for Advanced Natural Language Intelligence

Authors: Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Kai Zhang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun14248-14256

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments including simulations between agents, and games between agents and human players. Experimental results show that simple attack and defense strategies can achieve promising and interesting results
Researcher Affiliation	Academia	Department of Computer Science and Technology Institute for Artiﬁcial Intelligence, Tsinghua University, Beijing, China Beijing National Research Center for Information Science and Technology, China {yuan-yao18,zhonghx18}@mails.tsinghua.edu.cn
Pseudocode	No	The paper describes methods and strategies in text, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code and datasets of this paper can be obtained from https://github.com/thunlp/AdversarialTaboo.
Open Datasets	Yes	Speciﬁcally, we select 563 target words from English Wikipedia5 articles for Open QA-based simulation, and 567 target words from Reddit conversation dataset (Zhou et al. 2018) for chatbot-based experiment. 5https://en.wikipedia.org
Dataset Splits	No	In our experiments, A and D are trained (or ﬁne-tuned) on two disjoint dataset split from the Reddit dataset, ensuring that the training data of D is invisible to A. The paper mentions training data and a disjoint split but does not specify the explicit train/validation/test dataset splits (e.g., percentages or counts) needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions software components such as GPT-2, BERT, Dialo GPT, Concept Flow, and BM25, but does not provide specific version numbers for any of these components or their underlying libraries.
Experiment Setup	No	The paper describes the general setup of the game, the models used (e.g., fine-tuned GPT-2, BERT, Dialo GPT), and some high-level strategies. However, it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific training configurations for the models.