reproducibilityindex.ai

Copy is All You Need

Authors: Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to verify the effectiveness of our proposed COG. On the standard language modeling benchmark (Wiki Text-103), our proposed COG substantially outperforms standard baselines on automatic metrics (26.14 vs. 23.43 MAUVE (Pillutla et al., 2021)) and human evaluation (48% vs. 28% human preference). (from Introduction) and 4 EXPERIMENTAL SETUP
Researcher Affiliation	Collaboration	Tencent AI Lab School of Computer Science and Technology, Beijing Institute of Technology
Pseudocode	Yes	Algorithm 1: Phrase Segmentation Algorithm
Open Source Code	Yes	Our source codes are publicly available at https://github.com/gmftby GMFTBY/Copyisallyouneed.
Open Datasets	Yes	On the standard language modeling benchmark (Wiki Text-103) ... The Wiki Text-103 dataset (Merity et al., 2017) contains an extensive collection of Wikipedia articles with over 100 million words... we use the English part of Law-MT (Koehn & Knowles, 2017)... The En-Wiki corpus contains a large-scale collection of Wikipedia articles with over 3 billion words
Dataset Splits	Yes	Benchmarks Train Dev Test Wiki Text-103 1,801,350 3,760 4,358 Law-MT 389,292 2,000 2,000
Hardware Specification	Yes	We train baselines and COG for 400,000 steps on 8 Tesla-V100 GPUs.
Software Dependencies	No	The paper mentions 'Huggingface transformers package' and specific models like 'GPT2 model' and 'BERT-base-cased model', but does not provide specific version numbers for these software dependencies or other libraries.
Experiment Setup	Yes	For all the baselines, the learning rate, dropout rate, and gradient clipping are set as 5e-5, 0.1, and 1.0, respectively. Due to memory limitation, the batch size is set to contain 256 phrases. For the BERT model in the phrase encoder, the maximum sequence length is set as 256. For the GPT2 model in the prefix encoder, the maximum sequence length is set as 512.