reproducibilityindex.ai

Data-dependent Gaussian Prior Objective for Language Generation

Authors: Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the proposed method makes effective use of a more detailed prior in the data and has improved performance in typical language generation tasks, including supervised and unsupervised machine translation, text summarization, storytelling, and image captioning. (Abstract)
Researcher Affiliation	Collaboration	1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China 3Mo E Key Lab of Artiﬁcial Intelligence, AI Institute, Shanghai Jiao Tong University 4National Institute of Information and Communications Technology (NICT), Kyoto, Japan
Pseudocode	No	The paper describes the proposed method using mathematical equations and prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluated the model on several widely used translation tasks: WMT14 English-to-German (EN DE), English-to-French (EN FR), and WMT16 English-to-Romanian (EN RO)... (Section 5.2) The Annotated Gigaword corpus (Napoles et al., 2012) was used as the benchmark... (Section 5.4) ...on the MSCOCO 2014 caption dataset (Lin et al., 2014)... (Section 5.6)
Dataset Splits	Yes	The newstest2013 and newstest2014 datasets were used as the dev set and test set, respectively. (A.3) The newstest2012 and newstest2013 datasets were combined for validation and newstest2014 was used as the test set... (A.3) The data include approximately 3.8M training samples, 400,000 validation samples, and 2000 test samples. (A.4)
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU specifications) used to run the experiments.
Software Dependencies	No	The paper mentions several software tools and libraries (e.g., fastText, Transformer NMT, multi-bleu.pl, ROUGE, SPICE, CIDEr, METEOR) but does not provide specific version numbers for these components, which would be necessary for reproducible software dependencies.
Experiment Setup	Yes	During training with our D2GPo, the value of the standard deviation of the KL diversity item λ was set to 0.1, and the softmax temperature was T = 2.0 in all experiments. (A.7) ...we carried out experiments on WMT14 EN-DE with the Transformer-base model as the baseline 2 and set λ as [0, 0.1, 0.2, 0.5, 1.0], T as [1.0, 2.0, 5.0, 10.0]. (A.7)