reproducibilityindex.ai

Character n-Gram Embeddings to Improve RNN Language Models

Authors: Sho Takase, Jun Suzuki, Masaaki Nagata5074-5082

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments indicate that the proposed method outperforms neural language models trained with well-tuned hyperparameters and achieves state-of-the-art scores on each dataset. In addition, we incorporate our proposed method into a standard neural encoder-decoder model and investigate its effect on machine translation and headline generation. We indicate that the proposed method also has a positive effect on such tasks.
Researcher Affiliation	Collaboration	NTT Communication Science Laboratories Tohoku University sho.takase@nlp.c.titech.ac.jp, jun.suzuki@ecei.tohoku.ac.jp, nagata.masaaki@lab.ntt.co.jp Current affiliation: Tokyo Institute of Technology.
Pseudocode	No	The paper does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions external implementations (e.g., https://github.com/salesforce/awd-lstm-lm) for baseline models but does not provide an explicit statement or link for the source code of their own proposed method or implementations.
Open Datasets	Yes	We used the standard benchmark datasets for the wordlevel language modeling: Penn Treebank (PTB) (Marcus, Marcinkiewicz, and Santorini 1993), Wiki Text-2 (WT2), and Wiki Text-103 (WT103) (Merity et al. 2017). (Mikolov et al. 2010) and (Merity et al. 2017) published pre-processed PTB3, WT2, and WT1034. Following the previous studies, we used these pre-processed datasets for our experiments. Table 1 describes the statistics of the datasets. 3http://www.ﬁt.vutbr.cz/ mikolov/rnnlm/ 4https://einstein.ai/research/the-wikitext-long-termdependency-language-modeling-dataset
Dataset Splits	Yes	Table 1: Statistics of PTB, WT2, and WT103. Vocab, Train, Valid, Test counts are provided.
Hardware Specification	Yes	We calculated it on the NVIDIA Tesla P100.
Software Dependencies	No	The paper mentions using specific models like LSTM and QRNN, and refers to implementations of baseline models, but does not provide specific version numbers for any software dependencies or libraries (e.g., Python version, PyTorch/TensorFlow version, or specific library versions).
Experiment Setup	Yes	We set the embedding size and dimension of the LSTM hidden state to 500 for machine translation and 400 for headline generation. The mini-batch size is 64 for machine translation and 256 for headline generation. For other hyperparameters, we followed the conﬁgurations described in (Kiyono et al. 2017).