reproducibilityindex.ai

Kernelized Bayesian Softmax for Text Generation

Authors: Ning Miao, Hao Zhou, Chengqi Zhao, Wenxian Shi, Lei Li

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on a variety of text generation tasks including machine translation, language modeling, and dialog generation. The empirical results verify the effectiveness of Ker BS. Ablation study indicates that each part of Ker BS, including the Bayesian composition and the kernel function, is necessary for the performance improvement.
Researcher Affiliation	Industry	Ning Miao Hao Zhou Chengqi Zhao Wenxian Shi Lei Li Byte Dance AI lab {miaoning,zhouhao.nlp,zhaochengqi.d,shiwenxian,lileilab}@bytedance.com
Pseudocode	Yes	Algorithm 1: Training scheme for Ker BS
Open Source Code	No	The paper does not include any statement or link providing access to the source code for the described methodology.
Open Datasets	Yes	We employ the Daily Dialog dataset from Li et al. [2017] for experiment, by deleting the overlapping of train and test sets in advance.
Dataset Splits	Yes	Following previous work, we use a 300k, 10k and 30k subset of One-Billion-Word Corpus for training, validating and testing, respectively.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions tools and algorithms like
Experiment Setup	Yes	For Seq2Seq, (hidden size, embedding dimension) are set to (512, 256) and (1024, 512), respectively. And For Transformer, (hidden size, embedding dim, dropout, layer num, head num) is set to (288, 507, 0.1, 5, 2) for both MT and Dialog, following Lee et al. [2018]. All models are trained on sentences with up to 80 words. We set the batch size to 128 and the beam size to 5 for decoding. (...) For LM, we set the initial learning rate to 1.0, and the decay rate to 0.8. For MT and Dialog, the initial learning rate is 5e-4 and the decay rate is 0.5.