Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation
Authors: Yufei Wang, Can Xu, Huang Hu, Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on several benchmarks verify the effectiveness of our proposed model in both controllable and general text generation tasks. For evaluation, we select three representative benchmarks because all of them involve constraints or prior knowledge, allowing us to verify the effectiveness of our proposed NRETM model: ROCStories [12] are five-sentence stories with complicated predicate constraints over the story structure; Commonsense Generation task [13] with the constraints of mentioning all input concepts; TED15 Zh-En document-level machine translation benchmark [14] with prior knowledge of translating input sentences one by one. |
| Researcher Affiliation | Collaboration | Macquarie University, Sydney, Australia1 Microsoft Corporation, Beijing, China2 CSIRO Data61, Sydney, Australia3 yufei.wang@students.mq.edu.au, {mark.dras,mark.johnson}@mq.edu.au {caxu,huahu,chongyang.tao,djiang}@microsoft.com stephen.wan@data61.csiro.au |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes the model components and their interactions in text and diagrams. |
| Open Source Code | Yes | 3Our Source Code can be found in https://github.com/Gary Yufei/NRETM |
| Open Datasets | Yes | For evaluation, we select three representative benchmarks because all of them involve constraints or prior knowledge, allowing us to verify the effectiveness of our proposed NRETM model: ROCStories [12] are five-sentence stories with complicated predicate constraints over the story structure; Commonsense Generation task [13] with the constraints of mentioning all input concepts; TED15 Zh-En document-level machine translation benchmark [14] with prior knowledge of translating input sentences one by one. |
| Dataset Splits | Yes | TED15 Zh-En (from IWSLT 2014 and 2015 [25, 26]) as training and validation set and 2010-2013 TED as the test set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions using transformer-based models like T5-Base and T5-Large, but does not provide specific version numbers for software dependencies or libraries (e.g., PyTorch, TensorFlow, HuggingFace Transformers). |
| Experiment Setup | Yes | All models use the beam search decoding algorithm with beam size 5. When implementing our model, we use the same pre-processing method, blocks segmentation strategy and beam search setting as [3]. |