Multi-Agent Discussion Mechanism for Natural Language Generation

Authors: Xu Li, Mingming Sun, Ping Li6096-6103

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train and evaluate the discussion mechanism on Table to Text Generation, Text Summarization and Image Caption tasks, respectively. Our empirical results demonstrate that the proposed multi-agent discussion mechanism is helpful for maximizing the utility of the communication between agents.
Researcher Affiliation Industry Xu Li, Mingming Sun, Ping Li Cognitive Computing Lab (CCL), Baidu Research {lixu13,sunmingming01,liping11}@baidu.com
Pseudocode Yes Algorithm 1 describes the discussion mechanism of a single decoder time step.
Open Source Code No The paper does not provide any statement about making its source code available or include a link to a code repository.
Open Datasets Yes We train and evaluate the proposed discussion mechanism on three types of input information. The first one is the NBA game description generation task, different views of an NBA match is organized as several tables. The second one is the text summarization task, in this experiment, text to be summarized are too long to be encoded by a single encoder agent. Lastly, we evaluate discussion mechanism on the image caption task. We use the ROTOWIRE dataset (Wiseman, Shieber, and Rush 2017), CNN/Daily Mail (Nallapati, Zhai, and Zhou 2017; Hermann et al. 2015) dataset, and MS COCO (Lin et al. 2014) dataset.
Dataset Splits Yes In this experiment, we split the dataset to 3,398 for training, 750 for validation and 750 for testing. ... The preprocessed data has 287,226 training pairs, 13,368 validation pairs, and 11,490 test pairs. ... all 82,783 images from the training set for training, 5,000 images from the validation set for validation and the remaining 5,000 from the validation set for testing.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or cloud computing instances used for experiments.
Software Dependencies No The paper mentions various models and optimizers (e.g., LSTM, GRU, Adam optimizer, Ada-grad) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup Yes Hidden state dimension is set to 600, and the embedding dimension is set to 128. We train the model by Adam optimizer with initialized learning rate 0.001. Clipped gradient by 1 and L2 regularization has been adopted to avoid gradient explosion. Batch normalization has been adopted in each time step of decoder GRU to accumulate training process. ... The hidden dimension is set to 512, and the embedding dimension is 128. Ada-grad (Duchi, Hazan, and Singer 2011) with a learning rate of 0.15 and an initial accumulator value of 0.1 has been adopted as the optimizer during training. We use gradient clipping with a maximum gradient norm of 2. ... The hidden state dimension is set to 512, and the embedding dimension is set to 256. We train the model by Adam optimizer with initialized learning rate 0.001. We clip the gradient by 1 and use L2 regularization on the prediction layer.