MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization
Authors: Eric Chu, Peter Liu
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show through automated metrics and human evaluation that the generated summaries are highly abstractive, fluent, relevant, and representative of the average sentiment of the input reviews. Finally, we collect a reference evaluation dataset and show that our model outperforms a strong extractive baseline. |
| Researcher Affiliation | Collaboration | 1MIT Media Lab 2Google Brain. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available online3. https://github.com/sosuperic/Mean Sum |
| Open Datasets | Yes | We tuned our models primarily on a dataset of customer reviews provided in the Yelp Dataset Challenge, where each review is accompanied by a 5-star rating. https://www.yelp.com/dataset/challenge |
| Dataset Splits | Yes | The final training, validation, and test splits consist of 10695, 1337, and 1337 businesses, and 1038184, 129856, and 129840 reviews, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only describes software architecture and training parameters. |
| Software Dependencies | No | The paper mentions general algorithms and models like "multiplicative LSTM" and "Adam" but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The language model, encoders, and decoders were multiplicative LSTM s (Krause et al., 2016) with 512 hidden units, a 0.1 dropout rate, a word embedding size of 256, and layer normalization (Ba et al., 2016). We used Adam (Kingma & Ba, 2014) to train, a learning rate of 0.001 for the language model, a learning rate of 0.0001 for the classifier, and a learning rate of 0.0005 for the summarization model, with β1 = 0.9 and β2 = 0.999. The initial temperature for the Gumbel-softmax was set to 2.0. One input item to the language model was k = 8 reviews from the same business or product concatenated together with end-of-review delimiters, with each update step operating on a subsequence of 256 subtokens. The review-rating classifier was a multi-channel text convolutional neural network similar to Kim (2014) with 3,4,5 width filters, 128 feature maps per filter, and a 0.5 dropout rate. |