Structured Neural Summarization
Authors: Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In an extensive evaluation, we show that the resulting hybrid sequence-graph models outperform both pure sequence models as well as pure graph models on a range of summarization tasks. |
| Researcher Affiliation | Industry | Patrick Fernandes, Miltiadis Allamanis & Marc Brockschmidt Microsoft Research Cambridge, United Kingdom {t-pafern,miallama,mabrocks}@microsoft.com |
| Pseudocode | No | The paper describes the model architecture and mathematical formulations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release all used code and data at https://github.com/Coder Pat/structured-neural-summarization. |
| Open Datasets | Yes | We consider the Java (small) dataset of Alon et al. (2018a), re-using the train-validation-test splits they have picked. We additionally generated a new dataset from 23 open-source C# projects mined from Git Hub... We use the CNN/DM dataset (Hermann et al., 2015) using the exact data and split provided by See et al. (2017). |
| Dataset Splits | Yes | First, we consider the Java (small) dataset of Alon et al. (2018a), re-using the train-validation-test splits they have picked. The C# dataset is split 85-5-10%. |
| Hardware Specification | No | The paper mentions 'efficient computation' and 'TensorFlow's unsorted segment * operations' but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments. |
| Software Dependencies | Yes | We use Stanford Core NLP (Manning et al., 2014) (version 3.9.1) to tokenize the text and provide the resulting tokens to the encoder. |
| Experiment Setup | Yes | Concretely, we combine two encoders (a bidirectional LSTM encoder with 1 layer and 256 hidden units, and its sequence GNN extension with 128 hidden units unrolled over 8 timesteps) with two decoders (an LSTM decoder with 1 layer and 256 hidden units with attention over the input sequence, and an extension using a pointer network-style copying mechanism (Vinyals et al., 2015a)). |