Deep Attentive Ranking Networks for Learning to Order Sentences
Authors: Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai8115-8122
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive evaluation of our model on six benchmark datasets for the sentence ordering task. We evaluate our model on two standard metrics: (1) Kendall s tau (τ), (2) Perfect Match Ratio (PMR). We surpass the state-ofthe-art models in terms of τ on all benchmark datasets. We also give our model s performance in terms of PMR on all the datasets. Our model excels in making accurate first and last sentence predictions, achieving better performance than previous state-of-the-art approaches. We also provide visualizations of sentence representation and sentence level attention. On the order discrimination task, we show improvements over current state-of-the-art on Accidents dataset and give competitive results on Earthquakes dataset. We conduct a comprehensive analysis of our approach on various benchmark datasets and compare our model with other state-of-the-art approaches. We also demonstrate the effectiveness of different components of our models by performing ablation analysis. |
| Researcher Affiliation | Academia | Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai Department of Computer Science and Engineering, IIT Kanpur, India {kpawan, dhanajit, hk, piyush}@cse.iitk.ac.in |
| Pseudocode | No | The paper contains architectural diagrams and mathematical formulations but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | Following (Cui et al. 2018) and previous works we run our sentence ordering experiments on NIPS abstracts, AAN/ACL abstracts and NSF abstracts datasets from (Logeswaran, Lee, and Radev 2018); ar Xiv abstracts and SIND/VIST captions datasets from (Gong et al. 2016; Agrawal et al. 2016; Huang et al. 2016); and ROCStory dataset from (Wang and Wan 2019; Mostafazadeh et al. 2016). Table 3 provides the statistics for each dataset. |
| Dataset Splits | Yes | Table 3 provides the statistics for each dataset. For example, for NIPS abstracts, it lists Train 2448, Val 409, Test 402. |
| Hardware Specification | No | The paper does not specify any particular hardware details such as GPU models, CPU models, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'pre-trained BERTBASE model' and 'Adam optimizer', but it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For sentence encoder, we use the pre-trained BERTBASE model with 12 Transformer blocks, the hidden size as 768, and 12 self-attention heads. The feed-forward intermediate layer size is 4 × 768, i.e., 3072. The paragraph encoder is a Transformer Network having 2 Transformer blocks, with hidden size 768 and a feed-forward intermediate layer size of 4 × 768, i.e., 3072... We train the model with Adam optimizer (Kingma and Ba 2014) with initial learning rate, 5 × 10−5 for sentence encoder and paragraph encoder and 5 × 10−3 for decoder; β1 = 0.9, β2 = 0.999; and batch size of 400. For pairwise ranking loss, the value of the margin hyperparameter, γ, is set to 1. |