Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Attentive Ranking Networks for Learning to Order Sentences
Authors: Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai8115-8122
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive evaluation of our model on six benchmark datasets for the sentence ordering task. We evaluate our model on two standard metrics: (1) Kendall s tau (Ο), (2) Perfect Match Ratio (PMR). We surpass the state-ofthe-art models in terms of Ο on all benchmark datasets. We also give our model s performance in terms of PMR on all the datasets. Our model excels in making accurate ο¬rst and last sentence predictions, achieving better performance than previous state-of-the-art approaches. We also provide visualizations of sentence representation and sentence level attention. On the order discrimination task, we show improvements over current state-of-the-art on Accidents dataset and give competitive results on Earthquakes dataset. We conduct a comprehensive analysis of our approach on various benchmark datasets and compare our model with other state-of-the-art approaches. We also demonstrate the effectiveness of different components of our models by performing ablation analysis. |
| Researcher Affiliation | Academia | Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai Department of Computer Science and Engineering, IIT Kanpur, India EMAIL |
| Pseudocode | No | The paper contains architectural diagrams and mathematical formulations but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | Following (Cui et al. 2018) and previous works we run our sentence ordering experiments on NIPS abstracts, AAN/ACL abstracts and NSF abstracts datasets from (Logeswaran, Lee, and Radev 2018); ar Xiv abstracts and SIND/VIST captions datasets from (Gong et al. 2016; Agrawal et al. 2016; Huang et al. 2016); and ROCStory dataset from (Wang and Wan 2019; Mostafazadeh et al. 2016). Table 3 provides the statistics for each dataset. |
| Dataset Splits | Yes | Table 3 provides the statistics for each dataset. For example, for NIPS abstracts, it lists Train 2448, Val 409, Test 402. |
| Hardware Specification | No | The paper does not specify any particular hardware details such as GPU models, CPU models, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'pre-trained BERTBASE model' and 'Adam optimizer', but it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For sentence encoder, we use the pre-trained BERTBASE model with 12 Transformer blocks, the hidden size as 768, and 12 self-attention heads. The feed-forward intermediate layer size is 4 Γ 768, i.e., 3072. The paragraph encoder is a Transformer Network having 2 Transformer blocks, with hidden size 768 and a feed-forward intermediate layer size of 4 Γ 768, i.e., 3072... We train the model with Adam optimizer (Kingma and Ba 2014) with initial learning rate, 5 Γ 10β5 for sentence encoder and paragraph encoder and 5 Γ 10β3 for decoder; Ξ²1 = 0.9, Ξ²2 = 0.999; and batch size of 400. For pairwise ranking loss, the value of the margin hyperparameter, Ξ³, is set to 1. |