Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Deep Attentive Ranking Networks for Learning to Order Sentences
Authors: Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai8115-8122
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive evaluation of our model on six benchmark datasets for the sentence ordering task. We evaluate our model on two standard metrics: (1) Kendall s tau (Ο), (2) Perfect Match Ratio (PMR). We surpass the state-ofthe-art models in terms of Ο on all benchmark datasets. We also give our model s performance in terms of PMR on all the datasets. Our model excels in making accurate ο¬rst and last sentence predictions, achieving better performance than previous state-of-the-art approaches. We also provide visualizations of sentence representation and sentence level attention. On the order discrimination task, we show improvements over current state-of-the-art on Accidents dataset and give competitive results on Earthquakes dataset. We conduct a comprehensive analysis of our approach on various benchmark datasets and compare our model with other state-of-the-art approaches. We also demonstrate the effectiveness of different components of our models by performing ablation analysis. |
| Researcher Affiliation | Academia | Pawan Kumar, Dhanajit Brahma, Harish Karnick, Piyush Rai Department of Computer Science and Engineering, IIT Kanpur, India EMAIL |
| Pseudocode | No | The paper contains architectural diagrams and mathematical formulations but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | Following (Cui et al. 2018) and previous works we run our sentence ordering experiments on NIPS abstracts, AAN/ACL abstracts and NSF abstracts datasets from (Logeswaran, Lee, and Radev 2018); ar Xiv abstracts and SIND/VIST captions datasets from (Gong et al. 2016; Agrawal et al. 2016; Huang et al. 2016); and ROCStory dataset from (Wang and Wan 2019; Mostafazadeh et al. 2016). Table 3 provides the statistics for each dataset. |
| Dataset Splits | Yes | Table 3 provides the statistics for each dataset. For example, for NIPS abstracts, it lists Train 2448, Val 409, Test 402. |
| Hardware Specification | No | The paper does not specify any particular hardware details such as GPU models, CPU models, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'pre-trained BERTBASE model' and 'Adam optimizer', but it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For sentence encoder, we use the pre-trained BERTBASE model with 12 Transformer blocks, the hidden size as 768, and 12 self-attention heads. The feed-forward intermediate layer size is 4 Γ 768, i.e., 3072. The paragraph encoder is a Transformer Network having 2 Transformer blocks, with hidden size 768 and a feed-forward intermediate layer size of 4 Γ 768, i.e., 3072... We train the model with Adam optimizer (Kingma and Ba 2014) with initial learning rate, 5 Γ 10β5 for sentence encoder and paragraph encoder and 5 Γ 10β3 for decoder; Ξ²1 = 0.9, Ξ²2 = 0.999; and batch size of 400. For pairwise ranking loss, the value of the margin hyperparameter, Ξ³, is set to 1. |