reproducibilityindex.ai

Listwise Learning to Rank Based on Approximate Rank Indicators

Authors: Thibaut Thonet, Yagmur Gizem Cinar, Eric Gaussier, Minghan Li, Jean-Michel Renders8494-8502

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst prove theoretically that the approximations proposed are of good quality, prior to validate them experimentally on both learning to rank and text-based information retrieval tasks.
Researcher Affiliation	Collaboration	1 NAVER LABS Europe 2 Amazon UK 3 Univ. Grenoble Alpes, CNRS
Pseudocode	No	The paper describes methods in prose and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The losses deﬁned by List NET, List AP3, Approx and Smooth I were implemented in Py Torch (Paszke et al. 2019), using our own implementation for List NET, Approx and Smooth I4. 4https://github.com/ygcinar/Smooth I
Open Datasets	Yes	To evaluate our approach, we conducted learning to rank experiments on standard, publicly available datasets, namely LETOR 4.0 MQ2007, MQ2008 and MSLR-Web30K (Qin and Liu 2013), respectively containing 1,692/69,623, 784/15,211 and 31,531/3,771,125 queries/documents, and the Yahoo learning to rank Set-1 dataset (Chapelle and Chang 2010), containing 29,921/709,877 queries/documents.
Dataset Splits	Yes	We rely on the standard 5-fold train/validation/test split for the LETOR collections and the standard train/validation/test split for YLTR.
Hardware Specification	Yes	The random seed integer was set to 66 and we ran our experiments on an Intel Xeon server with a Nvidia GTX 1080 Ti GPU.
Software Dependencies	No	The paper mentions 'Py Torch' and other libraries/frameworks by name (e.g., TF-Ranking), but it does not provide specific version numbers for these software components, which are necessary for full reproducibility of ancillary software.
Experiment Setup	Yes	All models are trained for 100 epochs using Adam optimizer with a learning rate of 2 10 5 for BERT, as suggested in Mac Avaney et al. (2019), and 10 3 for the top dense layer, which is a common default value. As mentioned before, the batch size is set to four and gradient accumulation is used every eight steps (Mac Avaney et al. 2019).