PiRank: Scalable Learning To Rank via Differentiable Sorting
Authors: Robin Swezey, Aditya Grover, Bruno Charron, Stefano Ermon
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we benchmark Pi Rank against 5 competing methods on two of the largest publicly available LTR datasets: MSLR-WEB30K [20] and Yahoo! C14. We find that Pi Rank is superior or competitive on 13 out of 16 ranking metrics and their variants, including 9 on which it is significantly superior to all baselines, and that it is able to scale to very large item lists. We also provide several ablation experiments to understand the impact of various factors on performance. |
| Researcher Affiliation | Collaboration | Robin Swezey1 Aditya Grover2,3 Bruno Charron1 Stefano Ermon4 1Amazon 2University of California, Los Angeles 3Facebook AI Research 4Stanford University |
| Pseudocode | No | The paper describes algorithmic strategies but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | Finally, we provide an open-source implementation2 based on Tensor Flow Ranking [21]. 2https://github.com/ermongroup/pirank |
| Open Datasets | Yes | To empirically test Pi Rank, we consider two of the largest open-source benchmarks for LTR: the MSLR-WEB30K3 and the Yahoo! LTR dataset C144. Both datasets have relevance scores on a 5-point scale of 0 to 4, with 0 denoting complete irrelevance and 4 denoting perfect relevance. We give extensive details on the datasets and experimental protocol in Appendix C. |
| Dataset Splits | No | The paper mentions "at validation" and refers to Appendix C for experimental details, but the main text does not specify exact training/validation/test split percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions "Tensor Flow Ranking [21]" but does not specify its version or the versions of other software dependencies. |
| Experiment Setup | Yes | All approaches use the same 3-layer fully connected network architecture with Re LU activations to compute the scores ˆy for all (query, item) pairs, trained on 100,000 iterations. The maximum list size for each group of items to score and rank is fixed to 200, for both training and testing. |