Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank
Authors: Mouxiang Chen, Chenghao Liu, Zemin Liu, Jianling Sun
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on two LTR benchmark datasets show that the proposed model outperforms the state-of-the-art baselines and verify its effectiveness in debiasing data. |
| Researcher Affiliation | Collaboration | Mouxiang Chen1,4 , Chenghao Liu2 , Zemin Liu3, Jianling Sun1,4 1Zhejiang University, 2Salesforce Research Asia, 3 National University of Singapore, 4Alibaba-Zhejiang University Joint Institute of Frontier Technologies |
| Pseudocode | No | The paper describes the model implementation and objective function in Section 5, but it does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Our codes are available at https://github.com/Keytoyze/Lipschitz-Bernoulli-Decoupling. |
| Open Datasets | Yes | We conducted semi-synthetic experiments on two widely used benchmark datasets: Yahoo! LETOR3 [12] and Istella-S4 [33]. We provide further details for these datasets in Appendix C.1. We followed the given data split of training, validation and testing. 3https://webscope.sandbox.yahoo.com/ 4http://quickrank.isti.cnr.it/istella-dataset/ |
| Dataset Splits | Yes | We followed the given data split of training, validation and testing. |
| Hardware Specification | No | This work is not resource-intensive. |
| Software Dependencies | No | The paper mentions using a 'neural network' for the ranking and observation models, and adopting 'the codes in ULTRA framework' for baselines. However, it does not specify concrete versions for ancillary software or libraries. |
| Experiment Setup | Yes | Following the steps proposed by [13], we set the relevance probability to be: Pr(R = 1 | X = x) = ϵ + (1 ϵ) 2yx 1 / 2ymax 1, (3) ... ϵ is the click noise level and we set ϵ = 0.1 as the default setting. ... w is a 10-dimensional vector uniformly drawn from [ η, η], where η is a hyperparameter to control the dependency between the observation and crux features. |