Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SQL-Rank: A Listwise Approach to Collaborative Ranking
Authors: Liwei Wu, Cho-Jui Hsieh, James Sharpnack
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we compare our proposed algorithm (SQL-Rank) with other state-of-the-art algorithms on real world datasets. |
| Researcher Affiliation | Academia | 1Department of Statistics, University of California, Davis, CA, USA 2Department of Computer Science, University of California, Davis, CA, USA. |
| Pseudocode | Yes | Algorithm 1 SQL-Rank: General Framework |
| Open Source Code | Yes | SQL-Rank: our proposed algorithm implemented in Julia 1. 1https://github.com/wuliwei9278/SQL-Rank |
| Open Datasets | Yes | We experiment on the following four datasets. Note that the original data of Movielens1m, Amazon and Yahoo-music are ratings from 1 to 5, so we follow the procedure in (Rendle et al., 2009; Yu et al., 2017) to preprocess the data. ... Movielens1m: a popular movie recommendation data with 6, 040 users and 3, 952 items. Amazon: the Amazon purchase rating data for musical instruments 3 with 339, 232 users and 83, 047 items. Yahoo-music: the Yahoo music rating data set 4 which contains 15, 400 users and 1, 000 items. Foursquare: a location check-in data5. |
| Dataset Splits | Yes | We use rank r = 100 and tune regularization parameters for all three algorithms using a random sampled validation set. |
| Hardware Specification | Yes | All experiments are conducted on a server with an Intel Xeon E5-2640 2.40GHz CPU and 64G RAM. |
| Software Dependencies | No | The paper mentions 'Julia' and 'C++' but does not provide specific version numbers for these or other software dependencies, aside from an ambiguous 'Julia 1'. |
| Experiment Setup | Yes | We use rank r = 100 and tune regularization parameters for all three algorithms using a random sampled validation set. For Weighted-MF, we also tune the confidence weights on unobserved data. For BPR and SQL-Rank, we fix the ratio of subsampled unobserved 0 s versus observed 1 s to be 3 : 1, which gives the best performance for both BPR and SQL-rank in practice. |