Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Aggregating Quantitative Relative Judgments: From Social Choice to Ranking Prediction

Authors: Yixuan Xu, Hanrui Zhang, Yu Cheng, Vincent Conitzer

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on real-world datasets to compare the performance of ℓ1 and ℓ2 QRJA with existing methods.
Researcher Affiliation	Academia	Yixuan Even Xu Carnegie Mellon University EMAIL Hanrui Zhang Chinese University of Hong Kong EMAIL Yu Cheng Brown University EMAIL Vincent Conitzer Carnegie Mellon University EMAIL
Pseudocode	Yes	Algorithm 1 Subsampling Judgments
Open Source Code	Yes	All source code is available at https://github.com/YixuanEvenXu/quantitative-judgment-aggregation.
Open Datasets	Yes	We use the data from https://www.marathonguide.com/, which publishes results of all major marathon events. ... Codeforces (https://codeforces.com), a website hosting frequent online programming contests...
Dataset Splits	No	The paper states, 'We use the results of the first i − 1 contests to predict the results of the i-th contest,' which describes a temporal train/test split. However, it does not explicitly mention a separate validation set or how hyperparameters were tuned.
Hardware Specification	Yes	All experiments are done on a server with 56 CPU cores and 504G RAM. No GPU is used.
Software Dependencies	Yes	We use Gurobi Gurobi Optimization, LLC [2023] and Network X Hagberg et al. [2008] to implement ℓ1 QRJA and the least-square regression implementation in Sci Py [Jones et al., 2014] to implement ℓ2 QRJA.
Experiment Setup	No	The paper states that 'We set all weights to 1' and discusses variants of Matrix Factorization with 'r = 1, 2, 5' and the use of 'gradient descent for a fixed number of epochs on a deterministic initialization.' However, specific numerical hyperparameters like the learning rate or the exact number of epochs are not provided in the main text or appendices.