Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aggregating Quantitative Relative Judgments: From Social Choice to Ranking Prediction

Authors: Yixuan Xu, Hanrui Zhang, Yu Cheng, Vincent Conitzer

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on real-world datasets to compare the performance of ℓ1 and ℓ2 QRJA with existing methods.
Researcher Affiliation Academia Yixuan Even Xu Carnegie Mellon University EMAIL Hanrui Zhang Chinese University of Hong Kong EMAIL Yu Cheng Brown University EMAIL Vincent Conitzer Carnegie Mellon University EMAIL
Pseudocode Yes Algorithm 1 Subsampling Judgments
Open Source Code Yes All source code is available at https://github.com/YixuanEvenXu/quantitative-judgment-aggregation.
Open Datasets Yes We use the data from https://www.marathonguide.com/, which publishes results of all major marathon events. ... Codeforces (https://codeforces.com), a website hosting frequent online programming contests...
Dataset Splits No The paper states, 'We use the results of the first i − 1 contests to predict the results of the i-th contest,' which describes a temporal train/test split. However, it does not explicitly mention a separate validation set or how hyperparameters were tuned.
Hardware Specification Yes All experiments are done on a server with 56 CPU cores and 504G RAM. No GPU is used.
Software Dependencies Yes We use Gurobi Gurobi Optimization, LLC [2023] and Network X Hagberg et al. [2008] to implement ℓ1 QRJA and the least-square regression implementation in Sci Py [Jones et al., 2014] to implement ℓ2 QRJA.
Experiment Setup No The paper states that 'We set all weights to 1' and discusses variants of Matrix Factorization with 'r = 1, 2, 5' and the use of 'gradient descent for a fixed number of epochs on a deterministic initialization.' However, specific numerical hyperparameters like the learning rate or the exact number of epochs are not provided in the main text or appendices.