Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalized Top-k Mallows Model for Ranked Choices

Authors: Shahrzad Haddadan, Sara Ahmadian

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Furthermore, through extensive experiments on synthetic data and real-world data, we demonstrate the scalability and accuracy of our proposed methods, and we compare the predictive power of Mallows model for top-k lists compared to the simpler Multinomial Logit model. [...] 6 Experiments In this section, we present our experimental analysis, designed to achieve two main objectives: (1) to compare the predictive power of the top-k Mallows model (Top KGMM) with that of the multinomial logit model (MNL), and (2) to evaluate the accuracy and computational complexity of our methods, namely PRIM sampling algorithm, the DYPCHIP choice probability computation, and the two learning algorithms, FINDTOP and BUCCHOI.
Researcher Affiliation	Collaboration	Shahrzad Haddadan Rutgers Business School Piscataway, NJ EMAIL Sara Ahmadian Google research Seattle, WA EMAIL
Pseudocode	Yes	Algorithm 1 FINDTOP [...] Algorithm 2 BUCCHOI [...] Algorithm 3 TOPKGMMSAMPLING (TOPKGMM) [...] Algorithm 4 PROFILE-BASED RIM(PRIM) [...] Algorithm 5 PROFILE PROBABILITY [...] Algorithm 6 SORTCNTR [...] Algorithm 7 BUCCHOI-II
Open Source Code	Yes	The code and log files are available publicly3. 3Link to the code https://github.com/Shahrzad Git/topkmallows-choices
Open Datasets	Yes	We used Sushi Preference Data Set (Kamishima et al., 2005) which contains preference of customers over a set of 100 different sushi types 4. 4Link to of Sushi Preference Data Set https://www.kamishima.net/sushi/.
Dataset Splits	Yes	We begin by randomly splitting the 5K top-10 preference data into a training set (80%) and a test set (20%).
Hardware Specification	Yes	Results are generated by running the code on a Mac Book Pro M1 Max, 32GM RAM.
Software Dependencies	No	The paper does not explicitly state specific software dependencies with version numbers for libraries, frameworks, or programming languages used in the experiments. It mentions the
Experiment Setup	Yes	We apply BUCCHOI using assortments of size one or two (Algorithm 7) to the training set using various values of p and β to learn the center of the distribution. With the learned parameters, we use DYPCHIP to compute the corresponding choice probabilities. For evaluation, we use empirical choice probabilities on the test set by repeatedly sampling random assortments and recording corresponding choices. These empirical estimates are then compared to the predictions from DYPCHIP to assess out-of-sample accuracy; the errors are reported in Table 1. We tune parameters β and p by performing a grid search over a range of values as in Table 1.