reproducibilityindex.ai

Mallows Models for Top-k Lists

Authors: Flavio Chierichetti, Anirban Dasgupta, Shahrzad Haddadan, Ravi Kumar, Silvio Lattanzi

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Unlike many earlier works, our model is both analytically tractable and computationally efﬁcient. We demonstrate this by studying two basic problems in this model, namely, sampling and reconstruction, from both algorithmic and experimental points of view. We also conduct experiments on both synthetic and real-world datasets to demonstrate the efﬁciency and usability of our algorithms in practice.
Researcher Affiliation	Collaboration	Flavio Chierichetti Sapienza University, Rome, Italy flavio@di.uniroma1.it Anirban Dasgupta IIT, Gandhinagar, India anirban.dasgupta@gmail.com Shahrzad Haddadan Sapienza University, Rome, Italy shahrzad.haddadan@uniroma1.it Ravi Kumar Google, Mountain View, CA ravi.k53@gmail.com Silvio Lattanzi Google, Zurich, Switzerland silviolat@gmail.com
Pseudocode	No	The paper describes algorithms in text but does not include a formally labeled pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any explicit statement or link for the release of open-source code for the described methodology.
Open Datasets	Yes	We acquired 176 movie lists from www.criterion.com/explore/top10 and truncated each list to the ﬁrst 10 movies.
Dataset Splits	No	The paper does not provide explicit details about training, validation, or test dataset splits. It mentions using synthetic and real-world datasets but does not specify how they were partitioned for model training, validation, and testing.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers (e.g., programming languages, libraries, frameworks, or solvers) needed to reproduce the experiments.
Experiment Setup	Yes	We ran the algorithms on synthetic datasets, as well as on a real world dataset. The synthetic datasets were all generated by the top-k Mallows model, having a single center. We ﬁrst produced a number of synthetic top-k lists by sampling from M(p) β,Ik,k,n, for a number of choices of n, k, β, p, by running the Markov chain (Section 3.2) for 1000 steps. Figure 2 shows the probability of returning the center (averaged over various runs) of the algorithms on these inputs. The Condorcet algorithm clearly outperforms the other two. The ﬁrst three plots detail the results for n = 10 and k = 5, p = 0.5, and β {0.2, 0.5, 1.0}. The last plot, instead, has n = 100 and k = 10, and p = β = 0.5.