reproducibilityindex.ai

Concentric mixtures of Mallows models for top-$k$ rankings: sampling and identifiability

Authors: Fabien Collas, Ekhine Irurozki

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we validate empirically our proposal. The experimental framework is as follows. In the ﬁrst two experiments, we generate a sample of partial rankings, using Algorithm 1, with parameters n = 30 and k = 10, from a mixture of concentric MM, both centered at a random σ0 and with two dispersion parameters, θb, θg. The mixture parameter is denoted r.
Researcher Affiliation	Academia	1Basque Center for Applied Mathematics, Bilbao, Spain. 2LTCI, Telecom Paris, Institut Polytechnique de Paris.
Pseudocode	Yes	Algorithm 1 Sample top-k in O(k log k) Data: n, k, θ, σ0 Result: σ: Top-k ranking of n items distributed according to M(σ0, θ) for j [1, k] do Vj(πσ 1 0 ) = random choice in [n j] with choice probabilities of Eq. (3) πσ 1 0 = transform V (πσ 1 0 ) with the bijection in (Mc Clellan et al., 1974) return π 1 end
Open Source Code	Yes	Software implementing the algorithms described here is distributed in https://github.com/ ekhiru/top-k-mallows.
Open Datasets	Yes	To test the identiﬁability on real data, we used a dataset already used in (Fligner & Verducci, 1986), for which 98 college students were asked to rank ﬁve words according to its strength of association with the word idea .
Dataset Splits	No	The paper describes generating samples for experiments and their sizes (e.g., “mg = 40 rankings from a M(σ0, θg)”, “using the same growing sample, with size {1, 2, 3, ..., 44}”), but does not provide specific train/validation/test splits or methodology for data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper provides a link to its software implementation but does not list specific software dependencies (e.g., libraries, frameworks) with their version numbers required for reproduction.
Experiment Setup	Yes	In the ﬁrst two experiments, we generate a sample of partial rankings, using Algorithm 1, with parameters n = 30 and k = 10, from a mixture of concentric MM, both centered at a random σ0 and with two dispersion parameters, θb, θg. The mixture parameter is denoted r. ... mg = 40 rankings from a M(σ0, θg) such that E[d(γ, σ0)] {3, 8, 13, . . . , 48} mb = 60 rankings from a M(σ0, θb) such that E[d(β, σ0)] = c E[d(γ, σ0)] with 40 > c 3 and E[d(γ, σ0)] 217 (bound corresponding to the uniform distribution).