A Minimax Optimal Algorithm for Crowdsourcing

Authors: Thomas Bonald, Richard Combes

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conclude by assessing the performance of TE and other state-of-the-art algorithms on both synthetic and real-world data. In section 7 we present numerical experiments on synthetic and real-world data sets and section 8 concludes the paper.
Researcher Affiliation Academia Thomas Bonald Telecom Paris Tech thomas.bonald@telecom-paristech.fr Richard Combes Centrale-Supelec / L2S richard.combes@supelec.fr
Pseudocode No The paper describes the TE algorithm in Section 6 using descriptive text and mathematical formulas, but it does not present it in a formal pseudocode block or clearly labeled algorithm structure.
Open Source Code No The paper does not provide any specific links, statements, or references to open-source code for the methodology described.
Open Datasets Yes We next consider 6 publicly available data sets (see [Whitehill et al., 2009, Zhou et al., 2015] and summary information in Table 3)
Dataset Splits No The paper describes the characteristics of synthetic and real-world datasets used, including preprocessing steps for real-world data, but it does not explicitly provide details about training, validation, or test dataset splits, nor does it mention cross-validation.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, that would be needed to replicate the experiment.
Experiment Setup Yes We consider three instances: (i) n = 50, t = 103, α = 0.25, θi = a if i n/2 and 0 otherwise; (ii) n = 50, t = 104, α = 0.25, θ = (1, a, a, 0, ..., 0); (iii) n = 50, t = 104, α = 0.25, a = 0.9, θ = (a, a, a, a, b n 4, ..., b n 4). For each instance we average the performance of algorithms on 103 independent runs and apply a random permutation of the components of θ before each run. First, for data sets with more than 2 possible label values, we split the label values into two groups and associate them with 1 and +1 respectively. Second, we remove any worker who provides less than 10 labels.