Streaming Bayesian Inference for Crowdsourced Classification

Authors: Edoardo Manino, Long Tran-Thanh, Nicholas Jennings

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4 we compute its asymptotical accuracy. In Section 5 we compare its performance with the state of the art on both synthetic and real-world datasets.
Researcher Affiliation Academia Edoardo Manino University of Southampton E.Manino@soton.ac.uk Long Tran-Thanh University of Southampton l.tran-thanh@soton.ac.uk Nicholas R. Jennings Imperial College, London n.jennings@imperial.ac.uk
Pseudocode Yes Algorithm 1 Fast SBIC Input: dataset X, availability a, policy π, prior θ Output: final predictions ˆy T ... Algorithm 2 Sorted SBIC Input: dataset X, availability a, policy π, prior θ Output: final predictions ˆy T
Open Source Code No The paper does not provide a direct link to open-source code for the SBIC algorithm or explicitly state that the code is publicly available.
Open Datasets Yes Second, we consider the 5 publicly available dataset listed in Table 1, which come with binary annotations and ground-truth values. For more information on the datasets see [Snow et al., 2008; Welinder et al., 2010; Lease and Kazai, 2011].
Dataset Splits No The paper discusses synthetic and real-world datasets and analyzes prediction error, but it does not explicitly provide details about training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification No The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton. This is a general facility name, but no specific hardware components (e.g., GPU/CPU models, memory) are detailed.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed to reproduce the experiments.
Experiment Setup Yes To do so, we extract workers from a distribution pj Beta(4, 3), representing a non-uniform population with large variance. ... Additionally, we set the number of tasks to M = 1000 and the number of labels per worker to L = 10. ... we run EM, AMF, MC and SBIC with parameters α and β matching the distribution of pj. ... we run EM, AFM, MC and SBIC with the generic prior α = 2, β = 1 and q = 1/2 as proposed in Liu et al. [2012].