Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Bayesian Nonparametric Crowdsourcing

Authors: Pablo G. Moreno, Antonio Artes-Rodriguez, Yee Whye Teh, Fernando Perez-Cruz

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms.
Researcher Affiliation	Collaboration	Pablo G. Moreno EMAIL Gregorio Mara n on Health Research Institute Department of Signal Theory and Communications Universidad Carlos III de Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain) Yee Whye Teh EMAIL Department of Statistics 1 South Parks Road Oxford OX1 3TG, UK Fernando Perez-Cruz EMAIL Gregorio Mara n on Health Research Institute EMAIL Department of Signal Theory and Communications Universidad Carlos III de Madrid Avda. de la Universidad, 30 28911 Legan es (Madrid, Spain) Bell Labs (Alcatel-Lucent) 600 Mountain Avenue New Providence, NJ 07974
Pseudocode	No	The paper describes the inference algorithms (Gibbs sampling, Reuse algorithm) in paragraph text within Section 3 ('Inference') but does not present them in a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	In the second part we use publicly available real data-sets with C = 2 whose principal characteristics are described in Table 1 (Raykar and Yu, 2012).
Dataset Splits	No	The paper describes generating synthetic data with varying sparsity and creating random databases for each sparsity level, but it does not specify explicit training, validation, or test splits for either the synthetic or real datasets. It refers to 'accuracy predicting the ground truth' without detailing the dataset partitioning for reproduction.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper discusses algorithmic approaches like Gibbs sampling and MCMC but does not list any specific software or library names with version numbers that were used for implementation.
Experiment Setup	Yes	In the i BCC, the diagonal elements of η are 0.7 while the off diagonal are 0.3, which reﬂect our prior belief that users perform better than random. All the elements of β are 3. In the c BCC model, the hyper parameters of α are aα = 1 and bα = 10. [...] Finally, in the hc BCC model, we set γ and φ to the values of η and β in the c BCC model respectively. All the components of at are set to 30 while all the components of bt are set to 2. [...] We run the MCMC for 10,000 iterations. After the ﬁrst 3,000 we collect 7,000 samples to compute z and π. In the c BCC and hc BCC, we set to ﬁve the number of iterations used to sample α following the algorithm proposed by Escobar (1994). In the hc BCC we ﬁx the number of auxiliary clusters used by the Reuse Algorithm to h = 10.