reproducibilityindex.ai

Estimating Accuracy from Unlabeled Data: A Bayesian Approach

Authors: Emmanouil Antonios Platanios, Avinava Dubey, Tom Mitchell

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classiﬁer outputs.
Researcher Affiliation	Academia	Emmanouil Antonios Platanios E.A.PLATANIOS@CS.CMU.EDU Avinava Dubey AKDUBEY@CS.CMU.EDU Tom Mitchell TOM.MITCHELL@CS.CMU.EDU Carnegie Mellon University, Pittsburgh, PA 15213, USA
Pseudocode	No	The paper describes models and inference processes textually, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	the code for our methods and experiments, and the data we used can be found at http://platanios.org/code/
Open Datasets	Yes	NELL Data Set: This data set consists of data samples... (Carlson et al., 2010; Mitchell et al., 2015)... Brain Data Set: Functional Magnetic Resonance Imaging (f MRI) data were collected... (Rowling, 2012)... (Wehbe et al., 2014).
Dataset Splits	Yes	The held-out data set consisted of 10% of the total amount of data we had available, which was randomly sampled.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper describes the Gibbs sampling procedure and hyperparameter settings but does not specify any software names with version numbers (e.g., libraries, frameworks, programming language versions).
Experiment Setup	Yes	For all our experiments and all three of our models, the Gibbs sampling inference procedure we used consisted of the following steps: (i) we sample 4, 000 samples that we throw away (i.e., burn-in samples), (ii) we sample 2, 000 samples and keep every 10th sample... Regarding the hyperparameters of our models, we set them as: Labels Prior: αp and βp are both set to 1... Error Rates Prior: αe is set to 1 and βe is set to 10.