Estimating Accuracy from Unlabeled Data: A Bayesian Approach

Authors: Emmanouil Antonios Platanios, Avinava Dubey, Tom Mitchell

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs.
Researcher Affiliation Academia Emmanouil Antonios Platanios E.A.PLATANIOS@CS.CMU.EDU Avinava Dubey AKDUBEY@CS.CMU.EDU Tom Mitchell TOM.MITCHELL@CS.CMU.EDU Carnegie Mellon University, Pittsburgh, PA 15213, USA
Pseudocode No The paper describes models and inference processes textually, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes the code for our methods and experiments, and the data we used can be found at http://platanios.org/code/
Open Datasets Yes NELL Data Set: This data set consists of data samples... (Carlson et al., 2010; Mitchell et al., 2015)... Brain Data Set: Functional Magnetic Resonance Imaging (f MRI) data were collected... (Rowling, 2012)... (Wehbe et al., 2014).
Dataset Splits Yes The held-out data set consisted of 10% of the total amount of data we had available, which was randomly sampled.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies No The paper describes the Gibbs sampling procedure and hyperparameter settings but does not specify any software names with version numbers (e.g., libraries, frameworks, programming language versions).
Experiment Setup Yes For all our experiments and all three of our models, the Gibbs sampling inference procedure we used consisted of the following steps: (i) we sample 4, 000 samples that we throw away (i.e., burn-in samples), (ii) we sample 2, 000 samples and keep every 10th sample... Regarding the hyperparameters of our models, we set them as: Labels Prior: αp and βp are both set to 1... Error Rates Prior: αe is set to 1 and βe is set to 10.