Estimating Accuracy from Unlabeled Data: A Bayesian Approach
Authors: Emmanouil Antonios Platanios, Avinava Dubey, Tom Mitchell
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs. |
| Researcher Affiliation | Academia | Emmanouil Antonios Platanios E.A.PLATANIOS@CS.CMU.EDU Avinava Dubey AKDUBEY@CS.CMU.EDU Tom Mitchell TOM.MITCHELL@CS.CMU.EDU Carnegie Mellon University, Pittsburgh, PA 15213, USA |
| Pseudocode | No | The paper describes models and inference processes textually, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | the code for our methods and experiments, and the data we used can be found at http://platanios.org/code/ |
| Open Datasets | Yes | NELL Data Set: This data set consists of data samples... (Carlson et al., 2010; Mitchell et al., 2015)... Brain Data Set: Functional Magnetic Resonance Imaging (f MRI) data were collected... (Rowling, 2012)... (Wehbe et al., 2014). |
| Dataset Splits | Yes | The held-out data set consisted of 10% of the total amount of data we had available, which was randomly sampled. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper describes the Gibbs sampling procedure and hyperparameter settings but does not specify any software names with version numbers (e.g., libraries, frameworks, programming language versions). |
| Experiment Setup | Yes | For all our experiments and all three of our models, the Gibbs sampling inference procedure we used consisted of the following steps: (i) we sample 4, 000 samples that we throw away (i.e., burn-in samples), (ii) we sample 2, 000 samples and keep every 10th sample... Regarding the hyperparameters of our models, we set them as: Labels Prior: αp and βp are both set to 1... Error Rates Prior: αe is set to 1 and βe is set to 10. |