Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

Authors: Emmanouil Platanios, Hoifung Poon, Tom M. Mitchell, Eric J. Horvitz

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on four real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs.
Researcher Affiliation Collaboration Emmanouil A. Platanios Carnegie Mellon University Pittsburgh, PA e.a.platanios@cs.cmu.edu Hoifung Poon Microsoft Research Redmond, WA hoifung@microsoft.com Tom M. Mitchell Carnegie Mellon University Pittsburgh, PA tom.mitchell@cs.cmu.edu Eric Horvitz Microsoft Research Redmond, WA horvitz@microsoft.com
Pseudocode Yes Our grounding algorithm is shown in the supplementary material and is based on the idea that a ground rule is only useful if the function approximation predicate that appears in its body is observed. [...] The algorithm is summarized in the supplementary material.
Open Source Code Yes Our implementation as well as the experiment data sets are available at https://github.com/ eaplatanios/makina.
Open Datasets Yes Our implementation as well as the experiment data sets are available at https://github.com/ eaplatanios/makina.
Dataset Splits No The paper describes the datasets used (NELL-7, NELL-11, u NELL, u BRAIN) but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or cross-validation scheme).
Hardware Specification Yes First, note that the largest execution time of our method among all data sets was about 10 minutes, using a 2013 15-inch Mac Book Pro.
Software Dependencies No The paper mentions using 'Probabilistic Soft Logic (PSL)' but does not provide specific version numbers for PSL or any other software libraries, frameworks, or programming languages used in the implementation.
Experiment Setup No The paper describes the theoretical model and inference process but does not provide concrete details about the experimental setup such as hyperparameter values (e.g., learning rates, batch sizes, epochs), optimizer settings, or other system-level training configurations.