Recovering True Classifier Performance in Positive-Unlabeled Learning
Authors: Shantanu Jain, Martha White, Predrag Radivojac
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using state-of-the-art algorithms to estimate the positive class prior and the proportion of noise, we experimentally evaluate two correction approaches and demonstrate their efficacy on real-life data. |
| Researcher Affiliation | Academia | Shantanu Jain, Martha White, Predrag Radivojac Department of Computer Science Indiana University, Bloomington, Indiana, USA {shajain, martha, predrag}@indiana.edu |
| Pseudocode | No | The paper states 'The full algorithm for the indirect recovery is given in the ar Xiv supplement of this paper.' but no pseudocode or algorithm blocks are present in the provided text. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Our estimators were evaluated using twelve real-life data sets from the UCI Machine Learning Repository (Lichman 2013). |
| Dataset Splits | Yes | A validation set containing 25% of the training data was used to terminate training.The number of actual positive examples in each labeled set was a function of parameter β {1, 0.95, 0.75}. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions algorithms used (feedforward neural networks, resilient propagation) but does not provide specific software names with version numbers for dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Classifiers were constructed as ensembles of 100 feedforward neural networks... Each network had five hidden neurons and was trained using resilient propagation... A validation set containing 25% of the training data was used to terminate training. |