Class Prior Estimation with Biased Positives and Unlabeled Examples
Authors: Shantanu Jain, Justin Delano, Himanshu Sharma, Predrag Radivojac4255-4263
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical investigation suggests feasibility of the correction strategy and overall good performance. Experiments and Results summarize our empirical investigation, summarizing the datasets, experimental protocols and results. |
| Researcher Affiliation | Academia | Shantanu Jain, Justin D. Delano, Himanshu Sharma, Predrag Radivojac Khoury College of Computer Sciences Northeastern University, Boston, MA, U.S.A. |
| Pseudocode | Yes | Algorithm 1 Algorithm for class prior estimation with biased positives and unlabeled examples. // max K specifies the maximum number of clusters. Require: M, C, max K Ensure: α // Partition the biased positive set by k-means clustering. // The number of clusters is picked to be the one giving // a clustering with the maximum Silhouette coefficient, // up to a maximum of max K. c Part[i] stores the // positives in the ith cluster. c Part k Means Silhouette(C, max K) |
| Open Source Code | No | The paper does not provide explicit statements or links for the open-sourcing of the described methodology's code. |
| Open Datasets | Yes | Our experiments were carried out on twelve real-life datasets from the UCI Machine Learning Repository (Lichman 2013). |
| Dataset Splits | No | The paper describes the generation of biased and unbiased positive-unlabeled datasets but does not explicitly provide training, validation, and test splits with percentages or counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms and libraries (e.g., 'k-means algorithm', 'Alpha Max', 'Elkan-Noto algorithm') but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | Corrected is an exact implementation of Algorithm 1 with max K intialized to 5. To generate biased positive examples and unlabeled data, the positive examples were clustered using k-means, where the number of clusters, K, was determined based on the Silhouette coefficient. |