reproducibilityindex.ai

Learning from Positive and Unlabeled Data with Arbitrary Positive Shift

Authors: Zayd Hammoudeh, Daniel Lowd

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate our methods effectiveness across numerous real-world datasets and forms of positive bias, including disjoint positive class-conditional supports. Additionally, we propose a general, simpliﬁed approach to address PU risk estimation overﬁtting.
Researcher Affiliation	Academia	Zayd Hammoudeh Daniel Lowd Department of Computer & Information Science University of Oregon Eugene, OR, USA {zayd, lowd}@cs.uoregon.edu
Pseudocode	Yes	Algorithm 1 Two-step unlabeled-unlabeled a PU learning Input: Labeled-positive set Xp and unlabeled sets Xtr-u, Xte-u Output: g s model parameters θ
Open Source Code	Yes	Our implementation is publicly available at: https://github.com/Zayd H/arbitrary_pu.
Open Datasets	Yes	We empirically studied the effectiveness of our methods PURR, PU2w UU, and PU2a PNU using synthetic and real-world data.3 Limited space allows us to discuss only two experiment sets here. Suppl. Section E details experiments on: synthetic data, 10 LIBSVM datasets [30] under a totally different positive-bias condition, and a study of our methods robustness to negative-class shift. ... Datasets Section 7.2 considers the MNIST [31], CIFAR10 [32], and 20 Newsgroups [33] datasets with binary classes formed by partitioning each dataset s labels. Section 7.3 uses two different TREC [34] spam email datasets to demonstrate our methods performance under real-world adversarial concept drift.
Dataset Splits	Yes	All methods saw identical training/test data splits and where applicable used the same initial weights. ... Table 1: Mean inductive misclassiﬁcation rate (%) over 100 trials... Each dataset has four experimental conditions... (1) Ptrain = Ptest... (2 & 3 resp.) partially disjoint positive supports without and with prior shift, and (4) disjoint positive class deﬁnitions. ... Figure 2: Mean inductive misclassiﬁcation rate over 100 trials...
Hardware Specification	No	The paper mentions 'University of Oregon high performance computer, Talapas' but does not specify any particular CPU models, GPU models, or other detailed hardware specifications for the experimental setup.
Software Dependencies	No	The paper mentions software like 'Adam W [35] with AMSGrad [36]', 'Dense Net-121 [39]', 'ELMo [37]', and 'Py Torch [43]', but it does not provide specific version numbers for these software components, which is necessary for reproducibility.
Experiment Setup	Yes	Our only individually tuned hyperparameters are learning rate and weight decay. We assume the worst case of no a priori knowledge about the positive shift so midpoint value ρ = 0.5 was used. ... Probabilistic classiﬁer, ˆσ, used our abs-PU risk estimator with logistic loss. All other learners used sigmoid loss for ℓ. ... stochastic optimization (i.e., Adam W [35] with AMSGrad [36]). ... NNs to at most three fully-connected layers of 300 neurons.