reproducibilityindex.ai

Covariate Shift Adaptation on Learning from Positive and Unlabeled Data

Authors: Tomoya Sakai, Nobuyuki Shimizu4838-4845

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we address the PU learning problem under the covariate shift. We propose an importance-weighted PU learning method and reveal in which situations the importance-weighting is necessary. Moreover, we derive the convergence rate of the proposed method under mild conditions and experimentally demonstrate its effectiveness. ... Finally, we demonstrate the effectiveness of our proposed method through numerical experiments. ... 4 Experiments In this section, we show the effectiveness of the proposed PUc classiﬁcation method. ... Table 1 summarizes the results of dataset shift in the positive (resp. negative) class. ... Table 2 summarizes the average with standard error of misclassiﬁcation rates...
Researcher Affiliation	Collaboration	Tomoya Sakai NEC Corporation t-sakai@ah.jp.nec.com Nobuyuki Shimizu Yahoo Japan Corporation nobushim@yahoo-corp.jp Part of this work was done while at the University of Tokyo and RIKEN.
Pseudocode	No	The paper describes the mathematical formulations and implementation details in prose and equations, but it does not include a formal pseudocode block or algorithm.
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We used the MNIST dataset (Le Cun et al. 1998) ... benchmark datasets taken from the website of LIBSVM (Chang and Lin 2011), the IDA Benchmark (R atsch, Onoda, and M uller 2001), and the 20 Newsgroups (Lang 1995).
Dataset Splits	Yes	All hyper-parameters were determined by 5-fold IWCV (importance-weighted cross-validation) described in Section 3.3. ... In this experiment, we split the data set into training and test data based on the median of the feature vector. ... With probability 0.9 and 0.1, the samples whose indices were in the ﬁrst set were chosen as training data and test data, respectively. In contrast, the samples whose indices were in the second set were chosen as training data and test data with probability 0.1 and 0.9, respectively.
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions software like LIBSVM and programming languages (e.g., Python implied by discussion of machine learning frameworks), but it does not specify concrete version numbers for any software, libraries, or dependencies used in the experiments.
Experiment Setup	Yes	In all the experiments, the class-prior probabilities for training and test were set at πPtr = πPte = 0.5. ... we used the linear model g(x) = w x+w0. In Sections 4.3 and 4.4, we used the linear-in-parameter model g(x) = w φ(x) with the Gaussian kernel basis function φℓ(x) = exp( x xℓ 2/(2σ2)), where σ > 0 was the bandwidth, the number of basis functions was set at b = min(200, n Ute), and {xℓ}b ℓ=1 was a set of samples selected randomly from {x Ute k }n Ute k=1 . ... we use the squared loss function ℓ(m) = (1 m)2/4.