reproducibilityindex.ai

Convex Batch Mode Active Sampling via α-Relative Pearson Divergence

Authors: Hanmo Wang, Liang Du, Peng Zhou, Lei Shi, Yi-Dong Shen

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical studies on UCI datasets demonstrate the effectiveness of the proposed approach compared with the state-of-the-art batch mode active learning methods.
Researcher Affiliation	Academia	1State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China 2University of Chinese Academy of Sciences, Beijing 100049, China
Pseudocode	Yes	Algorithm 1 Algorithm of RPEactive Input: parameters α,λ; kernel matrix K; constants nu, nl, ns Output: indicator variable β 1: compute θ(0) according to (26) 2: ˆθ θ(0) 3: k 0 4: while not converge do 5: compute ˆβ according to (21) 6: compute g(θ(k)) according to (24) 7: update θ(k+1) according to (25) 8: k k + 1 9: if G(θ(k)) < G(ˆθ) then 10: ˆθ θ(k) 11: end if 12: end while 13: compute φ according to (20) with θ = ˆθ 14: compute β according to (21)
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	In our experiment, we evaluate the performance of our proposed RPEactive algorithm on 6 datasets from the UCI repository, namely iris, australian, sonar, heart, wine and arcene.
Dataset Splits	No	The paper mentions 'We randomly divide each dataset into unlabeled set (60%) and testing set (40%)' but does not specify a separate validation dataset split.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies	No	The paper mentions 'Support Vector Machines is used as classiﬁcation model' and 'We use Gaussian kernel' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For a ﬁxed batch size ns, each method selects data samples for labeling at each iteration. The batch size ns is set to 5 in dateset iris and arcene due to their small sizes, and 10 in other datasets. The experiment is repeated 20 times and the average result is reported. Support Vector Machines is used as classiﬁcation model to evaluate the performance of the labeled instances. Parameters α and λ are chosen from {0, 0.05, ..., 0.95} and {10 5, 10 4, ..., 1} respectively. We use Gaussian kernel for all datasets where the kernel width is searched in a relative large range. All the parameters are selected using greedy search method which searches all combinations of parameters and the one with the best average accuracy on test data(40%) is chosen.