Convex Batch Mode Active Sampling via α-Relative Pearson Divergence
Authors: Hanmo Wang, Liang Du, Peng Zhou, Lei Shi, Yi-Dong Shen
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical studies on UCI datasets demonstrate the effectiveness of the proposed approach compared with the state-of-the-art batch mode active learning methods. |
| Researcher Affiliation | Academia | 1State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China 2University of Chinese Academy of Sciences, Beijing 100049, China |
| Pseudocode | Yes | Algorithm 1 Algorithm of RPEactive Input: parameters α,λ; kernel matrix K; constants nu, nl, ns Output: indicator variable β 1: compute θ(0) according to (26) 2: ˆθ θ(0) 3: k 0 4: while not converge do 5: compute ˆβ according to (21) 6: compute g(θ(k)) according to (24) 7: update θ(k+1) according to (25) 8: k k + 1 9: if G(θ(k)) < G(ˆθ) then 10: ˆθ θ(k) 11: end if 12: end while 13: compute φ according to (20) with θ = ˆθ 14: compute β according to (21) |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | In our experiment, we evaluate the performance of our proposed RPEactive algorithm on 6 datasets from the UCI repository, namely iris, australian, sonar, heart, wine and arcene. |
| Dataset Splits | No | The paper mentions 'We randomly divide each dataset into unlabeled set (60%) and testing set (40%)' but does not specify a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions 'Support Vector Machines is used as classification model' and 'We use Gaussian kernel' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For a fixed batch size ns, each method selects data samples for labeling at each iteration. The batch size ns is set to 5 in dateset iris and arcene due to their small sizes, and 10 in other datasets. The experiment is repeated 20 times and the average result is reported. Support Vector Machines is used as classification model to evaluate the performance of the labeled instances. Parameters α and λ are chosen from {0, 0.05, ..., 0.95} and {10 5, 10 4, ..., 1} respectively. We use Gaussian kernel for all datasets where the kernel width is searched in a relative large range. All the parameters are selected using greedy search method which searches all combinations of parameters and the one with the best average accuracy on test data(40%) is chosen. |