Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Active Learning of Halfspaces via Query Synthesis
Authors: Ibrahim Alabdulmohsin, Xin Gao, Xiangliang Zhang
AAAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, it exhibits a significant improvement over traditional approaches such as uncertainty sampling and representative sampling. |
| Researcher Affiliation | Academia | King Abdullah University of Science & Technology (KAUST) Thuwal, Saudi Arabia 23955 |
| Pseudocode | Yes | Algorithm 1: Query synthesis algorithm for halfspaces. Data: Observations {(xi, yi)}i=1,2,..., t Result: k synthetic queries {xt+1, xt+2, . . . , xt+k} Begin: 1. Solve the optimization problem in (5) or (6). Let µ and Σ be the optimal solutions. 2. Compute N, which is the orthonormal basis to the orthogonal complement of µ (the null-space of µ T ). 3. Compute α1, α2, . . . , αk, which are the top k eigenvectors of the matrix N T Σ N. 4. Return xt+1 = Nα1, . . . , xt+k = Nαk. End |
| Open Source Code | No | MATLAB implementation codes will be made available at http://mine.kaust.edu.sa/Pages/Software.aspx |
| Open Datasets | No | The experiments involved generating data: 'For a fixed choice of k and d, we began with a random choice of a unit-norm w Rd, a single positive example, and a single negative example.' No concrete access information for a publicly available or open dataset was provided. |
| Dataset Splits | No | The paper describes an iterative query synthesis process and sample complexity, but does not specify a traditional training/validation/test dataset split or cross-validation setup. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or cloud instances) were mentioned for the experiments. |
| Software Dependencies | Yes | The optimization problem was solved using CVX (CVX Research 2012). CVX: Matlab software for disciplined convex programming, version 2.0. Additionally, linear SVM was implemented using the LIBLINEAR package (Fan et al. 2008). |
| Experiment Setup | Yes | For a fixed choice of k and d, we began with a random choice of a unit-norm w Rd, a single positive example, and a single negative example. After that, we ran the different query synthesis algorithms in parallel up to a total of 1, 000 queries. In the batch setting, we used k = 5. Also, all experiments were repeated for d {25, 50, 75}. |