Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting
Authors: Lin Chen, Hamed Hassani, Amin Karbasi
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical experiments demonstrate that DC runs orders of magnitude faster than the existing methods. In this section, we extensively evaluate the performance of DC against the following baselines: RANDOM-SAMPLING: Queries are generated by sampling uniformly at random from the unit sphere Sd 1. Our metrics to compare different algorithms are: a) estimation error, b) query complexity, and c) execution time. |
| Researcher Affiliation | Academia | Lin Chen,1,2 Hamed Hassani,3 Amin Karbasi1,2 1Department of Electrical Engineering, 2Yale Institute for Network Science, Yale University 3Computer Science Department, ETH Zürich {lin.chen, amin.karbasi}@yale.edu, hamed@inf.ethz.ch |
| Pseudocode | Yes | Algorithm 1 DC2 Input: orthonormal vectors e1,e2, estimation error at most ϵ, success probability at least 1 δ. Output: a unit vector ˆe which is an estimate for the normalized orthogonal projection of h onto span{e1,e2}. ... Algorithm 2 Dimension Coupling (DC) Input: an orthonormal basis E = {e1,e2,...,ed} of Rd. Output: a unit vector ˆh which is an estimate for h. |
| Open Source Code | No | The paper does not provide any specific links or statements about the availability of its own source code. It mentions 'the fastest available implementations in MATLAB' for baselines, but not for their proposed DC method. |
| Open Datasets | No | By nature, in active learning via query synthesis, all data points and queries are generated synthetically. For all the baselines, we used the fastest available implementations in MATLAB. |
| Dataset Splits | No | The paper evaluates performance based on 'Number of Queries' and 'Estimation Error' on synthetically generated data but does not specify traditional training, validation, or test dataset splits in terms of percentages or counts. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions 'MATLAB' was used for baselines but does not provide specific version numbers for MATLAB or any other software dependencies, libraries, or solvers relevant to their method. |
| Experiment Setup | No | The paper discusses algorithmic parameters like Tϵ,δ and noise level ρ, but it does not provide common machine learning experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer settings for training models. |