reproducibilityindex.ai

Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting

Authors: Lin Chen, Hamed Hassani, Amin Karbasi

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical experiments demonstrate that DC runs orders of magnitude faster than the existing methods. In this section, we extensively evaluate the performance of DC against the following baselines: RANDOM-SAMPLING: Queries are generated by sampling uniformly at random from the unit sphere Sd 1. Our metrics to compare different algorithms are: a) estimation error, b) query complexity, and c) execution time.
Researcher Affiliation	Academia	Lin Chen,1,2 Hamed Hassani,3 Amin Karbasi1,2 1Department of Electrical Engineering, 2Yale Institute for Network Science, Yale University 3Computer Science Department, ETH Zürich {lin.chen, amin.karbasi}@yale.edu, hamed@inf.ethz.ch
Pseudocode	Yes	Algorithm 1 DC2 Input: orthonormal vectors e1,e2, estimation error at most ϵ, success probability at least 1 δ. Output: a unit vector ˆe which is an estimate for the normalized orthogonal projection of h onto span{e1,e2}. ... Algorithm 2 Dimension Coupling (DC) Input: an orthonormal basis E = {e1,e2,...,ed} of Rd. Output: a unit vector ˆh which is an estimate for h.
Open Source Code	No	The paper does not provide any specific links or statements about the availability of its own source code. It mentions 'the fastest available implementations in MATLAB' for baselines, but not for their proposed DC method.
Open Datasets	No	By nature, in active learning via query synthesis, all data points and queries are generated synthetically. For all the baselines, we used the fastest available implementations in MATLAB.
Dataset Splits	No	The paper evaluates performance based on 'Number of Queries' and 'Estimation Error' on synthetically generated data but does not specify traditional training, validation, or test dataset splits in terms of percentages or counts.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies	No	The paper mentions 'MATLAB' was used for baselines but does not provide specific version numbers for MATLAB or any other software dependencies, libraries, or solvers relevant to their method.
Experiment Setup	No	The paper discusses algorithmic parameters like Tϵ,δ and noise level ρ, but it does not provide common machine learning experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer settings for training models.