Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Learning with Safety Constraints

Authors: Romain Camilleri, Andrew Wagenmaker, Jamie H. Morgenstern, Lalit Jain, Kevin G. Jamieson

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In practice, we demonstrate that this approach performs well on synthetic and real world datasets.
Researcher Affiliation Academia University of Washington, Seattle, WA EMAIL,EMAIL
Pseudocode Yes Algorithm 1 Best Safe Arm Identification (BESIDE) on page 4; Algorithm 2 Active constrained classification with randomized exploration on page 7.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Refer to Appendix.
Open Datasets Yes We evaluate on the adult income data set [27] (48,842 examples)... [27] M Lichman. Uci machine learning repository 2013. URL http://archive.ics.uci.edu/. We consider the German Credit Dataset originally from the Staflog Project Databases [24]... [24] E. Keogh, C.; Blake, and C. J. Merz. Uci repository of machine learning databases 1998. URL http://archive.ics.uci.edu/ml.
Dataset Splits No The paper describes a pool-based active learning setup where labels are acquired dynamically, rather than specifying fixed training, validation, and test splits with percentages or sample counts for the overall dataset.
Hardware Specification No The paper states 'See Appendix' for compute resources, but the Appendix does not provide specific hardware details such as GPU/CPU models or memory specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes For the Adult dataset, we randomly sample 2000 points from the dataset... batch size is set to 25 and initial number of queried labels is 50. For the German Credit dataset, we use the entire dataset (1000 points)... batch size is set to 25 and initial number of queried labels is 50. In the active classification experiments we set the number of rounds L = 100, the number of classifiers per round k = 10 and the perturbation variance σ = 0.05.