Improved Active Learning via Dependent Leverage Score Sampling

Authors: Atsushi Shimizu, Xiaoou Cheng, Christopher Musco, Jonathan Weare

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate our pivotal sampling methods on active regression problems with lowdimensional structure.
Researcher Affiliation Academia Atsushi Shimizu, Xiaoou Cheng, Christopher Musco, Jonathan Weare New York University {as15106,chengxo,cmusco,weare}@nyu.edu
Pseudocode Yes Algorithm 1 Binary Tree Based Pivotal Sampling (Deville and Tille, 1998)
Open Source Code No The paper does not provide any explicit statement about open-sourcing the code for the described methodology or a link to a code repository.
Open Datasets No For both problems, we construct A by uniformly selecting n = 105 data points in the 2-dimensional parameter range of interest. We then add all polynomial features of degree p = 12 as discussed in Section 1.2. We compute sampled entries from the target vector b using standard MATLAB routines.
Dataset Splits No The paper discusses 'train' and 'test' data conceptually but does not provide explicit details about a 'validation' split or how it was used in the experimental setup.
Hardware Specification No The Acknowledgements section mentions 'NYU IT for the use of the Greene computing cluster'. However, no specific hardware details such as GPU/CPU models, memory, or detailed cluster configurations are provided.
Software Dependencies No The paper mentions the use of 'standard MATLAB routines' for computing sampled entries from the target vector. However, no specific version numbers for MATLAB or any other software dependencies are provided.
Experiment Setup Yes We report median normalized error A x b 2 2/ b 2 2 after 1000 trials. By drawing more samples from b, the errors of all methods eventually converge to the optimal error Ax b 2 2/ b 2 2, but clearly the pivotal method requires less samples to achieve a given level of accuracy, confirming the benefits of spatially-aware sampling.