Active Feature Acquisition with Generative Surrogate Models

Authors: Yang Li, Junier Oliva

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our method on several benchmark environments built upon the UCI repository (Dua & Graff, 2017) and MNIST dataset (Le Cun, 1998). We compare our method to another RL based approach, JAFA (Shim et al., 2018), which jointly trains an agent and a classifier. We also compare to a greedy policy EDDI (Ma et al., 2018) that estimates the utility for each candidate feature using a VAE based model and selects one feature with the highest utility at each acquisition step. As a baseline, we also acquire features greedily using our surrogate model that estimates the utility following (6), (8) and (9). We use a fixed cost for each feature and report multiple results with different α in (1) to control the trade-off between task performance and acquisition cost. We cross validate the best architecture and hyperparameters for baselines. Architectural details, hyperparameters and sensitivity analysis are pro- Active Feature Acquisition with Generative Surrogate Models vided in the appendix.
Researcher Affiliation Academia 1Department of Computer Science, University of North Carolina at Chaple Hill, Chapel Hill, NC, USA. Correspondence to: Yang Li <yangli95@cs.unc.edu>, Junier B. Oliva <joliva@cs.unc.edu>.
Pseudocode Yes Please refer to Algorithm 1 for pseudo-code of the acquisition process with our GSMRL framework. Please also see Algorithm 2 in the appendix for a detailed version.
Open Source Code Yes We open-source a standardized environment inheriting the Open AI gym interfaces (Brockman et al., 2016) to assist future research on active feature acquisition. Code is publicly available at https:// github.com/lupalab/GSMRL.
Open Datasets Yes In this section, we evaluate our method on several benchmark environments built upon the UCI repository (Dua & Graff, 2017) and MNIST dataset (Le Cun, 1998).
Dataset Splits Yes We cross validate the best architecture and hyperparameters for baselines. Architectural details, hyperparameters and sensitivity analysis are pro- Active Feature Acquisition with Generative Surrogate Models vided in the appendix.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies Yes Our implementation is based on PyTorch (Paszke et al., 2019) and Python 3.8.
Experiment Setup No Architectural details, hyperparameters and sensitivity analysis are pro- Active Feature Acquisition with Generative Surrogate Models vided in the appendix.