INVASE: Instance-wise Variable Selection using Neural Networks

Authors: Jinsung Yoon, James Jordon, Mihaela van der Schaar

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate through a mixture of synthetic and real data experiments that INVASE significantly outperforms state-of-the-art benchmarks.
Researcher Affiliation Academia Jinsung Yoon Department of Electrical and Computer Engineering UCLA, California, USA jsyoon0823@g.ucla.edu James Jordon Engineering Science Department University of Oxford, UK james.jordon@wolfson.ox.ac.uk Mihaela van der Schaar University of Cambridge, UK Department of Electrical and Computer Engineering, UCLA, California, USA Alan Turing Institute, London, UK mihaela@ee.ucla.edu
Pseudocode Yes Pseudo-code of INVASE is given in Algorithm 1
Open Source Code Yes Implementation of INVASE can be found at https://github.com/jsyoon0823/ INVASE.
Open Datasets Yes In this section we use two real-world datasets to perform a series of further experiments. The first, the Meta-Analysis Global Group in Chronic Heart Failure (MAGGIC) dataset [23], has 40,409 patients each with 31 measured features. [...] The second, the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial in the US and the European Randomized Study of Screening for Prostate Cancer (ERSPC) dataset [8; 26] contains 38,001 each with 106 measured features.
Dataset Splits Yes For each of Syn1 to Syn6 we draw 20,000 samples from the data generation model and separate each into training (Dtrain = (xi, yi)10000 i=1 ) and testing (Dtest = (xj, yj)10000 j=1 ) sets. For real-world data: We use cross-validation to select λ among {0.1, 0.3, 0.5, 1, 2, 5, 10}.
Hardware Specification No The paper mentions 'CPU times' and that experiments were conducted, but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types.
Software Dependencies No The paper mentions 'tensorflow' and 'scikit-learn' but does not specify their version numbers or any other software dependencies with versioning information.
Experiment Setup Yes In the experiments, the depth of the selector, predictor, and baseline networks is set to 3. The number of hidden nodes in each layer is d and 2d, respectively. We use either Re Lu or Se Lu as the activation functions of each layer except for the output layer where we use the sigmoid activation function for the selector network and softmax activation function for the predictor and baseline networks. The number of samples in each mini-batch is 1000 for the selector, predictor, and baseline networks. We use cross-validation to select λ among {0.1, 0.3, 0.5, 1, 2, 5, 10}.