reproducibilityindex.ai

INVASE: Instance-wise Variable Selection using Neural Networks

Authors: Jinsung Yoon, James Jordon, Mihaela van der Schaar

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate through a mixture of synthetic and real data experiments that INVASE signiﬁcantly outperforms state-of-the-art benchmarks.
Researcher Affiliation	Academia	Jinsung Yoon Department of Electrical and Computer Engineering UCLA, California, USA jsyoon0823@g.ucla.edu James Jordon Engineering Science Department University of Oxford, UK james.jordon@wolfson.ox.ac.uk Mihaela van der Schaar University of Cambridge, UK Department of Electrical and Computer Engineering, UCLA, California, USA Alan Turing Institute, London, UK mihaela@ee.ucla.edu
Pseudocode	Yes	Pseudo-code of INVASE is given in Algorithm 1
Open Source Code	Yes	Implementation of INVASE can be found at https://github.com/jsyoon0823/ INVASE.
Open Datasets	Yes	In this section we use two real-world datasets to perform a series of further experiments. The ﬁrst, the Meta-Analysis Global Group in Chronic Heart Failure (MAGGIC) dataset [23], has 40,409 patients each with 31 measured features. [...] The second, the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial in the US and the European Randomized Study of Screening for Prostate Cancer (ERSPC) dataset [8; 26] contains 38,001 each with 106 measured features.
Dataset Splits	Yes	For each of Syn1 to Syn6 we draw 20,000 samples from the data generation model and separate each into training (Dtrain = (xi, yi)10000 i=1 ) and testing (Dtest = (xj, yj)10000 j=1 ) sets. For real-world data: We use cross-validation to select λ among {0.1, 0.3, 0.5, 1, 2, 5, 10}.
Hardware Specification	No	The paper mentions 'CPU times' and that experiments were conducted, but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types.
Software Dependencies	No	The paper mentions 'tensorﬂow' and 'scikit-learn' but does not specify their version numbers or any other software dependencies with versioning information.
Experiment Setup	Yes	In the experiments, the depth of the selector, predictor, and baseline networks is set to 3. The number of hidden nodes in each layer is d and 2d, respectively. We use either Re Lu or Se Lu as the activation functions of each layer except for the output layer where we use the sigmoid activation function for the selector network and softmax activation function for the predictor and baseline networks. The number of samples in each mini-batch is 1000 for the selector, predictor, and baseline networks. We use cross-validation to select λ among {0.1, 0.3, 0.5, 1, 2, 5, 10}.