reproducibilityindex.ai

KnockoffGAN: Generating Knockoffs for Feature Selection using Generative Adversarial Networks

Authors: James Jordon, Jinsung Yoon, Mihaela van der Schaar

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the capability of our model to perform feature selection, showing that it performs as well as the originally proposed knockoff generation model in the Gaussian setting and that it outperforms the original model in non-Gaussian settings, including on a real-world dataset. ... 5 EXPERIMENTS In this section we demonstrate the capability of our method to match the results of [7] in settings where their model is correctly speciﬁed (i.e. when the underlying feature distribution is Gaussian) and then go on to show, in settings where the underlying feature distribution is non-Gaussian, that our method is able to outperform their Gaussian approximation.
Researcher Affiliation	Academia	James Jordon Engineering Science Department University of Oxford, UK james.jordon@wolfson.ox.ac.uk Jinsung Yoon Department of Electrical and Computer Engineering UCLA, California, USA jsyoon0823@g.ucla.edu Mihaela van der Schaar University of Cambridge, UK Department of Electrical and Computer Engineering, UCLA, California, USA Alan Turing Institute, London, UK mihaela@ee.ucla.edu
Pseudocode	Yes	Algorithm 1 Pseudo-code of Knockoff GAN
Open Source Code	No	No explicit statement or link to their own open-source code for KnockoffGAN. Links provided are for benchmarks.
Open Datasets	No	In this section we use a biobank dataset (with 387 dimensions)... To preserve anonymity of the authors, the full details of this dataset will be given upon acceptance of the paper.
Dataset Splits	No	The paper states 'the number of samples to be n = 3000' for synthetic data but does not provide explicit train/validation/test splits, percentages, or counts for its datasets.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper states 'Knockoff GAN is implemented in tensorﬂow.' but does not provide specific version numbers for TensorFlow or any other software dependencies required for reproducibility.
Experiment Setup	Yes	The number of hidden nodes in each layer is d/4, d/16 d/4 for the generator, discriminator, and WGAN-GP, respectively. For the power network, we use 2 diagonal matrices for each layer to make two hidden nodes for each feature separately. We use Re Lu and tanh as the activation functions for each layer except for the output layer where we use a linear activation function for the generator, power network and WGAN-GP networks and sigmoid activation function for the discriminator network. The number of samples in each mini-batch is 128. ... where λ, µ are hyper-parameters (set to 1 in the experiments section). ... η is a hyper-parameter (set to 10 in practice). ... We chose 0.9 to balance this, following the implementation of [38].