reproducibilityindex.ai

Composite Feature Selection Using Deep Ensembles

Authors: Fergus Imrie, Alexander Norcliffe, Pietro Lió, Mihaela van der Schaar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Comp FS using several synthetic and semi-synthetic datasets where ground truth feature importances and group structure are known. In addition, we illustrate our method on an image dataset (MNIST) and a real-world cancer dataset (METABRIC).
Researcher Affiliation	Academia	Fergus Imrie University of California, Los Angeles imrie@ucla.edu Alexander Norcliffe University of Cambridge alin2@cam.ac.uk Pietro Liò University of Cambridge pl219@cam.ac.uk Mihaela van der Schaar University of Cambridge The Alan Turing Institute University of California, Los Angeles mv472@cam.ac.uk
Pseudocode	No	The paper describes its methodology in prose and figures, but it does not include a dedicated section or block explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	The code for our method and experiments is available on Github. 2 3
Open Datasets	Yes	We use several of the datasets constructed by [62], some of which were also used by [75]. (Section 5.2 Semi-Synthetic Experiments) We investigate Comp FS on the MNIST dataset [48]. (Section 5.3 Image Dataset: MNIST) Finally, we assess Comp FS on a real-world dataset, METABRIC [21, 68] (Section 5.4 Real-World Data: METABRIC)
Dataset Splits	Yes	For all datasets, we hold out 10% of the data for validation and use 10% of the data for testing, using the remaining 80% for training.
Hardware Specification	Yes	All experiments were run on a single NVIDIA GeForce RTX 3090 GPU or on a commercially available laptop with an Intel i7-10750H CPU and 16GB of RAM.
Software Dependencies	No	The paper mentions the use of an Adam optimizer but does not specify software dependencies with version numbers (e.g., specific Python versions, machine learning frameworks like PyTorch or TensorFlow with their versions, or library versions).
Experiment Setup	Yes	For all our experiments, we use a two-layer feedforward neural network with 128 hidden units and ReLU non-linearities for the group encoder and predictor network. We use a batch size of 128 and train with the Adam optimizer [44] with learning rate 0.001. (Appendix C) and Hyperparameters for each experiment are provided in Table 4. (Question 3b in checklist, Table 4 provides specific values for learning rate, batch size, epochs, etc.)