Instance-wise Feature Grouping

Authors: Aria Masoomi, Chieh Wu, Tingting Zhao, Zifeng Wang, Peter Castaldi, Jennifer Dy

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic data validate our theoretical claims. Experiments on MNIST, Fashion MNIST, and gene expression datasets show that our method discovers feature groups with high classification accuracies.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, US 2Channing Division of Network Medicine, Brigham and Women s Hospital, Boston, MA, US
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks. It describes the method textually and visually with a flowchart.
Open Source Code Yes We make the source code publicly available at https://github.com/ariahimself/Instance-wise-Feature-Grouping.
Open Datasets Yes We additionally test on benchmark image datasets from MNIST, and Fashion MNIST (F-MNIST) [34, 35]... We also evaluate our method on a real-world gene expression data as quantified by RNA sequencing from the COPDGene Study, an observational study to identify genomic markers associated with chronic obstructive pulmonary disease (COPD) [32].
Dataset Splits Yes We generate 100000 training, 1000 validation, and 1000 test samples for each combination. ... All λs are identified by maximizing the objective given a validation set.
Hardware Specification Yes The experiments are implemented with Python, Numpy, Sklearn, and Tensor Flow [36, 37, 38, 39] on a single NVIDIA GTX 1060Ti GPU.
Software Dependencies No The paper mentions software like Python, Numpy, Sklearn, and Tensor Flow but does not specify their version numbers, which are required for a reproducible description of ancillary software.
Experiment Setup Yes We use a neural network of width 100 and depth 2 to generate the probability inputs for the Gumbel-Softmax to obtain G and S; the Gumbel temperature was set to 0.1. ReLU was used as the activation function with softmax at the final layer for prediction. Adam optimizer with a learning rate of 0.001 and hyperparameters β1 = 0.9, β2 = 0.999 was used without further tuning. All datasets are centered to 0 and normalized to have a standard deviation of 1. For all data, we used two fully connected layers of width 32 and 16. All λs are identified by maximizing the objective given a validation set.