Feature Selection in the Contrastive Analysis Setting

Authors: Ethan Weinberger, Ian Covert, Su-In Lee

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We motivate our approach with a novel information-theoretic analysis of representation learning in the CA setting, and we empirically validate CFS on a semi-synthetic dataset and four real-world biomedical datasets.
Researcher Affiliation Academia Ethan Weinberger Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 ewein@cs.washington.edu Ian C. Covert Department of Computer Science Stanford University Stanford, CA 94305 icovert@stanford.edu Su-In Lee Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 suinlee@cs.washington.edu
Pseudocode No The paper describes the proposed method (CFS) using text descriptions and mathematical equations, accompanied by a diagram (Figure 2), but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes An open-source implementation of our method is available at https://github.com/suinleelab/CFS.
Open Datasets Yes We validate our approach empirically through extensive experiments on a semi-synthetic dataset introduced in prior work as well as four real-world biomedical datasets... Raw data was downloaded from https://archive.ics.uci.edu/ml/machine-learning-databases/00342/.
Dataset Splits Yes For all experiments we divided our data using an 80-20 train-test split, and we report the mean standard error over five random seeds for each method.
Hardware Specification Yes All experiments were peformed on a system running Cent OS 7.9.2009 equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Software Dependencies Yes CFS models were implemented using Py Torch [50] (version 1.13) with the Py Torch Lightning API4...equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Experiment Setup Yes For all CFS variants we let our reconstruction function f be a multilayer perceptron with two hidden layers of size 512 with Re LU activation functions...All CFS models were trained using the Py Torch implementation of the Adam [51] optimizer with default hyperparameters. Batch sizes of 128 were used for all experiments.