Contextual Feature Selection with Conditional Stochastic Gates
Authors: Ram Dyuthi Sristi, Ofir Lindenbaum, Shira Lifshitz, Maria Lavzin, Jackie Schiller, Gal Mishne, Hadas Benisty
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct an extensive benchmark using simulated and real-world datasets across multiple domains demonstrating that c-STG can lead to improved feature selection capabilities while enhancing prediction accuracy and interpretability. and We conduct comprehensive empirical evaluations on simulated and real-world datasets across healthcare, housing, and neuroscience, demonstrating the effectiveness and adaptability of our proposed methods compared to existing techniques. |
| Researcher Affiliation | Academia | 1University of California San Diego, La Jolla, California, USA 2Bar-Ilan University , Ramat Gan, Israel 3Technion Israel Institute of Technology, Haifa, Israel |
| Pseudocode | Yes | Algorithm 1 Weighted c-STG |
| Open Source Code | Yes | Code for c-STG is available in https://github.com/ Mishne-Lab/Conditional-STG |
| Open Datasets | Yes | Heart disease dataset: We now focus on medical data, specifically, the heart disease dataset from UCI ML repository (Janosi et al., 1988). and The Housing dataset (Lianjia, 2017). |
| Dataset Splits | Yes | The selection of model parameters/hyperparameters was based on preventing issues like underfitting and overfitting and ensuring optimal 5-fold cross-validated performance. and Table 1 shows the 5-fold cross-validation accuracy where we surpass other methods. |
| Hardware Specification | Yes | We trained all networks using CUDA-accelerated Py Torch implementations on a NVIDIA Quadro RTX8000 GPU. |
| Software Dependencies | No | It mentions "CUDA-accelerated Py Torch implementations" but does not provide version numbers for PyTorch or CUDA. |
| Experiment Setup | Yes | To determine the best hyperparameters, namely the learning rate (η) and regularization coefficient (λ), we performed a grid search over the following values: η {1e 1, 5e 2, 1e 2, 5e 3, 1e 3, 5e 4, 1e 4} and λ {1, 5e 1, 1e 1, 5e 2, 1e 2, 5e 3, 1e 3}. The same set of values was used for the grid search across all the datasets. |