Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
Authors: Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, Pavel Serdyukov
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The superiority of our approach is demonstrated by comparison with recently proposed interactionaware filters and several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets. We also empirically validate our approach with 3 state-of-the-art classification models on 10 publicly available benchmark datasets and compare it with known interaction-aware SFS-based filters and several state-of-the-art ones. |
| Researcher Affiliation | Industry | Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, Pavel Serdyukov Yandex; 16 Leo Tolstoy St., Moscow 119021, Russia {sisoid,nstbezz,adrutsa,ishfb,kglad,gleb57,pavser}@yandex-team.ru |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of the CMICOT feature selection method (an implementation of this algorithm is available at https://github.com/yandex/CMICOT). |
| Open Source Code | Yes | Algorithm 1 Pseudo-code of the CMICOT feature selection method (an implementation of this algorithm is available at https://github.com/yandex/CMICOT). |
| Open Datasets | Yes | on 10 publicly available benchmark datasets from the UCI ML Repo (that include the NIPS 2003 FS competition) |
| Dataset Splits | Yes | The curves on Fig. 1 (b,c) are obtained over a test set, while a 10-fold cross-validation [2, 18] is also applied for several key points (e.g. k = 10, 20, 50) to estimate the significance of differences in classification quality. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper mentions using Naive Bayes Classifier (NBC), k-Nearest Neighbor (k NN), and Ada Boost, but it does not provide specific version numbers for these or any other software libraries or dependencies. |
| Experiment Setup | No | The paper mentions general experimental parameters like k = 1..50 for feature selection and t = 1..10 for their method, and discusses preprocessing such as discretization. However, it does not provide specific hyperparameters (e.g., learning rate, batch size, optimizer settings) for the classifiers used (NBC, kNN, Ada Boost) or other detailed system-level training configurations in the main text. |