Sequential Attention for Feature Selection
Authors: Taisuke Yasuda, Mohammadhossein Bateni, Lin Chen, Matthew Fahrbach, Gang Fu, Vahab Mirrokni
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This work introduces the Sequential Attention algorithm for supervised feature selection... Empirically, Sequential Attention achieves state-of-the-art feature selection results for neural networks on standard benchmarks. The code for our algorithm and experiments is publicly available.1 |
| Researcher Affiliation | Collaboration | Taisuke Yasuda* Carnegie Mellon University taisukey@cs.cmu.edu Mohammad Hossein Bateni, Lin Chen, Matthew Fahrbach, Gang Fu*, and Vahab Mirrokni Google Research {bateni,linche,fahrbach,thomasfu,mirrokni}@google.com |
| Pseudocode | Yes | Algorithm 1 Sequential Attention for feature selection. Algorithm 2 Orthogonal Matching Pursuit (Pati et al., 1993). Algorithm 3 Sequential LASSO (Luo & Chen, 2014). |
| Open Source Code | Yes | The code for our algorithm and experiments is publicly available.1 The code is available at: github.com/google-research/google-research/tree/master/sequential attention |
| Open Datasets | Yes | In these experiments, we consider six datasets used in experiments in Lemhadri et al. (2021); Balın et al. (2019), and select 𝑘= 50 features... Table 1: Statistics about benchmark datasets. Dataset # Examples # Features # Classes Type Mice Protein 1,080 77 8 Biology MNIST 60,000 784 10 Image MNIST-Fashion 60,000 784 10 Image ISOLET 7,797 617 26 Speech COIL-20 1,440 400 20 Image Activity 5,744 561 6 Sensor |
| Dataset Splits | No | The paper mentions 'test data' and 'prediction accuracies' but does not explicitly specify the training/test/validation dataset splits (e.g., percentages or counts) or reference predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU specifications, or cloud computing instance types. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions). |
| Experiment Setup | Yes | In these experiments, we consider six datasets used in experiments in Lemhadri et al. (2021); Balın et al. (2019), and select 𝑘= 50 features using a one-layer neural network with hidden width 67 and Re LU activation... Table 4: Epochs and batch size used to compare the efficiency of feature selection algorithms... For this experiment, we use a dense neural network with 768, 256, and 128 neurons in each of the three hidden layers with Re LU activations. |