reproducibilityindex.ai

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction

Authors: Muhammed Fatih Balın, Abubakar Abid, James Zou

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate concrete autoencoders on a variety of datasets, where they signiﬁcantly outperform state-of-theart methods for feature selection and data reconstruction. In this section, we carry out experiments to compare the performance of concrete autoencoders to other feature subset selections on standard public datasets.
Researcher Affiliation	Collaboration	1Department of Electrical Engineering, Stanford University, Stanford, United States 2Department of Computer Engineering, Bogazici University, Istanbul, Turkey 3Department of Biomedical Data Sciences, Stanford University, Stanford, United States 4Chan-Zuckerberg Biohub, San Francisco, United States.
Pseudocode	Yes	Figure 2. Concrete autoencoder architecture and pseudocode. (b) Here, we show pseudocode for the concrete autoencoder algorithm, see Appendix C for more details.
Open Source Code	Yes	We have made the code for our algorithm and experiments available on a public repository1. 1Code available at: https://github.com/mfbalin/ Concrete-Autoencoders
Open Datasets	Yes	We evaluate these methods on a number of datasets (the sizes of the datasets are in Table 1): MNIST and MNIST-Fashion consist of 28-by-28 grayscale images of hand-written digits and clothing items, respectively. We choose these datasets because they are widely known in the machine learning community.
Dataset Splits	Yes	Furthermore, since the reconstruction fθ( ) can overﬁt to patterns particular to the training set, we divide each dataset randomly into train, validation, and test datasets according to a 72-8-20 split2 that were held constant for all experiments. We use the training set to learn the parameters of the concrete autoencoders, the validation set to select optimal hyperparameters, and the test set to evaluate generalization performance, which we report below. 2For the MNIST, MNIST-Fashion, and Epileptic datasets, we only used 6000, 6000 and 8000 samples respectively to train and validate the model (using a 90-10 train-validation split), because of long runtime of the UDFS algorithm. The remaining samples were used for the holdout test set.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments with specific models or specifications.
Software Dependencies	No	For all of the experiments, we use Adam optimizer with a learning rate of 10 3. Where available, we made use of scikit-feature implementation of each method (Li et al., 2016). (No version numbers are provided for Adam or scikit-feature).
Experiment Setup	Yes	For all of the experiments, we use Adam optimizer with a learning rate of 10 3. The initial temperature of the concrete autoencoder T0 was set to 10 and the ﬁnal temperature TB to 0.01. We trained the concrete autoencoder until the mean of the highest probabilities in α(i) exceeded 0.99.