Learning Feature Engineering for Classification

Authors: Fatemeh Nargesian, Horst Samulowitz, Udayan Khurana, Elias B. Khalil, Deepak Turaga

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results show that LFE outperforms other feature engineering approaches for an overwhelming majority (89%) of the datasets from various sources while incurring a substantially lower computational cost.
Researcher Affiliation Collaboration 1University of Toronto, 2IBM Research, 3Georgia Institute of Technology
Pseudocode No The paper describes its methods in prose but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper states that components were implemented in TensorFlow and Scikit-learn but does not provide any concrete access to source code for the methodology described, nor does it explicitly state that the code is open-source or available.
Open Datasets Yes We collected 900 classification datasets from the Open ML and UCI repositories to train transformation classifiers. [Lichman, 2013] M. Lichman. UCI machine learning repository, 2013. [Vanschoren et al., 2014] Joaquin Vanschoren, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. Openml: Networked science in machine learning. SIGKDD Explor. Newsl., 15(2):49 60, June 2014.
Dataset Splits Yes Training samples were generated for Random Forest and Logistic Regression using 10-fold cross validation and the performance improvement threshold, , of 1%.
Hardware Specification No The paper does not provide any specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions implementing components in ‘Tensor Flow’ and ‘Scikit-learn’, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes All transformation classifiers are MLPs with one hidden layer. We tuned the number of hidden units to optimize the F-score for each classifier, and they vary from 400 to 500. We use Stochastic Gradient Descent with minibatches to train transformation MLPs. In order to prevent overfitting, we apply regularization and drop-out [Srivastava et al., 2014]. We considered scaling range of [-10, 10] and quantile data sketch size of 200 bins.