Greedy Feature Construction
Authors: Dino Oglic, Thomas Gärtner
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of the approach is evaluated empirically by fitting a linear ridge regression model in the constructed feature space and our empirical results indicate a superior performance of our approach over competing methods. To evaluate the effectiveness of our approach empirically, we compare it to other related approaches by training linear ridge regression models in the feature spaces constructed by these methods. Our empirical results indicate a superior performance of the proposed approach over competing methods. The results are presented in Section 3 and the approaches are discussed in Section 4. |
| Researcher Affiliation | Academia | Dino Oglic dino.oglic@uni-bonn.de Institut für Informatik III Universität Bonn, Germany Thomas Gärtner thomas.gaertner@nottingham.ac.uk School of Computer Science The University of Nottingham, UK |
| Pseudocode | Yes | Algorithm 1 GREEDYDESCENT and Algorithm 2 GREEDY FEATURE CONSTRUCTION (GFC) provide structured pseudocode descriptions of the proposed algorithms. |
| Open Source Code | No | The paper does not provide any explicit statements about making its source code available or links to a code repository for the methodology described. |
| Open Datasets | Yes | The paper explicitly states: 'available from Luís Torgo [28]. Repository with regression data sets. http://www.dcc.fc.up.pt/~ltorgo/Regression/Data Sets.html, accessed September 22, 2016.' |
| Dataset Splits | Yes | For each considered data set, we split the data into 10 folds; we refer to these splits as the outer cross-validation folds. In each step of the outer cross-validation, we use nine folds as the training sample and one fold as the test sample. For the purpose of the hyperparameter tuning we split the training sample into five folds; we refer to these splits as the inner cross-validation folds. |
| Hardware Specification | No | The paper mentions: 'We are grateful for access to the University of Nottingham High Performance Computing Facility.' However, it does not specify any particular CPU or GPU models, memory, or other specific hardware components used for the experiments. |
| Software Dependencies | No | The paper discusses various algorithms and methods (e.g., 'linear SVM [13]', 'á la carte'), but it does not specify any software libraries, frameworks, or programming languages with their version numbers that were used for the implementation or experiments. |
| Experiment Setup | Yes | We run all algorithms on identical outer cross-validation folds and construct feature representations with 100 and 500 features. To control the smoothness of newly constructed features, we penalize the objective in line 3 so that the solutions with the small L2 ρ (X) norm are preferred. Following this observation, we have simulated the greedy descent with Ω(c, w) = c 2 2. The performance of the algorithms is assessed by comparing the root mean squared error of linear ridge regression models trained in the constructed feature spaces and the average time needed for the outer cross-validation of one fold. As the best performing configuration of á la carte on the development data sets is the one with Q = 5 components. For 95% confidence, the threshold value of the Wilcoxon signed rank test with 16 data sets is T = 30. We perform the paired Welch t-test [29] with p = 0.05. |