Learning Using Unselected Features (LUFe)

Authors: Joseph G. Taylor, Viktoriia Sharmanska, Kristian Kersting, David Weir, Novi Quadrianto

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide exhaustive experimental evaluation on 49 textual datasets; the results demonstrate that LUFe can indeed improve classification performance compared to traditional combinatorial feature selection, without incurring extra costs at test time.
Researcher Affiliation Academia Joseph G. Taylor,1 Viktoriia Sharmanska1 Kristian Kersting,2 David Weir,1 and Novi Quadrianto1 1SMi Le CLi Ni C and TAG Lab, University of Sussex, Brighton, UK 2Technische Universit at Dortmund, Dortmund, Germany
Pseudocode Yes For the pseudocode of LUFe, please refer to algorithm 1.
Open Source Code No No explicit statement or link providing access to the open-source code for the described methodology was found.
Open Datasets Yes Dataset We follow the protocol of [Paul et al., 2015], in using a subset of the Tech TC-300 collection consisting of 49 datasets, pre-processed to remove all features corresponding to any word of less than 5 characters. The Tech TC-300 collection consists of 300 textual datasets, which have baseline SVM error rate uniformly distributed between 0.4 and 0.0.1 1http://techtc.cs.technion.ac.il/techtc300/techtc300.html
Dataset Splits Yes All experimental settings were tested over 100 repeats, and each repeat, stratified 5-fold cross-validation was used to estimate the λ parameters for each setting.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or cloud instances) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions using SVM and a quadratic programming (QP) solver, but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes Model selection A consistent model selection procedure was carried out in all experiments. All experimental settings were tested over 100 repeats, and each repeat, stratified 5-fold cross-validation was used to estimate the λ parameters for each setting. All parameters were selected from seven log-spaced values in the range {10 3...103}). The two SVM+ parameters for LUFe (λ1 and λ2) were jointly optimised through grid search; that is, 49 combinations were assessed.