Top-k Supervise Feature Selection via ADMM for Integer Programming

Authors: Mingyu Fan, Xiaojun Chang, Xiaoqin Zhang, Di Wang, Liang Du

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments have been conducted on benchmark data sets to show the effectiveness of proposed method. In this section, the proposed method is compared with stateof-the-art supervised feature selection methods on benchmark image data sets. The experiments include the supervised classification by the Nearest Neighbor classifier (1-NN) and the Support Vector Machine (SVM) under various experimental settings. The numerical convergence analysis of the proposed method is also included.
Researcher Affiliation Academia Mingyu Fan1, Xiaojun Chang2, Xiaoqin Zhang1 , Di Wang1, Liang Du3 1School of Maths & Info. Science, Wenzhou University, Wenzhou 325035, China 2School of Computer Science, Carnegie Mellon University, PA 15213, USA 3School of Computer & Information Technology, Shanxi University, Taiyuan 030006 China
Pseudocode Yes Algorithm 1 ADMM for solving problem (7) Input: Data matrix X, label matrix Y , γ; A is initialized as the identity matrix I, v = 1D, v1 = v2 = 0D, ρ = 1, and µ = 1.05 Output: Projection matrix A and vector v 1: while not converged do 2: Update A(t+1) as in (10); 3: Update v(t+1) as in (11); 4: Update v(t+1) 1 and v(t+1) 2 through projections onto Sb and Sp as in (12); 5: Update y(t+1) 1 , y(t+1) 2 , y(t+1) 3 and ρ as Eq. (13). 6: If not converged, set t t + 1. 7: end while
Open Source Code Yes The Matlab code is published online1. 1https://github.com/cxj273/IJCAI2017_1274
Open Datasets Yes The Coil-20 data set2 contains 1440 image samples from 20 classes and each image is transformed into a 1024-dimensional data point. There are 72 samples in each class. 2http://www.cs.columbia.edu/CAVE/software/softlib/coil20.php The MNIST handwritten digital image data set3 has 6996 data points of digits 0 9 . Each sample is a 784 dimensional feature vector. 3http://www.escience.cn/people/fpnie/ There are 2114 frontal-face images of 38 individuals in the Yale-B face image data set4. Each image is stacked to a 1024-dimensional data vector. 4http://www.cad.zju.edu.cn/home/dengcai/Data/Face Data.html
Dataset Splits No Given a data set, we randomly select p percents from each class to formulate the training data Xtrain and the remaining data are used as the test data. No explicit mention of a separate validation dataset split was found; the paper describes a train/test split.
Hardware Specification No No specific hardware details (e.g., CPU or GPU models, memory size, or cloud instance types) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions 'The Matlab code is published online' but does not specify the version of Matlab or any other software dependencies with version numbers.
Experiment Setup Yes The Spectral method requires the neighborhood size as a key parameter, which is tuned in the range {4, 6, 8, 10}. The regularization parameter λA for the RFS and DLSR methods is searched in the range {0.001, 0.05, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1}. To make our results reproducible, the regularization parameter γ = 0.2 is used for our method throughout the experiments. ... The percentage of labeled training data is p = 30. ... 50 percents of data in each class are used as the training data (p=50) and the top 200 features are utilized.