Clustering-Based Joint Feature Selection for Semantic Attribute Prediction
Authors: Lin Chen, Baoxin Li
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both synthetic dataset and multiple public benchmark datasets demonstrate that the proposed approach effectively captures the correlation among multiple attributes and significantly outperforms the state-of-the-art approaches. |
| Researcher Affiliation | Academia | Lin Chen and Baoxin Li Arizona State University, Tempe Arizona {lin.chen.cs, baoxin.li}@asu.edu |
| Pseudocode | Yes | Algorithm 1 Feature Selection Optimization; Algorithm 2 Cluster Assignment Estimation |
| Open Source Code | No | The paper does not provide concrete access to its own source code for the methodology described. |
| Open Datasets | Yes | We verify the feature selection capability on general multilabel datasets in this section. The experiment is conducted on 6 public benchmark feature selection datasets including one object image dataset COIL100 [COI, 1996], one handwritten digit image dataset USPS [Hull, 1994], one spoken letter speech dataset Isolet [Fanty and Cole, 1991], three face image dataset Yale B [Georghiades et al., 2001], ORL [Samaria and Harter, 1994] and PIX10P1. The statistics of the datasets are summarized in Table 2. PIX10P is publicly available from https://featureselection.asu.edu/datasets.php |
| Dataset Splits | Yes | In each setup 50% samples are randomly selected for training and the remaining is for testing. Specific constrains are imposed to make sure the class labels of the training set are balanced. The whole experiment is conducted 10 rounds and average accuracies are reported. For SUN dataset, 60% of categories are randomly split out as seen categories in each round with the rest as unseen categories. During training 50% of samples are randomly and carefully drawn from each seen categories to ensure the balance of the positive and negative attribute labels. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers needed to replicate the experiment. |
| Experiment Setup | Yes | The proposed framework has three important parameters controlling the sparsity of W, β controlling the contribution of modeling label correlation and gamma controls the global penalty. We study the effect of each parameter by fixing the other to see how the performance of the proposed approach varies with the number of selected features. Figure 3 demonstrates the performance variance w.r.t. different parameters and the number of selected features. |