reproducibilityindex.ai

Clustering-Based Joint Feature Selection for Semantic Attribute Prediction

Authors: Lin Chen, Baoxin Li

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic dataset and multiple public benchmark datasets demonstrate that the proposed approach effectively captures the correlation among multiple attributes and signiﬁcantly outperforms the state-of-the-art approaches.
Researcher Affiliation	Academia	Lin Chen and Baoxin Li Arizona State University, Tempe Arizona {lin.chen.cs, baoxin.li}@asu.edu
Pseudocode	Yes	Algorithm 1 Feature Selection Optimization; Algorithm 2 Cluster Assignment Estimation
Open Source Code	No	The paper does not provide concrete access to its own source code for the methodology described.
Open Datasets	Yes	We verify the feature selection capability on general multilabel datasets in this section. The experiment is conducted on 6 public benchmark feature selection datasets including one object image dataset COIL100 [COI, 1996], one handwritten digit image dataset USPS [Hull, 1994], one spoken letter speech dataset Isolet [Fanty and Cole, 1991], three face image dataset Yale B [Georghiades et al., 2001], ORL [Samaria and Harter, 1994] and PIX10P1. The statistics of the datasets are summarized in Table 2. PIX10P is publicly available from https://featureselection.asu.edu/datasets.php
Dataset Splits	Yes	In each setup 50% samples are randomly selected for training and the remaining is for testing. Speciﬁc constrains are imposed to make sure the class labels of the training set are balanced. The whole experiment is conducted 10 rounds and average accuracies are reported. For SUN dataset, 60% of categories are randomly split out as seen categories in each round with the rest as unseen categories. During training 50% of samples are randomly and carefully drawn from each seen categories to ensure the balance of the positive and negative attribute labels.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers needed to replicate the experiment.
Experiment Setup	Yes	The proposed framework has three important parameters controlling the sparsity of W, β controlling the contribution of modeling label correlation and gamma controls the global penalty. We study the effect of each parameter by ﬁxing the other to see how the performance of the proposed approach varies with the number of selected features. Figure 3 demonstrates the performance variance w.r.t. different parameters and the number of selected features.