reproducibilityindex.ai

Structure Aware L1 Graph for Data Clustering

Authors: Shuchu Han, Hong Qin

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental Results To evaluate the performance of our proposed algorithm, we exam it through spectral clustering applications and compare it to different graphs: Gaussian similarity (GS) graph and L1 graph. Six UCI datasets are selected. The clustering performance is measured by Normalized Mutual Information(NMI) and Accuracy(AC). In our experiment setting, we select α = 0.99 for manifold ranking, and K equals to 10%,20% and 30% percent of total number of data samples. Our experiment results show that SA-L1 graph has better clustering performance than L1 graph generally.
Researcher Affiliation	Academia	Shuchu Han, Hong Qin Computer Science Department Stony Brook University Stony Brook, NY 11790
Pseudocode	Yes	Algorithm 1: SA-L1 graph Input :Data samples X = [x1, x2, , xn], where xi X; Parameter K; Output:Adjacency matrix W of sparse graph. 1 Calculate the manifold ranking score matrix F; 2 Normalize the data sample xi with xi 2 = 1; 3 for xi X do 4 Select top K atoms from F(i), and build ˆΦ i ; 5 Solve: min αi αi 1, s.t. xi = ˆΦ iαi, αi 0; 6 W(i, :) = αi; 8 return W;
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	Six UCI datasets are selected.
Dataset Splits	No	The paper mentions 'Six UCI datasets are selected' but does not provide specific data split information for training, validation, or test sets.
Hardware Specification	No	The paper does not provide specific hardware details (like GPU/CPU models or memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	In our experiment setting, we select α = 0.99 for manifold ranking, and K equals to 10%,20% and 30% percent of total number of data samples.