Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient One-Pass Multi-View Subspace Clustering with Consensus Anchors

Authors: Suyuan Liu, Siwei Wang, Pei Zhang, Kai Xu, Xinwang Liu, Changwang Zhang, Feng Gao7576-7584

AAAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the performance of EOMSC-CA, we conduct experiments in this section. [...] Table 3 compares the clustering performance of the EOMSC-CA with other methods on nine benchmark datasets. [...] The run time of various algorithms on nine datasets are compared in Figure 2.
Researcher Affiliation	Academia	Suyuan Liu,1* Siwei Wang,1* Pei Zhang,1 Kai Xu,1 Xinwang Liu,1 Changwang Zhang,2 Feng Gao3 1 School of Computer, National University of Defense Technology, Changsha, China, 410073 2 CCF Theoretical Computer Science Technical Committee, Shenzhen, China, 518064 3 School of Arts, Peking University, Beijing, China, 100871
Pseudocode	Yes	Algorithm 1: Algorithm for optimizing Z. [...] Algorithm 2: EOMSC-CA
Open Source Code	Yes	Our code is publicly available at https://github.com/Tracesource/EOMSC-CA.
Open Datasets	Yes	We perform experiments on nine widely used multi-view benchmark datasets: ORL mtv, Caltech101-7, Mfeat, Caltech101-20, Caltech101-all, SUNRGBD, NUSWIDEOBJ, AWA, Youtube Face. The details of them are shown in Table 2. Specifically, ORL mtv contains 400 images in 40 classes. Caltech101-7 with 1474 instances in 7 categories and Caltech101-20 with 2386 subjects in 20 classes are both subsets of the image dataset Caltech101 (Fei-Fei, Fergus, and Perona 2004). Mfeat was generated from UCI machine learning repository, which consists of the digits from 0 to 9. SUNRGBD (Song, Lichtenberg, and Xiao 2015) consists of 10335 indoor scene images spread over 45 classes. NUSWIDEOBJ (Chua et al. 2009) is an object recognition database with 30000 objects. AWA contains 50 different animals with their six features. Youtube Face is produced from You Tube with 101499 instances.
Dataset Splits	No	The paper states 'We run 50 times k-means and report the best result' but does not specify detailed train/validation/test dataset splits, percentages, or absolute counts for reproduction.
Hardware Specification	Yes	All the experiments are performed on a desktop with Intel Core i9-10900X CPU and 64G RAM, MATLAB 2019b(64-bit).
Software Dependencies	Yes	All the experiments are performed on a desktop with Intel Core i9-10900X CPU and 64G RAM, MATLAB 2019b(64-bit).
Experiment Setup	Yes	The proposed method has no hyper-parameters to be tuned, but we need to determine the anchors number and the dimension of anchor matrix. In the experiments, the anchors number and the dimension of anchor matrix are both traverse [k, 2k, ..., 7k] where k is the clustering number of each dataset. For the compared algorithms, we search their best parameters for fairness. Moreover, we run 50 times k-means and report the best result.