Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

New l1-Norm Relaxations and Optimizations for Graph Clustering

Authors: Feiping Nie, Hua Wang, Cheng Deng, Xinbo Gao, Xuelong Li, Heng Huang

AAAI 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The extensive experiments have been performed on three two-way clustering and eight multi-way clustering benchmark data sets. All empirical results show that our new relaxation methods consistently enhance the normalized cut and ratio cut clustering results.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, University of Texas at Arlington, USA 2Department of Electrical Engineering and Computer Science, Colorado School of Mines, USA 3School of Electronic Engineering, Xidian University, Xi an, China 4Xi an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, China
Pseudocode	Yes	Algorithm 1: Algorithm to solve the general problem (18). ... Algorithm 2: Algorithm to solve the problem (13).
Open Source Code	No	The paper does not provide any concrete access to source code or explicitly state that the code is open-source or available.
Open Datasets	Yes	Three benchmark data sets from UCI machine learning repository1 are used in our experiments, including hepatitis database with 155 instances and 20 attributes, ionosphere database with 351 instances and 34 attributes, breast cancer database with 286 instances and 9 attributes. ... Eight benchmark data sets are used in the experiments, including two UCI data sets, dermatology and ecoli, one object data set, COIL-20 (Nene, Nayar, and Murase 1996), one digit and character data sets, Binalpha, and four face data sets, Umist (Graham and Allinson 1998), AR (Martinez and Benavente 1998), Yale B (Georghiades, Belhumeur, and Kriegman 2001), and PIE (Sim and Baker 2003).
Dataset Splits	No	The paper discusses evaluation metrics and iterations but does not explicitly provide training, validation, or test dataset splits or cross-validation details.
Hardware Specification	No	The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	We construct nearest-neighbor graph for each data set following (Gu and Zhou 2009). ... Again, we construct nearest-neighbor graph for each data set and set the neighborhood size for graph construction as 10 (Gu and Zhou 2009). The dimension of PCA+K-means is searched from ﬁve candidates ranging from 10 to the dimension of data. ... To reduce statistical variety, we independently repeat all clustering algorithms for 50 times with random initializations, and then we report the results corresponding to the best objective values.