Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Clustering with Nearly Optimal Consistency

Authors: T-H. Hubert Chan, Shaofeng Jiang, Tianyi Wu, Mengshi Zhao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the performance of our algorithm on real datasets by plugging in the practically efficient k-MEANS++ algorithm. Our online algorithm makes k-MEANS++ achieve good consistency with little overhead to the quality of solutions. ... We validate (in Section 5) the performance for this new consistent k-MEANS++ on 3 real datasets, and compare with the vanilla k-MEANS++ as well as a previous algorithm (Lattanzi & Vassilvitskii, 2017), as baselines. Our experiments show that with a cost similar to k-MEANS++ and the algorithm in Lattanzi & Vassilvitskii (2017), our algorithm achieves much lower consistency. ... In this section, we present experimental results evaluating the performance of our algorithms. ... We depict the consistency and cost curves in Figures 1 and 2 for k = 10, comparing the performance of the algorithms naive , LV17 , ours-faithful , and ours-heuristic , as described earlier.
Researcher Affiliation	Academia	T-H. Hubert Chan The University of Hong Kong EMAIL. Shaofeng H.-C. Jiang Peking University EMAIL. Tianyi Wu Peking University EMAIL. Mengshi Zhao The University of Hong Kong EMAIL.
Pseudocode	Yes	Algorithm 1: Consistent online algorithm for (k, z)-CLUSTERING. Algorithm 2: Robustify, on weighted point set P, center set C and a center point c C. Algorithm 3: Make Robust, on weighted point set P and center set C with witness mapping wit C.
Open Source Code	No	The paper does not provide an explicit statement about the release of source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	Yes	Our experiment is conducted on the SKIN (Bhatt & Dhall, 2009), SHUTTLE (Catlett), and COVERTYPE (Blackard, 1998) datasets from the publicly available UCI repository, which were also used in the experiments of previous works such as (Lattanzi & Vassilvitskii, 2017). DOI: https://doi.org/10.24432/C5T30C. DOI: https://doi.org/10.24432/C5WS31. DOI: https://doi.org/10.24432/C50K5N.
Dataset Splits	No	The paper mentions the total number of points for each dataset (e.g., 'The SKIN dataset consists of 245,057 points') and coreset sizes ('Our experiments employ coreset sizes ranging from 1000 to 2000 elements'), but it does not specify any training, testing, or validation splits for these datasets.
Hardware Specification	No	The paper discusses theoretical running time complexity and observed performance (e.g., 'ours-faithful still runs in about 0.02 seconds per insertion'), but it does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory).
Software Dependencies	No	The paper mentions the use of 'k-MEANS++ (Arthur & Vassilvitskii, 2007)' as an algorithm integrated into their framework, but it does not list any specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks).
Experiment Setup	No	The paper provides a high-level description of the experimental approach, such as using an online coreset construction and integrating k-MEANS++ into their framework. It mentions 'coreset sizes ranging from 1000 to 2000 elements'. However, it does not detail specific hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings.