reproducibilityindex.ai

Near-Optimal $k$-Clustering in the Sliding Window Model

Authors: David Woodruff, Peilin Zhong, Samson Zhou

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct simple empirical demonstrations as proof-of-concepts to illustrate the benefits of our algorithm. Our empirical evaluations were conducted using Python 3.10 using a 64-bit operating system on an AMD Ryzen 7 5700U CPU, with 8GB RAM and 8 cores with base clock 1.80 GHz.
Researcher Affiliation	Collaboration	David P. Woodruff CMU dwoodruf@cs.cmu.edu Peilin Zhong Google Research peilinz@google.com Samson Zhou Texas A&M University samsonzhou@gmail.com
Pseudocode	Yes	Algorithm 1 RINGSAMPLE Algorithm 2 Merge-and-reduce framework for randomized algorithms in the sliding window model, using randomized constructions of online coresets
Open Source Code	No	The paper does not provide any statements about releasing code for the described methodology or links to a code repository.
Open Datasets	Yes	The first component of our dataset consists of the points of the SKIN (Skin Segmentation) dataset X from the publicly available UCI repository [6], which was also used in the experiments of [8].
Dataset Splits	No	The paper describes aspects of the experimental setup such as iterations and initialization methods, and states the ranges of m and k values tested. However, it does not specify explicit training, validation, or test dataset splits (e.g., percentages or absolute counts).
Hardware Specification	Yes	Our empirical evaluations were conducted using Python 3.10 using a 64-bit operating system on an AMD Ryzen 7 5700U CPU, with 8GB RAM and 8 cores with base clock 1.80 GHz.
Software Dependencies	No	The paper mentions 'Python 3.10', but it does not list multiple key software components with their specific version numbers or a self-contained solver with a version number, which is required for a reproducible description.
Experiment Setup	Yes	For each of the instances of Lloyd s algorithm, either on the entire dataset X or the sampled coreset C, we use 10 iterations using the k-means++ initialization.