reproducibilityindex.ai

BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

Authors: Mo Tiwari, Martin J. Zhang, James Mayclin, Sebastian Thrun, Chris Piech, Ilan Shomorony

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our results on several large real-world datasets, including a coding exercise submissions dataset from Code.org, the 10x Genomics 68k PBMC single-cell RNA sequencing dataset, and the MNIST handwritten digits dataset.
Researcher Affiliation	Academia	Mo Tiwari Department of Computer Science Stanford University motiwari@stanford.edu Martin Jinye Zhang Department of Epidemiology Harvard T.H. Chan School of Public Health jinyezhang@hsph.harvard.edu James Mayclin Department of Computer Science Stanford University jmayclin@stanford.edu Sebastian Thrun Department of Computer Science Stanford University thrun@stanford.edu Chris Piech Department of Computer Science Stanford University piech@cs.stanford.edu Ilan Shomorony Electrical and Computer Engineering University of Illinois at Urbana-Champaign ilans@illinois.edu
Pseudocode	Yes	Algorithm 1 Adaptive-Search ( Star, Sref, gx( ), B, δ, σx )
Open Source Code	Yes	We also release highly optimized Python and C++ implementations of our algorithm1. 1https://github.com/Thrun Group/Bandit PAM
Open Datasets	Yes	The MNIST dataset [26] consists of 70,000 black-and-white images of handwritten digits... The HOC4 dataset from Code.org [11] consists of 3,360 unique solutions to a block-based programming exercise. [11] Code.org. Research at code.org. In https://code.org/research, 2013.
Dataset Splits	No	The paper discusses the datasets used but does not provide specific details on how they were split into training, validation, or test sets.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions that the algorithm is implemented in Python and C++ but does not provide specific version numbers for these languages or any other software dependencies.
Experiment Setup	Yes	In all experiments, the batch size B is set to 100 and the error probability δ is set to δ = 1 1000\|Star\|.