BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

Authors: Mo Tiwari, Martin J. Zhang, James Mayclin, Sebastian Thrun, Chris Piech, Ilan Shomorony

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our results on several large real-world datasets, including a coding exercise submissions dataset from Code.org, the 10x Genomics 68k PBMC single-cell RNA sequencing dataset, and the MNIST handwritten digits dataset.
Researcher Affiliation Academia Mo Tiwari Department of Computer Science Stanford University motiwari@stanford.edu Martin Jinye Zhang Department of Epidemiology Harvard T.H. Chan School of Public Health jinyezhang@hsph.harvard.edu James Mayclin Department of Computer Science Stanford University jmayclin@stanford.edu Sebastian Thrun Department of Computer Science Stanford University thrun@stanford.edu Chris Piech Department of Computer Science Stanford University piech@cs.stanford.edu Ilan Shomorony Electrical and Computer Engineering University of Illinois at Urbana-Champaign ilans@illinois.edu
Pseudocode Yes Algorithm 1 Adaptive-Search ( Star, Sref, gx( ), B, δ, σx )
Open Source Code Yes We also release highly optimized Python and C++ implementations of our algorithm1. 1https://github.com/Thrun Group/Bandit PAM
Open Datasets Yes The MNIST dataset [26] consists of 70,000 black-and-white images of handwritten digits... The HOC4 dataset from Code.org [11] consists of 3,360 unique solutions to a block-based programming exercise. [11] Code.org. Research at code.org. In https://code.org/research, 2013.
Dataset Splits No The paper discusses the datasets used but does not provide specific details on how they were split into training, validation, or test sets.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions that the algorithm is implemented in Python and C++ but does not provide specific version numbers for these languages or any other software dependencies.
Experiment Setup Yes In all experiments, the batch size B is set to 100 and the error probability δ is set to δ = 1 1000|Star|.