BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits
Authors: Mo Tiwari, Martin J. Zhang, James Mayclin, Sebastian Thrun, Chris Piech, Ilan Shomorony
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our results on several large real-world datasets, including a coding exercise submissions dataset from Code.org, the 10x Genomics 68k PBMC single-cell RNA sequencing dataset, and the MNIST handwritten digits dataset. |
| Researcher Affiliation | Academia | Mo Tiwari Department of Computer Science Stanford University motiwari@stanford.edu Martin Jinye Zhang Department of Epidemiology Harvard T.H. Chan School of Public Health jinyezhang@hsph.harvard.edu James Mayclin Department of Computer Science Stanford University jmayclin@stanford.edu Sebastian Thrun Department of Computer Science Stanford University thrun@stanford.edu Chris Piech Department of Computer Science Stanford University piech@cs.stanford.edu Ilan Shomorony Electrical and Computer Engineering University of Illinois at Urbana-Champaign ilans@illinois.edu |
| Pseudocode | Yes | Algorithm 1 Adaptive-Search ( Star, Sref, gx( ), B, δ, σx ) |
| Open Source Code | Yes | We also release highly optimized Python and C++ implementations of our algorithm1. 1https://github.com/Thrun Group/Bandit PAM |
| Open Datasets | Yes | The MNIST dataset [26] consists of 70,000 black-and-white images of handwritten digits... The HOC4 dataset from Code.org [11] consists of 3,360 unique solutions to a block-based programming exercise. [11] Code.org. Research at code.org. In https://code.org/research, 2013. |
| Dataset Splits | No | The paper discusses the datasets used but does not provide specific details on how they were split into training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions that the algorithm is implemented in Python and C++ but does not provide specific version numbers for these languages or any other software dependencies. |
| Experiment Setup | Yes | In all experiments, the batch size B is set to 100 and the error probability δ is set to δ = 1 1000|Star|. |