Query K-means Clustering and the Double Dixie Cup Problem
Authors: I Chien, Chao Pan, Olgica Milenkovic
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of the proposed algorithm both on synthetic and real datasets, including MNIST and CIFAR 10. We experimentally tested the proposed algorithms on synthetic and real datasets in terms of the approximation accuracy for the potential function, query complexity and the misclassification ratio |
| Researcher Affiliation | Academia | I (Eli) Chien Department ECE UIUC ichien3@illinois.edu Chao Pan Department ECE UIUC chaopan2@illinois.edu Olgica Milenkovic Department ECE UIUC milenkov@illinois.edu |
| Pseudocode | Yes | Algorithm 1: Approximate Noiseless Query K-means Clustering. Algorithm 2: Approximate Noisy Query K-means Clustering with Outliers |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use the following two image classification datasets for which the ground-truth clusters are known... 1) The well-known MNIST dataset [26] comprises 60, 000 training and 10, 000 test images of handwritten digits... 2) The CIFAR-10 dataset [27] contains 60, 000 color images... |
| Dataset Splits | No | The paper mentions '60,000 training and 10,000 test images' for MNIST, but it does not specify a validation set or cross-validation strategy for any of the datasets used. |
| Hardware Specification | No | The paper describes the experimental setup and parameters but does not specify any hardware details (e.g., GPU/CPU models, memory, or specific computing platforms) used for running the experiments. |
| Software Dependencies | No | The paper describes the algorithms and experiments but does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | The parameters are d = 20, K = [2 : 20], α = [1, 6], σi = [0, 2], δ = ϵ = 0.2, po = pe = 0.05. Figures ((a), (e)) plot the potential, Figures ((b), (f)) the query complexity, and Figures ((c), (g)) the misclassification ratio. |