Clustering Stable Instances of Euclidean k-means.
Authors: Aravindan Vijayaraghavan, Abhratanu Dutta, Alex Wang
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Algorithm 3.1 on multiple real world datasets and compare its performance to the performance of k-means++, and also check how well these datasets satisfy our geometric conditions. Table 1: Comparison of k-means cost for Alg 3.1 and k-means++ |
| Researcher Affiliation | Academia | Abhratanu Dutta Northwestern University adutta@u.northwestern.edu Aravindan Vijayaraghavan Northwestern University aravindv@northwestern.edu Alex Wang Carnegie Mellon University alexwang@u.northwestern.edu |
| Pseudocode | Yes | Algorithm 3.1. Input: X = { x1, . . . , xn }, k. 1: for all pairs a, b of distinct points in { xi } do 2: Let r = a b be our guess for ρ 3: procedure INITIALIZE 4: Create graph G on vertex set { x1, . . . , xn } where xi and xj have an edge iff xi xj < r 5: Let a1, . . . , ak Rd where ai is the mean of the ith largest connected component of G 6: procedure ASSIGN 7: Let C1, . . . , Ck be the clusters obtained by assigning each point in X to the closest ai 8: Calculate the k-means objective of C1, . . . , Ck 9: Return clustering with smallest k-means objective found above |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | Experiments were run on unnormalized and normalized versions of four labeled datasets from the UCI Machine Learning Repository: Wine (n = 178, k = 3, d = 13), Iris (n = 150, k = 3, d = 4), Banknote Authentication (n = 1372, k = 2, d = 5), and Letter Recognition (n = 20, 000, k = 26, d = 16). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | No | The paper does not contain specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings) in the main text. |