Refined Learning Bounds for Kernel and Approximate $k$-Means
Authors: Yong Liu
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we validate our theoretical findings by performing experiments on both simulated and real data. |
| Researcher Affiliation | Academia | Yong Liu1,2 1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China liuyonggsai@ruc.edu.cn |
| Pseudocode | Yes | For the completeness, we briefly describe the improved k-means++ in the following, please refer to [25] for more details. 1: If |C| < k, add a sampled point x S with probability cost({ψ(x)}, C) P x S cost({ψ(x)}, C), where cost(P, C) = X xi P min c C Φi c , and add ψ(x) to C. 2: If |C| k, sample x S with probability cost({ψ(x)},C) P x S cost({ψ(x)},C), check whether there exists a point c C such that cost(S, C\{c} {ψ(x)}) < cost(S, C). If this is the case, we replace c by the point in C that reduces the cost function by the largest amount. |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the availability of its own source code. |
| Open Datasets | Yes | We use 6 publicly avaiable datasets, dna, segment, mushrooms, mnist, skin-nonskin and covtype, from the LIBSVM Data 2. |
| Dataset Splits | Yes | We generate Pk i=1 |Ci| samples of k clustering centers for training and 10,000 samples for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, or specific libraries with their versions). |
| Experiment Setup | No | The paper mentions using a "Gaussian kernel κ(x, x ) = exp x x 2 /σ2" but does not specify the value of σ (sigma) or any other hyperparameters for the kernel or for Lloyd's algorithm used in the experiments, making it not fully reproducible. |