Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Clustering for Stretched Mixtures: Landscape and Optimality
Authors: Kaizheng Wang, Yuling Yan, Mateo Diaz
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct numerical experiments on a real dataset. We randomly select N1 (resp. M1) T-shirts/tops and N2 (resp. M2) pullovers from the Fashion-MNIST (Xiao et al., 2017) training (resp. testing) dataset, each of which is a 28 28 grayscale image represented by a vector in [0, 1]28 28. The goal is clustering, i.e. learning from those N = N1 + N2 unlabeled images to predict the class labels of both N training samples and M = M1 + M2 testing samples.Figure 2 shows the learning curves of CURE over 50 independent trials. Even when the classes are unbalanced, CURE still reliably achieves low misclassification rates. Figure 3 presents histograms of testing data under the feature transform learned by the last (50th) trial of CURE, showing two seperated clusters around 1 corresponding to the two classes. To demonstrate the efficacy of CURE, we compare its misclassification rates with those of K-means and spectral methods on the training sets. We include the standard deviation over 50 independent trials for CURE due to its random initializations; other methods use the default settings (in Python) and thus are regarded as deterministic algorithms. As is shown in Table 1, CURE has the best performance under all settings. |
| Researcher Affiliation | Academia | Kaizheng Wang Columbia University EMAIL Yuling Yan Princeton University EMAIL Mateo Díaz Cornell University EMAIL |
| Pseudocode | Yes | Algorithm 1 Clustering via Uncoupled REgression (meta-algorithm) Input: Data {Xi}n i=1 in a feature space X, embedding space Y, target distribution nu over Y, discrepancy measure D, function class F, classification rule g. Embedding: find an approximation solution to min F D( # rn, nu). Output: Yi = g[ (Xi)] for i [n].Algorithm 2 Perturbed gradient descent Initialize 0 = 0. For t = 0, 1, . . . do If perturbation condition holds: Perturb t t + t with t U(B(0, r)) If termination condition holds: Return t Update t+1 t L1( t). |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide a link to a code repository or mention code being available in supplementary materials. |
| Open Datasets | Yes | We randomly select N1 (resp. M1) T-shirts/tops and N2 (resp. M2) pullovers from the Fashion-MNIST (Xiao et al., 2017) training (resp. testing) dataset, each of which is a 28 28 grayscale image represented by a vector in [0, 1]28 28. |
| Dataset Splits | No | The paper mentions using 'Fashion-MNIST training (resp. testing) dataset' but does not specify any validation dataset splits or methods like cross-validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'default settings (in Python)' for comparison methods but does not provide specific version numbers for Python or any other software libraries used in their own implementation. |
| Experiment Setup | Yes | We use gradient descent with random initialization from the unit sphere and learning rate 10^-3 (instead of perturbed gradient descent) to solve (7) as that requires less tuning. |