reproducibilityindex.ai

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

Authors: Kaizheng Wang, Yuling Yan, Mateo Diaz

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct numerical experiments on a real dataset. We randomly select N1 (resp. M1) T-shirts/tops and N2 (resp. M2) pullovers from the Fashion-MNIST (Xiao et al., 2017) training (resp. testing) dataset, each of which is a 28 28 grayscale image represented by a vector in [0, 1]28 28. The goal is clustering, i.e. learning from those N = N1 + N2 unlabeled images to predict the class labels of both N training samples and M = M1 + M2 testing samples.Figure 2 shows the learning curves of CURE over 50 independent trials. Even when the classes are unbalanced, CURE still reliably achieves low misclassiﬁcation rates. Figure 3 presents histograms of testing data under the feature transform learned by the last (50th) trial of CURE, showing two seperated clusters around 1 corresponding to the two classes. To demonstrate the efﬁcacy of CURE, we compare its misclassiﬁcation rates with those of K-means and spectral methods on the training sets. We include the standard deviation over 50 independent trials for CURE due to its random initializations; other methods use the default settings (in Python) and thus are regarded as deterministic algorithms. As is shown in Table 1, CURE has the best performance under all settings.
Researcher Affiliation	Academia	Kaizheng Wang Columbia University kaizheng.wang@columbia.edu Yuling Yan Princeton University yulingy@princeton.edu Mateo Díaz Cornell University md825@cornell.edu
Pseudocode	Yes	Algorithm 1 Clustering via Uncoupled REgression (meta-algorithm) Input: Data {Xi}n i=1 in a feature space X, embedding space Y, target distribution nu over Y, discrepancy measure D, function class F, classiﬁcation rule g. Embedding: ﬁnd an approximation solution to min F D( # rn, nu). Output: Yi = g[ (Xi)] for i [n].Algorithm 2 Perturbed gradient descent Initialize 0 = 0. For t = 0, 1, . . . do If perturbation condition holds: Perturb t t + t with t U(B(0, r)) If termination condition holds: Return t Update t+1 t L1( t).
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide a link to a code repository or mention code being available in supplementary materials.
Open Datasets	Yes	We randomly select N1 (resp. M1) T-shirts/tops and N2 (resp. M2) pullovers from the Fashion-MNIST (Xiao et al., 2017) training (resp. testing) dataset, each of which is a 28 28 grayscale image represented by a vector in [0, 1]28 28.
Dataset Splits	No	The paper mentions using 'Fashion-MNIST training (resp. testing) dataset' but does not specify any validation dataset splits or methods like cross-validation.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions 'default settings (in Python)' for comparison methods but does not provide specific version numbers for Python or any other software libraries used in their own implementation.
Experiment Setup	Yes	We use gradient descent with random initialization from the unit sphere and learning rate 10^-3 (instead of perturbed gradient descent) to solve (7) as that requires less tuning.