Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures

Authors: Xin Chen, Anderson Ye Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Numerical Studies In this section, we compare the performance of our methods with other popular clustering methods on synthetic and real datasets under different settings.
Researcher Affiliation Academia Xin Chen Princeton University xc5557@princeton.edu Anderson Ye Zhang University of Pennsylvania ayz@wharton.upenn.edu
Pseudocode Yes Algorithm 1: Adjusted Lloyd s Algorithm for Model 1. and Algorithm 2: Adjusted Lloyd s Algorithm for Model 2.
Open Source Code No The paper does not contain any explicit statement about making its source code available or a direct link to a code repository.
Open Datasets Yes To further demonstrate the effectiveness of our methods, we conduct experiments using the Fashion-MNIST dataset [23].
Dataset Splits No The paper conducts numerical studies on synthetic and real datasets (Fashion-MNIST) but does not specify the explicit training, validation, and test dataset splits used for these experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, specific libraries).
Experiment Setup Yes In this section, we compare the performance of our methods with other popular clustering methods on synthetic and real datasets under different settings. ... We independently generate n = 1200 samples with dimension d = 50 from k = 30 clusters. Each cluster has 40 samples. We set Σ = U T ΛU, where Λ is a 50 50 diagonal matrix with diagonal elements selected from 0.5 to 8 with equal space and U is a randomly generated orthogonal matrix. The centers {θ a}a [n] are orthogonal to each other with θ 1 = . . . = θ 30 = 9. and In this case, we take n = 1200, k = 2, and d = 9. We set Σ 1 = Id and Σ 2 = Λ2, a diagonal matrix where the first diagonal entry is 0.5 and the remaining entries are 5. We set the cluster sizes to be 900 and 300, respectively. To simplify the calculation of SNR , we set θ 1 = 0 and θ 2 = 5e1... and Additionally, the dashed lines in the left and right panels represent the optimal exponents SNR2/8 and SNR 2/8 of the minimax bounds, respectively. It is observed that both Algorithm 1 and Algorithm 2 meet these benchmarks after three iterations. and we apply PCA to reduce dimensionality from 784 to 50 by retaining the top 50 principal components.