Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Geometric Analysis of Nonlinear Manifold Clustering
Authors: Nimita Shinde, Tianjiao Ding, Daniel Robinson, Rene Vidal
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition to providing proof of correctness in this setting, a numerical comparison with state-of-the-art methods on CIFAR datasets shows that our method performs competitively although marginally worse than methods without theoretical guarantees. |
| Researcher Affiliation | Academia | Nimita Shinde Lehigh University EMAIL Tianjiao Ding University of Pennsylvania EMAIL Daniel P. Robinson Lehigh University EMAIL René Vidal University of Pennsylvania EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudocode for clustering data using (WMC) |
| Open Source Code | Yes | We have provided our code in the Supplementary material. |
| Open Datasets | Yes | The CIFAR dataset consists of 60000 color images of size 32 32 that are divided into 10, 20, and 100 classes for CIFAR-10, CIFAR-20, CIFAR-100, respectively. |
| Dataset Splits | No | The paper mentions using a 'grid search' for hyperparameter tuning, which implies some form of validation, but it does not specify the explicit training, validation, and test data splits (e.g., percentages or sample counts) for the datasets used in the experiments. It only describes the dataset as divided into classes. |
| Hardware Specification | Yes | The experiments are performed on a machine with Intel(R) Xeon(R) Gold 6130 CPU operating at 2.10 GHz frequency and with 37 GB RAM. |
| Software Dependencies | No | The paper mentions 'We implemented the ADMM algorithm that solves SMCE [58] in Python.' but does not specify the version of Python or any other software libraries with their version numbers. |
| Experiment Setup | Yes | We use grid search over the following parameter values: η {1, 20, 100, 400} and λ {20, 50} λ0, where λ0 is the smallest value of λ that generates a non-trivial (non-zero) solution. We report the best accuracy results in Table 1. Furthermore, Table 2 provides the values of the parameters λ and η corresponding to the clustering results reported in rows 1 (L-WMC) and 2 (E-WMC) in Table 1. |