Deep Transformation-Invariant Clustering

Authors: Tom Monnier, Thibault Groueix, Mathieu Aubry

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our novel approach yields competitive and highly promising results on standard image clustering benchmarks. Finally, we showcase its robustness and the advantages of its improved interpretability by visualizing clustering results over real photograph collections. In this section, we first analyze our approach and compare it to state-of-the-art, then showcase its interest for image collection analysis and visualization.
Researcher Affiliation Academia LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, France {tom.monnier,thibault.groueix,mathieu.aubry}@enpc.fr
Pseudocode Yes Algorithm 1: Deep Transformation-Invariant Gaussian Mixture Model
Open Source Code Yes Code, data, models as well as more visual results are available on our project webpage1. 1http://imagine.enpc.fr/~monniert/DTIClustering/
Open Datasets Yes MNIST [31], USPS [17]), a clothing dataset (Fashion MNIST [47]) and a face dataset (FRGC [43]). We also report results for SVHN [42]. aff NIST-test2 is the result of random affine transformations. 2https://www.cs.toronto.edu/~tijmen/aff NIST/
Dataset Splits No The paper mentions training and testing but does not explicitly provide details about validation dataset splits (e.g., percentages or counts) or a clear methodology for how a validation set was used for hyperparameter tuning.
Hardware Specification Yes Training DTI K-means or DTI GMM on MNIST takes approximately 50 minutes on a single Nvidia Ge Force RTX 2080 Ti GPU
Software Dependencies No The paper mentions using 'Adam optimizer [27]' but does not provide specific version numbers for any software dependencies like programming languages or libraries.
Experiment Setup Yes We sequentially add transformation modules at a constant learning rate of 0.001 then divide the learning rate by 10 after convergence... We use a batch size of 64 for real photograph collections and 128 otherwise.