Multi-Facet Clustering Variational Autoencoders
Authors: Fabian Falck, Haoting Zhang, Matthew Willetts, George Nicholson, Christopher Yau, Chris C Holmes
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On image benchmarks, we demonstrate that our approach separates out and clusters over different aspects of the data in a disentangled manner. We demonstrate the usefulness of our model and its prior structure in four experimental analyses: (a) discovering a multi-facet structure (b) compositionality of latent facets (c) generative, unsupervised classification, and (d) diversity of generated samples from our model. We train our model on three image datasets: MNIST [5], 3DShapes (two configurations) [24] and SVHN [25]. |
| Researcher Affiliation | Academia | 1University of Oxford 2University of Cambridge 3University College London 4University of Manchester 5Health Data Research UK 6The Alan Turing Institute |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | We also provide our code implementing MFCVAE, using Py Torch Distributions [26], and reproducing our results at https://github.com/Fabian Falck/mfcvae. |
| Open Datasets | Yes | We train our model on three image datasets: MNIST [5], 3DShapes (two configurations) [24] and SVHN [25]. |
| Dataset Splits | No | For all datasets, we use the standard training and test splits. |
| Hardware Specification | No | All computational experiments were carried out on a cluster infrastructure. No specific hardware details (e.g., GPU/CPU models, memory) were provided. |
| Software Dependencies | No | The paper lists software packages like PyTorch, NumPy, Matplotlib, Seaborn, OpenCV, and Scikit-learn but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We run each model for 500 epochs with Adam [60] with a learning rate of 1e-4 and a batch size of 128. We found this set of hyperparameters to work well across all datasets and settings for the number of facets and for each facet s number of components... We use a learning rate of 1e-4, and we warm-up the learning rate to 1e-4 over 1000 steps, and then linearly decay the learning rate to 0 over 500000 steps. We use a batch size of 128. |