No Representation Rules Them All in Category Discovery
Authors: Sagar Vaze, Andrea Vedaldi, Andrew Zisserman
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use this dataset to demonstrate the limitations of unsupervised clustering in the GCD setting, showing that even very strong unsupervised models fail on Clevr-4. We further use Clevr-4 to examine the weaknesses of existing GCD algorithms, and propose a new method which addresses these shortcomings, leveraging consistent findings from the representation learning literature to do so. Our simple solution, which is based on mean teachers and termed µGCD, substantially outperforms implemented baselines on Clevr-4. Finally, when we transfer these findings to real data on the challenging Semantic Shift Benchmark (SSB), we find that µGCD outperforms all prior work, setting a new state-of-the-art. |
| Researcher Affiliation | Academia | Sagar Vaze Andrea Vedaldi Andrew Zisserman Visual Geometry Group University of Oxford |
| Pseudocode | Yes | In this section, we detail a simple but strong method for GCD, µGCD, already motivated in section 4 and illustrated in fig. 2. In a first phase, the algorithm proceeds in the same way as the GCD baseline [7], learning the representation. Next, we append a classification head and fine-tune the model with a mean teacher setup [17], similarly to Sim GCD but yielding more robust pseudo-labels. |
| Open Source Code | No | The paper provides a URL (www.robots.ox.ac.uk/~vgg/data/clevr4/) which is described as containing the 'Clevr-4' dataset, but it does not contain an explicit statement or link confirming that the source code for the proposed µGCD method is open-source or publicly available. |
| Open Datasets | Yes | To this end, in section 3, we first introduce the Clevr-4 dataset. Clevr-4 is a synthetic dataset where each image is fully parameterized by a set of four attributes, and where each attribute defines an equally valid grouping of the data (see fig. 1). Clevr-4 extends the original CLEVR dataset [12]... We further synthesize 8.4K images for GCD development (summarized in table 1), and further make a larger 100K image dataset available. The full generation procedure is detailed in appendix A.1. |
| Dataset Splits | Yes | Finally, we create GCD splits for each taxonomy in Clevr-4, following standard practise and reserving half the categories for the labelled set, and half for the unlabelled set. We further subsample 50% of the images from the labelled categories and add them to the unlabelled set. |
| Hardware Specification | Yes | We implement all models in Py Torch [69] on a single NVIDIA P40 or M40. |
| Software Dependencies | No | The paper states 'We implement all models in Py Torch [69]' but does not provide specific version numbers for PyTorch or any other relevant software libraries or dependencies required for reproduction. |
| Experiment Setup | Yes | For the tradeoff between the unsupervised and supervised components of the losses, λ1 is set to 0.35 for all methods. For the entropy regularization, we follow Sim GCD and use λ2 = 1.0 for FGVC-Aircraft and Stanford Cars, and λ2 = 2.0 for all other datasets. We also train with L2 weight decay, set to 10e 4 for all models. |