Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning
Authors: Iro Laina, Ruth Fong, Andrea Vedaldi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments are organized as follows. First, we examine the representations learned by two state-of-the-art approaches, namely Se La [3] and Mo Co [27], and use our learnability metric (Eq. (1)) to quantify the semantic coherence of their learned representations. We then repeat theses experiments by providing human-annotated, class-level descriptions to measure the respective describability. |
| Researcher Affiliation | Academia | Visual Geometry Group University of Oxford {iro, ruthfong, vedaldi}@robots.ox.ac.uk |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using publicly available implementations of other models and tools, but does not provide an explicit statement or link to the open-source code for the methodology described in this paper. |
| Open Datasets | Yes | We use data from the training set of Image Net [17]. |
| Dataset Splits | No | The paper mentions using the 'training set of Image Net [17]' and evaluating on '20 selected Image Net classes', but it does not provide specific percentages or counts for training, validation, and test splits required for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud computing specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Res Net-50', 'Sentence BERT [59]', 'BERT-large as the backbone', and refers to a 'publicly available implementation' (Att2in), but it does not list specific version numbers for these or other software libraries/frameworks crucial for reproducibility. |
| Experiment Setup | Yes | For the semantic coherence experiments, each HIT consists of a reference set of 10 example images randomly sampled from the class and two query images. To obtain X Mo Co c , we apply k-means on top of Mo Co-v1 feature vectors (obtained using the official implementation) and set k = 3000 for a fair comparison with [3]. We then extract 1024-dimensional caption embeddings using Sentence BERT [59] (with BERT-large as the backbone). |