Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships
Authors: ABHRA CHAUDHURI, Massimiliano Mancini, Zeynep Akata, Anjan Dutta
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We begin by theoretically showing that abstract relational representations are nothing but a way of recovering transitive relationships among local views. Based on this, we design Transitivity Recovering Decompositions (TRD), a graphspace search algorithm that identifies interpretable equivalents of abstract emergent relationships at both instance and class levels, and with no post-hoc computations. We additionally show that TRD is provably robust to noisy views, with empirical evidence also supporting this finding. The latter allows TRD to perform at par or even better than the state-of-the-art, while being fully interpretable. |
| Researcher Affiliation | Academia | 1 University of Exeter 2 University of Trento 3 University of Tübingen 4 MPI for Informatics 5 The Alan Turing Institute 6 University of Surrey |
| Pseudocode | No | The paper describes methods in text and uses mathematical formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Implementation is available at https://github.com/abhrac/trd. |
| Open Datasets | Yes | To verify the generalizability and scalability of our method, we evaluate it on small, medium and large-scale FGVC benchmarks. We perform small-scale evaluation on the Soy and Cotton Cultivar datasets [89], while we choose FGVC Aircraft [53], Stanford Cars [42], CUB [80], and NA Birds [75] for medium scale evaluation. For large-scale evaluation, we choose the i Naturalist dataset [76], which has over 675K train set and 182K test set images. |
| Dataset Splits | Yes | For large-scale evaluation, we choose the i Naturalist dataset [76], which has over 675K train set and 182K test set images. The concept graph is obtained by performing an online clustering on the node and edge embeddings of the training instances of the corresponding class. |
| Hardware Specification | Yes | We implement TRD with a single NVIDIA Ge Force RTX 3090 GPU, an 8-core Intel Xeon processor, and 32 GBs of RAM. |
| Software Dependencies | No | The paper mentions software like ResNet50 and GAT, but does not provide specific version numbers for these or other key software components (e.g., PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | For obtaining the global view, we follow [84, 94] by selecting the smallest bounding box containing the largest connected component of the thresholded final layer feature map obtained from an Image Net-1K [17] pre-trained Res Net50 [30], which we also use as the relation-agnostic encoder f. The global view is resized to 224 224. We then obtain 64 local views by randomly cropping 28 28 regions within the global crop and resizing them to 224 224. We use a 8-layer Graph Attention Network (GAT) ([77], with 4 attention heads in each hidden layer, and normalized via Graph Norm [7] to obtain the Semantic Relevance Graph. We train TRD for 1000 epochs using the Adam optimizer, at an initial learning rate of 0.005 (decayed by a factor of 0.1 every 100 epochs), with a weight decay of 5 10 4. We generally followed [77, 10] for choosing the above settings. |