DINO as a von Mises-Fisher mixture model
Authors: Hariprasath Govindarajan, Per Sidén, Jacob Roll, Fredrik Lindsten
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTSWe conducted ablation experiments to study the impact of our proposed modifications to DINO. |
| Researcher Affiliation | Collaboration | 1Linköping University, Sweden 2 Qualcomm Technologies, Inc. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | We intend to make a public release of the code repository and pre-trained models in order to aid the research community to reproduce our experiments. |
| Open Datasets | Yes | The models are pre-trained on the Image Net dataset (Deng et al., 2009) |
| Dataset Splits | Yes | We report the k NN top-1 classification accuracy on Image Net in Table 1 by averaging over 2 runs. In Table 12 and Table 13, we show the mean and standard deviation of the top-1 validation classification accuracy over the three splits. |
| Hardware Specification | Yes | The model trainings are done on a single A100 node, consisting of 8 GPUs. |
| Software Dependencies | No | The paper mentions using the 'scipy' implementation for comparison, but no specific version numbers for any software dependencies are provided. |
| Experiment Setup | Yes | The student temperature τ is set to 0.1 and the teacher temperature is linearly scaled from 0.04 to 0.07 over some initial epochs (50 epochs for Vi T-Small/16 and 30 epochs for Vi T-Base/16). The batch sizes are adapted to fit the node and adjusted based on the model architecture (batch size=64 per GPU for Vi T-Base/16 and 128 for Vi T-Small/16). |