reproducibilityindex.ai

Learning from Offline Foundation Features with Tensor Augmentations

Authors: Emir Konuk, Christos Matsoukas, Moein Sorkhei, Phitchapha Lertsiravarameth, Kevin Smith

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of LOFF-TA we benchmark over eleven datasets from various domains using different foundation models, model capacities and image resolutions. In this section we show that LOFF-TA achieves competitive, sometimes superior, results compared to the baselines while significantly reducing memory usage and training time.
Researcher Affiliation	Academia	1 KTH Royal Institute of Technology, Stockholm, Sweden 2 Science for Life Laboratory, Stockholm, Sweden
Pseudocode	No	The paper describes its method verbally and with diagrams, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code used in this work can be found at https://github.com/emirkonuk/loffta.
Open Datasets	Yes	Our evaluation spans eleven image classification datasets, covering a diverse spectrum of object categories. We include APTOS2019 [21] for diabetic retinopathy detection, DDSM [29] for identifying masses in mammography, ISIC [40, 8, 9] for skin lesion classification, AID [43] for aerial image classification, and NABirds [41] for fine-grained bird species classification. The resolution of these datasets varies, but we resize them to 512 512. We extend our evaluation to a number of standard 256 256 resolution benchmark datasets: Flowers102 [34], NABirds [41], Stanford Cars [26], Stanford Dogs [22], Oxford-III Pet [36], Caltech-101 [12], and SUN397 [44].
Dataset Splits	Yes	We adhere to official train/validation/test splits when available, or follow [24] in their absence.
Hardware Specification	Yes	It was not possible to train Vi T-G on a single GPU with batch size of 64. Instead we report the memory footprint across 8 NVIDIA Quadro RTX 8000 using distributed training. We measure each approach in terms of performance, throughput (TP), and memory (Mem.) footprint.
Software Dependencies	No	The paper mentions several software components and models used (e.g., Adam W optimizer [32], Dei T-S [39], DINOv2 [35], CLIP [37], Open CLIP [19]), but it does not specify concrete version numbers for these software packages or any underlying frameworks/libraries (e.g., PyTorch, TensorFlow).
Experiment Setup	Yes	In our experiments, we utilize the Adam W optimizer [32], and a batch size of 64. We incorporate a learning rate warm-up strategy and manually decrease the learning rate by a factor of 0.1 when the validation performance plateaus. For lightweight classifier in LOFF and LOFF-TA, we implement modifications to the Dei T-S architecture [39]. We remove the patchifier from the model s stem and introduce a linear projection layer followed by a normalization layer as detailed in 3.