Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Geometry-Aware Adaptation for Pretrained Models

Authors: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, using easily-available external metrics, our proposed approach, LOKI, gains up to 29.7% relative improvement over Sim CLR on Image Net and scales to hundreds of thousands of classes. When no such metric is available, LOKI can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models such as CLIP.
Researcher Affiliation Academia Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala University of Wisconsin-Madison EMAIL EMAIL
Pseudocode Yes Algorithm 1 Locus cover for phylogenetic trees, Algorithm 2 Computing a pairwise decomposable locus, Algorithm 3 Computing a generic locus
Open Source Code Yes Code implementing all of our experiments is available here: https://github.com/Sprocket Lab/loki.
Open Datasets Yes We evaluate the capability of LOKI to improve upon zero-shot models where all classes are observed. Setup Our experiment compares the zero-shot prediction performance of CLIP [32] on CIFAR-100 [20] to CLIP logits used with LOKI...For Image Net, we use the Word Net phylogenetic tree as the metric space [2]...For Pub Med, we derive our metric from Euclidean distances between Sim CSE class embeddings [11]. Finally for LSHTC, we summarize the default graph by randomly selecting nodes and merging them with their neighbors until we obtain a graph with 10,000 supernodes representing sets of classes.
Dataset Splits Yes To construct our datasets, we randomly sample 50 images for each class from Image Net as our training dataset then use the validation dataset in Image Net to evaluate LOKI s performance.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions software like 'CLIP frozen weights' and 'Sim CLRv1' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup No The paper mentions some general settings like 'no training and hyperparameters involved in experiments involving CLIP, except for the Softmax temperature in the calibration analysis' and using a '5-NN model' for LSHTC, but it lacks specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations typically found in a reproducible experimental setup section.