OTTER: Effortless Label Distribution Adaptation of Zero-shot Models
Authors: Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we validate our method in a wide array of zero-shot image and text classification tasks, improving accuracy by 4.8% and 15.9% on average, and beating baselines like prior matching often by significant margins in 17 out of 21 datasets. |
| Researcher Affiliation | Academia | Department of Computer Sciences University of Wisconsin-Madison {cshin23, jzhao326, cromp, hvishwakarma, fsala}@wisc.edu |
| Pseudocode | Yes | Algorithm 1 OTTER 1: Input: Input X = {x1, . . . , xn}, label distribution specification (p1, . . . , p K), cost matrix C Rn K 2: Define input marginal µ = 1 n 1, prediction marginal ν = (p1, . . . , p K) 3: Run optimal transport and obtain transport plan π s.t. π = arg minγ Π(µ,ν) γ, C . 4: Get modified classification outputs ˆyi = arg maxj [K] πi,j. Return {ˆyi}i [n] |
| Open Source Code | Yes | Our code is available at https://github.com/SprocketLab/OTTER. |
| Open Datasets | Yes | We used 17 image classification datasets and 4 text classification datasets. ... CIFAR10, CIFAR100 [33], Caltech101 [22], Caltech256 [25], Food101 [8], STL10 [16], SUN397 [67], Flower102 [42], Euro SAT [27], Oxford IIIT Pet [44], Stanford Cars [32], DTD [14], CUB [61], Image Net [18], Image Net-r [29], and Image Net-Sketch [63]. Zeroshot text classification datasets We use Amazon [41], Gender [20], Civil Comments [7], and Hate Xplain [39]. |
| Dataset Splits | Yes | We selected hyperparameters through grid search, by evaluating their performance on a validation set, consisting of 10 labeled examples per class. |
| Hardware Specification | Yes | Measurements were taken using a machine equipped with an Intel Core i7-11700K @ 3.60GHz processor, 64GB RAM, and NVIDIA GPU RTX-4090. |
| Software Dependencies | No | The paper mentions using CLIP [49] and BERT [19] models, but does not provide specific version numbers for these or any other software libraries, frameworks, or programming languages used in the experiments. |
| Experiment Setup | Yes | We selected hyperparameters through grid search, by evaluating their performance on a validation set, consisting of 10 labeled examples per class. ... Temperature: [1e-3, 1e-4, 1e-5, 1e-6, 1e-7] Learning rate: [1e-3, 1e-4, 1e-5, 1e-6, 1e-7] ... For zero-shot image classification, we emply CLIP [49] models. We used a photo of a [CLASS]' prompt. Scores are computed by sθ(xi, j) = exp (cos(f(xi), g(yj))/τ) PK j =1 exp (cos(f(xi), g(yj ))/τ) for image xi regarding the label j, given the image encoder f, the text encoder g. Cost matrix is constructed by C = [Cij]i [n],j [K], where cij = log sθ(xi, j). We run Algorithm 1 with the true class balance of the test dataset. |