DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Authors: Ximeng Sun, Ping Hu, Kate Saenko

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on standard multi-label recognition benchmarks across two challenging low-label settings demonstrate the advantages of our approach over state-of-the-art methods.
Researcher Affiliation Collaboration 1Boston University, 2MIT-IBM Watson AI Lab, IBM Research {sunxm, pinghu, saenko}@bu.edu
Pseudocode No The paper describes its methods using text and mathematical equations but does not include any pseudocode or algorithm blocks.
Open Source Code Yes Project page: https://cs-people.bu.edu/sunxm/ Dual Co Op/project.html [...] Please check our project page included in the abstract and https://github.com/sunxm2357/Dual Co Op for our code.
Open Datasets Yes Datasets. We conduct experiments on MS-COCO [34], VOC2007 [16] and Big Earth [5] to evaluate multi-label recognition with partial labels.
Dataset Splits Yes MS-COCO [34] contains 80 common object categories and we use the official train2014 (82K images) and val2014 (40K images) splits for training and test.
Hardware Specification Yes Training is done with one RTX A6000. [...] We compare the computational cost between Dual Co Op and SARB [46] in terms of training/testing latency and memory (see Table 3) using the same device (one Nividia A100 GPU).
Software Dependencies No The paper mentions models and frameworks (e.g., ResNet-101, CLIP, Transformer, SGD optimizer, ASL loss) but does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup Yes For each class/label, we learn two independent context vectors with 16 context tokens (N = 16) following [71], which is the only learnable part in Dual Co Op. We use the SGD optimizer with an initial rate of 0.002 which is decayed by the cosine annealing rule. We train context vectors for 50 epochs with a batch-size 32/8/32 for MS-COCO/VOC2007/Big Earth, respectively. For ASL loss, we choose γ+ = 1, γ = 2 and c = 0.05 via validation.