Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Enhancing Compositional Generalization via Compositional Feature Alignment
Authors: Haoxiang Wang, Haozhe Si, Huajie Shao, Han Zhao
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further conduct extensive experiments on CG-Bench for CLIP and DINOv2, two powerful pretrained vision foundation models. Experiment results show that CFA outperforms common finetuning techniques in compositional generalization, corroborating CFA s efficacy in compositional feature learning. The code is released at https://github.com/Haoxiang-Wang/Compositional-Feature-Alignment. |
| Researcher Affiliation | Academia | Haoxiang Wang EMAIL Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign Haozhe Si EMAIL Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign Huajie Shao EMAIL Department of Computer Science William and Mary Han Zhao EMAIL Department of Computer Science University of Illinois Urbana-Champaign |
| Pseudocode | No | The paper describes the method and its stages using textual descriptions and a diagram (Figure 3), but it does not include a clearly labeled pseudocode block or algorithm section. |
| Open Source Code | Yes | The code is released at https://github.com/Haoxiang-Wang/Compositional-Feature-Alignment. |
| Open Datasets | Yes | We create CG-Bench, a compositional generalization benchmark built on four datasets previously designed for DG research: Office-Home (Venkateswara et al., 2017), Domain Net (Peng et al., 2019), and i Wild Cam (Beery et al., 2020) & FMo W (Christie et al., 2018) from the WILDS benchmark (Koh et al., 2021). |
| Dataset Splits | Yes | We randomly divide Domian Net into training and evaluation sets, with an 80:20 split. A CLIP model is fully fine-tuned on this training data, and evaluated on validation data from all domain-class combination. ... The ID data is then further segregated into a training set and an ID validation set at a 9:1 ratio. Meanwhile, the OOD data is divided between OOD validation and test sets. |
| Hardware Specification | Yes | The experiments described in this paper are executed on NVIDIA RTX A6000 GPUs with 48GB memory, utilizing a total of 12 GPUs. |
| Software Dependencies | No | The paper mentions using specific models like CLIP (Radford et al., 2021) and DINOv2 (Oquab et al., 2023), and an optimizer like Adam W (Loshchilov & Hutter, 2017). However, it does not specify version numbers for these or for broader software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We present the hyperparameter settings for our CFA models in Table 4. The parameters for each stage are chosen based on model performances on the OOD validation set. Note that λ is the domain loss coefficient in (3), and λortho is the coefficient for the orthogonality regularization loss }W T 1 W2}2 F that we use to ensure orthogonality of heads in Stage 1. The hyper-parameters in Stage 2 are also used for the two baseline algorithms, Full finetuning (FT) and LP-FT. ... Table 4: Hyperparameters for our algorithm. Model CLIP DINOv2 Dataset Stage 1 (Linear Probing) Stage 2 (Fine-Tuning) Stage 1 (Linear Probing) Stage 2 (Fine-Tuning) λ λortho Epochs Epochs Learning Rate λ λortho Epochs Epochs Learning Rate Office Home 1 100 200 3 10 5 1 100 200 3 5 ˆ 10 5 Domain Net 1 10000 200 3 10 5 1000 10 200 10 5 ˆ 10 5 i Wild Cam 10 10 200 5 10 5 1 100 200 5 10 5 FMo W 10 100 200 3 10 5 100 1000 200 4 5 ˆ 10 5 |