START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation
Authors: Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five benchmarks demonstrate that START outperforms existing SOTA DG methods with efficient linear complexity. |
| Researcher Affiliation | Academia | 1 Nanjing University 2 Southeast University guojintao@smail.nju.edu.cn, qilei@seu.edu.cn, {syh, gaoy}@nju.edu.cn |
| Pseudocode | No | The paper presents mathematical equations and describes procedures in detail, such as in Section 3.1 and 3.3, but does not provide a formal pseudocode block or algorithm labeled as such. |
| Open Source Code | Yes | Our code is available at https://github.com/lingeringlight/START. |
| Open Datasets | Yes | We perform an extensive evaluation on five DG datasets: PACS [24] comprises 9,991 images of 7 classes from 4 domains: Photo, Art Painting, Cartoon, and Sketch. Office Home [79] includes 15,588 images of 65 classes from four diverse domains: Artistic, Clipart, Product, and Real-World, exhibiting a large domain gap. VLCS [80] contains 10,729 images of 5 categories from 4 domains: Pascal, Label Me, Caltech, and Sun. Terra Incognita [81] comprises photographs of wild animals taken by 4 camera-trap domains, with 10 classes and a total of 24,788 images. Domain Net [5] is large-scale with 586,575 images, having 345 classes from 6 domains, i.e., Clipart, Infograph, Painting, Quickdraw, Real, and Sketch. |
| Dataset Splits | Yes | We apply the leave-one-domain-out protocol for all benchmarks, where one domain is used for testing, and the remaining domains are employed for training. |
| Hardware Specification | Yes | All the experiments are run on 4 NVIDIA Teska V100 GPUs. |
| Software Dependencies | No | The paper mentions using |
| Experiment Setup | Yes | We train the model for 50 epochs using Adam W optimizer and cosine decay schedule, with a batch size of 64, the initial learning rate as 5e 4, and the momentum of 0.9. For all experiments, we the ratio Ptoken of augmented tokens to 0.75. |