START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation

Authors: Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on five benchmarks demonstrate that START outperforms existing SOTA DG methods with efficient linear complexity.
Researcher Affiliation Academia 1 Nanjing University 2 Southeast University guojintao@smail.nju.edu.cn, qilei@seu.edu.cn, {syh, gaoy}@nju.edu.cn
Pseudocode No The paper presents mathematical equations and describes procedures in detail, such as in Section 3.1 and 3.3, but does not provide a formal pseudocode block or algorithm labeled as such.
Open Source Code Yes Our code is available at https://github.com/lingeringlight/START.
Open Datasets Yes We perform an extensive evaluation on five DG datasets: PACS [24] comprises 9,991 images of 7 classes from 4 domains: Photo, Art Painting, Cartoon, and Sketch. Office Home [79] includes 15,588 images of 65 classes from four diverse domains: Artistic, Clipart, Product, and Real-World, exhibiting a large domain gap. VLCS [80] contains 10,729 images of 5 categories from 4 domains: Pascal, Label Me, Caltech, and Sun. Terra Incognita [81] comprises photographs of wild animals taken by 4 camera-trap domains, with 10 classes and a total of 24,788 images. Domain Net [5] is large-scale with 586,575 images, having 345 classes from 6 domains, i.e., Clipart, Infograph, Painting, Quickdraw, Real, and Sketch.
Dataset Splits Yes We apply the leave-one-domain-out protocol for all benchmarks, where one domain is used for testing, and the remaining domains are employed for training.
Hardware Specification Yes All the experiments are run on 4 NVIDIA Teska V100 GPUs.
Software Dependencies No The paper mentions using
Experiment Setup Yes We train the model for 50 epochs using Adam W optimizer and cosine decay schedule, with a batch size of 64, the initial learning rate as 5e 4, and the momentum of 0.9. For all experiments, we the ratio Ptoken of augmented tokens to 0.75.