Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning to Generate Gradients for Test-Time Adaptation via Test-Time Training Layers
Authors: Qi Deng, Shuaicheng Niu, Ronghao Zhang, Yaofo Chen, Runhao Zeng, Jian Chen, Xiping Hu
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Promising results on Image Net-C/R/Sketch/A indicate that our method surpasses current state-of-the-art methods with fewer updates, less data, and significantly shorter adaptation times. Compared with a previous SOTA SAR, we achieve 7.4% accuracy improvement and 4.2 faster adaptation speed on Image Net-C. [...] Extensive results indicate that our method surpasses existing SOTAs with fewer updates, fewer data, and significantly shorter adaptation times. [...] Experiments in Table 6 demonstrate that 128 images are sufficient for our method to achieve excellent performance. |
| Researcher Affiliation | Academia | 1South China University of Technology, 2Nanyang Technological University, 3Artificial Intelligence Research Institute, Shenzhen MSU-BIT University {dengqi.kei; niushuaicheng; zhangronghao16; runhaozeng.cs}@gmail.com; EMAIL; |
| Pseudocode | Yes | We summarize the overall pseudo-code of our method in algorithm 1. Algorithm 1: The pre-training/TTA pipeline of MGTTA. |
| Open Source Code | Yes | Code https://github.com/keikeiqi/MGTTA |
| Open Datasets | Yes | We conduct experiments on four benchmark datasets, including 1) Image Net-C (Hendrycks and Dietterich 2019) contains corrupted images in 15 types of 4 main categories and each type has 5 severity levels. [...] 2) Image Net R (Hendrycks et al. 2021a) contains various artistic renditions of 200 Image Net classes. 3) Image Net-Sketch (Wang et al. 2019) includes sketch-style images representing 1,000 Image Net classes. 4) Image Net-A (Hendrycks et al. 2021b) consists of natural adversarial examples. |
| Dataset Splits | Yes | For pre-training MGG, we randomly select 128 unlabeled samples from the Image Net-C validation set. [...] In our main experiments, we random sample 128 images without labels from the held-out validation set of Image Net-C as the training set of MGG, and then test the trained MGG on all Image Net-C testing datasets and other Image Net variants. [...] Total #batches is 782, with batch size 64. |
| Hardware Specification | Yes | Table 5: Wall-clock runtime for processing 50,000 images of Image Net-C on a RTX 4090 GPU, and Acc. averaged over 15 corruptions. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The learning rate is set to 1e-4 for θ and 1e-2 for ϕr. We update θ and ϕr for T=2,000 iterations with a batch size of 2. The GML hidden size is set to 8. During TTA, the batch size is 64, and ϕr is fixed, the learning rate for θ is set to 1e-3. |