Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Proxy Denoising for Source-Free Domain Adaptation

Authors: Song Tang, Wenxin Su, Yan Gan, Mao Ye, Jianwei Dr. Zhang, Xiatian Zhu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS Datasets We evaluate four widely used domain adaptation benchmarks. Among them, Office31 (Saenko et al., 2010) and Office-Home (Venkateswara et al., 2017) are small-scaled and medium-scale datasets, respectively, whilst Vis DA (Peng et al., 2017) and Domain Net-126 (Saito et al., 2019) are both challenging large-scale datasets. Their details are provided in Appendix C. Settings We consider a variety of SFDA settings: (1) closed-set, (2) partial-set (initialized in SHOT (Liang et al., 2020)), (3) open-set (initialized in SHOT (Liang et al., 2020)), (4) generalized SFDA (Yang et al., 2021b), (5) multi-target (SF-MTDA, detailed in (Kumar et al., 2023)), (6) multisource (SF-MSDA, detailed in (Ahmed et al., 2021)), and (7) test-time adaptation (TTA) (Wang et al., 2021a). More details are given in Appendix D.
Researcher Affiliation	Collaboration	Song Tang1,2,3, Wenxin Su1, Yan Gan4, Mao Ye5,*, Jianwei Zhang2 & Xiatian Zhu6, 1University of Shanghai for Science and Technology, 2Universität Hamburg, 3Com Origin Mat Inc. 4Chongqing University, 5University of Electronic Science and Technology of China, 6University of Surrey
Pseudocode	Yes	Algorithm 1 Training of Pro De Input: Source model θs, Vi L model θv, target dataset XT , C prompts with context v, #iteration M. Procedure: 1: Initialisation: Set target model θt = θs, prompt context v ="a photo of a". 2: for m = 1:M do 3: Sample a batch X b T from XT . 4: Forward updated prompts and X b T through θv. 5: Forward X b T through θt. 6: Conduct proxy denoising for the Vi L predictions of X b T (Eq. (5)). 7: Update model θt and prompt context v by optimizing objective LPro De (Eq. (6)). 8: end for 9: return Adapted target model θt.
Open Source Code	Yes	Our code and data are available at https://github.com/tntek/ source-free-domain-adaptation.
Open Datasets	Yes	Datasets We evaluate four widely used domain adaptation benchmarks. Among them, Office31 (Saenko et al., 2010) and Office-Home (Venkateswara et al., 2017) are small-scaled and medium-scale datasets, respectively, whilst Vis DA (Peng et al., 2017) and Domain Net-126 (Saito et al., 2019) are both challenging large-scale datasets.
Dataset Splits	Yes	The source dataset is divided into the training set and testing set in a 0.9:0.1 ratio.
Hardware Specification	No	All experiments are conducted with Py Torch on a single GPU of NVIDIA RTX.
Software Dependencies	No	All experiments are conducted with Py Torch on a single GPU of NVIDIA RTX. The paper mentions PyTorch but does not specify a version number.
Experiment Setup	Yes	Hyper-parameter setting. The Pro De model involves four parameters: The correction strength factor ω in Eq. (5) and two trade-off parameters α, β and γ in objective LP ro De (Eq. (6)). On all four datasets, we set (ω, α, β) = (1, 1, 0.4). Parameter γ is sensitive to the dataset scale, also noted in the TPDS method (Tang et al., 2024a). In practice, the setting of γ = 1.0/1.0/0.1/0.5 is employed on Office-31, Office-Home, Vis DA and Domain Net-126, respectively. Training setting. We chose a batch size of 64 and utilized the SGD optimizer with a momentum of 0.9 and 15 training epochs on all datasets.