Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Leveraging Endo- and Exo-Temporal Regularization for Black-box Video Domain Adaptation

Authors: Yuecong Xu, Jianfei Yang, Haozhi Cao, Min Wu, Xiaoli Li, Lihua Xie, Zhenghua Chen

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate the state-of-the-art performance of EXTERN across various cross-domain closed-set and partial-set action recognition benchmarks, which even surpasses most existing video domain adaptation methods with source data accessibility. Code will be available at https://xuyu0010.github.io/b2vda.html. In this section, we evaluate our proposed EXTERN across a variety of cross-domain action recognition benchmarks, covering a wide range of cross-domain scenarios. We demonstrate exceptional performances on all benchmarks. Moreover, thorough ablation studies and analysis of EXTERN are performed to further justify the design of EXTERN.
Researcher Affiliation	Academia	Yuecong Xu EMAIL Department of Electrical and Computer Engineering National University of Singapore; Jianfei Yang EMAIL School of Mechanical and Aerospace Engineering Nanyang Technological University, Singapore; Haozhi Cao EMAIL School of Electrical and Electronic Engineering Nanyang Technological University, Singapore; Min Wu, Xiaoli Li EMAIL Institute for Infocomm Research Agency for Science, Technology and Research (ASTAR), Singapore; Lihua Xie EMAIL School of Electrical and Electronic Engineering Nanyang Technological University, Singapore IEEE Fellow; Zhenghua Chen EMAIL Institute for Infocomm Research and Centre for Frontier AI Research Agency for Science, Technology and Research (ASTAR), Singapore
Pseudocode	No	The paper describes the methodology using mathematical formulations and descriptive text, but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	No	Code will be available at https://xuyu0010.github.io/b2vda.html.
Open Datasets	Yes	Datasets. We evaluate EXTERN on three benchmarks: UCF-HMDBfull (Chen et al., 2019a), Sports DA (Xu et al., 2023) and Daily-DA (Xu et al., 2023). UCF-HMDBfull is one of the most common benchmarks for VUDA and is constructed from UCF101 (U101) (Soomro et al., 2012) and HMDB51 (H51) (Kuehne et al., 2011)... Sports-DA is a large-scale benchmark with three domains, built from UCF101, Sports-1M (S1M) (Karpathy et al., 2014), and Kinetics (K600). Daily-DA incorporates both normal and low-illumination videos with four domains, built from ARID (A11) (Xu et al., 2021b) (a low-illumination video dataset), HMDB51, Moments-in-Time (MIT) (Monfort et al., 2019), and Kinetics (Kay et al., 2017).
Dataset Splits	No	The paper mentions evaluating on different benchmarks and tasks (e.g., U-14 H-7, H-10 A-5) which implies certain data configurations, but it does not explicitly provide specific training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits) in the main text.
Hardware Specification	No	The paper states: "The batch size is set to 32 input videos per GPU." However, it does not specify the model or type of GPU, CPU, or any other specific hardware component used for the experiments.
Software Dependencies	No	We implement our method with the Py Torch (Paszke et al., 2019) library.
Experiment Setup	Yes	Implementation. We implement our method with the Py Torch (Paszke et al., 2019) library... For the black-box source model, the training lasts for 100 epochs for tasks related to the Sports-DA and the Mini Kinetics-UCF dataset, and for 50 epochs for all other datasets. For the target model, the training lasts for 20 epochs for tasks related to the UCF-HMDBfull dataset and the UCF-HMDBpartial dataset, 30 epochs for tasks related to the Daily-DA dataset and the HMDB-ARIDpartial dataset, and 50 epochs for the Sports-DA dataset and the Mini Kinetics-UCF dataset. The stochastic gradient descent (SGD) algorithm (Bottou, 2010) is used for optimization, with the weight decay set to 0.0001 and the momentum set to 0.9. The batch size is set to 32 input videos per GPU. Hyper-parameters αv = 0.3 and βreg = 1.0 are empirically set and fixed. We refer to the settings of DINE (Liang et al., 2022) by setting αt as 0.3 and c as 3.