Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition

Authors: Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on established background and foreground bias protocols, setting a new state-of-the-art and strongly improving combined debiasing performance by over 12% absolute on HMDB51. Furthermore, we identify an issue of background leakage in the existing UCF101 protocol for bias evaluation which provides a shortcut to predict actions and does not provide an accurate measure of the debiasing capability of a model. We address this issue by proposing more fine-grained segmentation boundaries for the actor, where our method also outperforms existing approaches.
Researcher Affiliation	Academia	Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah Center for Research in Computer Vision, University of Central Florida, Orlando, USA EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using prose and mathematical equations but does not include any distinct pseudocode or algorithm blocks.
Open Source Code	No	Project Page: https://joefioresi718.github.io/ALBAR_webpage/. This is a project page, not a direct code repository. The paper does not explicitly state that the source code for the described methodology is released or provide a direct link to a code repository.
Open Datasets	Yes	We evaluate our method on established background and foreground bias protocols, setting a new state-of-the-art and strongly improving combined debiasing performance by over 12% absolute on HMDB51. ... SCUBA and SCUFO Li et al. (2023) are background and foreground bias evaluation benchmarks for action recognition based on common benchmarks Kinetics400 Carreira & Zisserman (2017), UCF101 Soomro et al. (2012), and HMDB51 Kuehne et al. (2011).
Dataset Splits	Yes	UCF101 Soomro et al. (2012) ... has three train/test splits available. Following Li et al. (2023), we utilize only the first (split 1) train/test split for all training and evaluation in this work. ... HMDB51 Kuehne et al. (2011) ... has three potential train/test splits... we only use the first (split 1) train/test split for all training and evaluation in this work. ... Kinetics400 Carreira & Zisserman (2017) ... It has a single dedicated train/val/test split. In this work, we train on the train split and evaluate IID on the test split. ... A standard validation set does not exist for HDMB51 and UCF101. We randomly sample 20% of the respective training sets to use for validation...
Hardware Specification	Yes	All experiments are performed on a local computing cluster with access to V100 and A100 GPUs of various memory configurations up to 80GB.
Software Dependencies	No	The Py Torch Paszke et al. (2019) library is utilized for all experiments. This mentions PyTorch but does not specify a version number.
Experiment Setup	Yes	For all experiments, we use a clip resolution of 224 × 224. We follow Li et al. (2023) and use Kinetics400 Carreira & Zisserman (2017) pretrained Swin-T Liu et al. (2022) with 32 frame clips at a skip rate of 2. We adopt the same common augmentations used in Li et al. (2023): random resized cropping and random horizontal flipping. Our chosen optimizer is AdamW Kingma & Ba (2014); Loshchilov & Hutter (2017) with default parameters β1 = 0.9, β2 = 0.999, and weight decay of 0.01. We follow the linear scaling rule Goyal et al. (2017) with a base learning rate of 1e-4 corresponding to a batch size of 64. For training, we utilize a linear warmup of 5 epochs and a cosine learning rate scheduler.