Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Preventing Shortcuts in Adapter Training via Providing the Shortcuts

Authors: Anujraaj Goyal, Guocheng Qian, Huseyin Coskun, Aarush Gupta, Himmy Tam, Daniil Ostashev, Ju Hu, Dhritiman Sagar, Sergey Tulyakov, Kfir Aberman, Kuan-Chieh Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we show a number of experiments for different instance of Shortcut-Rerouted Adapter Training. First, we show results for Shortcut Rerouting in the setting of face adapters using both Lo RA and Control Net as Shortcut Rerouting mechanisms ( 4.2). The resulting adapters demonstrate improved prior preservation, head pose control and expression control. Then, we show results for Shortcut Rerouting in the setting of body adapters ( 4.3). Table 1: Quantitative comparison for face adapters. Our Shortcut-Rerouted methods (SR-Lo RA and SR-Control Net) outperform prior baselines in head pose control and prior preservation, while maintaining competitive identity fidelity. All models are based on Flux Dev.
Researcher Affiliation	Industry	Anujraaj Argo Goyal Guocheng Gordon Qian Huseyin Coskun Aarush Gupta Himmy Tam Daniil Ostashev Ju Hu Dhritiman Sagar Sergey Tulyakov Kfir Aberman Kuan-Chieh Jackson Wang Snap Inc., https://snap-research.github.io/shortcut-rerouting/
Pseudocode	No	The paper describes methods using mathematical formalisms in Section 3.1 but does not present any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	At the time of submission, we are unable to release the code or dataset due to legal and compliance constraints within our organization. We recognize the value of open access and reproducibility and are actively exploring the possibility of releasing portions of the code or evaluation tools pending internal review.
Open Datasets	No	We curate an internal large-scale dataset of a few million high-quality human images, filtered to retain only single-subject photos and remove low-quality, NSFW, or watermarked content.
Dataset Splits	No	The paper mentions curating an internal dataset and preparing inputs by extracting and aligning face/body crops, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits with citations) for reproducibility.
Hardware Specification	Yes	Training is performed on 8 A100 GPUs (80GB each) using Adam W [Loshchilov and Hutter, 2019] with a learning rate of 5e 5 and a global batch size of 32 for 250K iterations.
Software Dependencies	No	All methods are implemented in Py Torch [Paszke et al., 2019] using the Hugging Face Diffusers [von Platen et al., 2022] framework, based on the FLUX.1 [Dev] [Labs, 2024] model with a Di T [Peebles and Xie, 2023] backbone and Conditional Flow Matching objective [Esser et al., 2024]. The paper lists software libraries and frameworks used but does not provide specific version numbers for them (e.g., PyTorch 1.9, Diffusers 0.x.x).
Experiment Setup	Yes	Training is performed on 8 A100 GPUs (80GB each) using Adam W [Loshchilov and Hutter, 2019] with a learning rate of 5e 5 and a global batch size of 32 for 250K iterations. Inference is standardized across all methods with IP scale 1.0, CFG 3.5, 28 steps, and 1024 1024 resolution.