Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Focus-Then-Reuse: Fast Adaptation in Visual Perturbation Environments

Authors: Jiahui Wang, Chao Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on challenging tasks based on Deep Mind Control Suite and Franka Emika Robotics demonstrate that FTR enables rapid adaptation in visual perturbation environments and achieves state-of-the-art performance.
Researcher Affiliation	Academia	1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artiﬁcial Intelligence, Nanjing University, Nanjing, China 3Nanyang Technological University, Singapore EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Focus-Then-Reuse
Open Source Code	Yes	The source code is available at https://github.com/LAMDA-RL/FTR.
Open Datasets	Yes	In this section, we present the experimental results of FTR on 11 tasks, including 8 tasks of Deep Mind Control Suite (DMC) [67] and 3 tasks of Franka Emika Robotics [68, 69]. ... To simulate real-world visual disturbances, we use ﬁve diverse videos from the DMC-Generalization Benchmark [17], as shown in Fig. 3.
Dataset Splits	No	Policies are trained using the Dr Q-v2 algorithm. We perform three independent runs and choose the policy with the best performance as the original policy πori. ... Generalization method (Sim GRL): Policies are trained in the source domain for 500k steps across three runs. ... Adaptation methods (FTR, PAD): ... adapted to each of the ﬁve target domains using three different seeds for 200k steps.
Hardware Specification	Yes	Most experiments are conducted on a server outﬁtted with 2 AMD EPYC 7542 32-Core Processor CPUs, 504GB of RAM, and 8 GPUs, each with a performance of over 35 TFLOPS, running Ubuntu 22.04.
Software Dependencies	No	The source code is available at https://github.com/LAMDA-RL/FTR. The code is modiﬁed from DMC-Generalization Benchmark [17] and FTD [10]. The PPO algorithm used in FTR is implemented based on https://iclr-blog-track.github.io/2022/03/25/ ppo-implementation-details/. The Dr Q-v2 algorithm is implemented based on https:// github.com/facebookresearch/drqv2. ... Segment Anything Model 2 (SAM 2) serves as the default segmentation model and tracking model in FTR. ... The default VLM used in FTR is Qwen-VL-Max [33].
Experiment Setup	Yes	Table 3: Hyperparameters. Hyperparameters of environments frame size 168 168 (franka-push, franka-door), 84 84 (otherwise) frame stack 3 episode length 200 (franka-push, franka-door), 1000 (otherwise) action repeat 2 (ﬁnger-spin, pendulum-swingup), 4 (otherwise) Hyperparameters of Dr Q-v2 train steps 5 10^5 replay buffer size 1 10^5 exploration steps 1 10^4 n-step returns 3 batch size 256 optimizer Adam actor & critic learning rate 1 10^-4 discount factor 0.99 critic Q-function soft-update rate τ 0.01 exploration stddev. clip 0.3 exploration stddev. schedule linear(1.0,0.1,100000) Hyperparameters of focus stage SAM 2 checkpoint sam2_hiera_tiny adapt steps 2 10^5 number of segments k 9 selection interval Tsel 20 SL-to-RL transition timestep T1 5000 transition end timestep T2 10000 policy stddev. σh 0.1 optimizer Adam batch size 128 learning rate 3 10^-4 clip ratio of PPO 0.2 discount factor 0.5 GAE lambda 0.95 LSL objective margin δ 0.1