reproducibilityindex.ai

Transfer Learning for Efficient Iterative Safety Validation

Authors: Anthony Corso, Mykel J. Kochenderfer7125-7132

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on safety validation tasks in gridworld and autonomous driving scenarios. We show that transfer learning can improve the initial and ﬁnal performance of validation algorithms and reduce the number of training steps.
Researcher Affiliation	Academia	Anthony Corso and Mykel J. Kochenderfer Stanford University, Department of Aeronautics and Astronautics, 496 Lomita Mall, Stanford, CA 94305 {acorso, mykel}@stanford.edu
Pseudocode	No	The paper describes algorithms like DQN and A2T within the main text, but it does not contain structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that the code is being released or is available.
Open Datasets	No	The paper describes custom simulation environments (Gridworld with Adversary, Autonomous Vehicle) for generating data, but it does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset.
Dataset Splits	No	The paper mentions 'Training steps' and 'Evaluation 300 episodes every 2000 steps' in Table 1, indicating evaluation during training, but it does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce data partitioning for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software components like DQN, prioritized replay, double Q-learning, and Huber loss, but it does not provide specific version numbers for these or other ancillary software dependencies (e.g., Python, PyTorch, TensorFlow versions) needed to replicate the experiments.
Experiment Setup	Yes	Table 1: Network architectures and hyperparameters. Parameter Value Base network 3 hidden layers, [64, 32, 16] relus Attention network 1 hidden layer, 16 relus Training steps 3 106 Batch size 64 Learning rate α 4 10 5 (GW), 5 10 5 (AD) Target update frequency 2000 (GW), 3000 (AD) Evaluation 300 episodes every 2000 steps Exploration policy ϵ-greedy with ϵ [1, 0.1]