Transfer Learning for Efficient Iterative Safety Validation
Authors: Anthony Corso, Mykel J. Kochenderfer7125-7132
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on safety validation tasks in gridworld and autonomous driving scenarios. We show that transfer learning can improve the initial and final performance of validation algorithms and reduce the number of training steps. |
| Researcher Affiliation | Academia | Anthony Corso and Mykel J. Kochenderfer Stanford University, Department of Aeronautics and Astronautics, 496 Lomita Mall, Stanford, CA 94305 {acorso, mykel}@stanford.edu |
| Pseudocode | No | The paper describes algorithms like DQN and A2T within the main text, but it does not contain structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that the code is being released or is available. |
| Open Datasets | No | The paper describes custom simulation environments (Gridworld with Adversary, Autonomous Vehicle) for generating data, but it does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper mentions 'Training steps' and 'Evaluation 300 episodes every 2000 steps' in Table 1, indicating evaluation during training, but it does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce data partitioning for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like DQN, prioritized replay, double Q-learning, and Huber loss, but it does not provide specific version numbers for these or other ancillary software dependencies (e.g., Python, PyTorch, TensorFlow versions) needed to replicate the experiments. |
| Experiment Setup | Yes | Table 1: Network architectures and hyperparameters. Parameter Value Base network 3 hidden layers, [64, 32, 16] relus Attention network 1 hidden layer, 16 relus Training steps 3 106 Batch size 64 Learning rate α 4 10 5 (GW), 5 10 5 (AD) Target update frequency 2000 (GW), 3000 (AD) Evaluation 300 episodes every 2000 steps Exploration policy ϵ-greedy with ϵ [1, 0.1] |