Transfer Value Iteration Networks

Authors: Junyi Shen, Hankz Hankui Zhuo, Jin Xu, Bin Zhong, Sinno Pan5676-5683

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the effectiveness of our proposed TVIN , we conduct experiments to transfer knowledge between different 2D RL domains, including 2D mazes and Differential Drive (Lee et al. 2018). We evaluate the transfer performance of TVIN with varying environments, maze sizes, dataset sizes and hyperparameters, etc. Extensive experiments empirically show that our proposed TVIN is able to learn a target-domain policy significantly faster and reach a higher generalization performance, compared with the conventional VIN and another heuristic transfer learning method.
Researcher Affiliation Collaboration 1School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China 2Data Quality Team, We Chat, Tencent Inc., China 3Nanyang Technological University, Singapore
Pseudocode Yes Algorithm 1 Transfer Value Iteration Algorithm
Open Source Code No The paper does not provide any links or explicit statements about the availability of open-source code for the described methodology.
Open Datasets No The paper states: 'The RL task domains for our experiments are synthetic 2D maps with randomly placed obstacles...' and 'Experimentally, our ground-truth label is created with a maze generation process that uses depth-first search with the recursive back-tracker algorithm (Cormen et al. 2009).' While it references (Lee et al. 2018) for environments, it describes their own data generation without providing concrete access (link, DOI, or specific repository) to the generated datasets.
Dataset Splits No The paper mentions 'training data' and a 'test set' but does not specify a separate 'validation' split or dataset used for hyperparameter tuning or early stopping.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with their versions) that would allow for reproducible setup of the experiment environment.
Experiment Setup Yes In the implementation, we refer to (Lee et al. 2018) and set the default recurrence K relative to the maze sizes: K = 20 for 9 9 mazes, K = 30 for 15 15 mazes and K = 56 for 28 28 mazes. We further evaluate the effect of varying both iteration count K and kernel size F on the TVIN models.