Theory-Based Causal Transfer:Integrating Instance-Level Induction and Abstract-Level Structure Learning

Authors: Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu1283-1291

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare performances against a set of predominate model-free RL algorithms." "5 Experiments We compare results between predominate model-free RL algorithms with the proposed theory-based causal transfer model.
Researcher Affiliation Academia 1UCLA Center for Vision, Cognition, Learning, and Autonomy 2International Center for AI and Robot Autonomy (CARA) 3UCLA Computational Vision and Learning (CVL) Lab {markedmonds, maxiaojian, syqi, yixin.zhu, hongjing}@ucla.edu, sczhu@stat.ucla.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1The proposed algorithm and all baseline algorithms can be found on the first author s website.
Open Datasets Yes The Open Lock task, originally presented in Edmonds et al. 2018, requires agents to escape from a virtual room by unlocking and opening a door.
Dataset Splits No The paper describes training and transfer phases (e.g., "Training sessions contain only 3-lever trials", "In the transfer phase, the agent is tasked with a 4-lever trial"), but does not specify a distinct validation set or its split.
Hardware Specification No The paper does not provide specific details about the hardware used to run experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions various RL algorithms (e.g., DQN, A2C, TRPO, PPO) but does not specify the version numbers of any software, libraries, or frameworks used for their implementation.
Experiment Setup Yes For every training trial, the agent is placed into a 3-lever trial and allowed 30 attempts to find all solutions." and "RL agents: (i) were given more attempts per trial; and (ii) more importantly, were allowed to learn in the same trial multiple times." and "During training, agents execute for 200 training iterations, where each iteration consists of looping through all six 3-lever trials." and "RL agents operate directly on the state of the simulator encoded as a 16-dimensional binary vector"