Lifelong Learning with a Changing Action Set

Authors: Yash Chandak, Georgios Theocharous, Chris Nota, Philip Thomas3373-3380

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we aim to empirically compare the following methods... To demonstrate the effectiveness of our proposed method(s) on lifelong learning problems, we consider a maze environment and two domains corresponding to real-world applications... The plots in Figures 3 and 4 present the evaluations on the domains considered.
Researcher Affiliation Collaboration 1University of Massachusetts Amherst, 2Adobe Research
Pseudocode Yes A step-by-step pseudo-code for the LAICA algorithm is available in Algorithm 1, Appendix E.
Open Source Code No The paper does not provide any statements about releasing code, nor does it include links to a source code repository.
Open Datasets No The paper mentions a 'maze environment' which was 'constructed' and 'existing log of user s click stream data' for recommender systems. Neither is explicitly stated as publicly available, nor are specific links or formal citations provided for public access to these datasets. The citation (Shani, Heckerman, and Brafman 2005) refers to an MDP model, not the datasets themselves.
Dataset Splits No The paper mentions 'The total number of actions were randomly split into five equal sets' but does not provide specific details on how the datasets (e.g., maze environment data, user click stream data) were split into training, validation, or test sets in terms of percentages or counts.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud resources.
Software Dependencies No The paper mentions using 'DPG (Silver et al. 2014)' and 'an actor-critic (Sutton and Barto 2018)' but does not list any specific software components with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup No While the paper mentions the algorithms used ('DPG' and 'actor-critic'), it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or other training configurations.