Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Experience Reuse for Learning from Multiple Demonstrators

Authors: Mike Gimelfarb, Scott Sanner, Chi-Guhn Lee

IJCAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate its effectiveness for minimizing multi-modal functions, and optimizing a high-dimensional supply chain with cost uncertainty, where it is also shown to improve upon the performance of the demonstrators policies. ... 4 Empirical Evaluation In order to demonstrate the effectiveness of BERS, we consider two problems: (1) the search for the minimum of static but high-dimensional multi-modal functions, and (2) the dynamic control of a complex supply chain network with stochastic demand.
Researcher Affiliation Academia Michael Gimelfarb , Scott Sanner and Chi-Guhn Lee Department of Mechanical and Industrial Engineering, University of Toronto EMAIL, EMAIL, EMAIL Affiliate to Vector Institute, Toronto, Canada.
Pseudocode Yes Algorithm 1 Bayesian Experience Reuse (BERS)
Open Source Code Yes The appendix can be found at https://github.com/mikegimelfarb/bayesian-experience-reuse.
Open Datasets Yes More specifically, we use the 10-dimensional Rosenbrock, Ackley and sphere functions as source tasks, and the Rastrigin function as the target task (please see appendix for definitions and processing).
Dataset Splits No No explicit train/validation/test dataset splits (percentages, counts, or specific predefined splits) are mentioned in the main text.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments are provided in the paper.
Software Dependencies No The paper mentions using "DDPG [Lillicrap et al., 2016]" as the base learning agent but does not provide specific version numbers for DDPG or any other software libraries.
Experiment Setup Yes The search is limited to xi [ 4, 4] for all i = 1, 2 . . . 10. The global minimums of the functions are: x Rosenbrock = 1, x Ackley = 0, x Sphere = 2 and x Rastrigin = 2. ... The factory can manufacture up to 35 units of inventory per day, and the factory and the warehouses can each store up to 50 units of inventory at any given time. ... Demand for each warehouse A, B .. .F, in units per day, is Poisson-distributed with respective means {7, 6, 6, 5, 5, 5}. ... This leads to a 2 + K + K2 = 44-dimensional continuous action space.