reproducibilityindex.ai

Actor-Critic Alignment for Offline-to-Online Reinforcement Learning

Authors: Zishun Yu, Xinhua Zhang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that the proposed method improves the performance of the fine-tuned robotic agents on various simulated tasks.
Researcher Affiliation	Academia	1Department of Computer Science, University of Illinois Chicago, Chicago, IL 60607, USA. Correspondence to: Zishun Yu <zyu32@uic.edu>.
Pseudocode	Yes	The pseudo-code of the offline, alignment, and online phases is provided in Algorithm 1, 2, and 3, respectively.
Open Source Code	Yes	The implementation of our ACA algorithm can be found at https://github.com/Zishun Yu/ACA.
Open Datasets	Yes	We used the Half Cheetah, Hopper, and Walker2d environments from the D4RL-v2 datasets (Fu et al., 2020).
Dataset Splits	No	The paper mentions running experiments for a certain number of episodes and mini-batches but does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	Overall, all our implementations are from or based on d3rlpy (Takuma Seno, 2021), a popular RL library that specialized for offline RL.
Experiment Setup	Yes	All offline/online experiments ran 5 random seeds. We ran all offline algorithms for 500 episodes with 1000 mini-batches each, and all online experiments for 100 episodes with 1000 environment interactions each.