Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation
Authors: Junhong Shen, Lin F. Yang9558-9566
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on classical control and Mu Jo Co locomotion tasks show that the NN-accelerated agents achieve higher sample efficiency and stability than the baseline agents. |
| Researcher Affiliation | Academia | Junhong Shen, Lin F. Yang University of California, Los Angeles jhshen@ucla.edu, linyang@ee.ucla.edu |
| Pseudocode | Yes | Algorithm 1 Nearest Neighbor Actor-Critic and Algorithm 2 Soft Nearest Neighbor Update |
| Open Source Code | No | No explicit statement or link providing access to the authors' source code for the methodology described in this paper was found. |
| Open Datasets | Yes | We use the Open AI Gym implementation (Brockman et al. 2016). |
| Dataset Splits | No | No explicit statement about specific training/validation/test dataset splits was found. The paper focuses on online reinforcement learning, where data is generated through interaction with the environment rather than pre-split datasets. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running experiments were mentioned. |
| Software Dependencies | No | The paper mentions 'Open AI Gym implementation' and 'Stable Baselines implementation (Hill et al. 2018)' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | The discount factor γ is 0.99. The Lipschitz L is determined by a grid search and set to 7. All agents are trained with 5 random seeds. Evaluation is done every 1000 steps without exploration. |