Weighted model estimation for offline model-based reinforcement learning
Authors: Toru Hishinuma, Kei Senda
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate the effectiveness of weighting with the artificial weight. 6 Numerical Experiment |
| Researcher Affiliation | Academia | Toru Hishinuma Kyoto University hishinuma.toru.43n@kyoto-u.jp Kei Senda Kyoto University senda@kuaero.kyoto-u.ac.jp |
| Pseudocode | Yes | Algorithm 1 Weighted model estimation for policy evaluation (full version). |
| Open Source Code | No | The paper mentions modifying existing code ('This paper implements SAC by modifying the implementation code by [36]') but does not explicitly state that the source code for their own methodology is made publicly available or provide a link to it. |
| Open Datasets | Yes | This paper studies policy optimization on the D4RL Benchmark [33] based on the Mu Jo Co simulator [34]. |
| Dataset Splits | No | The paper mentions using the D4RL Benchmark datasets but does not explicitly provide specific training/validation/test dataset splits, such as percentages or sample counts, within its text. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or any other computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions using PyTorch (implicitly via a reference to a PyTorch SAC implementation) and the MuJoCo simulator, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The agent uses Pθ represented by two-layer neural networks with 8 units with tanh activation. |