Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
Authors: Joongkyu Lee, Min-hwan Oh
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In numerical experiments, our proposed method consistently outperforms existing algorithms by significant margins. Our main contributions can be summarized as follows: |
| Researcher Affiliation | Academia | Joongkyu Lee Graduate School of Data Science Seoul National University jklee0717@snu.ac.kr Min-hwan Oh Graduate School of Data Science Seoul National University minoh@snu.ac.kr |
| Pseudocode | Yes | Algorithm 1 Upper Confidence Hierarchical RL with Transition-Targeted Regression (UC-HRL) |
| Open Source Code | No | The paper does not provide a concrete statement or link for the availability of its source code. |
| Open Datasets | No | The paper describes the "Block-River Swim" environment in Appendix H, which is a variant of River Swim (Strehl & Littman, 2008). This describes the simulation environment but does not provide access information (link, citation with authors/year for a public dataset, or explicit statement of public availability) for a static dataset used in experiments. |
| Dataset Splits | No | The paper mentions running "Episodic returns over 10 independent runs" but does not specify details of train/validation/test splits, percentages, or predefined splits for any dataset used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper states, "For a fair comparison, we sweep over the hyper-parameters for each algorithm over certain ranges." However, it does not list specific hyperparameter values, training configurations, or system-level settings in the main text. |