Adaptive Model Design for Markov Decision Process
Authors: Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The paper includes sections like '7. Experiments', '7.1. Tax Design for Macroeconomic Model', '7.2. Workbench Position Design for Two-Ankle Robotic Arm', and '7.3. Result Analysis', presenting empirical studies with figures and tables. |
| Researcher Affiliation | Academia | 1Tsinghua University, Beijing, China 2Northwestern University, Evanston, IL, USA 3Yale University, New Haven, CT, USA. |
| Pseudocode | Yes | Algorithm 1 General framework for solving the RMD (14) with Ω(x) = x ln x |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper describes experiments on a 'bi-level macroeconomic model based on (Hill et al., 2021)' and a '2D robotic arm environment'. These are described as models or environments with defined state/action spaces and reward functions, not as external public datasets with concrete access information (links, DOIs, or specific citations to a dataset). |
| Dataset Splits | No | The paper describes setting up and running experiments on defined models/environments rather than using external datasets, and therefore does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments. |
| Experiment Setup | Yes | D. Additional Details of Experiments: D.1. Taxation Design for Macroeconomic Model - 'The learning rate η is 0.001. The initial asset for the agent follows a Gaussian distribution with mean 0 and variance 2. The initial taxation is set to (0.4, 0.4, 0.4, 0.4). The discounted factor γ1 and γ2 are both set to 0.8.' D.2. Workbench Position Design for A Two-ankle Robot Arm - 'The learning rate η is 0.01 . The inner iterations K is 100. γ = 0.8 is the discount factor for robotic arm s control, and γu = 0.8 is the discount factor for calculating the discounted cumulative energy consumption. The initial workbench s position p0 is at (1, 1).' |