Adaptive Model Design for Markov Decision Process

Authors: Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The paper includes sections like '7. Experiments', '7.1. Tax Design for Macroeconomic Model', '7.2. Workbench Position Design for Two-Ankle Robotic Arm', and '7.3. Result Analysis', presenting empirical studies with figures and tables.
Researcher Affiliation Academia 1Tsinghua University, Beijing, China 2Northwestern University, Evanston, IL, USA 3Yale University, New Haven, CT, USA.
Pseudocode Yes Algorithm 1 General framework for solving the RMD (14) with Ω(x) = x ln x
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper describes experiments on a 'bi-level macroeconomic model based on (Hill et al., 2021)' and a '2D robotic arm environment'. These are described as models or environments with defined state/action spaces and reward functions, not as external public datasets with concrete access information (links, DOIs, or specific citations to a dataset).
Dataset Splits No The paper describes setting up and running experiments on defined models/environments rather than using external datasets, and therefore does not specify training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments.
Experiment Setup Yes D. Additional Details of Experiments: D.1. Taxation Design for Macroeconomic Model - 'The learning rate η is 0.001. The initial asset for the agent follows a Gaussian distribution with mean 0 and variance 2. The initial taxation is set to (0.4, 0.4, 0.4, 0.4). The discounted factor γ1 and γ2 are both set to 0.8.' D.2. Workbench Position Design for A Two-ankle Robot Arm - 'The learning rate η is 0.01 . The inner iterations K is 100. γ = 0.8 is the discount factor for robotic arm s control, and γu = 0.8 is the discount factor for calculating the discounted cumulative energy consumption. The initial workbench s position p0 is at (1, 1).'