reproducibilityindex.ai

Adaptive Model Design for Markov Decision Process

Authors: Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The paper includes sections like '7. Experiments', '7.1. Tax Design for Macroeconomic Model', '7.2. Workbench Position Design for Two-Ankle Robotic Arm', and '7.3. Result Analysis', presenting empirical studies with figures and tables.
Researcher Affiliation	Academia	1Tsinghua University, Beijing, China 2Northwestern University, Evanston, IL, USA 3Yale University, New Haven, CT, USA.
Pseudocode	Yes	Algorithm 1 General framework for solving the RMD (14) with Ω(x) = x ln x
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	No	The paper describes experiments on a 'bi-level macroeconomic model based on (Hill et al., 2021)' and a '2D robotic arm environment'. These are described as models or environments with defined state/action spaces and reward functions, not as external public datasets with concrete access information (links, DOIs, or specific citations to a dataset).
Dataset Splits	No	The paper describes setting up and running experiments on defined models/environments rather than using external datasets, and therefore does not specify training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments.
Experiment Setup	Yes	D. Additional Details of Experiments: D.1. Taxation Design for Macroeconomic Model - 'The learning rate η is 0.001. The initial asset for the agent follows a Gaussian distribution with mean 0 and variance 2. The initial taxation is set to (0.4, 0.4, 0.4, 0.4). The discounted factor γ1 and γ2 are both set to 0.8.' D.2. Workbench Position Design for A Two-ankle Robot Arm - 'The learning rate η is 0.01 . The inner iterations K is 100. γ = 0.8 is the discount factor for robotic arm s control, and γu = 0.8 is the discount factor for calculating the discounted cumulative energy consumption. The initial workbench s position p0 is at (1, 1).'