reproducibilityindex.ai

Risk-sensitive control as inference with Rényi divergence

Authors: Kaito Ito, Kenji Kashima

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The behavior of the risk-sensitive soft actor-critic is examined via an experiment.
Researcher Affiliation	Academia	Kaito Ito The University of Tokyo kaito@g.ecc.u-tokyo.ac.jp Kenji Kashima Kyoto University kk@i.kyoto-u.ac.jp
Pseudocode	No	The paper describes algorithms but does not provide them in a structured pseudocode or algorithm block.
Open Source Code	Yes	The code is available at https://github.com/kaito-1111/risk-sensitive-sac.git.
Open Datasets	Yes	The environment is Pendulum-v1 in Open AI Gymnasium.
Dataset Splits	No	The paper mentions training and testing but does not provide specific percentages or absolute counts for dataset splits (train/validation/test).
Hardware Specification	Yes	For the training, we used an Ubuntu 20.04 server (GPU: NVIDIA Ge Force RTX 2080Ti).
Software Dependencies	No	The implementation of the risk-sensitive SAC (RSAC) algorithm follows the stable-baselines3 [50] version of the SAC algorithm... optimizer Adam [51]. No specific version numbers for these or other software are provided.
Experiment Setup	Yes	Now, we introduce a series of hyperparameters listed in Table 1 shared for both SAC and RSAC algorithms. Table 1: SAC and RSAC Hyperparameters Parameter Value optimizer Adam [51] learning rate 10 3 discount factor 0.99 regularization coefﬁcient 0.1 target smoothing coefﬁcient 0.005 replay buffer size 105 number of critic networks 2 number of hidden layers (all networks) 2 number of hidden units per layer 256 activation function Re LU