Variational Intrinsic Control Revisited
Authors: Taehwan Kwon
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We substantiate our claims through rigorous mathematical derivations and experimental analyses. 5 EXPERIMENTS In this section, we evaluate implicit VIC (Gregor et al., 2016), Algorithm 1 and Algorithm 2. We use LSTM (Hochreiter & Schmidhuber, 1997) to encode τt = (s0, a0, ..., st) into a vector. We conduct experiments on both deterministic and stochastic environments and evaluate results by measuring the mutual information I from samples. |
| Researcher Affiliation | Industry | Taehwan Kwon NC kth315@ncsoft.com |
| Pseudocode | Yes | Algorithm 1 Implicit VIC with transitional probability model Algorithm 2 Implicit VIC with Gaussian mixture model |
| Open Source Code | No | The paper does not provide any statements about releasing source code or links to a code repository. |
| Open Datasets | No | The paper describes custom '1D world', '2D world', 'tree world', and 'grid world' environments, as well as 'Half Cheetah-v3 in the Mujoco environments'. These are simulated environments or custom-built scenarios, not publicly available datasets with specific access information, citations, or repositories. |
| Dataset Splits | No | The paper does not explicitly specify training, validation, and test dataset splits with percentages, counts, or references to predefined splits for reproducibility. It discusses training phases and how data is generated from environments. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions 'LSTM' as an architecture and 'Mujoco' as an environment, but does not specify software dependencies with version numbers (e.g., specific deep learning frameworks like PyTorch or TensorFlow, or Mujoco version). |
| Experiment Setup | Yes | Please see Appendix F.1 for details on the hyper-parameter settings. Table 1: Hyper-parameters used for experiments. Optimizer Adam, Learning rate 1e-3, 1e-4, Betas (0.9, 0.999), Weight initialization Gaussian with std. 0.1 and mean 0, Batch size 128, Tsmooth 128, σ (GMM) 0.25, ngmm (GMM) 10. |