reproducibilityindex.ai

Optimistic Active Exploration of Dynamical Systems

Authors: Bhavya Sukhija, Lenart Treven, Cansu Sancaktar, Sebastian Blaes, Stelian Coros, Andreas Krause

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we compare OPAX with other heuristic active exploration approaches on several environments. Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks. We evaluate OPAX on several simulated robotic tasks with state dimensions ranging from two to 58. The empirical results provide validation for our theoretical conclusions, showing that OPAX consistently delivers strong performance across all tested environments.
Researcher Affiliation	Academia	ETH Zürich 1 MPI for Intelligent Systems2 {sukhijab,trevenl,scoros,krausea}@ethz.ch {cansu.sancaktar,sebastian.blae}@tuebingen.mpg.de
Pseudocode	Yes	OPAX: OPTIMISTIC ACTIVE EXPLORATION Init: Aleatoric uncertainty σ, Probability δ, Statistical model (µ0, σ0, β0(δ)) for episode n = 1, . . . , N do πn = argmax π Π max η Ξ E 1 + σ2 n 1,j(xt, π(xt)) Prepare policy Dn ROLLOUT(πn) Collect measurements Update (µn, σn, βn(δ)) D1:n Update model
Open Source Code	Yes	Finally, we provide an efficient implementation1 of OPAX in JAX (Bradbury et al., 2018). 1https://github.com/lasgroup/opax
Open Datasets	Yes	We evaluate OPAX on the Pendulum-v1 and Mountain Car environment from the Open AI gym benchmark suite (Brockman et al., 2016), on the Reacher, Swimmer, and Cheetah from the deep mind control suite (Tassa et al., 2018), and a high-dimensional simulated robotic manipulation task introduced by Li et al. (2020).
Dataset Splits	No	No explicit mention of specific train/validation/test dataset splits with percentages, counts, or predefined citations. The paper describes an episodic setting where data is collected and used to update a model.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts) were explicitly mentioned for running the experiments or training the models.
Software Dependencies	No	Finally, we provide an efficient implementation1 of OPAX in JAX (Bradbury et al., 2018). However, no specific version number for JAX or other libraries like Python, PyTorch, TensorFlow, etc., is provided.
Experiment Setup	Yes	Table 3: Hyperparameters for results in Section 5. Table 4: Parameters of i CEM optimizer for experiments in Section 5. Table 5: Parameters of model-based SAC optimizer for experiments in Section 5. Table 6: Environment and model settings used for the experiment results shown in Figure 4. Table 7: Base settings for i CEM as they are used in the intrinsic phase. Same settings are used for all methods. Table 8: i CEM hyperparameters used for zero-shot generalization in the extrinsic phase. Any settings not specified here are the same as the general settings given in Table 7.