reproducibilityindex.ai

The Differentiable Cross-Entropy Method

Authors: Brandon Amos, Denis Yarats

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate applications of the crossentropy method in structured prediction, control, and reinforcement learning.
Researcher Affiliation	Collaboration	1Facebook AI Research 2New York University.
Pseudocode	Yes	Algorithm 1 DCEM(f , gφ, φ1; , N, k, T) and Algorithm 2 Learning an embedded control space with DCEM
Open Source Code	Yes	Our Py Torch (Paszke et al., 2019) source code is openly available at github.com/facebookresearch/dcem and uses the Py Torch LML implementation from github.com/locuslab/lml to compute eq. (4).
Open Datasets	Yes	We use the standard cartpole dynamical system from Barto et al. (1983) with a continuous state-action space. cheetah.run and walker.walk continuous locomotion tasks from the Deep Mind control suite (Tassa et al., 2018) using the Mu Jo Co physics engine (Todorov et al., 2012).
Dataset Splits	No	The paper mentions training on "2M timesteps" and evaluating on "100 test episodes," but it does not specify explicit percentages or counts for training, validation, and test splits needed for exact data partitioning reproducibility.
Hardware Specification	No	The paper mentions running computations that are "GPU-amenable" but does not specify any particular models of GPUs, CPUs, or other hardware components used for the experiments.
Software Dependencies	No	The paper mentions using PyTorch, the DeepMind control suite, MuJoCo, and PPO, but it does not provide specific version numbers for these software components.
Experiment Setup	Yes	For illustrative purposes we consider a simple unidimensional regression task... Both of these are trained to take 10 optimizer steps and we use an inner learning rate of 0.1 for gradient descent and with DCEM we use 10 iterations with 100 samples per iteration and 10 elite candidates, with a temperature of 1. For DCEM over the embedded space we use 10 iterations with 100 samples in each iteration and 10 elite candidates, again with a temperature of 1.