Prediction and Control with Temporal Segment Models

Authors: Nikhil Mishra, Pieter Abbeel, Igor Mordatch

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments investigate the following questions: (i) How well do segment-based models predict dynamics? (ii) How does prediction accuracy transfer to control applications? How does this scale with the difficulty of the task and stochasticity in the dynamics? (iii) How is this affected by the use of latent action priors? (iv) Is there any meaning or structure encoded by the latent space learned by the dynamics model? ... 5. Experiments
Researcher Affiliation Collaboration Nikhil Mishra 1 Pieter Abbeel 1 2 Igor Mordatch 2 1University of California, Berkeley 2Open AI.
Pseudocode No The paper describes the model architecture and training process in text and diagrams but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper provides a link to videos of experimental results, but there is no explicit statement about making the source code for the described methodology publicly available, nor a link to a code repository.
Open Datasets Yes We base our experiments on a simulated 2-DOF arm moving in a plane (as implemented in the Reacher environment in Open AI Gym), because performing random actions in this environment results in sufficient exploration.
Dataset Splits No The paper mentions using a "test set of held-out trajectories" and that the "training set is comprised of trajectories", but it does not specify explicit train/validation/test splits by percentages or counts, nor does it clearly define a separate validation set for hyperparameter tuning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cluster specifications) used to run the experiments.
Software Dependencies No The paper mentions "Open AI Gym" as an environment, but it does not list any specific software dependencies or libraries with their version numbers.
Experiment Setup Yes For all environments, the training set is comprised of trajectories of length T = 100 of the arm executing smooth random torques, and we used segments of length H = 10 and 8-dimensional latent spaces. ... We used Adam (Kingma & Ba, 2015) with step size 0.01 to perform this optimization and found that it generally converged in around 100 iterations.