Predictive Coding for Locally-Linear Control

Authors: Rui Shu, Tung Nguyen, Yinlam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, Hung Bui

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmark tasks show that our model reliably learns a controllable latent space that leads to superior performance when compared with state-of-the-art LCE baselines.
Researcher Affiliation Collaboration 1Stanford University 2Vin AI 3Hanoi University of Science and Technology 4Google Research 5Facebook AI Research.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It presents mathematical formulations and objectives but not in an algorithmic format.
Open Source Code Yes https://github.com/Vin AIResearch/PC3-pytorch
Open Datasets No The paper mentions "benchmark domains" (Planar System, Inverted Pendulum, Cartpole, and 3-Link Manipulator) from which data is generated, but it does not provide concrete access information (link, DOI, formal citation) for publicly available datasets used for training. Data is described as being generated through simulation.
Dataset Splits No The paper describes how samples are generated and used for training but does not specify explicit training, validation, or test dataset splits with percentages, sample counts, or references to predefined splits.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper implicitly suggests the use of PyTorch through its GitHub link, but it does not list specific software dependencies with version numbers (e.g., Python version, PyTorch version, CUDA version) needed for replication.
Experiment Setup Yes We apply i LQR algorithm in the latent space with a quadratic cost, c(zt, ut) = (zt zgoal)>Q(zt zgoal) + u> t Rut, where zt and zgoal are the encoded vectors of the current and goal observation, and Q = Inz, R = β Inu. By tuning the noise variance σ2 as a hyperparameter, we can balance the latent space retraction encouraged by cons+ with the latent space expansion encouraged by cpc+ and thus stabilize the learning of the latent space. Our overall objective is thus E,F λ1 cpc(E, F) + λ2 cons(E, F) λ3 curv(F).