Predictive Coding for Locally-Linear Control
Authors: Rui Shu, Tung Nguyen, Yinlam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, Hung Bui
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark tasks show that our model reliably learns a controllable latent space that leads to superior performance when compared with state-of-the-art LCE baselines. |
| Researcher Affiliation | Collaboration | 1Stanford University 2Vin AI 3Hanoi University of Science and Technology 4Google Research 5Facebook AI Research. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It presents mathematical formulations and objectives but not in an algorithmic format. |
| Open Source Code | Yes | https://github.com/Vin AIResearch/PC3-pytorch |
| Open Datasets | No | The paper mentions "benchmark domains" (Planar System, Inverted Pendulum, Cartpole, and 3-Link Manipulator) from which data is generated, but it does not provide concrete access information (link, DOI, formal citation) for publicly available datasets used for training. Data is described as being generated through simulation. |
| Dataset Splits | No | The paper describes how samples are generated and used for training but does not specify explicit training, validation, or test dataset splits with percentages, sample counts, or references to predefined splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper implicitly suggests the use of PyTorch through its GitHub link, but it does not list specific software dependencies with version numbers (e.g., Python version, PyTorch version, CUDA version) needed for replication. |
| Experiment Setup | Yes | We apply i LQR algorithm in the latent space with a quadratic cost, c(zt, ut) = (zt zgoal)>Q(zt zgoal) + u> t Rut, where zt and zgoal are the encoded vectors of the current and goal observation, and Q = Inz, R = β Inu. By tuning the noise variance σ2 as a hyperparameter, we can balance the latent space retraction encouraged by cons+ with the latent space expansion encouraged by cpc+ and thus stabilize the learning of the latent space. Our overall objective is thus E,F λ1 cpc(E, F) + λ2 cons(E, F) λ3 curv(F). |