Learning World Models with Identifiable Factorization
Authors: Yuren Liu, Biwei Huang, Zhengmao Zhu, Honglong Tian, Mingming Gong, Yang Yu, Kun Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments in synthetic worlds demonstrate that our method accurately identifies the ground-truth latent variables, substantiating our theoretical findings. Moreover, experiments in variants of the Deep Mind Control Suite and Robo Desk showcase the superior performance of our approach over baselines. |
| Researcher Affiliation | Collaboration | 1 National Key Laboratory for Novel Software Technology, Nanjing University, China 2 University of California San Diego, USA 3 Carnegie Mellon University, USA 4 Mohamed bin Zayed University of Artificial Intelligence, UAE 5 University of Melbourne, Australia 6 Polixir.ai, China 7 Peng Cheng Laboratory, China |
| Pseudocode | Yes | Algorithm 1: IFactor |
| Open Source Code | Yes | The source code is available at https://github.com/Alex Liuyuren/IFactor |
| Open Datasets | Yes | Moreover, experiments in variants of the Deep Mind Control Suite and Robo Desk showcase the superior performance of our approach over baselines. |
| Dataset Splits | No | The paper describes training on collected trajectories and evaluating policies, but does not specify explicit train/validation/test dataset splits by percentage or count for fixed datasets. |
| Hardware Specification | Yes | Computing Hardware We used a machine with the following CPU specifications: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz; 32 CPUs, eight physical cores per CPU, a total of 256 logical CPU units. The machine has two Ge Force RTX 2080 Ti GPUs with 11GB GPU memory. |
| Software Dependencies | Yes | The models are implemented in Py Torch 1.13.1. |
| Experiment Setup | Yes | For all experiments, we assign β1 = β2 = β3 = β4 = 0.003 as the weights for the KL divergence terms. (from E.1) and Table 5: Some hyperparameters of our method in the environment of Modified Cartpole, Robodesk and DMC. |