Palm up: Playing in the Latent Manifold for Unsupervised Pretraining
Authors: Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder Singh
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments suggest that the learned representations can be successfully transferred to downstream tasks in both vision and reinforcement learning domains. We conduct experiments in CIFAR classification and out-of-distribution detection by transferring our unsupervised exploratory pretrained representations in Style GAN-based environments. Our experiments show that the learned representations achieve competitive results with state-of-the-art methods in image recognition and out-of-distribution detection despite being only trained in synthesized data without data augmentation. |
| Researcher Affiliation | Collaboration | Hao Liu UC Berkeley Tom Zahavy Deep Mind Volodymyr Mnih Deep Mind Satinder Singh Deep Mind |
| Pseudocode | No | The paper describes its methods and dynamics using prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We conduct experiments in CIFAR classification and out-of-distribution detection by transferring our unsupervised exploratory pretrained representations in Style GAN-based environments. We also train Style GAN in observation data collected from Atari and apply our method to it. This refers to well-known public datasets like CIFAR and Atari. |
| Dataset Splits | No | The paper mentions training on CIFAR and Atari datasets and discusses 'online finetuning' for RL experiments. It refers to using 'the default hyperparameters of Sim Siam [11]' and that 'more details can be found in the supplemental material' regarding representation learning. However, it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) in the main text. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for its experiments, such as GPU models, CPU types, or cloud computing instance specifications. It mentions computational expense but no explicit hardware. |
| Software Dependencies | No | The paper mentions various software components and models like Style GAN [32], APT [44], Sim Siam [11], and Dr Q [37], but it does not specify any version numbers for these or other software dependencies, which would be necessary for reproducibility. |
| Experiment Setup | Yes | We used β = 0.95 in most of our experiments per our initial experiments. The hyperparameters of the representation learning follow the default hyperparameters of Sim Siam [11] and more details can be found in the supplemental material. This includes specific hyperparameter values and references to detailed settings. |