Palm up: Playing in the Latent Manifold for Unsupervised Pretraining

Authors: Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder Singh

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments suggest that the learned representations can be successfully transferred to downstream tasks in both vision and reinforcement learning domains. We conduct experiments in CIFAR classification and out-of-distribution detection by transferring our unsupervised exploratory pretrained representations in Style GAN-based environments. Our experiments show that the learned representations achieve competitive results with state-of-the-art methods in image recognition and out-of-distribution detection despite being only trained in synthesized data without data augmentation.
Researcher Affiliation Collaboration Hao Liu UC Berkeley Tom Zahavy Deep Mind Volodymyr Mnih Deep Mind Satinder Singh Deep Mind
Pseudocode No The paper describes its methods and dynamics using prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code No The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets Yes We conduct experiments in CIFAR classification and out-of-distribution detection by transferring our unsupervised exploratory pretrained representations in Style GAN-based environments. We also train Style GAN in observation data collected from Atari and apply our method to it. This refers to well-known public datasets like CIFAR and Atari.
Dataset Splits No The paper mentions training on CIFAR and Atari datasets and discusses 'online finetuning' for RL experiments. It refers to using 'the default hyperparameters of Sim Siam [11]' and that 'more details can be found in the supplemental material' regarding representation learning. However, it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) in the main text.
Hardware Specification No The paper does not provide specific details about the hardware used for its experiments, such as GPU models, CPU types, or cloud computing instance specifications. It mentions computational expense but no explicit hardware.
Software Dependencies No The paper mentions various software components and models like Style GAN [32], APT [44], Sim Siam [11], and Dr Q [37], but it does not specify any version numbers for these or other software dependencies, which would be necessary for reproducibility.
Experiment Setup Yes We used β = 0.95 in most of our experiments per our initial experiments. The hyperparameters of the representation learning follow the default hyperparameters of Sim Siam [11] and more details can be found in the supplemental material. This includes specific hyperparameter values and references to detailed settings.