reproducibilityindex.ai

Spectrum Random Masking for Generalization in Image-based Reinforcement Learning

Authors: Yangru Huang, Peixi Peng, Yifan Zhao, Guangyao Chen, Yonghong Tian

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on DMControl Generalization Benchmark demonstrate the proposed SRM achieves the state-of-the-art performance with strong generalization potentials.
Researcher Affiliation	Academia	1School of Computer Science, Peking University, Beijing, China 2Peng Cheng Laboratory, Shen Zhen, China
Pseudocode	Yes	Algorithm 1 Spectrum Random Masking
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See supplemental material.
Open Datasets	Yes	We conduct our experiment on 5 tasks from Deep Mind Control Suite (DMControl) [34]
Dataset Splits	No	No explicit mention of validation dataset splits or percentages (e.g., 'X% for validation') was found. The paper mentions training on DMControl and testing on DMControl Generalization Benchmark, implying a train/test split but not a validation split.
Hardware Specification	No	The paper states 'See supplemental material' for hardware specifications, but these details are not present in the provided main paper text.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as libraries, frameworks, or programming languages (e.g., PyTorch 1.9, Python 3.8).
Experiment Setup	Yes	For a fair comparison, we implement all methods following [13], where the same hyperparameters and network architecture are adopted. We use a 11-layer feed-forward convolution network as the shared encoder, which is followed by independent linear projections for the actor and critic. During training, the masking ratio and position of SRM are randomly chosen, and the ranges of r1 and r are set as [0, 0.5] and [0, 0.05] for each batch of observations, respectively.