reproducibilityindex.ai

RRL: Resnet as representation for Reinforcement Learning

Authors: Rutav M Shah, Vikash Kumar

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a simulated dexterous manipulation benchmark, where the state of the art methods fails to make signiﬁcant progress, RRL delivers contact rich behaviors. Our experimental evaluations aims to address the following questions: (1) Does pre-tained representations acquired via large real world image dataset allow RRL to learn complex tasks directly from proprioceptive signals (camera inputs and joint encoders)? (2) How does RRL s performance and efﬁciency compare against other state-of-the-art methods? (3) How various representational choices inﬂuence the generality and versatility of the resulting behaviors? (5) What are the effects of various design decisions on RRL? (6) Are commonly used benchmarks for studying image based continuous control methods effective?
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, India 2Department of Computer Science, University of Washington, Seattle, USA.
Pseudocode	Yes	Algorithm 1 RRL
Open Source Code	No	The paper does not explicitly state that its own source code is released or provide a link to it. It mentions other projects and their code repositories (e.g., 'Yarats, D. and Kostrikov, I. Soft actor-critic (sac) implementation in pytorch. https://github.com/ denisyarats/pytorch_sac, 2020.' or 'Subramanian, A. Pytorch-vae. https://github.com/ Antix K/Py Torch-VAE, 2020.').
Open Datasets	Yes	We use standard Resnet-34 model as RRL s feature extractor. The model is pre-trained on the Image Net dataset which consists of 1000 classes. It is trained on 1.28 million images on the classiﬁcation task of Image Net.
Dataset Splits	No	The paper refers to using the ResNet model pre-trained on ImageNet but does not explicitly state the train/validation/test splits used for its own experiments on the ADROIT or DMControl suites. It refers to 'samples(M)' and 'Robot Hours' for performance evaluation, but not specific dataset splits for reproduction.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU models, memory, or cloud instances). It mentions 'Robot Hours' as a measure of compute time but no underlying hardware.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, TensorFlow, or other libraries used in the implementation.
Experiment Setup	Yes	All the hyperparameters used for training are summarized in Appendix(Table 2).