reproducibilityindex.ai

Reinforcement Learning with Neural Radiance Fields

Authors: Danny Driess, Ingmar Schubert, Pete Florence, Yunzhu Li, Marc Toussaint

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments indicate that Ne RF as supervision leads to a latent space better suited for the downstream RL tasks involving robotic object manipulations like hanging mugs on hooks, pushing objects, or opening doors.
Researcher Affiliation	Collaboration	Danny Driess TU Berlin Ingmar Schubert TU Berlin Pete Florence Google Yunzhu Li MIT Marc Toussaint TU Berlin
Pseudocode	No	The paper describes its methods in prose and with mathematical formulas, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See website.
Open Datasets	No	The paper states that the environments are custom and data is collected by random interactions, rather than using a publicly available or open dataset with access information provided.
Dataset Splits	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See appendix.
Hardware Specification	Yes	Additionally of particular relevance, various methods have developed latent-conditioned [42, 43, 44] or compositional/object-oriented approaches for Ne RFs [45, 46, 47, 48, 49, 50, 51, 52, 53], although they, nor other Ne RF-style methods to our knowledge, have been applied to RL. In our case, we are not constrained by inference-time computation issues, since we do not need to render images, and only have to run our latent-space encoder (with a runtime of approx. 7 ms on an RTX3090).
Software Dependencies	No	The paper mentions using PPO as the RL algorithm and references Stable Baselines3, but it does not specify explicit version numbers for any software dependencies.
Experiment Setup	Yes	We use PPO [86] as the RL algorithm and four camera views in all experiments. Refer to the appendix for more details about our environments, parameter choices, network architectures, and training times.