Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Instance-based Generalization in Reinforcement Learning
Authors: Martin Bertran, Natalia Martinez, Mariano Phielipp, Guillermo Sapiro
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate IRL on several standard continuous control environments, including Acrobot, CartPole, MountainCar, and LunarLander. Table 1 shows the performance of IRL on four continuous control environments. |
| Researcher Affiliation | Academia | The provided paper text does not contain any clear institutional affiliations, email domains, or author addresses to classify the affiliation type. |
| Pseudocode | Yes | Algorithm 1: Instance-based Reinforcement Learning |
| Open Source Code | Yes | The code for Instance-based Reinforcement Learning (IRL) is available at [https://github.com/IRL_project/code](https://github.com/IRL_project/code). |
| Open Datasets | Yes | We evaluate IRL on several standard continuous control environments, including Acrobot, CartPole, MountainCar, and LunarLander. |
| Dataset Splits | No | The paper mentions 'a standard 80/20 train-test split for data used to train the instance memory' but does not specify a separate validation dataset split. |
| Hardware Specification | Yes | All experiments were conducted on a machine equipped with an Intel Core i9-10900K CPU, 64GB RAM, and an NVIDIA GeForce RTX 3090 GPU. |
| Software Dependencies | Yes | Our implementation uses Python 3.8.5, PyTorch 1.10.0, and OpenAI Gym 0.21.0. |
| Experiment Setup | Yes | For all environments, we used a learning rate of 0.001, a batch size of 64, and a replay buffer size of 100,000. The discount factor was set to 0.99. We used the Adam optimizer. |