Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Inverse Reinforcement Learning From Like-Minded Teachers
Authors: Ritesh Noothigattu, Tom Yan, Ariel D. Procaccia9197-9204
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We next study the empirical performance of our algorithm for the inverse multi-armed bandit problem. |
| Researcher Affiliation | Academia | 1 Carnegie Mellon University 2 Harvard University EMAIL, EMAIL |
| Pseudocode | No | For completeness we present these algorithms, and formally state their featurematching guarantees, in the full version of the paper. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | In the first set of experiments, we fix the noise standard deviation σ to 1, generate n = 500 agents according to the noise η N(0, σ2), and vary parameter δ from 0.01 to 3. |
| Dataset Splits | No | The paper describes generating data for simulations ('generate n = 500 agents') and averaging results over multiple runs ('averaged over 1000 runs'), but it does not specify traditional training/validation/test dataset splits for a fixed dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | In the first set of experiments, we fix the noise standard deviation σ to 1, generate n = 500 agents according to the noise η N(0, σ2), and vary parameter δ from 0.01 to 3. ... Next, we fix the parameter δ to 1 and generate n = 500 agents according to noise η N(0, σ2), while varying the noise parameter σ from 0.01 to 5. |