Inverse Reinforcement Learning From Like-Minded Teachers
Authors: Ritesh Noothigattu, Tom Yan, Ariel D. Procaccia9197-9204
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We next study the empirical performance of our algorithm for the inverse multi-armed bandit problem. |
| Researcher Affiliation | Academia | 1 Carnegie Mellon University 2 Harvard University {riteshn, tyyan}@cmu.edu, arielpro@seas.harvard.edu |
| Pseudocode | No | For completeness we present these algorithms, and formally state their featurematching guarantees, in the full version of the paper. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | In the first set of experiments, we fix the noise standard deviation σ to 1, generate n = 500 agents according to the noise η N(0, σ2), and vary parameter δ from 0.01 to 3. |
| Dataset Splits | No | The paper describes generating data for simulations ('generate n = 500 agents') and averaging results over multiple runs ('averaged over 1000 runs'), but it does not specify traditional training/validation/test dataset splits for a fixed dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | In the first set of experiments, we fix the noise standard deviation σ to 1, generate n = 500 agents according to the noise η N(0, σ2), and vary parameter δ from 0.01 to 3. ... Next, we fix the parameter δ to 1 and generate n = 500 agents according to noise η N(0, σ2), while varying the noise parameter σ from 0.01 to 5. |