Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction
Authors: Brahma S. Pavse, Josiah P. Hanna
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, our empirical evaluation on difficult, high-dimensional state-space OPE tasks shows that the abstract ratios can make MIS OPE estimators achieve lower mean-squared error and more robust to hyperparameter tuning than the ground ratios. 4 Empirical Study We will now show how projecting S Sϕ can produce more accurate OPE estimates in practice. |
| Researcher Affiliation | Academia | University of Wisconsin Madison pavse@wisc.edu, jphanna@cs.wisc.edu |
| Pseudocode | No | The paper describes its algorithm ('Abstract Best DICE') and an optimization problem, but it does not present pseudocode or a clearly labeled algorithm block. |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Reacher (Brockman et al. 2016). A robotic arm tries to move to a goal location. Here, s R11 and a R2. ... Walker2D (Brockman et al. 2016). ... Pusher (Brockman et al. 2016). ... Ant UMaze (Fu et al. 2020). |
| Dataset Splits | No | The paper mentions 'batch size' and uses the concept of validation in relation to assumptions, but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper mentions hyperparameters like 'learning rates of ζ and ν, αζ and αν' and discusses hyperparameter tuning and robustness, but it does not explicitly list the specific values or ranges of these hyperparameters used in the experiments. |