Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
Authors: Gugan Thoppe, Prashanth L A, Sanjay P. Bhat
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we are concerned with the problem of estimating a risk measure from a sample path of a discounted Markov Cost Process (MCP). In the context of RL, this is equivalent to the policy evaluation problem, albeit for a risk measure. For this problem, we derive minimax sample complexity lower bounds as well as upper bounds. |
| Researcher Affiliation | Collaboration | 1 Dept. of Computer Science and Automation, Indian Institute of Science (IISc), Bengaluru, India; Robert Bosch Centre for Data Science and Artificial Intelligence, IIT Madras, Chennai, India. 2 Dept. of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India. 3 TCS Research, Hyderabad, India. |
| Pseudocode | No | The paper describes algorithms and estimation schemes in prose but does not include structured pseudocode blocks or sections explicitly labeled 'Algorithm'. |
| Open Source Code | No | The paper does not contain any statement regarding the release of source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper is theoretical and focuses on deriving lower and upper bounds for risk estimation; it does not use or specify any public datasets for training or experimentation. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with dataset splits, thus no information on training, validation, or test splits is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe computational experiments, therefore no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe computational experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical derivations and proofs of bounds, thus it does not include details on experimental setup, hyperparameters, or training configurations. |