On The Statistical Complexity of Offline Decision-Making
Authors: Thanh Nguyen-Tang, Raman Arora
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the statistical complexity of offline decision-making with function approximation, establishing (near) minimax-optimal rates for stochastic contextual bandits and Markov decision processes. The performance limits are captured by the pseudo-dimension of the (value) function class and a new characterization of the behavior policy that strictly subsumes all the previous notions of data coverage in the offline decision-making literature. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Johns Hopkins University, Baltimore 21218, USA. |
| Pseudocode | Yes | Algorithm 1 Hedge for Offline Decision-Making (Of DM-Hedge) |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for the source code of the methodology described. |
| Open Datasets | No | The paper is theoretical and does not conduct empirical experiments using a specific dataset. Therefore, it does not provide concrete access information for a publicly available or open dataset for training. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical experiments. Therefore, it does not describe dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not conduct empirical experiments. Therefore, it does not provide details about the hardware used. |
| Software Dependencies | No | The paper is theoretical and does not conduct empirical experiments. Therefore, it does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not conduct empirical experiments. Therefore, it does not describe any specific experimental setup details like hyperparameters or system-level training settings. |