Hedging as Reward Augmentation in Probabilistic Graphical Models
Authors: Debarun Bhattacharjya, Radu Marinescu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the concepts with examples and counter-examples, and conduct experiments to demonstrate the properties and applicability of the proposed computational tools that enable agents to proactively identify potential hedging opportunities in real-world situations. |
| Researcher Affiliation | Industry | Debarun Bhattacharjya IBM Research debarunb@us.ibm.com Radu Marinescu IBM Research radu.marinescu@ie.ibm.com |
| Pseudocode | No | The paper describes methods in text and equations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] All the information necessary to conduct the experiments is provided in the paper itself. |
| Open Datasets | Yes | The numbers are based on real data from 1996 to 1998 (see Appendix C.1). |
| Dataset Splits | No | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A] |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The experiments do not require extensive compute power. |
| Software Dependencies | No | Computations were done using exact Bayesian inference available from the pgmpy2 package (https://github.com/pgmpy/pgmpy). However, no specific version number for pgmpy or any other software is mentioned. |
| Experiment Setup | Yes | Consider the following numerical case: σ = 1, expected performances for tasks: P1 = [2, 1], P2 = [3, 2], P3 = [1, 7]. With these numbers, policy π = h is optimal a-priori. |