Hedging as Reward Augmentation in Probabilistic Graphical Models

Authors: Debarun Bhattacharjya, Radu Marinescu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the concepts with examples and counter-examples, and conduct experiments to demonstrate the properties and applicability of the proposed computational tools that enable agents to proactively identify potential hedging opportunities in real-world situations.
Researcher Affiliation Industry Debarun Bhattacharjya IBM Research debarunb@us.ibm.com Radu Marinescu IBM Research radu.marinescu@ie.ibm.com
Pseudocode No The paper describes methods in text and equations but does not include any pseudocode or algorithm blocks.
Open Source Code No Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] All the information necessary to conduct the experiments is provided in the paper itself.
Open Datasets Yes The numbers are based on real data from 1996 to 1998 (see Appendix C.1).
Dataset Splits No Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The experiments do not require extensive compute power.
Software Dependencies No Computations were done using exact Bayesian inference available from the pgmpy2 package (https://github.com/pgmpy/pgmpy). However, no specific version number for pgmpy or any other software is mentioned.
Experiment Setup Yes Consider the following numerical case: σ = 1, expected performances for tasks: P1 = [2, 1], P2 = [3, 2], P3 = [1, 7]. With these numbers, policy π = h is optimal a-priori.