Solving MDPs with Skew Symmetric Bilinear Utility Functions

Authors: Hugo Gilbert, Olivier Spanjaard, Paolo Viappiani, Paul Weng

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we present and discuss experimental results where SSB-optimal policies are computed for a popular TV contest according to several instantiations of SSB utility functions.
Researcher Affiliation Academia 1Sorbonne Universit es, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005, Paris, France 2CNRS, UMR 7606, LIP6, F-75005, Paris, France 3SYSU-CMU Joint Institute of Engineering, Guangzhou, China 4SYSU-CMU Shunde International Joint Research Institute, Shunde, China
Pseudocode Yes Algorithm 1: Double Oracle Algorithm
Open Source Code No The paper does not provide an explicit link to open-source code for the methodology described.
Open Datasets Yes We used the two models of the Spanish 2003 version of the game presented by Perea and Puerto [2007].
Dataset Splits No The paper does not provide specific details about training, validation, or test dataset splits.
Hardware Specification Yes All times are wall-clock times on a 2,4 GHz Intel Core i5 machine with 8G main memory.
Software Dependencies Yes Our implementation is in Python, with an external call to GUROBI version 5.6.3 in order to solve the linear programs required to find the Nash equilibria.
Experiment Setup Yes We computed the optimal policies for the two models according to several instantiations of the SSB utility function: the expectation (Exp), probabilistic dominance (PD), threshold probability (Th) criteria (threshold set to 2700) and a risk averse SSB utility function (RA) defined by ϕRA(x, y) = (x − y)/(x + y) − 2/3