Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Solving MDPs with Skew Symmetric Bilinear Utility Functions
Authors: Hugo Gilbert, Olivier Spanjaard, Paolo Viappiani, Paul Weng
IJCAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present and discuss experimental results where SSB-optimal policies are computed for a popular TV contest according to several instantiations of SSB utility functions. |
| Researcher Affiliation | Academia | 1Sorbonne Universit es, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005, Paris, France 2CNRS, UMR 7606, LIP6, F-75005, Paris, France 3SYSU-CMU Joint Institute of Engineering, Guangzhou, China 4SYSU-CMU Shunde International Joint Research Institute, Shunde, China |
| Pseudocode | Yes | Algorithm 1: Double Oracle Algorithm |
| Open Source Code | No | The paper does not provide an explicit link to open-source code for the methodology described. |
| Open Datasets | Yes | We used the two models of the Spanish 2003 version of the game presented by Perea and Puerto [2007]. |
| Dataset Splits | No | The paper does not provide specific details about training, validation, or test dataset splits. |
| Hardware Specification | Yes | All times are wall-clock times on a 2,4 GHz Intel Core i5 machine with 8G main memory. |
| Software Dependencies | Yes | Our implementation is in Python, with an external call to GUROBI version 5.6.3 in order to solve the linear programs required to ๏ฌnd the Nash equilibria. |
| Experiment Setup | Yes | We computed the optimal policies for the two models according to several instantiations of the SSB utility function: the expectation (Exp), probabilistic dominance (PD), threshold probability (Th) criteria (threshold set to 2700) and a risk averse SSB utility function (RA) de๏ฌned by ฯRA(x, y) = (x โ y)/(x + y) โ 2/3 |