Online Decision Mediation
Authors: Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, through experiments and sensitivities on a variety of real-world datasets, we illustrate consistent gains over applicable benchmarks on a comprehensive set of performance measures with respect to the mediator policy, the learned model, and the entire decision-making system as a unit (Section 4). |
| Researcher Affiliation | Collaboration | Daniel Jarrett1 Alihan Hüyük1 Mihaela van der Schaar1,2,3 Department of Applied Mathematics and Theoretical Physics 1University of Cambridge, 2UCLA, 3Alan Turing Institute |
| Pseudocode | Yes | Algorithm 1 summarizes UMPIRE as applied to ODM. |
| Open Source Code | No | The code will be made available upon acceptance. |
| Open Datasets | Yes | Datasets We experiment with six environments. In Gauss Sine, synthetic points are generated in three categories by rounding a sinusoidal latent function on 2D Gaussian input [61]. In High Energy, the task is to identify signals in high energy particles registered in a Cherenkov gamma telescope [62]. In Motion Capture, the task is to recognize hand postures from data recorded by glove markers on users [63]. In Lunar Lander, the task is to perform actions in the Open AI gym [64] Atari environment, with the expert defined as a PPO2 agent [65,66] trained on the true reward. In Alzheimers, the task is to perform early diagnosis of patients in the Alzheimer s Disease Neuroimaging Initiative study [67] as cognitively normal, mildly impaired, or at risk of dementia [19,20]. Lastly, in Cystic Fibrosis, the task is to perform diagnosis of patients enrolled in the UK Cystic Fibrosis registry [68] as to their GOLD grading in chronic obstructive pulmonary disease [69]. See Appendix B for additional detail. |
| Dataset Splits | No | Importantly, note that this is a more challenging objective than simply minimizing the generalization error of the model, system, or some asymptotic complexity thereof: Here we have no separation between training versus testing , since losses begin accumulating from the very first step of the sequential process. |
| Hardware Specification | Yes | This work is not computation-heavy, so the details are not pertinent. (All work was done on a Mac Book Pro, 13-inch, 2017). |
| Software Dependencies | No | The paper states that the underlying model policy is implemented using 'Dirichlet-based Gaussian process classifiers [61,70 72]' but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, scikit-learn, or other libraries). |
| Experiment Setup | Yes | Each experiment run consists of n=2000 rounds of interactions (except for the synthetic Gauss Sine, for which n=500), and this is repeated for a total of 10 runs with random seeds. ... In all experiments, we set kint =0.1, = 1 2, =10% where applicable, and = 0 as noted in Section 3.2. |