Online Decision Mediation

Authors: Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, through experiments and sensitivities on a variety of real-world datasets, we illustrate consistent gains over applicable benchmarks on a comprehensive set of performance measures with respect to the mediator policy, the learned model, and the entire decision-making system as a unit (Section 4).
Researcher Affiliation Collaboration Daniel Jarrett1 Alihan Hüyük1 Mihaela van der Schaar1,2,3 Department of Applied Mathematics and Theoretical Physics 1University of Cambridge, 2UCLA, 3Alan Turing Institute
Pseudocode Yes Algorithm 1 summarizes UMPIRE as applied to ODM.
Open Source Code No The code will be made available upon acceptance.
Open Datasets Yes Datasets We experiment with six environments. In Gauss Sine, synthetic points are generated in three categories by rounding a sinusoidal latent function on 2D Gaussian input [61]. In High Energy, the task is to identify signals in high energy particles registered in a Cherenkov gamma telescope [62]. In Motion Capture, the task is to recognize hand postures from data recorded by glove markers on users [63]. In Lunar Lander, the task is to perform actions in the Open AI gym [64] Atari environment, with the expert defined as a PPO2 agent [65,66] trained on the true reward. In Alzheimers, the task is to perform early diagnosis of patients in the Alzheimer s Disease Neuroimaging Initiative study [67] as cognitively normal, mildly impaired, or at risk of dementia [19,20]. Lastly, in Cystic Fibrosis, the task is to perform diagnosis of patients enrolled in the UK Cystic Fibrosis registry [68] as to their GOLD grading in chronic obstructive pulmonary disease [69]. See Appendix B for additional detail.
Dataset Splits No Importantly, note that this is a more challenging objective than simply minimizing the generalization error of the model, system, or some asymptotic complexity thereof: Here we have no separation between training versus testing , since losses begin accumulating from the very first step of the sequential process.
Hardware Specification Yes This work is not computation-heavy, so the details are not pertinent. (All work was done on a Mac Book Pro, 13-inch, 2017).
Software Dependencies No The paper states that the underlying model policy is implemented using 'Dirichlet-based Gaussian process classifiers [61,70 72]' but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, scikit-learn, or other libraries).
Experiment Setup Yes Each experiment run consists of n=2000 rounds of interactions (except for the synthetic Gauss Sine, for which n=500), and this is repeated for a total of 10 runs with random seeds. ... In all experiments, we set kint =0.1, = 1 2, =10% where applicable, and = 0 as noted in Section 3.2.