Distributionally-Aware Kernelized Bandit Problems for Risk Aversion

Authors: Sho Takemori

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we empirically verify our theoretical result in synthetic environments, and demonstrate that our proposed method significantly outperforms a baseline in many cases.
Researcher Affiliation Industry 1Fujitsu Ltd., Kawasaki, Japan. Correspondence to: Sho Takemori <takemori.sho@fujitsu.com>
Pseudocode Yes Algorithm 1 UCB-type Algorithm for Kernelized CVa R Bandits
Open Source Code Yes Source code is publicly available as a supplementary material.
Open Datasets No In this section, we assume that X is a discretization of the cube [0, 1]d with d = 3, X = {i/10 : i = 0, 1, . . . , 10}3... For the first kind of environments, we consider a family of normal distributions ρ(x) = N(µm(x), σm(x)) for 1 m 10. Here, we constructed functions µm, σm Hk(X) independently randomly for 1 m 10 (we provide details in the appendix). ...For the second kind of environments, we consider log-normal distributions LN(µ m(x), σ m(x)), where 1 m 10 and random functions µ m, σ m are defined similarly to µm, σm.
Dataset Splits Yes To tune parameters of the algorithms, we use different but the same kind of the environments using the first 200 rounds.
Hardware Specification Yes We used Intel(R) Core(TM) i9-9920X CPU for our experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies. It implies the use of an 'own implementation' of an algorithm, but no explicit software or library versions are listed.
Experiment Setup Yes We take λ = 1 for both the algorithms, δ = 10 2 for IGP-UCB. To apply CVPKE-UCB to continuous distributions, assuming there exists a discretization Y for support of ρ(x), we let δ/|Y| = 10 2 in the definition of β(CV) k,t (δ).