Distributionally-Aware Kernelized Bandit Problems for Risk Aversion
Authors: Sho Takemori
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we empirically verify our theoretical result in synthetic environments, and demonstrate that our proposed method significantly outperforms a baseline in many cases. |
| Researcher Affiliation | Industry | 1Fujitsu Ltd., Kawasaki, Japan. Correspondence to: Sho Takemori <takemori.sho@fujitsu.com> |
| Pseudocode | Yes | Algorithm 1 UCB-type Algorithm for Kernelized CVa R Bandits |
| Open Source Code | Yes | Source code is publicly available as a supplementary material. |
| Open Datasets | No | In this section, we assume that X is a discretization of the cube [0, 1]d with d = 3, X = {i/10 : i = 0, 1, . . . , 10}3... For the first kind of environments, we consider a family of normal distributions ρ(x) = N(µm(x), σm(x)) for 1 m 10. Here, we constructed functions µm, σm Hk(X) independently randomly for 1 m 10 (we provide details in the appendix). ...For the second kind of environments, we consider log-normal distributions LN(µ m(x), σ m(x)), where 1 m 10 and random functions µ m, σ m are defined similarly to µm, σm. |
| Dataset Splits | Yes | To tune parameters of the algorithms, we use different but the same kind of the environments using the first 200 rounds. |
| Hardware Specification | Yes | We used Intel(R) Core(TM) i9-9920X CPU for our experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies. It implies the use of an 'own implementation' of an algorithm, but no explicit software or library versions are listed. |
| Experiment Setup | Yes | We take λ = 1 for both the algorithms, δ = 10 2 for IGP-UCB. To apply CVPKE-UCB to continuous distributions, assuming there exists a discretization Y for support of ρ(x), we let δ/|Y| = 10 2 in the definition of β(CV) k,t (δ). |