Cost-aware Cascading Bandits
Authors: Ruida Zhou, Chao Gan, Jing Yang, Cong Shen
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performance of the CC-UCB algorithm is evaluated with both synthetic and real-world data. In this section we will resort to numerical experiments to evaluate the performances of the CC-UCB algorithm in Algorithm 1. |
| Researcher Affiliation | Academia | Ruida Zhou1, Chao Gan2, Jing Yang2, Cong Shen1 1 University of Science and Technology of China 2 The Pennsylvania State University |
| Pseudocode | Yes | Algorithm 1 Cost-aware Cascading UCB (CC-UCB) |
| Open Source Code | No | The paper does not provide any explicit statement or link to an open-source code repository for the methodology described. |
| Open Datasets | Yes | We test the proposed CC-UCB algorithm using real-world data extracted from the click log dataset of Yandex Challenge [Int, 2011]. The URL for the Yandex Challenge data is provided: https: //academy.yandex.ru/events/data_ analysis/relpred2011/. |
| Dataset Splits | No | The paper describes generating synthetic data and using a real-world click log dataset for online bandit experiments. It does not mention explicit training, validation, or test dataset splits as typically found in supervised learning setups. |
| Hardware Specification | No | The paper describes the experimental setup including parameters and datasets, but it does not provide any specific details about the hardware (e.g., CPU, GPU models, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific details about software dependencies or their version numbers used in the experiments. |
| Experiment Setup | Yes | In this section we will resort to numerical experiments to evaluate the performances of the CC-UCB algorithm in Algorithm 1. We set α = 1.5 and ϵ = 10 5. Both synthetic and real-world datasets are used. We run it for T = 2 105 steps, and average the accumulative regret over 20 runs. We vary K and L, i.e., the total number of arms K, and the number of arms in I , respectively. We also change i, i.e., ci θi, for i [K]\I . Specifically, we set θi = 0.5 for i I , θi = 0.3 for i [K]\I , and let ci be a constant c across all arms. |