Multi-Feedback Bandit Learning with Probabilistic Contexts

Authors: Luting Yang, Jianyi Yang, Shaolei Ren

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our simulation on machine learning model recommendation further validates the sub-linearity of our cumulative regret and demonstrates that our algorithm outperforms the approach that selects arms based on the most probable context.
Researcher Affiliation Academia Luting Yang , Jianyi Yang and Shaolei Ren University of California, Riverside {lyang029, jyang239, shaolei}@ucr.edu
Pseudocode Yes Algorithm 1 Multi-Feedback Probabilistic Contextual UCB
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include links to repositories or mention code in supplementary materials.
Open Datasets No For evaluation purposes, we run experiments and collect measured data of five image classification DNN models from Tensor Flow Hub running on two cellphones (Vivo V1838A and Google Pixel 3a) and two tablets (Samsung Galaxy Tab A7 and Vankyo Matrix Pad Z4).
Dataset Splits No The paper mentions collecting its own measured data and generating probabilities for context bundles, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits) for reproducibility.
Hardware Specification No The paper mentions the mobile devices (Vivo V1838A, Google Pixel 3a, Samsung Galaxy Tab A7, Vankyo Matrix Pad Z4) used for collecting DNN model data, but it does not specify the hardware (e.g., CPU, GPU models, memory) used to run the bandit learning algorithm simulations.
Software Dependencies No The paper mentions using Tensor Flow Hub and a radial basis function kernel but does not provide specific version numbers for any software dependencies or libraries required for replication.
Experiment Setup No The paper mentions generating probabilistic contexts and random utility functions for simulation, and uses a radial basis function kernel, but it does not provide specific hyperparameter values (e.g., values for λ or β) or detailed training configurations for its experiments.