Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging

Authors: Ping-Chun Hsieh, Xi Liu, Anirban Bhattacharya, P R Kumar

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate the performance of HR-UCB via simulations.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA 2Department of Statistics, Texas A&M University, College Station, USA.
Pseudocode Yes Algorithm 1 The HR-UCB Policy
Open Source Code No The paper does not provide any explicit statements or links about the availability of open-source code for the described methodology.
Open Datasets No The paper states: "For simplicity, the context of each user-action pair is designed to be a four-dimensional vector, which is drawn uniformly at random from a unit ball." This indicates a simulated or synthetic dataset without public access information.
Dataset Splits No The paper describes the simulation setup and parameters but does not specify train, validation, or test dataset splits in the conventional sense. It refers to simulating over T=30000 users.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper does not provide specific software names with version numbers that would be necessary to replicate the experiments.
Experiment Setup Yes For the mean and variance of the outcome distribution, we set θ = [0.6, 0.5, 0.5, 0.3] and φ = [0.5, 0.2, 0.8, 0.9] , respectively. We consider the function f(x) = x + L with L = 2 and Mf = 1. The acceptance level of each user is drawn uniformly at random from the interval [ 1, 1]. We set T = 30000 throughout the simulations. For HR-UCB, we set δ = 0.1 and λ = 1. All the results in this section are the average of 20 simulation trials.