An Online Learning Approach to Sequential User-Centric Selection Problems

Authors: Junpu Chen, Hong Xie6231-6238

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments to validate the efficiency of On Lin Act Prf. and Experiments Experiment Setting Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default.
Researcher Affiliation Academia Junpu Chen, Hong Xie College of Computer Science, Chongqing University ironman98@sina.cn, xiehong2018@cqu.edu.cn
Pseudocode Yes Algorithm 1: Off Opt Act Prf (µ, P , C) and Algorithm 2: On Lin Act Prf ( Ht)
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the described methodology.
Open Datasets No We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. Note that we also vary M and K to evaluate our proposed algorithm. We set the probability mass function as: αd, if d m/2 , α(m + 1 d), if m/2 < d m, 0, otherwise where α = 1/(P m/2 d=1 d + Pm d= m/2 +1 m + 1 d) is the normalizing factor. This probability function has a normal distribution like shape. Roughly, the expected number of units of resource of an arm increases in its index m, because as the index m increases, more probability masses shift to the larger value of d. Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0.
Dataset Splits No We use Monte Carlo simulation to compute the average regret of each algorithm with 120 rounds of simulation. The paper does not specify train/validation/test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, or cloud resources) used to run the experiments.
Software Dependencies No The paper does not provide any specific software names with version numbers used in the experiments.
Experiment Setup Yes Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. ... We set the probability mass function as: ... Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0. ... We set the movement cost as ck,m = η|(k mod M) m|/ max{K, M}, where η R+ is a hyper-parameter that controls the scale of the cost. Unless we vary them explicitly, we consider the following default parameters: T = 105, δ = 1/T, K = 30 plays, M = 15 arms, η = 1, the exact σ case with σ = 0.2 and the U-Shape reward.