An Online Learning Approach to Sequential User-Centric Selection Problems
Authors: Junpu Chen, Hong Xie6231-6238
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to validate the efficiency of On Lin Act Prf. and Experiments Experiment Setting Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. |
| Researcher Affiliation | Academia | Junpu Chen, Hong Xie College of Computer Science, Chongqing University ironman98@sina.cn, xiehong2018@cqu.edu.cn |
| Pseudocode | Yes | Algorithm 1: Off Opt Act Prf (µ, P , C) and Algorithm 2: On Lin Act Prf ( Ht) |
| Open Source Code | No | The paper does not provide an explicit statement or a link to open-source code for the described methodology. |
| Open Datasets | No | We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. Note that we also vary M and K to evaluate our proposed algorithm. We set the probability mass function as: αd, if d m/2 , α(m + 1 d), if m/2 < d m, 0, otherwise where α = 1/(P m/2 d=1 d + Pm d= m/2 +1 m + 1 d) is the normalizing factor. This probability function has a normal distribution like shape. Roughly, the expected number of units of resource of an arm increases in its index m, because as the index m increases, more probability masses shift to the larger value of d. Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0. |
| Dataset Splits | No | We use Monte Carlo simulation to compute the average regret of each algorithm with 120 rounds of simulation. The paper does not specify train/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper does not provide any specific software names with version numbers used in the experiments. |
| Experiment Setup | Yes | Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. ... We set the probability mass function as: ... Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0. ... We set the movement cost as ck,m = η|(k mod M) m|/ max{K, M}, where η R+ is a hyper-parameter that controls the scale of the cost. Unless we vary them explicitly, we consider the following default parameters: T = 105, δ = 1/T, K = 30 plays, M = 15 arms, η = 1, the exact σ case with σ = 0.2 and the U-Shape reward. |