Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
An Online Learning Approach to Sequential User-Centric Selection Problems
Authors: Junpu Chen, Hong Xie6231-6238
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to validate the efficiency of On Lin Act Prf. and Experiments Experiment Setting Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. |
| Researcher Affiliation | Academia | Junpu Chen, Hong Xie College of Computer Science, Chongqing University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Off Opt Act Prf (µ, P , C) and Algorithm 2: On Lin Act Prf ( Ht) |
| Open Source Code | No | The paper does not provide an explicit statement or a link to open-source code for the described methodology. |
| Open Datasets | No | We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. Note that we also vary M and K to evaluate our proposed algorithm. We set the probability mass function as: αd, if d m/2 , α(m + 1 d), if m/2 < d m, 0, otherwise where α = 1/(P m/2 d=1 d + Pm d= m/2 +1 m + 1 d) is the normalizing factor. This probability function has a normal distribution like shape. Roughly, the expected number of units of resource of an arm increases in its index m, because as the index m increases, more probability masses shift to the larger value of d. Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0. |
| Dataset Splits | No | We use Monte Carlo simulation to compute the average regret of each algorithm with 120 rounds of simulation. The paper does not specify train/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper does not provide any specific software names with version numbers used in the experiments. |
| Experiment Setup | Yes | Parameter setting. We consider a generic sequential usercentric selection problem characterized by M = 15 arms and K = 30 plays by default. ... We set the probability mass function as: ... Each arm s rewards are sampled from Guassian distributions. i.e., Rm N(µm, σ2), where µm [1, 2] and σ > 0. ... We set the movement cost as ck,m = η|(k mod M) m|/ max{K, M}, where η R+ is a hyper-parameter that controls the scale of the cost. Unless we vary them explicitly, we consider the following default parameters: T = 105, δ = 1/T, K = 30 plays, M = 15 arms, η = 1, the exact σ case with σ = 0.2 and the U-Shape reward. |