Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning from a Learning User for Optimal Recommendations
Authors: Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, Haifeng Xu
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic datasets demonstrate the strength of RAES for such a contemporaneous system-user learning problem. In this section, we study the empirical performance of RAES to validate our theoretical analysis by running simulations on synthetic datasets in comparison with several baselines. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Virginia, USA 2Department of Economics, University of Virginia, USA. Correspondence to: Fan Yao <EMAIL>, Hongning Wang <EMAIL>, Haifeng Xu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Active Ellipsoid Search on Unit Sphere and Algorithm 2 Noise-robust Active Ellipsoid Search (RAES) |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | No | The paper mentions using 'synthetic datasets' for experiments but does not provide specific access information (link, DOI, repository, or formal citation) for these datasets. |
| Dataset Splits | No | The paper refers to synthetic datasets and uses a time horizon 'T' for experiments, but it does not provide specific details on training, validation, or test set splits, nor does it specify how the data was partitioned for evaluation. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used to run its experiments, such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments. |
| Experiment Setup | Yes | In all experiments, we fix the action set A = Bd 2(0, 1), i.e., D0 = D1 = 1, and δ = 0.1, k = 1.05. We consider a (1, γ)-rational user with γ {0, 0.2} and prior knowledge matrix V0. The user s decision sequence {β(0) t } and {β(1) t } are independently drawn from [ tγ, tγ]. The ground-truth parameter θ is sampled from Bd 2(0, 1) and the reported results are collected from the same problem instance and averaged over 10 independent runs. DBGD s hyper-parameters include the starting point w0, and two learning rates δ, γ that control the step-lengths for proposing new points and update the current points, respectively. In the experiment, these hyper-parameters are set to (w0, δ, γ) = (0, d 1 2 ), as recommended in (Yue & Joachims, 2009). |