Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Near-Optimal MNL Bandits Under Risk Criteria
Authors: Guangyu Xi, Chao Tao, Yuan Zhou10397-10404
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms. |
| Researcher Affiliation | Academia | Guangyu Xi, 1 Chao Tao, 2 Yuan Zhou 3 1 University of Maryland, College Park 2 Indiana University Bloomington 3 University of Illinois at Urbana-Champaign |
| Pseudocode | Yes | Algorithm 1: Risk Aware UCB(N, K, r, U) |
| Open Source Code | Yes | 1Please refer to https://github.com/Alanthink/aaai2021 for the source code. |
| Open Datasets | Yes | In this experiment, we consider the UCI Car Evaluation Database dataset from the Machine Learning Repository (Dua and Graff 2017) |
| Dataset Splits | No | The paper describes using synthetic and real datasets, but it does not specify explicit training, validation, or test splits, nor does it mention cross-validation. It only states the number of repetitions for experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions that algorithms are "implemented in Python3" and use the "Bandit Py Lib library," but it does not provide specific version numbers for Python, the library, or any other software dependencies. |
| Experiment Setup | Yes | In this experiment, we fix the number of products N = 10, cardinality limit K = 4, horizon T = 10^6, and set the goal to be U = CVa R0.5. We generate 10 uniformly distributed random input instances where vi [0, 1] and ri [0.1, 1]. For each input instance, we run 20 repetitions and compute their average as the regret. |