Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems

Authors: Yue Kang, Cho-Jui Hsieh, Thomas Chun Man Lee

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For completeness, we conduct experiments to illustrate that our proposed algorithms, especially G-ESTS, are also computationally tractable and consistently outperform other state-of-the-art (generalized) linear matrix bandit methods based on a suite of simulations. In this section, we show by simulation experiments that our proposed G-ESTT (with Low GLM-UCB), G-ESTS (with SGD-TS) outperform existing algorithms for the generalized low-rank matrix bandit problems.
Researcher Affiliation Academia Yue Kang Department of Statistics University of California, Davis Davis, CA 95616 yuekang@ucdavis.edu Cho-Jui Hsieh Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 chohsieh@cs.ucla.edu Thomas C. M. Lee Department of Statistics University of California, Davis Davis, CA 95616 tcmlee@ucdavis.edu
Pseudocode Yes Algorithm 1 Generalized Explore Subspace Then Transform (G-ESTT) Algorithm 2 Low GLM-UCB Algorithm 3 Generalized Explore Subspace Then Subtract (G-ESTS)
Open Source Code No The paper's main content does not provide a direct link to a code repository or an explicit statement of open-source code availability for the described methodology. The checklist item 3a states 'Yes' for including code, data, and instructions in supplemental material or URL, but this is not an explicit statement within the main body for public access.
Open Datasets No We simulate a dataset with d1 = d2 = 10 (12) and r = 1 (2): when r = 1, we set the diagonal matrix Θ as diag(Θ ) = (0.8, 0, , 0). When r = 2, we set Θ = v1v 1 + v2v 2 for two random orthogonal vectors v1, v2 with v1 2 = v2 2 = 3. For arms we draw 480 (1000) random matrices from {X Rd1 d2 : X F 1}, and we build a logistic model where the payoff yt is drawn from a Bernoulli distribution with mean µ(X t θ ).
Dataset Splits No The paper describes simulation experiments but does not provide specific dataset split information (percentages, sample counts, or citations to predefined splits) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. The authors also marked 'N/A' for the checklist item regarding compute resources.
Software Dependencies No The paper mentions algorithms like SGD-TS and GLM-UCB but does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions).
Experiment Setup Yes More details on the hyper-parameter tuning are in Appendix I. Each experiment is repeated 100 times for credibility and the average regret, along with standard deviation, is displayed in Figure 1. Note that our experiments are more comprehensive than those in [26]. And due to the expensive time complexity of UCB-based baselines (Table 2), it is formidable for us to increase d here. We simulate a dataset with d1 = d2 = 10 (12) and r = 1 (2): when r = 1, we set the diagonal matrix Θ as diag(Θ ) = (0.8, 0, , 0). When r = 2, we set Θ = v1v 1 + v2v 2 for two random orthogonal vectors v1, v2 with v1 2 = v2 2 = 3. For arms we draw 480 (1000) random matrices from {X Rd1 d2 : X F 1}, and we build a logistic model where the payoff yt is drawn from a Bernoulli distribution with mean µ(X t θ ).