Continuous Mean-Covariance Bandits
Authors: Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results also demonstrate the superiority of our algorithms. In this section, we present experimental results for our algorithms on both synthetic and real-world [20] datasets. |
| Researcher Affiliation | Academia | Yihan Du IIIS, Tsinghua University Beijing, China duyh18@mails.tsinghua.edu.cn Siwei Wang CST, Tsinghua University Beijing, China wangsw2020@mail.tsinghua.edu.cn Zhixuan Fang IIIS, Tsinghua University, Beijing, China Shanghai Qi Zhi Institute, Shanghai, China zfang@mail.tsinghua.edu.cn Longbo Huang IIIS, Tsinghua University Beijing, China longbohuang@mail.tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 MC-Empirical; Algorithm 2 MC-UCB; Algorithm 3 MC-ETE |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that the code for their method is publicly available. |
| Open Datasets | Yes | For the real-world dataset, we use an open dataset US Funds from Yahoo Finance on Kaggle [20], which provides financial data of 1680 ETF funds in 2010-2017. [20] Stefano Leone. Dataset: US funds dataset from yahoo finance. Kaggle, 2020. https: //www.kaggle.com/stefanoleone992/mutual-funds-and-etfs?select=ETFs.csv. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. For bandit problems, data is generated through interaction, and traditional static dataset splits are not typically defined. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the synthetic dataset, we set θ = [0.2, 0.3, 0.2, 0.2, 0.2] , and Σ has all diagonal entries equal to 1 and all off-diagonal entries equal to 0.05. For both datasets, we set d = 5 and ρ {0.1, 10}. The random reward θt is drawn i.i.d. from Gaussian distribution N(θ , Σ ). We perform 50 independent runs for each algorithm and show the average regret and 95% confidence interval across runs. |