Adaptive Linear Estimating Equations

Authors: Mufang Ying, Koulik Khamaru, Cun-Hui Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we consider three settings: two-armed bandit setting, first order auto-regressive model setting and contextual bandit setting. In two-armed bandit setting, the rewards are generated with same arm mean (θ 1, θ 2) = (0.3, 0.3), and noise is generated from a normal distribution with mean 0 and variance 1. To collect two-armed bandit data, we use ϵ-Greedy algorithm with decaying exploration rate p log(t)/t. The rate is designed to make sure the number of times each armed is pulled has order greater than log(n) up to time n. In the second setting, we consider the time series model, yt = θ yt 1 + ϵt, (31) where θ = 1 and noise ϵt is drawn from a normal distribution with mean 0 and variance 1. In the contextual bandit setting, we consider the true parameter θ to be 0.3 times the all-one vector. In the initial iterations, a random context xt is generated from a uniform distribution in Sd 1. Then, we apply ϵ-Greedy algorithm to these pre-selected contexts with decaying exploration rate log2(t)/t. For all of the above three settings, we run 1000 independent replications.
Researcher Affiliation Academia Mufang Ying Department of Statistics Rutgers University New Brunswick my426@scarletmail.rutgers.edu
Pseudocode Yes Algorithm 1: Modified ALEE estimate
Open Source Code Yes The code can be found at https://github.com/ mufangying/ALEE.
Open Datasets No The paper describes generating synthetic data for simulations (e.g., 'rewards are generated with same arm mean', 'noise is generated from a normal distribution') rather than using a publicly available or open dataset. No specific link, DOI, repository, or formal citation for a public dataset is provided.
Dataset Splits No The paper describes data generation and experimental setup for simulations but does not specify training, validation, and test dataset splits in terms of percentages, absolute counts, or references to predefined splits of a static dataset.
Hardware Specification No The paper mentions running '1000 independent replications' but does not provide any specific details about the hardware used, such as GPU/CPU models, memory, or cloud computing instances.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) required to reproduce the experiments. While a GitHub link is provided for the code, the dependencies are not detailed in the paper's text itself.
Experiment Setup Yes In two-armed bandit setting, the rewards are generated with same arm mean (θ 1, θ 2) = (0.3, 0.3), and noise is generated from a normal distribution with mean 0 and variance 1. To collect two-armed bandit data, we use ϵ-Greedy algorithm with decaying exploration rate p log(t)/t. The rate is designed to make sure the number of times each armed is pulled has order greater than log(n) up to time n.