reproducibilityindex.ai

Online Learning with Bounded Recall

Authors: Jon Schneider, Kiran Vodrahalli

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct some experiments to assess the performance of the bounded-recall learning algorithms we introduce in some simple settings.
Researcher Affiliation	Industry	1Google. Correspondence to: Jon Schneider <jschnei@google.com>, Kiran Vodrahalli <kirannv@google.com>.
Pseudocode	Yes	Algorithm 1 PERIODICRESTART; Algorithm 3 RANDOMIZEDAVERAGERESTART; Algorithm 4 AVERAGERESTARTFULLHORIZON
Open Source Code	No	The paper does not provide explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper uses simulated environments rather than publicly available datasets. For example, 'In the first environment, we simulate periodic drifting rewards...' and 'In the second environment, we simulate the adversarial example from the proof of Lemma 4.2 (with M = T/3).'
Dataset Splits	No	The paper conducts simulations in synthetic environments and does not describe standard dataset splits (training, validation, test) with specific percentages or counts.
Hardware Specification	No	The paper conducts simulations but does not specify any hardware details such as GPU models, CPU types, or memory used for the experiments.
Software Dependencies	No	The paper mentions using the 'Multiplicative Weights Update algorithm' but does not provide specific software names with version numbers for implementation, such as programming languages, libraries, or frameworks.
Experiment Setup	Yes	In particular, we consider Algorithms 1 (PERIODICRESTART), 3 (AVERAGERESTART), 4 (AVERAGERESTARTFULLHORIZON), and an M-bounded-recall mean-based algorithm, all based off of the Multiplicative Weights Update algorithm with fixed learning rate η = 1/2. All algorithms we simulate in the first environment have M = 0.15T. In the second environment, we simulate the adversarial example from the proof of Lemma 4.2 (with M = T/3). In Figure 2, we plot the cumulative regret over time for both environments, averaged over ten independent runs with T = 1000.