reproducibilityindex.ai

Learning to Interact With Learning Agents

Authors: Adish Singla, Hamed Hassani, Andreas Krause

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulation Results Next, we evaluate the performance of the forecaster LIL via simulations, and compare against the following benchmarks: EXP3: using EXP3 algorithm (Auer et al. 2002) as the forecaster for the speciﬁcation in Protocol 1. ALL-LEARN: using EXP3 algorithm (Auer et al. 2002) as the forecaster for a relaxed/easier setting in which all experts j [N] observe the feedback at any time t. Adversarial losses. As our ﬁrst simulation setting, we consider the same set up used in the proof of Theorem 1 and we use the loss sequence shown in Figure 1(a). For this loss sequence, the loss of actions A = {a1, a2, b} averaged over t [T] is given by (0.4583, 0.5, 0.7487) hence the best expert is EXP1 and the best action is a1 (cf. Equation 3). Figure 2(a) shows the regret REG(T, ALGO) for LIL, EXP3, and ALL-LEARN, and illustrates the following points. First, EXP3 suffer a linear regret, as dictated by the hardness result in Theorem 1. Second, LIL has a sub-linear regret as proved in Theorem 2.
Researcher Affiliation	Academia	Adish Singla MPI-SWS Saarbrücken, Germany adishs@mpi-sws.org Hamed Hassani University of Pennsylvania Philadelphia, USA hassani@seas.upenn.edu Andreas Krause ETH Zurich Zurich, Switzerland krausea@ethz.ch
Pseudocode	Yes	Algorithm 2: Forecaster LIL
Open Source Code	No	The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper discusses generating loss sequences for simulations (e.g., 'loss sequence shown in Figure 1(a)', 'losses of actions A = {a1, a2, b} are sampled i.i.d. from Bernoulli distributions') but does not refer to or provide access information for a publicly available or open dataset.
Dataset Splits	No	The paper does not provide specific details about train/validation/test dataset splits, sample counts, or cross-validation setups. It describes simulated loss sequences but no data partitioning.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud resources used for running its simulations or experiments.
Software Dependencies	No	The paper discusses algorithms and frameworks (e.g., EXP3, HEDGE, Online Mirror Descent) but does not provide specific software names with version numbers or dependencies needed to replicate the experiments.
Experiment Setup	Yes	Set parameters η = 1 β 2 β (log N)( 1 2 1{β=0}) . Then, for sufﬁciently large T, the worst-case expected cumulative regret of the forecaster LIL is: REG(T, LIL) O T 1 2 β N 1 2 β (log N)( 1.