reproducibilityindex.ai

Meta-Learning Effective Exploration Strategies for Contextual Bandits

Authors: Amr Sharaf, Hal Daumé III9541-9548

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate MˆEL EE on both a natural contextual bandit problem derived from a learning to rank dataset as well as hundreds of simulated contextual bandit problems derived from classiﬁcation tasks.
Researcher Affiliation	Collaboration	Amr Sharaf,1 Hal Daume III1,2 1 University of Maryland 2 Microsoft Research
Pseudocode	Yes	Algorithm 1 MˆEL EE (supervised training sets {Sm}, hypothesis class F, exploration rate µ, number of validation examples NVal, feature extractor Φ)
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for its methodology or links to a code repository.
Open Datasets	Yes	The dataset we consider is the Microsoft Learning to Rank dataset, variant MSLR-10K from (Qin and Liu 2013). ... Following Bietti, Agarwal, and Langford (2018), we use a collection of 300 binary classiﬁcation datasets from openml.org for evaluation.
Dataset Splits	Yes	Algorithm 1 MˆEL EE ... 4: partition and permute S randomly into train Tr and validation Val where \|Val\| = NVal
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions methods like 'Platt s scaling' and 'AggreVaTe' but does not specify software packages or libraries with version numbers.
Experiment Setup	Yes	In all cases, the underlying classiﬁer f is a linear model trained with an optimizer that runs stochastic gradient descent. ... In our experiments we use only 30 fully labeled examples... In practice ( 5), we ﬁnd that setting µ = 0 is optimal in aggregate... To avoid correlations between the observed query-url pairs, we group the queries by the query ID, and sample a single query from each group. ... we repeat the experiment 16 times with randomly shufﬂed permutations of the MSLR-10K dataset.