reproducibilityindex.ai

Learning to Search Better than Your Teacher

Authors: Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daumé III, John Langford

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments This section shows that LOLS is able to improve upon a suboptimal reference policy and provides empirical evidence to support the analysis in Section 3. We conducted experiments on the following three applications.
Researcher Affiliation	Collaboration	Kai-Wei Chang KCHANG10@ILLINOIS.EDU University of Illinois at Urbana Champaign, IL Akshay Krishnamurthy AKSHAYKR@CS.CMU.EDU Carnegie Mellon University, Pittsburgh, PA Alekh Agarwal ALEKHA@MICROSOFT.COM Microsoft Research, New York, NY Hal Daum e III HAL@UMIACS.UMD.EDU University of Maryland, College Park, MD John Langford JCL@MICROSOFT.COM Microsoft Research, New York, NY
Pseudocode	Yes	Algorithm 1: Locally Optimal Learning to Search (LOLS) Algorithm 2: Structured Contextual Bandit Learning
Open Source Code	No	Our implementation is based on Vowpal Wabbit6, a machine learning system that supports online learning and L2S. Footnote 6 points to http://hunch.net/ vw/. This is a third-party tool used by the authors, not their own source code for the methodology.
Open Datasets	Yes	The experiments are conducted on KDDCup 99 dataset5 generated from a computer network intrusion detection task. (Footnote 5: http://kdd.ics.uci.edu/databases/ kddcup99/kddcup99.html) We train on 38k sentences and test on 11k from the Penn Treebank (Marcus et al., 1993). We used data from the Penn Treebank Wall Street Journal corpus: the standard data split for training (sections 02-21) and test (section 23).
Dataset Splits	Yes	The dataset contains 5 classes, 4, 898, 431 training and 311, 029 test instances. We train on 38k sentences and test on 11k from the Penn Treebank (Marcus et al., 1993). We used data from the Penn Treebank Wall Street Journal corpus: the standard data split for training (sections 02-21) and test (section 23).
Hardware Specification	No	The paper does not provide specific hardware details (like GPU/CPU models or memory) used for running the experiments. It only mentions using 'Vowpal Wabbit' which is a software system.
Software Dependencies	No	The paper mentions 'Vowpal Wabbit' as the base for their implementation but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	For LOLS s mixture policy, we set β = 0.5. For SEARN, we set the mixture parameter to be 1 (1 α)t, where t is the number of rounds and α = 10 5. Unless stated otherwise all the learners take 5 passes over the data.