reproducibilityindex.ai

Budgeted Prediction with Expert Advice

Authors: Kareem Amin, Satyen Kale, Gerald Tesauro, Deepak Turaga

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the ﬁrst part of this paper, we prove our main theoretical results, stated below. We then demonstrate our algorithm s performance on real data in a setting emulating the aforementioned motivation for this work.
Researcher Affiliation	Collaboration	Kareem Amin University of Pennsylvania, Philadelphia, PA akareem@seas.upenn.edu Satyen Kale Yahoo! Labs, New York, NY satyen@yahoo-inc.com Gerald Tesauro Deepak Turaga IBM Research, Yorktown Heights, NY {gtesauro, turaga}@us.ibm.com
Pseudocode	Yes	Algorithm 1 Budgeted Experts Algorithm (BEXP).
Open Source Code	No	The paper provides links to third-party software (libsvm, Vowpal Wabbit) used, but does not state that the authors' own implementation code for the described methodology (BEXP, BEXP-AVG) is open-source or provide a link for it.
Open Datasets	Yes	We will use the year-prediction task associated with the Million Song Dataset (MSD). Each example in the MSD corresponds to a song released between 1922 and 2011. We use the same features as (Bertin-Mahieux et al. 2011) 90 acoustical features representing the timbre of a song.
Dataset Splits	No	The paper mentions training on 'a set of 46,382 examples of the MSD' and testing on 'a sequence of T = 5,153 examples arriving online' but does not specify a separate validation split or explicit percentages/counts for train/validation/test.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'MATLAB', 'libsvm', and 'Vowpal Wabbit' as software used, but does not provide specific version numbers for any of these dependencies.
Experiment Setup	Yes	We used 3 families of models trained with different parameter settings, for a total of 29 different experts. The models were all trained on a set of 46, 382 examples of the MSD. All labels in the dataset were in the range 1929 to 2010. We normalized the labels by subtracting 1900 and dividing by 100 so that they were in the range [0, 1.1]... We used absolute loss to measure the performance of the experts...