reproducibilityindex.ai

Active Learning for Accurate Estimation of Linear Models

Authors: Carlos Riquelme, Mohammad Ghavamzadeh, Alessandro Lazaric

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide empirical evidence to support our theoretical results. We consider both synthetic and real-world problems, and compare the performance (in terms of normalized MSE) of TRACE-UCB to uniform sampling, optimal static allocation (which requires the knowledge of noise variances), and the context-free algorithm VAR-UCB (see Remark 2). ... First, we use synthetic data to ensure that all the assumptions of our model are satisﬁed, namely we deal with linear regression models with Gaussian context and noise. ... Second, we consider real-world datasets in which the underlying model is non-linear and the contexts are not Gaussian, to observe how TRACE-UCB behaves (relative to the baselines) in settings where its main underlying assumptions are violated.
Researcher Affiliation	Collaboration	1Stanford University, Stanford, CA, USA. 2Deep Mind, Mountain View, CA, USA (The work was done when the author was with Adobe Research). 3Inria Lille, France.
Pseudocode	Yes	Algorithm 1 TRACE-UCB Algorithm
Open Source Code	No	The paper does not provide an explicit statement about the release of source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We consider two collaborative ﬁltering datasets in which users provide ratings for items. ... Fig. 2(a) reports the results using the Jester Dataset by (Goldberg et al., 2001)... Fig. 2(b) shows the results for the Movie Lens dataset (Maxwell Harper & Konstan, 2016)...
Dataset Splits	No	The paper mentions collecting a "training set Dn" and using k-n users for evaluation, but it does not specify explicit train/validation/test splits (e.g., percentages, sample counts, or specific predefined splits) that would be needed for direct reproducibility of the data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper describes the algorithms and their performance but does not specify any software dependencies with version numbers (e.g., specific Python libraries, machine learning frameworks like PyTorch or TensorFlow, or statistical software versions).
Experiment Setup	No	The paper mentions a "regularization parameter λ = O(1/n)" as an input to the algorithm but does not provide a comprehensive set of hyperparameters or detailed system-level training configurations (e.g., learning rates, batch sizes, optimizer settings, number of epochs) for the experiments conducted.