Active Learning for Accurate Estimation of Linear Models

Authors: Carlos Riquelme, Mohammad Ghavamzadeh, Alessandro Lazaric

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide empirical evidence to support our theoretical results. We consider both synthetic and real-world problems, and compare the performance (in terms of normalized MSE) of TRACE-UCB to uniform sampling, optimal static allocation (which requires the knowledge of noise variances), and the context-free algorithm VAR-UCB (see Remark 2). ... First, we use synthetic data to ensure that all the assumptions of our model are satisfied, namely we deal with linear regression models with Gaussian context and noise. ... Second, we consider real-world datasets in which the underlying model is non-linear and the contexts are not Gaussian, to observe how TRACE-UCB behaves (relative to the baselines) in settings where its main underlying assumptions are violated.
Researcher Affiliation Collaboration 1Stanford University, Stanford, CA, USA. 2Deep Mind, Mountain View, CA, USA (The work was done when the author was with Adobe Research). 3Inria Lille, France.
Pseudocode Yes Algorithm 1 TRACE-UCB Algorithm
Open Source Code No The paper does not provide an explicit statement about the release of source code for the described methodology, nor does it include a link to a code repository.
Open Datasets Yes We consider two collaborative filtering datasets in which users provide ratings for items. ... Fig. 2(a) reports the results using the Jester Dataset by (Goldberg et al., 2001)... Fig. 2(b) shows the results for the Movie Lens dataset (Maxwell Harper & Konstan, 2016)...
Dataset Splits No The paper mentions collecting a "training set Dn" and using k-n users for evaluation, but it does not specify explicit train/validation/test splits (e.g., percentages, sample counts, or specific predefined splits) that would be needed for direct reproducibility of the data partitioning.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper describes the algorithms and their performance but does not specify any software dependencies with version numbers (e.g., specific Python libraries, machine learning frameworks like PyTorch or TensorFlow, or statistical software versions).
Experiment Setup No The paper mentions a "regularization parameter λ = O(1/n)" as an input to the algorithm but does not provide a comprehensive set of hyperparameters or detailed system-level training configurations (e.g., learning rates, batch sizes, optimizer settings, number of epochs) for the experiments conducted.