Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

New Perspectives on k-Support and Cluster Norms

Authors: Andrew M. McDonald, Massimiliano Pontil, Dimitris Stamos

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments indicate that the spectral k-support and box-norms and their centered variants provide state of the art performance in matrix completion and multitask learning problems respectively. We present extensive numerical experiments on both synthetic and real matrix learning data sets. Our findings indicate that regularization with the spectral k-support and box-norms produces state-of-the art results on a number of popular matrix completion benchmarks and centered variants of the norms show a significant improvement in performance over the centered trace norm and the matrix elastic net on multitask learning benchmarks.
Researcher Affiliation Academia Andrew M. Mc Donald EMAIL Department of Computer Science University College London Gower Street, London WC1E 6BT, UK Massimiliano Pontil EMAIL Istituto Italiano di Tecnologia Via Morego, 30, 16163 Genoa, Italy Department of Computer Science University College London Gower Street, London WC1E 6BT, UK Dimitris Stamos EMAIL Department of Computer Science University College London Gower Street, London WC1E 6BT, UK
Pseudocode Yes Algorithm 1 Computation of x = prox λ 2 2 box (w). Require: parameters a, b, c, λ. 1. Sort points αi 2d i=1 = n a+λ |wi| : i = 1, . . . , d o n b+λ |wi| : i = 1, . . . , d o such that αi αi+1; 2. Identify points αi and αi+1 such that S(αi) c and S(αi+1) c by binary search; 3. Find α between αi and αi+1 such that S(α ) = c by linear interpolation; 4. Compute θi(α ) for i = 1 . . . , d; 5. Return xi = θiwi θi+λ for i = 1 . . . , d.
Open Source Code Yes Matlab code used in the experiments is available at http://www0.cs.ucl.ac.uk/staff/M.Pontil/software.html.
Open Datasets Yes Movie Lens data sets are available at http://grouplens.org/datasets/movielens/. Jester data sets are available at http://goldberg.berkeley.edu/jester-data/. The Lenk personal computer data set (Lenk et al., 1996) consists of 180 ratings of 20 profiles of computers characterized by 14 features (including a bias term). The Animals with Attributes data set (Lampert et al., 2009) consists of 30,475 images of animals from 50 classes.
Dataset Splits Yes Following Toh and Yun (2011), for Movie Lens we uniformly sampled ρ = 50% of the available entries for each user for training, and for Jester 1, Jester 2 and Jester 3 we sampled 20, 20 and 8 ratings per user respectively, and we again used 10% for validation. For each class the examples were split into training, validation and testing data sets, with a split of 50%, 25%, 25% respectively, and we averaged the performance over 50 runs.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments. It mentions using 'Matlab code' and an 'accelerated proximal gradient method (FISTA)' but no CPU, GPU, or other hardware specifications.
Software Dependencies No The paper mentions 'Matlab code' and 'an accelerated proximal gradient method (FISTA)' but does not provide specific version numbers for any software or libraries used, which is necessary for reproducibility.
Experiment Setup Yes To solve the optimization problem we used an accelerated proximal gradient method (FISTA), (see e.g. Beck and Teboulle, 2009; Nesterov, 2007), using the percentage change in the objective as convergence criterion, with a tolerance of 10-5 (10-3 for real matrix completion experiments). For the matrix completion experiments, the thresholding level was chosen by validation. The error was measured as normalized mean absolute error. We used the logistic loss, yielding the error term sum_i=1^n log (1 + exp(yt,i wt, xi)).