reproducibilityindex.ai

Generalized Leverage Scores: Geometric Interpretation and Applications

Authors: Bruno Ordozgoiti, Antonis Matakos, Aristides Gionis

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run numerical experiments to provide further insight on the proposed methods. We conduct experiments to gain further insight about our results.
Researcher Affiliation	Academia	1School of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom 2Deparment of Computer Science, Aalto University, Finland 3Division of Theoretical Computer Science, KTH Royal Institute of Technology, Sweden.
Pseudocode	Yes	Algorithm 1 Generalized deterministic leverage score sampling for GCSS (Problem 2). Algorithm 2 Generalized deterministic leverage score sampling for Sparse CCA.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets	Yes	We evaluate the performance of Algorithm 1 on a collection of real datasets, obtained from a repository maintained by Arizona State University for feature-selection tasks.1 (footnote 1: https://jundongl.github.io/ scikit-feature/datasets.html)
Dataset Splits	No	The paper states: 'We split each dataset in two. The ﬁrst half of the columns acts as matrix A (see def. of GCSS), and the second as B.' This describes how the data is used for the specific problem setup, but does not provide standard training/validation/test dataset splits for model reproduction.
Hardware Specification	No	The paper discusses computational complexity and running time but does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies	No	The paper mentions utilizing 'off-the-shelf software' for computing SVD but does not provide specific version numbers for any software dependencies or libraries used in their implementation.
Experiment Setup	Yes	To avoid large values of the ratio σω / σµ , we only retain singular vectors paired with the leading singular values accounting for at least 75% of the total squared norm. We thus try setting the size of R to a number of values: 1/10, 1/4 and 1/2 of the retained rank. For each resulting R we compute generalized leverage scores. We use the generalized-leverage-score ordering to build a column submatrix Ck of size k, for a range of k (depending on the dataset size).