On Approximation Guarantees for Greedy Low Rank Optimization

Authors: Rajiv Khanna, Ethan R. Elenberg, Alexandros G. Dimakis, Joydeep Ghosh, Sahand Negahban

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide new approximation guarantees for greedy low rank matrix estimation under standard assumptions of restricted strong convexity and smoothness. Our novel analysis also uncovers previously unknown connections between the low rank estimation and combinatorial optimization, so much so that our bounds are reminiscent of corresponding approximation bounds in submodular maximization. Additionally, we also provide statistical recovery guarantees. Finally, we present empirical comparison of greedy estimation with established baselines on two important real-world problems.
Researcher Affiliation Academia 1UT Austin 2Yale University. Correspondence to: Rajiv Khanna <rajivak@utexas.edu>.
Pseudocode Yes Algorithm 1 GREEDY(U, V, k, ) ... Algorithm 2 GECO(U, V, k, )
Open Source Code No The paper does not contain an explicit statement or a link providing access to the source code for the methodology described in the paper. The only link provided (http://www.statmt.org/wmt14/training-monolingual-news-crawl) is for a dataset.
Open Datasets Yes For word similarity, we use the W353 dataset (Finkelstein et al., 2001) and the MEN data (Bruni et al., 2012). ... For the analogy task, we use the Microsoft Research (MSR) syntactic analogies (Mikolov et al., 2013c) and the Google mixed analogies dataset (Mikolov et al., 2013a). ... To train the word embeddings, we use the 2013 news crawl dataset.
Dataset Splits No The paper mentions using specific datasets for experiments but does not provide explicit details about how these datasets were split into training, validation, and test sets (e.g., percentages, sample counts, or specific predefined splits with citations).
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as CPU or GPU models, memory specifications, or types of computing clusters.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. It generally describes the methods but lacks the precise software environment details required for reproducibility.
Experiment Setup No The paper describes the experimental tasks and datasets but does not provide specific details regarding hyperparameters, optimization settings (e.g., learning rates, batch sizes), or other system-level training configurations used for the models evaluated.