reproducibilityindex.ai

Characterizing the Loss Landscape in Non-Negative Matrix Factorization

Authors: Johan Bjorck, Anmol Kabra, Kilian Q. Weinberger, Carla Gomes6768-6776

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that such a property holds with high probability for NMF, provably in a non-worst case model with a planted solution, and empirically across an extensive suite of real-world NMF problems spanning collaborative ﬁltering, scientiﬁc analysis, and image analysis. Our analysis predicts that this property becomes more likely with a growing number of parameters, and experiments suggest that a similar trend might also hold for deep neural networks turning increasing dataset sizes and model sizes into a blessing from an optimization perspective.
Researcher Affiliation	Academia	Johan Bjorck, Anmol Kabra, Kilian Q. Weinberger, Carla P. Gomes Cornell University {njb225,ak2426,kqw4,gomes}@cornell.edu
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	In Table 1, we list these datasets together with their sparsity. movielens movie ratings p3953, 6041, 20q 0.0419 (Harper and Konstan 2016), netﬂix movie/tv-show ratings p47928, 8963, 20q 0.0121 (Zhou et al. 2008), goodbooks book ratings p10000, 43461, 50q 0.0022 (Kula 2017).
Dataset Splits	No	The paper does not explicitly provide training/validation/test dataset splits (e.g., percentages or sample counts) needed to reproduce the experiment. It mentions computing 'loss only over observed entries' but not explicit splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud instances) used for running its experiments.
Software Dependencies	No	The paper mentions software like PyTorch (implicitly, through Resnet references) but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, or other libraries).
Experiment Setup	Yes	For simplicity, we use the same learning rate of 1e 5 for all datasets and run gradient descent until the rate of relative improvement in the loss falls below 1e 7. We initialize decomposition matrices using the half-normal distribution, which is scaled so that the mean matches with that of the dataset. To enable comparison between datasets, we scale all data matrices so that the variance of observed entries is one, and divide the loss function by the number of (observed) entries.