reproducibilityindex.ai

Infinite Recommendation Networks: A Data-Centric Approach

Authors: Noveen Sachdeva, Mehak Dhaliwal, Carole-Jean Wu, Julian Mcauley

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Both of our proposed approaches significantly outperform their respective state-of-the-art and when used together, we observe 96-105% of -AE’s performance on the full dataset with as little as 0.1% of the original dataset size, leading us to explore the counter-intuitive question: Is more data what you need for better recommendation? 5 Experiments
Researcher Affiliation	Collaboration	Noveen Sachdeva Mehak Preet Dhaliwal Carole-Jean Wu Julian Mc Auley University of California, San Diego Meta AI {nosachde,mdhaliwal,jmcauley}@ucsd.edu carolejeanwu@meta.com
Pseudocode	Yes	We also provide -AE’s training and inference pseudo-codes in Appendix A, Algorithms 1 and 2. We also provide DISTILL-CF’s pseudo-code in Appendix A, Algorithm 3.
Open Source Code	Yes	Our implementation for -AE is available at https://github.com/noveens/infinite_ae_cf Our implementation for DISTILL-CF is available at https://github.com/noveens/distill_cf
Open Datasets	Yes	We use four recommendation datasets with varying sizes and sparsity characteristics. A brief set of data statistics can be found in Appendix B.3, Table 2. For each user in the dataset, we randomly split their interaction history into 80/10/10% train-test-validation splits. Following recent warnings against unrealistic dense preprocessing of recommender system datasets [55, 57], we only prune users that have fewer than 3 interactions to enforce at least one interaction per user in the train, test, and validation sets. No such preprocessing is followed for items.
Dataset Splits	Yes	For each user in the dataset, we randomly split their interaction history into 80/10/10% train-test-validation splits.
Hardware Specification	Yes	All of our experiments are performed on a single RTX 3090 GPU, with a random-seed initialization of 42.
Software Dependencies	No	We implement both -AE and DISTILL-CF using JAX [8] along with the Neural Tangents package [44] for the relevant NTK computations. While JAX and Neural Tangents are mentioned, specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	To ensure a fair comparison, we conduct a hyper-parameter search for all competitors on the validation set. More details on the hyper-parameters for -AE, DISTILL-CF, and all competitors can be found in Appendix B.3, Table 3.