Scaling Law for Recommendation Models: Towards General-Purpose User Representations
Authors: Kyuyong Shin, Hanock Kwak, Su Young Kim, Max Nihlén Ramström, Jisu Jeong, Jung-Woo Ha, Kyung-Min Kim
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We comprehensively evaluate the pretrained user representation of CLUE with multiple downstream tasks from industrial and benchmark datasets, including an online CTR evaluation. More specifically, we compare the performance of a simple multi-layer perceptron (MLP) employing our task-agnostic pretrained CLUE features with a task-specific model trained for each downstream task. Furthermore, we investigate the empirical scaling laws of training data size, model capacity, sequence length and batch size with extensive experiments, and analyze power-law scaling for training performance as a function of computing resources. |
| Researcher Affiliation | Industry | 1NAVER 2NAVER AI Lab {ky.shin, hanock.kwak2}@navercorp.com |
| Pseudocode | No | The paper describes the model architecture and mathematical formulations (e.g., equations 1-4) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | Benchmark dataset. We select two categories Books and Clothing Shoes and Jewelry from Amazon review dataset (Ni, Li, and Mc Auley 2019). |
| Dataset Splits | No | The paper states, "We make sure there are no shared users between the training, validation, and test sets," indicating the use of a validation set. However, it does not provide specific details on the split percentages or sample counts for this validation set, nor does it specify the splitting methodology. |
| Hardware Specification | No | The paper acknowledges the "NAVER Smart Machine Learning (NSML) platform team... for their critical work on the software and hardware infrastructure on which all the experiments were performed," but it does not specify any particular hardware components such as GPU models, CPU models, or memory details. |
| Software Dependencies | No | The paper mentions a "software infrastructure" provided by the NSML platform team in the acknowledgments, but it does not list any specific software dependencies (e.g., libraries, frameworks) with version numbers that would be required for replication. |
| Experiment Setup | Yes | The training details and hyperparameters of best CLUE are described in Appendix B. We train all models for 100,000 steps. CLUE is trained with 160M parameters, sequence length (128), and batch size (256). |