Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
Authors: Yuhao Yang, ZhI JI, Zhaopeng Li, Yi Li, Zhonglin Mo, Yue Ding, Kai Chen, Zijian Zhang, Jie Li, shuanglong li, LIU LIN
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on public datasets and offline tests validate our method s robustness. Online A/B tests on a real-world advertising platform with over 200 million daily users demonstrate substantial improvements in key metrics, highlighting COBRA s practical advantages. |
| Researcher Affiliation | Industry | Yuhao Yang, Zhi Ji , Zhaopeng Li, Yi Li, Zhonglin Mo, Yue Ding, Kai Chen, Zijian Zhang, Jie Li, Shuanglong Li, Lin Liu Baidu Inc., Beijing, China EMAIL |
| Pseudocode | Yes | For a detailed algorithmic description, please refer to the pseudocode provided in Appendix E. |
| Open Source Code | No | Due to the submission requirements of the commercial company, we are unable to include the code with our submission. However, we provide complete and detailed parameter settings and pseudocode for use by other researchers. |
| Open Datasets | Yes | In our experiments, we evaluate the performance of COBRA using the Amazon Product Reviews dataset [35, 36]. |
| Dataset Splits | Yes | For evaluation, we adopted the widely-used leave-one-out strategy: the last item in each user s sequence served as the test sample, the second-to-last as the validation sample, and the remaining items as training data. The dataset is divided into two parts: the training set Dtrain and the test set Dtest. The training set consists of user interaction logs collected over the first 60 days, covering recommendation content and user behaviors during this period. The test set is constructed from logs recorded on the day immediately following the training period and serves as a benchmark for model performance evaluation. |
| Hardware Specification | No | The focus of this study is on theoretical innovation and methodological exploration of the algorithm, rather than specific engineering implementation and resource optimization. We are primarily concerned with the structural design of the model and the logical flow of the algorithm. At this stage, we believe that the innovativeness and effectiveness of the algorithm are the more critical factors to consider. |
| Software Dependencies | No | In our approach, we adopt a method for generating semantic IDs similar to the one used in [19]. However, unlike [19], which uses a different configuration, we employ a 3-level semantic ID structure, where each level corresponds to a codebook size of 32. These semantic IDs are generated using the T5 model. COBRA is implemented with a lightweight architecture, featuring a 1-layer encoder and a 2-layer decoder. |
| Experiment Setup | Yes | COBRA is implemented with a lightweight architecture, featuring a 1-layer encoder and a 2-layer decoder. COBRA achieves an optimal balance between recall and diversity at τ = 0.9. |