Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Representation Learning-Assisted Click-Through Rate Prediction
Authors: Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Chao Qi, Zhaojie Liu, Yanlong Du
IJCAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two large-scale datasets demonstrate that Deep MCP outperforms several state-of-the-art models for CTR prediction. |
| Researcher Affiliation | Industry | Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Chao Qi, Zhaojie Liu and Yanlong Du Alibaba Group EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make the implementation code of Deep MCP publicly available1. 1https://github.com/oywtece/deepmcp |
| Open Datasets | Yes | Avito advertising dataset2. This dataset contains a random sample of ad logs from avito.ru, the largest general classified website in Russia. We use the ad logs from 2015-04-28 to 2015-05-18 for training, those on 2015-05-19 for validation, and those on 2015-05-20 for testing. ... 2https://www.kaggle.com/c/avito-context-ad-clicks/data |
| Dataset Splits | Yes | We use the ad logs from 2015-04-28 to 2015-05-18 for training, those on 2015-05-19 for validation, and those on 2015-05-20 for testing. ... We use ad logs of 30 consecutive days during Aug.-Sep. 2018 for training, logs of the next day for validation, and logs of the day after the next day for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | All the methods are implemented in Tensorflow and optimized by the Adagrad algorithm [Duchi et al., 2011]. |
| Experiment Setup | Yes | We set the embedding dimension of each feature as K = 10, because the number of distinct features is huge. We set the number of fully connected layers in neural network-based models as 2, with dimensions 512 and 256. We set the batch size as 128, the context window size as C = 2 and the number of negative ads as Q = 4. The dropout ratio is set to 0.5. |