Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Conversion Rate Prediction via Multi-Interval Screening and Synthesizing under Delayed Feedback

Authors: Qiming Liu, Xiang Ao, Yuyao Guo, Qing He

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on two real-world advertising datasets validate the effectiveness of our model.
Researcher Affiliation	Academia	1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China 2University of Chinese Academy of Sciences, CAS, Beijing 100049, China 3Institute of Intelligent Computing Technology, Suzhou, CAS
Pseudocode	No	The paper does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	We implement the MISS in Tensorflow and the source code will be available on Git Hub3. 3https://github.com/Neal Walker/MISS
Open Datasets	Yes	Criteo Conversion Logs Criteo1 is a widely used dataset for CVR prediction task (Chen et al. 2022). It contains 60 days data, with a 30 days attribution window 𝑑𝑚𝑎𝑥. 1https://labs.criteo.com/2013/12/conversion-logs-dataset/ Tencent Advertising Algorithm Competition 2017 Tencent dataset2 includes 9 days of data with a 5 days attribution window 𝑑𝑚𝑎𝑥. The dataset contains 22 million samples. 2https://algo.qq.com/?lang=en
Dataset Splits	Yes	Following previous work (Chen et al. 2022; Yang et al. 2021), we separate the dataset into pretraining part and streaming part. Methods are allowed to use all the data from the former part to complete pretraining as they need. Then, models keep getting evaluated and updated hour by hour on the streaming part. The online training data only contains information available at the current timestamps.
Hardware Specification	No	The paper mentions "A DNN model" and "Tensorflow" but does not specify any hardware details like GPU/CPU models or memory used for experiments.
Software Dependencies	No	The paper states, "We implement the MISS in Tensorflow," but does not specify a version number for Tensorflow or any other software dependencies.
Experiment Setup	Yes	A DNN model with a fixed hidden size (128,128) is used as the base model for all the methods. Each hidden layer is followed by the Leaky Re LU activation function (Maas et al. 2013). The synthesizing model for MISS only has one hidden layer with size [32]. L2 regularization is set to 10^-6 on Criteo Dataset and 10^-7 on Tencent Dataset. The models are updated by the Adam optimizer (Kingma and Ba 2014). For a fair comparison, we apply the grid search strategy to tune the best learning rate among {0.0001, 0.0005, 0.001}, and tune the waiting window for previous models in accordance with the original papers. MISS and FTP apply the same waiting windows, [1D, 7D, 14D, 21D, 30D] on Criteo and [1H, 6H, 24H, 48H, 120H] on Tencent.