Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GenAuction: A Generative Auction for Online Advertising

Authors: Yuchao Ma, Ruohan Qian, Bingzhe Wang, Qi Qi, Wenqiang Liu, Qian Tang, Zhao Shen, Wei Zhong, Bo Shen, Yixin Su, Bin Zou, Wen Yi, Zhi Guo, Shuanglong Li, Lin Liu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments using real-world data and online A/B tests to validate that Gen Auction efficiently handles multiobjective allocation tasks, demonstrating its efficacy and potential for real-world application. In this section, we conduct both offline and online experiments using real-world datasets to evaluate the performance of Gen Auction in multi-objective allocation tasks.
Researcher Affiliation	Collaboration	1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Baidu Inc., Beijing, China
Pseudocode	Yes	Algorithm 1: Transformer-based DDPG
Open Source Code	No	The text does not provide concrete access to source code for the methodology described in this paper. It only provides a link to datasets.
Open Datasets	Yes	Datasets https://drive.google.com/drive/folders/ 1x Hs Ld HJRPWXCF2s2kdat7x Un Q6c FIILN?usp= drive link
Dataset Splits	Yes	This comprehensive dataset is then partitioned into a training set and a test set at a ratio of 9:1 to ensure the rigor and validity of our experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It mentions conducting 'online A/B test' but without hardware specifications.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	The whole training algorithm process is presented in Algorithm 1. ... Initialize replay memory D with capacity C; ... Discount Factor γ. ... Update target network parameters θtarg ρθtarg + (1 ρ)θ, ϕtarg ρϕtarg + (1 ρ)ϕ. ... In each round t T, For each candidate PV PV Xt m PV Xt, containing Xt ads, the embedding layers of the Evaluator will map the PV level features into embeddings. In the reward function: "n,k {bntxntkαntkβntk i max [0, \|λifi (xntk) thri\|]}, where fi(xntk) is the i-th performance metric, and λi is a Lagrangian multiplier with thri serving as the corresponding threshold." For experimental settings: "For each round t, we imposed a limit of slots for display up to Xt = 3 ads and a maximum of X = 2T ads being displayed across all T PVs. Furthermore, we introduced a tunable threshold Γt to quantify the relevance of ads to individual PVs."