Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Coupled Group Lasso for Web-Scale CTR Prediction in Display Advertising

Authors: Ling Yan, Wu-Jun Li, Gui-Rong Xue, Dingyi Han

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on real-world data sets show that our CGL model can achieve state-of-the-art performance on webscale CTR prediction tasks.
Researcher Affiliation	Collaboration	Ling Yan EMAIL Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China Wu-Jun Li EMAIL National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, China Gui-Rong Xue EMAIL Alibaba Group, China Dingyi Han EMAIL Alibaba Group, China
Pseudocode	Yes	Algorithm 1 Alternate Learning for CGL
Open Source Code	No	The paper does not include an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	No	We conduct our experiment on three real-world data sets connected from Taobao of Alibaba group. We build our data sets from the logs of the advertisements displayed in http://www.taobao.com, one of the most famous C2C e-commerce web sites in China.
Dataset Splits	Yes	We sample 20% of each training set for validation to specify the hyper-parameters of our CGL model and other baselines.
Hardware Specification	Yes	We have an MPI-cluster with hundreds of nodes, each of which is a 24-core server with 2.2GHz Intel(R) Xeon(R) E5-2430 processor and 96GB of RAM.
Software Dependencies	No	The paper mentions using MPI and L-BFGS algorithms but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	The k in CGL is ﬁxed to 50 in our experiment unless otherwise stated. We vary the values of the hyper-parameter λ and draw the inﬂuence on the performance in Figure 3 (b).