Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Coupled Group Lasso for Web-Scale CTR Prediction in Display Advertising

Authors: Ling Yan, Wu-Jun Li, Gui-Rong Xue, Dingyi Han

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on real-world data sets show that our CGL model can achieve state-of-the-art performance on webscale CTR prediction tasks.
Researcher Affiliation Collaboration Ling Yan EMAIL Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China Wu-Jun Li EMAIL National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, China Gui-Rong Xue EMAIL Alibaba Group, China Dingyi Han EMAIL Alibaba Group, China
Pseudocode Yes Algorithm 1 Alternate Learning for CGL
Open Source Code No The paper does not include an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No We conduct our experiment on three real-world data sets connected from Taobao of Alibaba group. We build our data sets from the logs of the advertisements displayed in http://www.taobao.com, one of the most famous C2C e-commerce web sites in China.
Dataset Splits Yes We sample 20% of each training set for validation to specify the hyper-parameters of our CGL model and other baselines.
Hardware Specification Yes We have an MPI-cluster with hundreds of nodes, each of which is a 24-core server with 2.2GHz Intel(R) Xeon(R) E5-2430 processor and 96GB of RAM.
Software Dependencies No The paper mentions using MPI and L-BFGS algorithms but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes The k in CGL is fixed to 50 in our experiment unless otherwise stated. We vary the values of the hyper-parameter λ and draw the influence on the performance in Figure 3 (b).